: Been testing nutch to crawl for solr and I was wondering if anyone had
: already worked on a system for getting the urls out of solr and generating
: an XML sitemap for Google.

it's pretty easy to just paginate through all docs in solr, so you could 
do that -- but I'd be really suprised if Nutch wasn't also loggign all the 
URLs it indexed, so you could just post-process that log to build the 
sitemap as well.



-Hoss

Reply via email to