Re: Generating a sitemap

2010-09-23 Thread Doki
Hi all, Hate to bring forward a zombified thread (Mar 2010 though, not too bad), but I also am tasked to generate a sitemap for items indexed in a Solr index. Been at this job for only a few weeks, so Solr and Lucene are all new to me, but I think my path forward on this is to create a requ

Re: Generating a sitemap

2010-03-19 Thread Jon Baer
It's unfortunately actually a pretty domain specific thing (urls, content, etc), there are also limits @ certain points (see ... but we took CNN.com as a model, for example: http://www.cnn.com/video_sitemap_index.xml http://www.cnn.com/sitemap_videos_0001.xml Then you just line up the big 3 w/

Re: Generating a sitemap

2010-03-19 Thread Erik Hatcher
Jon - Very cool use of VelocityResponseWriter! Would you happen to have a sitemap.vm template to contribute? I realize there'd need to be an external URL configurable, but this would be trivially added as a request parameter and leveraged in the template. Erik p.s. Anyone else

Re: Generating a sitemap

2010-03-18 Thread Jon Baer
It's also possible to try and use the Velocity contrib response writer and paging it w/ the sitemap elements. BTW generating a sitemap was a big reason of a switch we did from GSA to Solr because (for some reason) the map took way too long to generate (even simple requests). If you page throug

Re: Generating a sitemap

2010-03-18 Thread Chris Hostetter
: Been testing nutch to crawl for solr and I was wondering if anyone had : already worked on a system for getting the urls out of solr and generating : an XML sitemap for Google. it's pretty easy to just paginate through all docs in solr, so you could do that -- but I'd be really suprised if Nut