Re: indexing/crawling HTML + solr

2009-06-03 Thread Otis Gospodnetic
Gena, Besides droids (simpler, smaller components you can put together) there is also Nutch, a bigger beast for large scale crawling that index crawled pages into Solr - http://lucene.apache.org/nutch . Otis - Original Message > From: Gena Batsyan > To: solr-user@lucene.apache.org

Re: indexing/crawling HTML + solr

2009-06-03 Thread Olivier Dobberkau
Hi Have à Look at the droids project in The incubator. Olivier Von meinem iPhone gesendet Am 03.06.2009 um 12:09 schrieb Gena Batsyan : Hi! to be short, where to start with the subject? Any pointers to some [semi-]functional solutions that crawl the web as a normal crawler, take care ab