Re: Indexing documents with SOLR

2010-12-11 Thread Adam Estrada
Pankaj, Check this article out on how to get going with Nutch. http://bit.ly/dbBdK4This is a few months old so you will have to note that there is a new parameter called something like -SolrUrl that will allow you to update your solr index with the crawled data. For crawling your local file syste

Re: Indexing documents with SOLR

2010-12-10 Thread Adam Estrada
Nutch is also a great option if you want a crawler. I have found that you will need to use the latest version of PDFBox and a it's dependencies for better results. Also, make sure to set JAVA_OPT to something really large so that you won't exceed your heap size. Adam On Fri, Dec 10, 2010 at 6:27

Re: Indexing documents with SOLR

2010-12-10 Thread Tommaso Teofili
Hi Pankaj, you can find the needed documentation right here [1]. Hope this helps, Tommaso [1] : http://wiki.apache.org/solr/ExtractingRequestHandler 2010/12/10 pankaj bhatt > Hi All, > I am a newbie to SOLR and trying to integrate TIKA + SOLR. > Can anyone please guide me, how to achieve