+1 on Nutch!
On Fri, Jan 21, 2011 at 4:11 PM, Markus Jelsma <markus.jel...@openindex.io> wrote: > Hi, > > Please take a look at Apache Nutch. I can crawl through a file system over > FTP. > After crawling, it can use Tika to extract the content from your PDF files and > other. Finally you can then send the data to your Solr server for indexing. > > http://nutch.apache.org/ > >> Hi All, >> Is there is any way in SOLR or any plug-in through which the folders and >> documents in FTP location can be indexed. >> >> / Pankaj Bhatt. >