+1 on Nutch!

On Fri, Jan 21, 2011 at 4:11 PM, Markus Jelsma
<markus.jel...@openindex.io> wrote:
> Hi,
>
> Please take a look at Apache Nutch. I can crawl through a file system over 
> FTP.
> After crawling, it can use Tika to extract the content from your PDF files and
> other. Finally you can then send the data to your Solr server for indexing.
>
> http://nutch.apache.org/
>
>> Hi All,
>>   Is there is any way in SOLR or any plug-in through which the folders and
>> documents in FTP location can be indexed.
>>
>> / Pankaj Bhatt.
>

Reply via email to