RE: Indexing URLs for Binaries

2014-01-03 Thread Teague James
: Re: Indexing URLs for Binaries Check suffix-urlfilter.txt in your conf directory for Nutch. You might be prohibiting those filetypes from the crawl. - Mark On 1/3/14, 10:29 AM, "Teague James" wrote: >I am using Nutch 1.7 with Solr 4.6.0 to index websites that have links >

Re: Indexing URLs for Binaries

2014-01-03 Thread Reyes, Mark
Check suffix-urlfilter.txt in your conf directory for Nutch. You might be prohibiting those filetypes from the crawl. - Mark On 1/3/14, 10:29 AM, "Teague James" wrote: >I am using Nutch 1.7 with Solr 4.6.0 to index websites that have links to >binary files, such as Word, PDF, etc. The craw