Re: Exclude urls without 'www' from Nutch 1.7 crawl

2013-11-01 Thread Reyes, Mark
Noted and will do (that goes twice for the suggestions and putting this on the nutch list instead). Thanks all, Mark On 11/1/13, 10:53 AM, "Furkan KAMACI" wrote: >As Markus pointed Nutch has a feature for such kind of situation. Here is >Solr list but one more thing for you: www.mywebsite.com

Re: Exclude urls without 'www' from Nutch 1.7 crawl

2013-11-01 Thread Furkan KAMACI
As Markus pointed Nutch has a feature for such kind of situation. Here is Solr list but one more thing for you: www.mywebsite.com and mywebsite.commay point to "different" pages. 2013/11/1 Markus Jelsma > Hi - Use the domain-urlfilter for host, domain and TLD filtering. > > Also, please ask que

RE: Exclude urls without 'www' from Nutch 1.7 crawl

2013-11-01 Thread Markus Jelsma
Hi - Use the domain-urlfilter for host, domain and TLD filtering. Also, please ask questions on the Nutch list, you're on Solr now :) -Original message- > From:Reyes, Mark > Sent: Friday 1st November 2013 17:24 > To: solr-user@lucene.apache.org > Subject: Exclude urls without 'www' fr