Re: Nutch related issue: URL Ignore

2011-08-15 Thread Markus Jelsma
The Solr list is not the appropriate list to ask. Please try the Nutch user mailing list. > hi > > i am using nutch 1.2. in my crawl-urlfilter.txt, i am specifying URLs to be > skipped. i am giving some patterns that need to be skipped but it is not > working > > e.g. > > -^http://([a-z0-9]*\.

Nutch related issue: URL Ignore

2011-08-12 Thread Pawan Darira
hi i am using nutch 1.2. in my crawl-urlfilter.txt, i am specifying URLs to be skipped. i am giving some patterns that need to be skipped but it is not working e.g. -^http://([a-z0-9]*\.)*domain.com +^http://([a-z0-9]*\.)*domain.com/([0-9-a-z])*.html -^http://([a-z0-9]*\.)*domain.com/([a-z/])* -