You could have a StatelessScriptUpdate processor detect the file type and then returns false, which aborts the update.

I'll be sure to add such an example to the next early access release of my book!

-- Jack Krupansky

-----Original Message----- From: stone2dbone
Sent: Thursday, August 01, 2013 2:17 PM
To: solr-user@lucene.apache.org
Subject: Skip Indexing Certain Files on Purpose

I'm using Nutch 1.6 to retrieve metadata from crawled documents (e.g. .doc,
.ppt, .pdf, etc.) for indexing by Solr 4.0. Several of the crawled files
have no value or a junk value for certain metatags. Is there a way to force
Solr to skip indexing of documents where, say metatag.title is empty or
metatag.title is 'Slide 1'?



--
View this message in context: http://lucene.472066.n3.nabble.com/Skip-Indexing-Certain-Files-on-Purpose-tp4082026.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to