You could have a StatelessScriptUpdate processor detect the file type and
then returns false, which aborts the update.
I'll be sure to add such an example to the next early access release of my
book!
-- Jack Krupansky
-----Original Message-----
From: stone2dbone
Sent: Thursday, August 01, 2013 2:17 PM
To: solr-user@lucene.apache.org
Subject: Skip Indexing Certain Files on Purpose
I'm using Nutch 1.6 to retrieve metadata from crawled documents (e.g. .doc,
.ppt, .pdf, etc.) for indexing by Solr 4.0. Several of the crawled files
have no value or a junk value for certain metatags. Is there a way to force
Solr to skip indexing of documents where, say metatag.title is empty or
metatag.title is 'Slide 1'?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Skip-Indexing-Certain-Files-on-Purpose-tp4082026.html
Sent from the Solr - User mailing list archive at Nabble.com.