I'm using Nutch 1.6 to retrieve metadata from crawled documents (e.g. .doc, .ppt, .pdf, etc.) for indexing by Solr 4.0. Several of the crawled files have no value or a junk value for certain metatags. Is there a way to force Solr to skip indexing of documents where, say metatag.title is empty or metatag.title is 'Slide 1'?
-- View this message in context: http://lucene.472066.n3.nabble.com/Skip-Indexing-Certain-Files-on-Purpose-tp4082026.html Sent from the Solr - User mailing list archive at Nabble.com.