I'm using Nutch 1.6 to retrieve metadata from crawled documents (e.g. .doc,
.ppt, .pdf, etc.) for indexing by Solr 4.0. Several of the crawled files
have no value or a junk value for certain metatags. Is there a way to force
Solr to skip indexing of documents where, say metatag.title is empty or
metatag.title is 'Slide 1'?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Skip-Indexing-Certain-Files-on-Purpose-tp4082026.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to