Jack, thanks for the response. So, adding something as simple as the
following to the processAdd() function should do the trick in your opinion?
this_title = doc.getFieldValue("title");
if (this_title == "Slide 1"){
return false;
}
Regards,
ADS
--
View
I'm using Nutch 1.6 to retrieve metadata from crawled documents (e.g. .doc,
.ppt, .pdf, etc.) for indexing by Solr 4.0. Several of the crawled files
have no value or a junk value for certain metatags. Is there a way to force
Solr to skip indexing of documents where, say metatag.title is empty or
me