Hi
I am building a custom UpdateRequestProcessor to intercept any doc heading to
the index. Basically what I want to do is to check if the current index has a
doc with the same title (i am using IDs as the uniques so I can't use that, and
besides the logic of checking is a little more complicated). If the incoming
doc has a duplicate and some other conditions hold then one of 2 things can
happen:
1- we don't index the incoming document
2- we index the incoming and delete the duplicate currently in the index
I think (1) can be done by simple not passing the call up the chain (not
calling super.processAdd(cmd)). However, I don't know how to implement the
second condition, deleting the duplicate document, inside a custom
UpdateRequestProcessor. This thread is the closest to my goal
http://lucene.472066.n3.nabble.com/SOLR-4-3-0-Migration-How-to-use-DeleteUpdateCommand-td4062454.html
however i am not clear how to proceed. Code snippets below.
thank you in advance for your help
class isDuplicate extends UpdateRequestProcessor
{
public isDuplicate( UpdateRequestProcessor next) {
super( next );
}
@Override
public void processAdd(AddUpdateCommand cmd) throws IOException
{
try
{
boolean indexIncomingDoc =
checkIfIsDuplicate(cmd);
if(indexIncomingDoc)
super.processAdd(cmd);
} catch (SolrServerException e) {e.printStackTrace();}
catch (ParseException e) {e.printStackTrace();}
}
public boolean checkIfIsDuplicate(AddUpdateCommand cmd) ...{
SolrInputDocument incomingDoc =
cmd.getSolrInputDocument();
if(incomingDoc == null) return false;
String title = (String) incomingDoc.getFieldValue(
"title" );
SolrIndexSearcher searcher =
cmd.getReq().getSearcher();
boolean addIncomingDoc = true;
Integer idOfDuplicate = searcher.getFirstMatch(new
Term("title",title));
if(idOfDuplicate != -1)
{
addIncomingDoc =
compareDocs(searcher,incomingDoc,idOfDuplicate,title,addIncomingDoc);
}
return addIncomingDoc;
}
private boolean compareDocs(.....){
....
if( condition 1 )
{
--> DELETE DUPLICATE DOC in INDEX <--
addIncomingDoc = true;
}
....
return addIncomingDoc;
}