Hi

I am building a custom UpdateRequestProcessor to intercept any doc heading to 
the index. Basically what I want to do is to check if the current index has a 
doc with the same title (i am using IDs as the uniques so I can't use that, and 
besides the logic of checking is a little more complicated). If the incoming 
doc has a duplicate and some other conditions hold then one of 2 things can 
happen:

        1- we don't index the incoming document
        2- we index the incoming and delete the duplicate currently in the index

I think (1) can be done by simple not passing the call up the chain (not 
calling super.processAdd(cmd)). However, I don't know how to implement the 
second condition, deleting the duplicate document, inside a custom 
UpdateRequestProcessor. This thread is the closest to my goal 
http://lucene.472066.n3.nabble.com/SOLR-4-3-0-Migration-How-to-use-DeleteUpdateCommand-td4062454.html

however i am not clear how to proceed. Code snippets below.

thank you in advance for your help

        class isDuplicate extends UpdateRequestProcessor 
        {
                public isDuplicate( UpdateRequestProcessor next) { 
                  super( next ); 
                } 
                @Override 
                public void processAdd(AddUpdateCommand cmd) throws IOException 
{       
                        try 
                        {
                                boolean indexIncomingDoc = 
checkIfIsDuplicate(cmd);                             
                                if(indexIncomingDoc)
                                        super.processAdd(cmd);                  
        
                        } catch (SolrServerException e) {e.printStackTrace();} 
                        catch (ParseException e) {e.printStackTrace();}
                } 
                public boolean checkIfIsDuplicate(AddUpdateCommand cmd) ...{
                        
                        SolrInputDocument incomingDoc = 
cmd.getSolrInputDocument();
                        if(incomingDoc == null) return false;
                        String title = (String) incomingDoc.getFieldValue( 
"title" );                    
                        SolrIndexSearcher searcher = 
cmd.getReq().getSearcher();                        
                        boolean addIncomingDoc = true;
                        Integer idOfDuplicate = searcher.getFirstMatch(new 
Term("title",title));                        
                        if(idOfDuplicate != -1) 
                        {
                                addIncomingDoc = 
compareDocs(searcher,incomingDoc,idOfDuplicate,title,addIncomingDoc);
                        }
                        return addIncomingDoc;                          
                }
                private boolean compareDocs(.....){             
                        ....
                        if( condition 1 ) 
                        {
                                --> DELETE DUPLICATE DOC in INDEX <--
                                addIncomingDoc = true;
                        }
                        ....
                        return addIncomingDoc;
                }

Reply via email to