On Tue, Jan 27, 2009 at 5:03 AM, Chris Hostetter <hossman_luc...@fucit.org>wrote:
> > : Hi, i added some code to *DirectUpdateHandler2.java's doDeletions()* > (solr > : 1.2.0) ,and got the solution i wanted.(logging duplicate post entry-i.e > old > : field and new field of duplicate post) > : > : > : Document d1=searcher.doc(prev); //existing doc to be > deleted > : Document d2=searcher.doc(tdocs.doc()); //new doc > : String oldname=d1.get("name"); > : String id1=d1.get("id"); > : String newname=d2.get("name"); > : String id2=d1.get("id"); > : out3.write(id1+","+oldname+","+newname+"\n"); > : > : But i dont know ,wether the performance of solr will be affected by this. > : Any comment on the performance issue for the above solution is welcome... > > it's probably going to be painfully slow -- you're probably going to be a > lot better off avoiding the use of searcher.doc and instead stick with > using the FieldCache, but there are trade offs there as well, it's largely > going to depend on how often you're doing adds vs. commits. > > BTW: as i mentioned before, it probably make more sense to implement this > in an UpdateProcessor instead of hacking DirectUpdateHandler2 ... that way > you'll be able to upgrade Solr without worryiing about losing/redocing > your changes. > > > > > -Hoss > Thanks a lot Chris Hostetter , I realize i must make it to UpdateProcessor for best performance and i am new to SOLR (a few months back i started working on solr). I found modifying DirectUpdateHandler2 bit easy. Further,for the current importance of finding duplicate post,i made the above modification to DirectUpdateHandler2. Note:And for your information,we are commiting for every 1000 posts. -- Yours, S.Selvam