Hey there,

I have posted before telling about my situation but I thing my explanation
was a bit confusing...
I am using dataImportHanlder and delta-import and it's working perfectly. I
have also coded my own SqlEntityProcesor to delete from the index and
database expired rows.

Now I need to do duplication control at indexing time. In my old lucene core
I made my own duplication control but it was so slow as it worked comparing
strings... I have been investigating solr deduplication
(http://wiki.apache.org/solr/Deduplication) and it seems so cool as it works
with hashes instead of strings.

I have learned how to use deduplication using the /update requestHandler as
the wiki says:
 <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" >
    <lst name="defaults">
      <str name="update.processor">dedupe</str>
    </lst>
  </requestHandler>

But the thing is that I want to use it with the /dataimport requestHanlder
(the one used by dataimporthandler). I don't know if there's a possible xml
configuration to add deduplication to dataimportHandler or I should code a
plugin... in that case, I don't exacly now where.

Hope my explanation is more clear now...
Thank's in advanced!


-- 
View this message in context: 
http://www.nabble.com/using-deduplication-with-dataimporthandler-tp20536053p20536053.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to