: A quick check did show me a couple of duplicates, but if I understand
: correctly, even if two different process send the same document, the last
: one should update the previous. If I send the same documents 10 times, in
: the end, it should only be in my index once, no?

it should yes ... i didn't say i could explain your problem, i'm just 
trying to speculate about things that might give us insight into figureing 
out if/where a bug exists.

the only thing i can possibly think of that would cause a situation like 
this (where the number of documents decreases w/o any deletes happening) 
is if some of the "add" commands use overwrite="false" and some use 
overwrite="true" ... in that 
situation, you might get 10 docs added with the same uniqueKey 
value using overwrite="false" and so you'll have 10 docs in your index.  
then you might index one more doc with the same uniqueKey value, but this 
time using overwrite="true" and that one document will overwrite all 10 of 
the previous documents, causing your doc count to decrease from 10 to 1.

But nothing in your description of how you are using Solr gimplies that 
you were doing this, hence my question of what exactly your indexing code 
looks like.

My best guess is that maybe the deduplication UpdateProcessors hav a bug 
in them, but w/o a reproducible test case demonstrating hte problem it 
will be nearly impossible to even know where (or if that's actaully the 
problem at all)



-Hoss

Reply via email to