: You probably have duplicates (docs on different shards with the same id).
: Deeper paging will detect more of them.
: It does raise the question of if we should be changing numFound, or
: indicating a separate duplicate count.  Duplicates aren't eliminated

random thought (from someone whose never really considered distributed 
searching in much depth) ...

why do we bother detecthing/removing the duplicates?

strictly speaking docs with duplicate IDs on multiple shards is a "garbage 
in" situation, i can understanding Solr taking a little extra effort to 
not fail hard if this situation is encountered, but why update the 
numFound at all, or remove the duplicates from the list? ... why not leave 
them in as is?  (then numFound would never change)


-Hoss

Reply via email to