The uniqueKey is enforced within the same shard/index only.

On Fri, May 24, 2013 at 6:39 PM, Valery Giner <valgi...@research.att.com>wrote:

> Shawn,
>
> How is it possible for more than one document with the same unique key to
> appear in the index, even in different shards?
> Isn't it a bug by definition?
> What am I missing here?
>
> Thanks,
> Val
>
>
> On 05/23/2013 09:55 AM, Shawn Heisey wrote:
>
>> On 5/23/2013 1:51 AM, Luis Cappa Banda wrote:
>>
>>> I've query each Solr shard server one by one and the total number of
>>> documents is correct. However, when I change rows parameter from 10 to
>>> 100
>>> the total numFound of documents change:
>>>
>> I've seen this problem on the list before and the cause has been
>> determined each time to be caused by documents with the same uniqueKey
>> value appearing in more than one shard.
>>
>> What I think happens here:
>>
>> With rows=10, you get the top ten docs from each of the three shards,
>> and each shard sends its numFound for that query to the core that's
>> coordinating the search.  The coordinator adds up numFound, looks
>> through those thirty docs, and arranges them according to the requested
>> sort order, returning only the top 10.  In this case, there happen to be
>> no duplicates.
>>
>> With rows=100, you get a total of 300 docs.  This time, duplicates are
>> found and removed by the coordinator.  I think that the coordinator
>> adjusts the total numFound by the number of duplicate documents it
>> removed, in an attempt to be more accurate.
>>
>> I don't know if adjusting numFound when duplicates are found in a
>> sharded query is the right thing to do, I'll leave that for smarter
>> people.  Perhaps Solr should return a message with the results saying
>> that duplicates were found, and if a config option is not enabled, the
>> server should throw an exception and return a 4xx HTTP error code.  One
>> idea for a config parameter name would be allowShardDuplicates, but
>> something better can probably be found.
>>
>> Thanks,
>> Shawn
>>
>>
>


-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to