The current distributed search design assumes that all document ids
are unique across the set of cores. If you have duplicates, you're on
your on.
On Fri, Jan 1, 2010 at 7:10 AM, Yonik Seeley wrote:
> On Thu, Dec 31, 2009 at 10:26 PM, Chris Hostetter
> wrote:
>> why do we bother detecthing/remov
On Thu, Dec 31, 2009 at 10:26 PM, Chris Hostetter
wrote:
> why do we bother detecthing/removing the duplicates?
>
> strictly speaking docs with duplicate IDs on multiple shards is a "garbage
> in" situation, i can understanding Solr taking a little extra effort to
> not fail hard if this situation
Yonik Seeley-2 wrote:
>
> On Thu, Dec 31, 2009 at 2:29 AM, johnson hong
> wrote:
>>
>> Hi,all.
>> I found a problem on distributed-seach.
>> when i use "?q=keyword&start=0&rows=20" to query across
>> distributed-seach,it will return numFound="181" ,then I
>> change the start param fro
: You probably have duplicates (docs on different shards with the same id).
: Deeper paging will detect more of them.
: It does raise the question of if we should be changing numFound, or
: indicating a separate duplicate count. Duplicates aren't eliminated
random thought (from someone whose nev
On Thu, Dec 31, 2009 at 2:29 AM, johnson hong
wrote:
>
> Hi,all.
> I found a problem on distributed-seach.
> when i use "?q=keyword&start=0&rows=20" to query across
> distributed-seach,it will return numFound="181" ,then I
> change the start param from 0 to 100,it will return numFound="13