Sorry for introducing bad information.
Because it happens in the json facet api, I thought it would also happen in
the facet. Soyrry again for the misunderstood.

2016-07-07 16:08 GMT-03:00 Chris Hostetter <hossman_luc...@fucit.org>:

>
> : The problem with the shards appears in the following scenario (note that
> : the problem below also applies in a solr standalone enviroment with
> : distributed search):
> :
> : Shard1: DATA_SOURCE1 (3 docs), DATA_SOURCE2 (2 docs), DATA_SOURCE3 (2
> docs).
> : Shard2: DATA_SOURCE3 (2 docs), DATA_SOURCE2 (1 docs).
> :
> : If you make a distributed search across these two shards, faceting
> : dataSourceName with a limit of 1, it will ask for the top 1 in the first
> : shard (DATA_SOURCE1 (3 docs)) and for the top 1 in the second shard
> : (DATA_SOURCE3
> : (2 docs)). After that it will merge the results and return DATA_SOURCE1
> (3
> : docs), when it should have return DATA_SOURCE3 (4 docs).
>
> That's completley false.
>
> a) in the first pass, even if you ask for "top 1" (ie: facet.limit=1) solr
> will overrequest when comunicating with each shard (the amount of
> overrequest is a function of your facet.limit, so as facet.limit increases
> so does the overrequest amount)
>
> b) if *any* (but not *all*) shards returns DATA_SOURCE3 from the
> initial shard request, a second "refinement" step will request the count
> for DATA_SOURCE3 from all of the other shards to get an accurate count,
> and to accurately sort DATA_SOURCE3 to the top of the facet constraint
> list.
>
>
> -Hoss
> http://www.lucidworks.com/
>

Reply via email to