what version of Solr are you using? There's been quite a bit of work
on this lately,
I'm not even sure how much has made it into 3.6. You might try searching the
JIRA list, Martijn van Groningen has done a bunch of work lately, look for
his name. Fortunately, it's not likely to get a bunch of false hits <G>..

Best
Erick

On Fri, Jul 13, 2012 at 7:50 AM, Agnieszka Kukałowicz
<agnieszka.kukalow...@usable.pl> wrote:
> Hi,
>
> I have problem with faceting count in distributed grouping. It appears only
> when I make query that returns almost all of the documents.
>
> My SOLR implementation has 4 shards and my queries looks like:
>
> http://host:port
> /select/q?=*:*&shards=shard1,shard2,shard3,shard4&group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1
>
> With query like above I get strange counts for field category1.
> The counts for values are very big:
> <int name="val1">9659</int>
> <int name="val2">7015</int>
> <int name="val3">5676</int>
> <int name="val4">1180</int>
> <int name="val5">1105</int>
> <int name="val6">979</int>
> <int name="val7">770</int>
> <int name="val8">701</int>
> <int name="">612</int>
> <int name="val9">422</int>
> <int name="val10">358</int>
>
> When I make query to narrow the results adding to query
> fq=category1:"val1", etc. I get different counts than facet category1 shows
> for a few first values:
>
> fq=category1:"val1" - counts: 22
> fq=category1:"val2" - counts: 22
> fq=category1:"val3" - counts: 21
> fq=category1:"val4" - counts: 19
> fq=category1:"val5" - counts: 19
> fq=category1:"val6" - counts: 20
> fq=category1:"val7" - counts: 20
> fq=category1:"val8" - counts: 25
> fq=category1:"val9" - counts: 422
> fq=category1:"val10" - counts: 358
>
> From val9 the count is ok.
>
> First I thought that for some values in facet "category1" groups count does
> not work and it returns counts of all documents not group by field id.
> But the number of all documents matches query  fq=category1:"val1" is
> 45468. So the numbers are not the same.
>
> I check the queries on each shard for val1 and the results are:
>
> shard1:
> query:
> http://shard1/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1
>
> <lst name="fcategory">
> <int name="val1">11</int>
>
> query:
> http://shard1/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1&fq=category1
> :"val1"
>
> shard 2:
> query:
> http://shard2/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1
>
> there is no value "val1" in category1 facet.
>
> query:
> http://shard2/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1&fq=category1
> :"val1"
>
> <int name="ngroups">7</int>
>
> shard3:
> query:
> http://shard3/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1
>
> there is no value val1 in category1 facet
>
> query:
> http://shard3/select/?q=*:*group=true&group.field=id&group.facet=true&group.ngroups=true&facet.field=category1&facet.missing=false&facet.mincount=1&fq=category1
> :"val1"
>
> <int name="ngroups">4</int>
>
> So it looks that detail query with fq=category1:"val1" returns the relevant
> results. But Solr has problem with faceting counts when one of the shard
> does not return the faceting value (in this scenario "val1") that exists on
> other shards.
>
> I checked shards for "val10" and I got:
>
> shard1: count for val10 - 142
> shard2: count for val10 - 131
> shard3: count for val10 -  149
> sum of counts 422 - ok.
>
> I'm not sure how to resolve that situation. For sure the counts of val1 to
> val9 should be different and they should not be on the top of the category1
> facet because this is very confusing. Do you have any idea how to fix this
> problem?
>
> Best regards
> Agnieszka

Reply via email to