Re: Nested facet complete wrong counts

2017-11-11 Thread Kenny Knecht
RRGG - [banging my head against the wall] Of course. You are abolutely right about the multi valuedness Thanks for the 7.0 hint. Gives a reason to upgrade. Need to re-index when upgrading? Kenny [image: ONTOFORCE] Kenny Knecht, PhD CTO and technical lead +32 486

Re: Nested facet complete wrong counts

2017-11-11 Thread Yonik Seeley
Also, If you're looking at all constraints, you shouldn't need refine:true But if you do need it, it was only added in Solr 7.0 (and I see you're using 6.6) -Yonik On Sat, Nov 11, 2017 at 9:48 AM, Yonik Seeley wrote: > On Sat, Nov 11, 2017 at 9:18 AM, Kenny Knecht wrote: >> Hi Yonik, >> >> I a

Re: Nested facet complete wrong counts

2017-11-11 Thread Yonik Seeley
On Sat, Nov 11, 2017 at 9:18 AM, Kenny Knecht wrote: > Hi Yonik, > > I am aware of the estimate on the hll. But we don't use the hll as a > baseline for comparison. We ask the values for one facet (for example > Gender). We store these counts for each bucket. Next we do another request. > This tim

Re: Nested facet complete wrong counts

2017-11-11 Thread Kenny Knecht
Hi Yonik, I am aware of the estimate on the hll. But we don't use the hll as a baseline for comparison. We ask the values for one facet (for example Gender). We store these counts for each bucket. Next we do another request. This time for a facet and a subfacet (for example Gender x Type). We sum

Re: Nested facet complete wrong counts

2017-11-11 Thread Kenny Knecht
Thank you. But as I showed in my example we used refine and overrequest is not strictly needed because we need all buckets anyway. But that can hardly explain an error of 60%, right? Op 10-nov.-2017 19:29 schreef "Amrit Sarkar" : > Kenny, > > This is a known behavior in multi-sharded collection w

Re: Nested facet complete wrong counts

2017-11-10 Thread Yonik Seeley
I do notice you are using hll (hyper-log-log) which is a distributed cardinality *estimate* : https://en.wikipedia.org/wiki/HyperLogLog -Yonik On Fri, Nov 10, 2017 at 11:32 AM, kenny wrote: > Hi all, > > We are doing some tests in solr 6.6 with json facet api and we get > completely wrong count

Re: Nested facet complete wrong counts

2017-11-10 Thread Amrit Sarkar
Kenny, This is a known behavior in multi-sharded collection where the field values belonging to same facet doesn't reside in same shard. Yonik Seeley has improved the Json Facet feature by introducing "overrequest" and "refine" parameters. Kindly checkout Jira: https://issues.apache.org/jira/brow

Nested facet complete wrong counts

2017-11-10 Thread kenny
Hi all, We are doing some tests in solr 6.6 with json facet api and we get completely wrong counts for some combination of  facets Setting: We have a set of fields for 376k documents in our query (total 120M documents). We work with 2 shards. When doing first a faceting over the first facet