After a bit of investigation, I am verifying I get over the double of qTime
for a single solr query on a distributed evnironment.
I will go into the details, but before I go into the code, is the unique
functionality going to be helped if we store docValues for the unique field
?

I have a cardinality of 50.000.000 docs, the field I am facet has a
cardinality of 50 values, each bucket is around 1.000.000 docs and the
unique field cardinality is 20.000 .

I was not thinking these to be big numbers, will need to speed up my query
as I am assuming something is going as expected :)

Cheers

On Mon, Sep 12, 2016 at 11:59 AM, Alessandro Benedetti <
abenede...@apache.org> wrote:

> Hi gents,
> was taking a look to the ways to calculate distinct count per facet.
>
> Reading through Yonik blogs [1] it seems quite safe to assume the "
> unique(field)" is the approach to go.
>
> Do we have any benchmark or details about the implementation ?
> Because as per Yonik blog it is faster than HyperLogLog so I assume it is
> using different data structures and algorithms.
> Worst case scenario I go through the code, but any presentation or blog
> would be useful!
> Cheers
>
>
> [1] http://yonik.com/solr-count-distinct/ , http://yonik.com/facet-
> performance/
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to