After a bit of investigation, I am verifying I get over the double of qTime for a single solr query on a distributed evnironment. I will go into the details, but before I go into the code, is the unique functionality going to be helped if we store docValues for the unique field ?
I have a cardinality of 50.000.000 docs, the field I am facet has a cardinality of 50 values, each bucket is around 1.000.000 docs and the unique field cardinality is 20.000 . I was not thinking these to be big numbers, will need to speed up my query as I am assuming something is going as expected :) Cheers On Mon, Sep 12, 2016 at 11:59 AM, Alessandro Benedetti < abenede...@apache.org> wrote: > Hi gents, > was taking a look to the ways to calculate distinct count per facet. > > Reading through Yonik blogs [1] it seems quite safe to assume the " > unique(field)" is the approach to go. > > Do we have any benchmark or details about the implementation ? > Because as per Yonik blog it is faster than HyperLogLog so I assume it is > using different data structures and algorithms. > Worst case scenario I go through the code, but any presentation or blog > would be useful! > Cheers > > > [1] http://yonik.com/solr-count-distinct/ , http://yonik.com/facet- > performance/ > -- > -------------------------- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England