lakshmanan-v opened a new issue #7014:
URL: https://github.com/apache/incubator-pinot/issues/7014


   DISTINCTCOUNTHLL accuracy and memory footprint can be improved through 
latest HLL algorithms. We have a choice either replace the existing 
implementation with a better one or leave the existing DISTINCTCOUNTHLL to 
implement original HLL and create separate functions (ex: 
DISTINCTCOUNTHLLPLUSPLUS).
   
   [Google's 
HLL++](http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/40671.pdf)
 -- a popular algorithm amongst the community offers lot of improvements over 
original HLL. There are multiple java implementations of HLL++. Most of them 
have variations in performance due to the register size and other 
implementation choices. Clearspring 
[stream-lib](https://github.com/addthis/stream-lib) used for current 
HyperLogLog function, implements HLL++ as 
[HyperLogPlus](https://github.com/addthis/stream-lib/commits/master/src/main/java/com/clearspring/analytics/stream/cardinality/HyperLogLogPlus.java).
 
   
    
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to