Jackie-Jiang commented on pull request #5872:
URL: https://github.com/apache/incubator-pinot/pull/5872#issuecomment-675798956


   @mayankshriv Currently the `DistinctCount` is not having the expected 
behavior of returning the exact distinct count because it is storing the 
`hashCode()` of the values instead of the actual values, and will return less 
than accurate result when hash collision happens. We are fixing this unexpected 
behavior in this PR, but that has performance overhead.
   `DistinctCountBitmap` will have the same behavior as the current 
`DistinctCount` (storing hash of the values) and similar or better performance. 
In case there are performance-sensitive use cases with `DistinctCount`, you 
might consider using `DistinctCountBitmap` instead. For 
non-performance-sensitive use cases, nothing need to be changed as 
`DistinctCount` will return the exact distinct count, which is the expected 
behavior.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to