Jackie-Jiang commented on pull request #5872: URL: https://github.com/apache/incubator-pinot/pull/5872#issuecomment-675798956
@mayankshriv Currently the `DistinctCount` is not having the expected behavior of returning the exact distinct count because it is storing the `hashCode()` of the values instead of the actual values, and will return less than accurate result when hash collision happens. We are fixing this unexpected behavior in this PR, but that has performance overhead. `DistinctCountBitmap` will have the same behavior as the current `DistinctCount` (storing hash of the values) and similar or better performance. In case there are performance-sensitive use cases with `DistinctCount`, you might consider using `DistinctCountBitmap` instead. For non-performance-sensitive use cases, nothing need to be changed as `DistinctCount` will return the exact distinct count, which is the expected behavior. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org