richardstartin commented on PR #15049: URL: https://github.com/apache/pinot/pull/15049#issuecomment-2656427339
Just a note on whether to use the `RoaringBitmapWriter`, the benchmark results, and the necessity for the benchmark itself which might have saved a bit of time, and to avoid the result of this benchmark being generalised inappropriately. This isn't really the use case it was designed for, and just adding to the bitmap makes perfect sense. It's intended for consuming ascending or nearly ascending values, which the flattened doc id mapping won't preserve in general. The rationale was outlined [here](https://richardstartin.github.io/posts/roaringbitmaps-construction-performance). If the added doc id is outside of the current window of 2^16 values, it falls back to simply adding to the underlying bitmap so can only add overhead. On the slow partial radix sort option, the intended use case is for adding unsorted values _in one go_ as in `RoaringBitmap.bitmapOfUnordered` - batching as done here was never considered and wouldn't be worthwhile because the cost the approach saves can only be amortised in one go. It was benchmarked for intended use [here](https://github.com/RoaringBitmap/RoaringBitmap/pull/199#issuecomment-352427889). I can't think of a use case in Pinot for using the partial radix sort option. The rest of the changes look good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org