richardstartin commented on pull request #8411:
URL: https://github.com/apache/pinot/pull/8411#issuecomment-1081007479


   > Suggest just adding `int getNumMatchingDocs()` (return can be int as it is 
inner segment) and `RoaringBitmap getBitmap()` without the boolean check 
methods. We can add default implementation of them by just scanning the 
`BlockDocIdIterator`, and override the filter operators that can accelerate 
this process.
   > 
   > This way, all the single `COUNT` aggregation query can be solved with the 
short-circuited operator. Even if there is no special optimization for the 
filter (scan-based), we can still save the overhead of creating 10000 docs 
blocks.
   
   Makes sense. I will follow up with this.
   
   > Not sure how much extra optimization we can get from `BitmapCollection`, 
but that does add complexity to the logic, so I'd suggest first adding the 
basic `RoaringBitmap` one, then have a separate enhancement to compare their 
performance
   
   I'd prefer to keep it this way:
   
   * a lot of collective effort has gone in to the counting methods over the 
years and `BitmapCollection` just encapsulates them. 
   * Implementing negation of a compressed bitmap by flipping its bits means 
you get a very dense bitmap, unless you already have a dense bitmap and you 
have incurred a sunk cost.
   * I don't anticipate it needing to change.
   * it has high test coverage if you are concerned about correctness.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to