richardstartin opened a new issue #7747:
URL: https://github.com/apache/pinot/issues/7747


   Aggregation performance appears to be bottlenecked on `DirectByteBuffer` 
bounds checks, as can be seen in this profile taken from a pinot server in live 
system executing group by queries (among other things) - nearly 30% of samples 
were in `Buffer.checkIndex`.
   
   <img width="755" alt="Screenshot 2021-11-11 at 11 16 33" 
src="https://user-images.githubusercontent.com/16439049/141289056-82af984c-c54c-4b68-9e6c-3c42b7c84c97.png";>
   
   
   <img width="1282" alt="Screenshot 2021-11-11 at 11 15 05" 
src="https://user-images.githubusercontent.com/16439049/141289372-61db4b6f-b358-497a-9468-a9b3d0590750.png";>
   
   These checks can usually be eliminated when the accesses are sequential, 
e.g. within a counted loop with a known bound. #7708 produced a ~5x speedup be 
reducing the number of accesses to `PinotDataBuffer` by a factor of 8, but a 
more general technique applicable to aggregations would be to pass a primitive 
array to be filled in a for loop, rather than assume `PinotDataBuffer.getX` 
calls behave like array accesses. This would have to support the Cartesian 
product of numeric types to support some of the conversions we need (e.g. `int` 
to `double`)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to