jbewing opened a new pull request, #10936:
URL: https://github.com/apache/pinot/pull/10936

   ### What
   This PR fixes a bug which may trigger when an mmap'd segment larger than 2GB 
(Integer.MAX_VALUE bytes) is prefetched by SegmentLocalFSDirectory.
   
   There is an (outdated) comment above the prefetching code that specifies 
that `buffer.size()` is 32-bit and therefore `pos` cannot be greater than 
`Integer.MAX_VALUE`. That isn't true as LArray-backed buffers can exceed 
Integer.MAX_VALUE in size (and `PinotDataBuffer#size` returns a `long`). If a 
buffer with a size that exceeds `Integer.MAX_VALUE` is pre-fetched, `pos` will 
be casted to an `int` (`Integer.MIN_VALUE`) and passed to 
`PinotDataBuffer#getByte`. On some implementations, this will loudly throw. On 
others (like LArray), the behavior is undefined as that buffer implementation 
is backed by Unsafe and the memory address being accessed is out of the range 
of the buffer.
   
   I've observed this cause a (hard) JVM crash when running batch ingestion 
jobs on Java 17 with a recent patch https://github.com/apache/pinot/pull/10528 
that adds a substitute for LArray for buffers that map more than 2^31-1 bytes. 
The logs for that crash are captured in this gist: 
https://gist.github.com/jbewing/471fb86f419c9975e5f22994673b0c1b
   
   I hypothesize that this bug is less noticeable on java versions less than 17 
as it may not result in a hard crash of the JVM when using the LArray large 
buffer implementation.
   
   ### Testing
   I've tested this patch by re-running the batch ingestion job that caused the 
crash and verifying that it didn't cause the crash. I welcome any advice on 
additional testing steps needed here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to