jbewing opened a new pull request, #10936: URL: https://github.com/apache/pinot/pull/10936
### What This PR fixes a bug which may trigger when an mmap'd segment larger than 2GB (Integer.MAX_VALUE bytes) is prefetched by SegmentLocalFSDirectory. There is an (outdated) comment above the prefetching code that specifies that `buffer.size()` is 32-bit and therefore `pos` cannot be greater than `Integer.MAX_VALUE`. That isn't true as LArray-backed buffers can exceed Integer.MAX_VALUE in size (and `PinotDataBuffer#size` returns a `long`). If a buffer with a size that exceeds `Integer.MAX_VALUE` is pre-fetched, `pos` will be casted to an `int` (`Integer.MIN_VALUE`) and passed to `PinotDataBuffer#getByte`. On some implementations, this will loudly throw. On others (like LArray), the behavior is undefined as that buffer implementation is backed by Unsafe and the memory address being accessed is out of the range of the buffer. I've observed this cause a (hard) JVM crash when running batch ingestion jobs on Java 17 with a recent patch https://github.com/apache/pinot/pull/10528 that adds a substitute for LArray for buffers that map more than 2^31-1 bytes. The logs for that crash are captured in this gist: https://gist.github.com/jbewing/471fb86f419c9975e5f22994673b0c1b I hypothesize that this bug is less noticeable on java versions less than 17 as it may not result in a hard crash of the JVM when using the LArray large buffer implementation. ### Testing I've tested this patch by re-running the batch ingestion job that caused the crash and verifying that it didn't cause the crash. I welcome any advice on additional testing steps needed here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org