richardstartin opened a new pull request #7934: URL: https://github.com/apache/pinot/pull/7934
This introduces a simple evolution of the raw index format for fixed width data types which merely enforces that chunk sizes are a power of 2. For example, if a chunk size of 1000 is chosen, the writer will round up to 1024. This allows the reader to assume that the chunk size is a power of 2 and replace integer remainder calculations and divisions with masks and shifts respectively. The format is otherwise identical. This has a good impact when the index is compressed and the accesses are non-contiguous but there are many accesses per chunk: ``` Benchmark (_blockSize) (_numBlocks) Mode Cnt Score Error Units BenchmarkFixedByteSVForwardIndexReader.readCompressedDoublesNonContiguousV3 10000 1000 avgt 5 39.976 ± 0.439 ms/op BenchmarkFixedByteSVForwardIndexReader.readCompressedDoublesNonContiguousV4 10000 1000 avgt 5 33.110 ± 0.588 ms/op BenchmarkFixedByteSVForwardIndexReader.readCompressedLongsNonContiguousV3 10000 1000 avgt 5 46.568 ± 0.440 ms/op BenchmarkFixedByteSVForwardIndexReader.readCompressedLongsNonContiguousV4 10000 1000 avgt 5 31.989 ± 0.419 ms/op ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org