richardstartin opened a new pull request #7934:
URL: https://github.com/apache/pinot/pull/7934


   This introduces a simple evolution of the raw index format for fixed width 
data types which merely enforces that chunk sizes are a power of 2. For 
example, if a chunk size of 1000 is chosen, the writer will round up to 1024. 
This allows the reader to assume that the chunk size is a power of 2 and 
replace integer remainder calculations and divisions with masks and shifts 
respectively. The format is otherwise identical. 
   
   
   This has a good impact when the index is compressed and the accesses are 
non-contiguous but there are many accesses per chunk:
   
   ```
   Benchmark                                                                    
(_blockSize)  (_numBlocks)  Mode  Cnt   Score   Error  Units
   BenchmarkFixedByteSVForwardIndexReader.readCompressedDoublesNonContiguousV3  
       10000          1000  avgt    5  39.976 ± 0.439  ms/op
   BenchmarkFixedByteSVForwardIndexReader.readCompressedDoublesNonContiguousV4  
       10000          1000  avgt    5  33.110 ± 0.588  ms/op
   BenchmarkFixedByteSVForwardIndexReader.readCompressedLongsNonContiguousV3    
       10000          1000  avgt    5  46.568 ± 0.440  ms/op
   BenchmarkFixedByteSVForwardIndexReader.readCompressedLongsNonContiguousV4    
       10000          1000  avgt    5  31.989 ± 0.419  ms/op
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to