siddharthteotia opened a new pull request #5409:
URL: https://github.com/apache/incubator-pinot/pull/5409


   Couple of improvements have been done for bit-unpacking.
   
   - Use hand-written unpack methods for power of 2 (1, 2, 4, 8, 16, 32) number 
of bits used to encode the dictionaryId. The hand-written methods are faster 
than generic due to simplified bit math.
   - Amortize the overhead of function calls.
   
   Right now, the new code isn't yet wired into existing bit reader and writer. 
Couple of follow-ups will be coming soon:
   
   - Evaluate this optimization for non power of 2 number of bits. It is fairly 
possible but the performance benefit of using a special hand-written function 
for unpacking seems to get lost as the bit math itself gets complicated with 
branches for non power of 2 number of bits.
   - Consider using a new format where if the number of bits to encode is non 
power of 2, we convert it to nearest power of 2. This means if you need more 
than 16 bits, we use 32 bits (raw value). We get diminished returns as the 
overhead of unpacking itself increases at the cost of saving 10-12 bits. 
   - Integrate the new changes with existing code. 
   
   **Description of changes:**
   
   A new version of FixedBitIntReaderWriter is written that underneath uses a 
new version of fast bit unpack reader PinotDataBitSetV2.
   
   There are 3 important APIs here:
   
   `public int readInt(int index)` 
   Exists in the current code as well - Used by the scan operator to read 
through the forward index and dictId for each docId
   
   `public void readInt(int startDocId, int length, int[] buffer)`
   Exists in the current code as well - Used by the multi-value bit reader to 
get dictId for all MVs in a given cell. 
   
   `public void readValues(int[] docIds, int docIdStartIndex, int docIdLength, 
int[] values, int valuesStartIndex)`
   Exists at the FixedBitSingleColumnSingleValueReader interface and used by 
the dictionary based group by executor to get dictIds for a set of docIds 
(monotonically increasing but not necessarily contiguous). But the API still 
issued single read calls underneath. This PR introduces this API at the 
FixedBitIntReaderWriterV2 level so that group by executor can leverage it using 
the bulk read semantics.  
   
   When this code is wired in, the scan operator will start using one of the 
second or third API.
   
   Please see the [spreadsheet 
](https://docs.google.com/spreadsheets/d/1mz_TQe0rXadWPtA_Xov6cXwYrSvUpQB1p1b_ZqROTDQ/edit?usp=sharing)for
 performance numbers.
   
   Two kinds of tests were done:
   
   - Compare the performance of sequential consecutive reads using single read 
API `getInt(index)` with faster bit unpacking code.
   - Compare the performance of sequential consecutive reads using array API 
`readInt(int startDocId, int length, int[] buffer)` with faster bit unpacking 
code. 
   
   Will be adding some units tests. The current PR has performance test.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to