itschrispeck opened a new pull request, #12945:
URL: https://github.com/apache/pinot/pull/12945

   **Background**
   - V4 format was introduced to better handle variable length data chunk size 
by reducing the potential for large allocations 
https://github.com/apache/pinot/pull/7661 
   - V3 format allocated direct memory based on `numDocPerChunk * 
lengthOfLongestEntry`, which was very efficient for near-constant length/short 
data. 
   - For example, in each format a null column’s chunk size would be:
     - V3: 1000 docs * 4 bytes (‘null’) = `4KB`
     - V4: `1MB` hardcoded target
   
   **Problem**
   - Making V4 default will result in a large direct memory increase for the 
values we typically see. 
   - For the static 1000 docs/chunk used in V2/3 (assuming 
`deriveNumDocsPerChunk` is not set) the breakeven point assumes the 
`lengthOfLongestEntry` of a column in a segment is ~1KB
   
   We have seen this behavior first hand after making V4 default internally. We 
have many columns for which we do not know if they will contain variable length 
data or ‘short data’, and it's desirable to handle both cases with a single 
format.
   
   **Change**
   This PR introduces dynamic chunk sizing for V4 format. Target chunk size is 
calculated based on the heuristic:
   ```
   min(maxLength * DEFAULT_NUM_DOCS_PER_CHUNK, TARGET_MAX_CHUNK_SIZE)
   ```
   
   In testing I’ve found doing this results in reduced direct memory spikes, 
especially against wide tables/high QPS. The below graph shows the improvement 
in direct memory spikes for a env with majority of tables using 3-7 day TTL and 
adhoc QPS. Some spikes are still present as not all segments with the old 
static chunk size have been expired (some 30 day TTL tables exist).
   
   
![image](https://github.com/apache/pinot/assets/27231838/f33003cd-141d-478a-93d7-adf9995a4c67)
   
   I think dynamic chunk sizing should be the default implementation for V4 and 
have not put this behind a config. It bridges the gap between the variable 
length data behavior of V4 with the 'short data' behavior of V2/V3. 
   
   There are no backward compatibility concerns with this PR. 
   
   tags: `performance`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to