richardstartin opened a new issue #7616:
URL: https://github.com/apache/pinot/issues/7616


   We would like to introduce a new format for raw forward indexes, which does 
not need a constant number of documents per chunk, instead partitioning columns 
based on uncompressed size. It is expected that this design will lead to:
   
   * less memory consumption when there are large values in a raw column
   * fewer chunks than when the number of documents is derived
   * more balanced chunk sizes than when the number of documents is derived
   * will provide support for realtime segments by breaking the dependency on 
column statistics for sizing
   
   The format would be opt in for the foreseeable future. 
   
   [Design 
document](https://docs.google.com/document/d/1Y7MyQGmDD2fI7brOOFQtToxd8ML837qRuc3IlNYFvCw/edit?usp=sharing)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to