Jackie-Jiang commented on issue #16618: URL: https://github.com/apache/pinot/issues/16618#issuecomment-3192961981
We should consider supporting the following type of String dictionaries: 1. Real fixed length one (all values of the same length), where we don't need to handle the padding 2. Existing Var length dictionary (value length encoded in the header) 3. Existing fixed length one with padding (only for backward compatibility) When creating the dictionary, Pinot should choose whether to create fixed length one or var length one based on the stats of the values. For var length dictionary, we are paying storage overhead of 1 integer per entry to save the offset of the entry. We can also consider using 2 levels to store index, where first level is the index for a block, and second level is the index within a block to reduce the overhead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
