vvivekiyer opened a new pull request, #12223: URL: https://github.com/apache/pinot/pull/12223
This PR corresponds to issue [12078](https://github.com/apache/pinot/issues/12078) - This is disabled by default. It can be enabled by adding the following config ` "fieldConfigList": [ { "name": "dimInt", "encodingType": "DICTIONARY", "indexTypes": [], "indexes": { "dictionary": { "disabled": false, "onHeap": true, "useVarLengthDictionary": true "onHeapConfig": { "enableInterning":true, "internerCapacity":32000000 } } }, "tierOverwrites": null } ` - This change will help reduce heap usage for high-cardinality columns that contain reasonable number of duplicate values. - The FALFInterner implementation in this PR is based on https://dzone.com/articles/duplicate-strings-how-to-get-rid-of-them-and-save Heap Usage Before Interning  Heap Usage After Interning  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org