[ https://issues.apache.org/jira/browse/LUCENE-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292542#comment-17292542 ]
ASF subversion and git services commented on LUCENE-9816: --------------------------------------------------------- Commit dade99cb4d70a8d1fd9eecd85741668828f7b874 in lucene-solr's branch refs/heads/master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dade99c ] LUCENE-9816: lazy-init LZ4-HC hashtable in BlockTreeTermsWriter LZ4-HC hashtable is heavy (128kb int[] + 128kb short[]) and must be filled with special values on initialization. This is a lot of overhead for fields that might not use the compression at all. Don't initialize this for a field until we see hints that the data might be compressible and need to use the table in order to test it out. > lazy-init LZ4-HC hashtable in blocktreewriter > --------------------------------------------- > > Key: LUCENE-9816 > URL: https://issues.apache.org/jira/browse/LUCENE-9816 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Robert Muir > Priority: Major > Attachments: LUCENE-9816.patch > > > Based upon the data for a field, blocktree may compress with LZ4-HC (or with > simple lowercase compression or none at all). > But we currently eagerly initialize HC hashtable (132k) for each field > regardless of whether it will be even "tried". This shows up as top cpu and > heap hotspot when profiling tests. It creates unnecessary overhead for small > flushes. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org