[ 
https://issues.apache.org/jira/browse/LUCENE-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292542#comment-17292542
 ] 

ASF subversion and git services commented on LUCENE-9816:
---------------------------------------------------------

Commit dade99cb4d70a8d1fd9eecd85741668828f7b874 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dade99c ]

LUCENE-9816: lazy-init LZ4-HC hashtable in BlockTreeTermsWriter

LZ4-HC hashtable is heavy (128kb int[] + 128kb short[]) and must be
filled with special values on initialization. This is a lot of overhead
for fields that might not use the compression at all.

Don't initialize this for a field until we see hints that the data might
be compressible and need to use the table in order to test it out.


> lazy-init LZ4-HC hashtable in blocktreewriter
> ---------------------------------------------
>
>                 Key: LUCENE-9816
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9816
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Priority: Major
>         Attachments: LUCENE-9816.patch
>
>
> Based upon the data for a field, blocktree may compress with LZ4-HC (or with 
> simple lowercase compression or none at all).
> But we currently eagerly initialize HC hashtable (132k) for each field 
> regardless of whether it will be even "tried". This shows up as top cpu and 
> heap hotspot when profiling tests. It creates unnecessary overhead for small 
> flushes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to