gf2121 opened a new pull request, #12604: URL: https://github.com/apache/lucene/pull/12604
### Description https://blunders.io/jfr-demo/indexing-4kb-2023.09.25.18.03.36/allocations-drill-down Nightly benchmark shows that `FSTCompiler#init` allocated most of the memory during indexing. This is because `FSTCompiler#init` will always allocate 32k bytes as we param `bytesPageBits` default to 15. I counted the usage of BytesStore (`getPosition()` when `BytesStore#finish` called) during the wikimediumall indexing, and the result shows that 99% FST won't even use more than 1k bytes. ``` BytesStore#finish called: 1000000 times min: 1 mid: 16 avg: 64.555987 pct75: 28 pct90: 57 pct99: 525 pct999: 4957 pct9999: 29124 max: 631700 ``` This PR proposes to reduce the block size of `FST` in `Lucene90BlockTreeTermsWriter`. closes https://github.com/apache/lucene/issues/12598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org