luyuncheng commented on PR #11987: URL: https://github.com/apache/lucene/pull/11987#issuecomment-1337530535
> the idea is not to have one instance per segment. > but it means we can reuse buffers per-search and per-merge rather than creating per-document garbage. basically it would work like every other part of lucene. @rmuir it is absolutely right!! How about make the buffer decouple from Decompressor? when some instance wanna reuse buffer per-merge or retrieve docs, let the `reader` or `BlockState` holds the buffer's lifecycle rather than every decompressor in commits [2f676e6](https://github.com/apache/lucene/pull/11987/commits/2f676e6d16afdc3ce890899ab961264de7b5d4b5) , i move the buffer in BytesRef, and let BytesRef to decide the buffer's lifecycle when there is a merging, the bytes buffer would reused like code: https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene90/compressing/Lucene90CompressingStoredFieldsReader.java#L527-L532 and benchmarks shows almost no performance regression: when enableBulkMerge=false -Xmx256m | | Baseline | Candidate | | :--- | :----: | ---: | | indexing_time_msec | | | | BEST_SPEED | 365247.00 | 370257.00 | | BEST_COMPRESSION | 850573.00 | 850017.00 | | retrieved_time_msec | | | | BEST_SPEED | 256.48 | 268.60 | | BEST_COMPRESSION | 2646.73 | 2593.71 | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org