[GitHub] [lucene] jpountz opened a new pull request, #892: LUCENE-10573: Improve stored fields bulk merge for degenerate O(n^2) merges.

GitBox Mon, 16 May 2022 01:46:09 -0700


jpountz opened a new pull request, #892:
URL: https://github.com/apache/lucene/pull/892


   With this change, if the first segment of the merge is too dirty, stored 
fields
   would still perform a bulk merge until the first dirty chunk. In the 
degenerate
   case when the first segment keeps being rewritten because it keeps being 
merged
   with tiny segments, this greatly improves the performance of merging.
   
   Before this change, luceneutil's StoredFieldsBenchmark runs in 78s on my
   machine on 1M docs and `BEST_COMPRESSION`. After this change, it runs in 22s.
   This is because merging a clean large segment with 9 small segments would
   create a new large segment whose dirty chunks are located at the very end of
   the segment. So the next merge could still bulk-copy most of the data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz opened a new pull request, #892: LUCENE-10573: Improve stored fields bulk merge for degenerate O(n^2) merges.

Reply via email to