jpountz opened a new pull request, #892: URL: https://github.com/apache/lucene/pull/892
With this change, if the first segment of the merge is too dirty, stored fields would still perform a bulk merge until the first dirty chunk. In the degenerate case when the first segment keeps being rewritten because it keeps being merged with tiny segments, this greatly improves the performance of merging. Before this change, luceneutil's StoredFieldsBenchmark runs in 78s on my machine on 1M docs and `BEST_COMPRESSION`. After this change, it runs in 22s. This is because merging a clean large segment with 9 small segments would create a new large segment whose dirty chunks are located at the very end of the segment. So the next merge could still bulk-copy most of the data. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org