jpountz opened a new pull request, #13294: URL: https://github.com/apache/lucene/pull/13294
You need as many merge threads as necessary to make sure that merges can keep up with indexing. But this number depends on the data that you are indexing: if you are only indexing stored fields, merges can copy compressed data directly and merges are only a small fraction of the total indexing+flushing+merging cost. But if you primary index knn vectors, merging N docs may require about as much work as flushing N docs. If you add the fact that documents typically go through multiple rounds of merging, the merging cost can end up being more than half of the total indexing+flushing+merging cost. This change proposes to update the default number of merge threads assuming an intermediate scenario where merges perform about half of the total indexing+flushing+merging work, ie. it gives half the threads of the system to merges. One goal of this change is to no longer have to configure a custom number of merge threads on nightly benchmarks, which run on a highly concurrent machine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org