[PR] Increase the default number of merge threads. [lucene]

via GitHub Thu, 11 Apr 2024 02:11:31 -0700


jpountz opened a new pull request, #13294:
URL: https://github.com/apache/lucene/pull/13294


   You need as many merge threads as necessary to make sure that merges can 
keep up with indexing. But this number depends on the data that you are 
indexing: if you are only indexing stored fields, merges can copy compressed 
data directly and merges are only a small fraction of the total 
indexing+flushing+merging cost. But if you primary index knn vectors, merging N 
docs may require about as much work as flushing N docs. If you add the fact 
that documents typically go through multiple rounds of merging, the merging 
cost can end up being more than half of the total indexing+flushing+merging 
cost.
   
   This change proposes to update the default number of merge threads assuming 
an intermediate scenario where merges perform about half of the total 
indexing+flushing+merging work, ie. it gives half the threads of the system to 
merges.
   
   One goal of this change is to no longer have to configure a custom number of 
merge threads on nightly benchmarks, which run on a highly concurrent machine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[PR] Increase the default number of merge threads. [lucene]

Reply via email to