Two more tidbits to add to Shawn’s explanation: There are heuristics built in to ConcurrentMergeScheduler. From the Javadocs: * If it's an SSD, * {@code maxThreadCount} is set to {@code max(1, min(4, cpuCoreCount/2))}, * otherwise 1. Note that detection only currently works on * Linux; other platforms will assume the index is not on an SSD.
Second, TieredMergePolicy (the default) merges in “tiers” that are of similar size. So you can have multiple merges going on at the same time on disjoint sets of segments. Best, Erick > On Jul 3, 2019, at 7:54 AM, Shawn Heisey <apa...@elyograg.org> wrote: > > On 7/2/2019 10:53 PM, Rahul Goswami wrote: >> Hi Shawn, >> Thank you for the detailed suggestions. Although, I would like to >> understand the maxMergeCount and maxThreadCount params better. The >> documentation >> <https://lucene.apache.org/solr/guide/7_3/indexconfig-in-solrconfig.html#mergescheduler> >> mentions >> that >> maxMergeCount : The maximum number of simultaneous merges that are allowed. >> maxThreadCount : The maximum number of simultaneous merge threads that >> should be running at once >> Since one thread can only do 1 merge at any given point of time, how does >> maxMergeCount being greater than maxThreadCount help anyway? I am having >> difficulty wrapping my head around this, and would appreciate if you could >> help clear it for me. > > The maxMergeCount setting controls the number of merges that can be > *scheduled* at the same time. As soon as that number of merges is reached, > the indexing thread(s) will be paused until the number of merges in the > schedule drops below this number. This ensures that no more merges will be > scheduled. > > By setting maxMergeCount higher than the number of merges that are expected > in the schedule, you can ensure that indexing will never be paused. It would > require very atypical merge policy settings for the number of scheduled > merges to ever reach six. On my own indexing, I reached three scheduled > merges quite frequently. The default setting for maxMergeCount is three. > > The maxThreadCount setting controls how many of the scheduled merges will be > simultaneously executed. With index data on standard spinning disks, you do > not want to increase this number beyond 1, or you will have a performance > problem due to thrashing disk heads. If your data is on SSD, you can make it > larger than 1. > > Thanks, > Shawn