Re: CommitScheduler Thread blocked due to excessive number of Merging Threads

Shawn Heisey Thu, 07 Sep 2017 06:22:07 -0700

On 9/7/2017 4:25 AM, yasoobhaider wrote:
> So I did a little more digging around why the merging is taking so
> long, and it looks like merging postings is the culprit. On the 5.4
> version, merging 500 docs is taking approximately 100 msec, while on
> the 6.6 version, it is taking more than 3000 msec. The difference
> seems to get worse when more docs are being merged. Any ideas why this
> may be the case?


The rest of this thread is completely lost here, I only found the info
by going to Nabble, which is a mirror of the mailing list in forum
format.  The mailing list is the canonical repository.

Setting the ramBufferSizeMB to nearly 5 gigabytes is only going to be
helpful if the docs you are indexing into Solr are enormous -- many
megabytes of text data in each one.  Testing by Solr developers has
shown that values above about 128MB do not typically provide any
performance advantage with normal sized documents.  The commit
characteristics will have more to do with how large each segment is than
the ramBufferSizeMB.  The default ramBufferSizeMB value in modern Solr
versions is 100.

Assuming we are dealing with relatively small documents, I would
recommend these settings in indexConfig (removing ramBufferSizeMB,
mergePolicyFactory, and maxBufferedDocs entirely):

<autoCommit>
      <maxTime>60000</maxTime>
      <openSearcher>false</openSearcher>
</autoCommit>

<autoSoftCommit>
      <maxTime>600000</maxTime>
</autoSoftCommit>

<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler">
      <int name="maxMergeCount">6</int>
      <int name="maxThreadCount">1</int>
</mergeScheduler>

If your data is on standard disks, then you want maxThreadCount at one. 
If it's on SSD, then you can raise it a little bit, but I wouldn't go
beyond about 2 or 3.  On standard disks with many threads writing merged
segments, the disk will begin thrashing excessively and I/O will slow to
a crawl.

If the documents are huge, then you can raise ramBufferSizeMB, but five
gigabytes is REALLY BIG and will require a very large heap.

If there is good reason to increase the values in mergePolicy, then this
is what I would recommend:

<mergePolicyFactory class="org.apache.solr.index.TieredMergePolicyFactory">
      <int name="maxMergeAtOnce">30</int>
      <int name="segmentsPerTier">30</int>
      <int name="maxMergeAtOnceExplicit">90</int>
</mergePolicyFactory>

The settings I've described here may help, or it may do nothing.  If it
doesn't help, then the problems may be memory-related, which is a whole
separate discussion.

When Lucene says "too many merge threads, stalling" it means there are
many merges scheduled at the same time, which usually means that there
are multiple *levels* of merging scheduled -- one that combines a bunch
of initial level segments into one second level segment, one that
combines multiple second level segments into third-level segments, and
so on.  The "stalling" means that the *indexing* thread is paused until
the number of merges drops below maxMergeCount.  If this is happening
with maxMergeCount at eight, it is likely because of the current
autoCommit maxDocs setting of 10000 -- each of the initial segments are
very small, so there are a LOT of segments that need merging.  The
autoCommit and autoSoftCommit settings that I provided will hopefully
make that less of a problem.

Merging segments goes slower than the speed of your disks.  This is
because Lucene must collect a lot of information from each source
segment and combine it in memory to write a new segment.  The gathering
and combining is much slower than modern disk speeds.

Thanks,
Shawn

Re: CommitScheduler Thread blocked due to excessive number of Merging Threads

Reply via email to