On 5/17/2012 3:01 PM, Dyer, James wrote:
Do you think this behavior is because, while the indexing is paused, you reach
some type of timeout so either your db or the jdbc cuts the connection? Or, ar
you thinking something in the DIH/JDBCDataSource code is causing the connection
to drop under these circumstances?
I'm almost positive that it's the db or jdbc that cuts the connection,
probably the former. The last time I ran into it (which was before Solr
3.5), Solr's indexing was paused for eight minutes while merges
finished, and I think we have a five minute timeout. I don't think I
saw the same exception that Jon is seeing, but I don't have a record, so
I can't check.
My test server is SATA RAID1, and I have also done some indexing onto a
USB2/SATA drive, which is SLOW. I've never run into the timeout problem
on my production system, but those machines are running six 1TB drives
in RAID10. Lots of IOPS.
With an effective mergeFactor of 35, I merge much less often and I never
see a third-level merge. I haven't calculated how big my index has to
get before I will see a third level merge, but with my settings (see
below, because I modified the config snippet I pasted in earlier) I
should keep indexing even with three merges happening.
Solr 3.6 API for ConcurrentMergeScheduler:
http://bit.ly/JNmNY4
I did remove one line from my indexDefaults that I pasted in - I also
set maxThreadCount to 4, even though I am not doing a multithreaded
DIH. I removed it because I thought it might be confusing to have it
there. Turns out that was a bad idea. After looking at the 3.6 source
code for ConcurrentMergeScheduler, I believe that maxThreadCount is
required, but maxMergeCount defaults to maxThreadCount plus two, so it
actually would not be required, as long as maxThreadCount is set to at
least 4. Without the explicit configuration, maxThreadCount would
default to three or less, depending on how many CPUs you have.
private int maxThreadCount = Math.max(1, Math.min
(3, Runtime.getRuntime().availableProcessors()/2));
private int maxMergeCount = maxThreadCount+2;
<indexDefaults>
<useCompoundFile>false</useCompoundFile>
<mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
<int name="maxMergeAtOnce">35</int>
<int name="segmentsPerTier">35</int>
<int name="maxMergeAtOnceExplicit">105</int>
</mergePolicy>
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler">
<int name="maxMergeCount">4</int>
<int name="maxThreadCount">4</int>
</mergeScheduler>
<ramBufferSizeMB>128</ramBufferSizeMB>
<maxFieldLength>32768</maxFieldLength>
<writeLockTimeout>1000</writeLockTimeout>
<commitLockTimeout>10000</commitLockTimeout>
<lockType>native</lockType>
</indexDefaults>