I am using DIH with the MySQL connector to import data into my index.
When doing a full import in my 3.1 test environment, it sometimes loses
connection with the database and ends up rolling back the import. My
import configuration uses a single query, so there's no possibility of a
reconnect fixing this. Visit http://pastebin.com/Ya9DBMEP for the error
log. I'm using mysql-connector-java-5.1.15-bin.jar.
It seems that this occurs when Solr is busy doing multiple segment
merges, when there are two merges partially complete and it's working on
a third, causing ongoing index activity to cease for several minutes.
Indexing activity seems to be fine up until there are three merges in
progress.
This is a virtual environment using Xen on CentOS5, two VMs. The host
has SATA RAID1, so there's not a lot of I/O capacity. When both virtual
machines are busy indexing, it can't keep up with the load, and one
segment merge doesn't have time to complete before it's built up enough
segments to start another one, which puts the first one on hold. If I
build one virtual machine at a time, it doesn't do this, but then it
takes twice as long. My 1.4.1 production systems builds all six shards
at the same time when it's doing a full rebuild, but that's using RAID10.
I grabbed a sniffer trace of the MySQL connection from the database
server. After the last actual data packet in the capture, there is a
173 second pause followed by a "Request Quit" packet from the VM, then
the connection is torn down normally.
My best guess right now is that the "idle-timeout-minutes" setting in
JDBC is coming into play here during my single query, and that it's set
to 3 minutes. The Internet cannot seem to tell me what the default
value is for this setting, and I do not see it mentioned anywhere in the
MySQL/J source code. I tried adding idle-timeout-minutes="30" to the
datasource definition in my DIH config, it didn't seem to do anything.
Am I on the right track? Is there any way to configure DIH so that it
won't do this?
Thanks,
Shawn