Hi, folks,

I am using Solr 1.3 pretty successfully, but am running into an issue that
hits once in a long while.  I'm still using 1.3 since I have some custom
code I will have to port forward to 1.4.

My basic setup is that I have data sources continually pushing data into
Solr, around 20K adds per day.  The index is currently around 100G, stored
on local disk on a fast linux server.  I'm trying to make new docs
searchable as quickly as possible, so I currently have autocommit set to
15s.  I originally had 3s but that seems to be a little too unstable.  I
never optimize the index since optimize will lock things up solid for 2
hours, dropping docs until the optimize completes.  I'm using the default
segment merging settings.

Every once in a while I'm getting a socket timeout when trying to add a
document.  I traced it to a 20s timeout and then found the corresponding
point in the Solr log.

Jan 13, 2010 2:59:15 PM org.apache.solr.core.SolrCore execute
INFO: [tales] webapp=/solr path=/update params={} status=0 QTime=2
Jan 13, 2010 2:59:15 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true)
Jan 13, 2010 2:59:56 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening searc...@26e926e9 main
Jan 13, 2010 2:59:56 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush

Solr locked up for 41 seconds here while doing some of the commit work.
So, I have a few questions.

Is this related to GC?
Does Solr always lock up when merging segments and I just have to live with
losing the doc I want to add?
Is there a timeout that would guarantee me a write success?
Should I just retry in this situation? If so, how do I distinguish between
this and Solr just being down?
I already have had issues in the past with too many files open, so
increasing the merge factor isn't an option.


On a related note, I had previously asked about optimizing and was told
that segment merging would take care of cleaning up deleted docs.  However,
I have the following stats for my index:

numDocs : 2791091
maxDoc : 4811416

My understanding is that numDocs is the docs being searched and maxDoc is
the number of docs including ones that will disappear after optimization.
How do I get this cleanup without using optimize, since it locks up Solr
for multiple hours.  I'm deleting old docs daily as well.

Thanks for all the help,
Jerry

Reply via email to