You have to be a little careful here, one thing I learned relatively recently 
is that there are in-memory structures that hold pointers to _all_ 
un-searchable docs (i.e. no new searchers have been opened since the doc was 
added/updated) to support real-time get. So if you’re indexing a _lot_ of docs 
that internal structure can grow quite large….

FWIW, delete-by-query is painful. Each one has to lock all indexing on all 
replicas while it completes. If you can use delete-by-id it’d be better.

Let’s back up a bit and look at _why_ your nodes go into recovery…. Leave the 
replicas on if you can and look for “Leader Initiated Recovery” (not sure 
that’s the exact phrase, but you’ll see something very like that). If that’s 
the case, then one situation we’ve seen is that a request takes too long to 
return from a follower. So the sequence looks like this:

- leader gets update
- leader indexes locally _and_ forwards to follower
- follower is busy (and the delete-by-query could be why) and takes too long to 
respond so the request times out
- leader says “hmmm, I don’t know what happened so I’ll tell the follower to 
recover”.

Given your heavy update rate, there’ll be no chance for “peer sync” to fully 
recover so it’ll go into full recovery. That can sometimes be fixed by simply 
lengthening the timeout.

Otherwise also take a look at the logs and see if you can find a root cause for 
the replica going into recovery and we should see if we can fix that.

I didn’t ask what versions of Solr you’re using, but in the 7x code line (7.3 
IIRC) significant work was done to make recovery less likely.

Best,
Erick

> On May 22, 2019, at 10:27 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> 
> On 5/22/2019 10:47 AM, Russell Taylor wrote:
>> I will add that we have set commits to be only called by the loading 
>> program. We have turned off soft and autoCommits in the solrconfig.xml.
> 
> Don't turn off autoCommit.  Regular hard commits, typically with openSearcher 
> set to false so they don't interfere with change visibility, are extremely 
> important for good Solr operation.  Without it, the transaction logs will 
> grow out of control.  In addition to taking a lot of disk space, that will 
> cause a Solr restart to happen VERY slowly.  Note that a hard commit with 
> openSearcher set to false will be VERY fast -- doing them frequently is 
> usually not a problem for performance.  Sample configs in recent Solr 
> versions ship with autoCommit set to 15 seconds and openSearcher set to false.
> 
> Not using autoSoftCommit is a reasonable thing to do if you do not need that 
> functionality ... but don't disable autoCommit.
> 
> Thanks,
> Shawn

Reply via email to