You have to be a little careful here, one thing I learned relatively recently is that there are in-memory structures that hold pointers to _all_ un-searchable docs (i.e. no new searchers have been opened since the doc was added/updated) to support real-time get. So if you’re indexing a _lot_ of docs that internal structure can grow quite large….
FWIW, delete-by-query is painful. Each one has to lock all indexing on all replicas while it completes. If you can use delete-by-id it’d be better. Let’s back up a bit and look at _why_ your nodes go into recovery…. Leave the replicas on if you can and look for “Leader Initiated Recovery” (not sure that’s the exact phrase, but you’ll see something very like that). If that’s the case, then one situation we’ve seen is that a request takes too long to return from a follower. So the sequence looks like this: - leader gets update - leader indexes locally _and_ forwards to follower - follower is busy (and the delete-by-query could be why) and takes too long to respond so the request times out - leader says “hmmm, I don’t know what happened so I’ll tell the follower to recover”. Given your heavy update rate, there’ll be no chance for “peer sync” to fully recover so it’ll go into full recovery. That can sometimes be fixed by simply lengthening the timeout. Otherwise also take a look at the logs and see if you can find a root cause for the replica going into recovery and we should see if we can fix that. I didn’t ask what versions of Solr you’re using, but in the 7x code line (7.3 IIRC) significant work was done to make recovery less likely. Best, Erick > On May 22, 2019, at 10:27 AM, Shawn Heisey <apa...@elyograg.org> wrote: > > On 5/22/2019 10:47 AM, Russell Taylor wrote: >> I will add that we have set commits to be only called by the loading >> program. We have turned off soft and autoCommits in the solrconfig.xml. > > Don't turn off autoCommit. Regular hard commits, typically with openSearcher > set to false so they don't interfere with change visibility, are extremely > important for good Solr operation. Without it, the transaction logs will > grow out of control. In addition to taking a lot of disk space, that will > cause a Solr restart to happen VERY slowly. Note that a hard commit with > openSearcher set to false will be VERY fast -- doing them frequently is > usually not a problem for performance. Sample configs in recent Solr > versions ship with autoCommit set to 15 seconds and openSearcher set to false. > > Not using autoSoftCommit is a reasonable thing to do if you do not need that > functionality ... but don't disable autoCommit. > > Thanks, > Shawn