Things shouldn't be going into recovery that often. Exceeding the maxwarming searchers indicates that you're committing very often, and that your autowarming interval exceeds the interval between commits (either hard commit with openSearcher set to true or soft commits).
I'd focus on that bit first. How are you committing, what are your autowarm settings etc? Are you committing from the client? Do you have very high (> 32 IMO) autowarm counts for your caches in solrconfig.xml? etc. Here's a long writeup of commits -n- stuff: https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Best, Erick On Tue, Sep 22, 2015 at 9:24 AM, vsilgalis <vsilga...@gmail.com> wrote: > We have a collection with 2 shards, 3 nodes per shard running solr 4.10.2 > > Our issue is that cores that get in recovery never recover, they are in a > constant state of recovery unless we restart the node and then reload the > core on the leader. Updates seem to get to the server fine as the > transaction log grows over time and when we restart the node it replays the > transaction log successfully and chugs along in recovery until we reload the > core on the leader. If we hit the maxwarmingsearchers error would that > break something that prevents recovery? > > here is log i have for the node that is in recovery: > INFO - 2015-09-18 15:10:25.332; > org.apache.solr.handler.component.SpellCheckComponent; > http://0.0.0.41:8080/solr/collection1/|http://0.0.0.45:8080/solr/collection1/ > {suggestions={}} > INFO - 2015-09-18 15:10:25.332; > org.apache.solr.handler.component.SpellCheckComponent; > http://0.0.0.40:8080/solr/collection1/|http://0.0.0.42:8080/solr/collection1/|http://0.0.0.44:8080/solr/collection1/ > { > suggestions={}} > INFO - 2015-09-18 15:10:25.609; > org.apache.solr.update.DirectUpdateHandler2; start > commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > WARN - 2015-09-18 15:10:25.642; org.apache.solr.core.SolrCore; > [collection1] Error opening new searcher. exceeded limit of > maxWarmingSearchers=2, try again later. > ERROR - 2015-09-18 15:10:25.642; org.apache.solr.common.SolrException; auto > commit error...:org.apache.solr.common.SolrException: Error opening new > searcher. exceeded limit of maxWarmingSearchers=2, try again > later. > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1663) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1421) > at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:615) > at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > INFO - 2015-09-18 15:10:26.429; > org.apache.solr.handler.component.SpellCheckComponent; > http://0.0.0.40:8080/solr/collection1/|http://0.0.0.42:8080/solr/collection1/|http://0.0.0.44:8080/solr/collection1/ > n > ull > INFO - 2015-09-18 15:10:26.429; > org.apache.solr.handler.component.SpellCheckComponent; > http://0.0.0.41:8080/solr/collection1/|http://0.0.0.45:8080/solr/collection1/ > null > INFO - 2015-09-18 15:10:26.430; > org.apache.solr.handler.component.SpellCheckComponent; > http://0.0.0.40:8080/solr/collection1/|http://0.0.0.42:8080/solr/collection1/|http://0.0.0.44:8080/solr/collection1/ > n > ull > INFO - 2015-09-18 15:10:26.430; > org.apache.solr.handler.component.SpellCheckComponent; > http://0.0.0.41:8080/solr/collection1/|http://0.0.0.45:8080/solr/collection1/ > null > INFO - 2015-09-18 15:10:27.359; > org.apache.solr.handler.component.SpellCheckComponent; > http://0.0.0.40:8080/solr/collection1/|http://0.0.0.42:8080/solr/collection1/|http://0.0.0.44:8080/solr/collection1/ > n > ull > INFO - 2015-09-18 15:10:27.359; > org.apache.solr.handler.component.SpellCheckComponent; > http://0.0.0.41:8080/solr/collection1/|http://0.0.0.45:8080/solr/collection1/ > null > INFO - 2015-09-18 15:10:27.710; > org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener; > Building spell index for spell checker: default > INFO - 2015-09-18 15:10:27.766; org.apache.solr.cloud.RecoveryStrategy; > PeerSync Recovery was successful - registering as Active. core=collection1 > INFO - 2015-09-18 15:10:27.766; org.apache.solr.cloud.ZkController; > publishing core=collection1 state=active collection=collection1 > INFO - 2015-09-18 15:10:27.773; > org.apache.solr.update.DefaultSolrCoreState; Running recovery - first > canceling any ongoing recovery > WARN - 2015-09-18 15:10:27.774; org.apache.solr.cloud.RecoveryStrategy; > Stopping recovery for core=collection1 coreNodeName=solrserver4 > INFO - 2015-09-18 15:10:27.774; org.apache.solr.cloud.RecoveryStrategy; > Starting recovery process. core=collection1 recoveringAfterStartup=false > INFO - 2015-09-18 15:10:27.776; org.apache.solr.cloud.RecoveryStrategy; > Finished recovery process. core=collection1 > INFO - 2015-09-18 15:10:27.776; org.apache.solr.cloud.RecoveryStrategy; > Starting recovery process. core=collection1 recoveringAfterStartup=false > INFO - 2015-09-18 15:10:27.776; > org.apache.solr.update.DefaultSolrCoreState; Running recovery - first > canceling any ongoing recovery > WARN - 2015-09-18 15:10:27.777; org.apache.solr.cloud.RecoveryStrategy; > Stopping recovery for core=collection1 coreNodeName=solrserver4 > INFO - 2015-09-18 15:10:27.777; org.apache.solr.cloud.RecoveryStrategy; > Finished recovery process. core=collection1 > INFO - 2015-09-18 15:10:27.778; > org.apache.solr.update.DefaultSolrCoreState; Running recovery - first > canceling any ongoing recovery > INFO - 2015-09-18 15:10:27.778; org.apache.solr.cloud.RecoveryStrategy; > Starting recovery process. core=collection1 recoveringAfterStartup=false > WARN - 2015-09-18 15:10:27.778; org.apache.solr.cloud.RecoveryStrategy; > Stopping recovery for core=collection1 coreNodeName=solrserver4 > INFO - 2015-09-18 15:10:27.778; org.apache.solr.cloud.RecoveryStrategy; > Finished recovery process. core=collection1 > INFO - 2015-09-18 15:10:27.779; > org.apache.solr.update.DefaultSolrCoreState; Running recovery - first > canceling any ongoing recovery > INFO - 2015-09-18 15:10:27.779; org.apache.solr.cloud.RecoveryStrategy; > Starting recovery process. core=collection1 recoveringAfterStartup=false > WARN - 2015-09-18 15:10:27.779; org.apache.solr.cloud.RecoveryStrategy; > Stopping recovery for core=collection1 coreNodeName=solrserver4 > > The starting stopping recovery just replays constantly. > > Let me know what else is needed to help troubleshoot this issue. > > Thanks > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-4-10-2-Cores-in-Recovery-tp4230598.html > Sent from the Solr - User mailing list archive at Nabble.com.