Maybe I have been working too many long hours as I missed the obvious solution 
of bringing down/up one of the Solr nodes backing one of the replicas, and then 
the same for the second node.  This did the trick.

Since I brought this topic up, I will narrow the question a bit:  Would there 
be a way to recover without restarting the Solr node?  Basically to delete one 
replica and then somehow declare the other replica the leader and break it out 
of its recovery process?

Thanks,
Matt


From: Matt Kuiper
Sent: Wednesday, April 01, 2015 8:43 PM
To: solr-user@lucene.apache.org
Subject: How to recover a Shard

Hello,

I have a SolrCloud (4.10.1) where for one of the shards, both replicas are in a 
"Recovery Failed" state per the Solr Admin Cloud page.  The logs contains the 
following type of entries for the two Solr nodes involved, including statements 
that it will retry.

Is there a way to recover from this state?

Maybe bring down one replica, and then somehow declare that the remaining 
replica is to be the leader?  Understand this would not be ideal as the new 
leader may be missing documents that were sent its way to be indexed while it 
was down, but would be better than having to rebuild the whole cloud.

Any tips or suggestions would be appreciated.

Thanks,
Matt

Solr node .65
Error while trying to recover. 
core=kla_collection_shard6_replica5:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6
         at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)
         at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)
         at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
         at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
Solr node .64

Error while trying to recover. 
core=kla_collection_shard6_replica2:org.apache.solr.common.SolrException: No 
registered leader was found after waiting for 4000ms , collection: 
kla_collection slice: shard6

         at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)

         at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)

         at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)

         at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)

Reply via email to