Hello, Wondering if there are any tips on how to recover a shard when all nodes are down for a shard and ZK cannot find a leader ( clusterstate.json has no replica marked as leader for a shard)? Bouncing the nodes does not seem to help. Seems like I need to reset the clusterstate....
Running Solr 4.10.1 >From clusterstate.json: //Problem with Shard35 "shard34":{ "range":"28f50000-2e13ffff", "state":"active", "replicas":{ "core_node49":{ "state":"active", "core":"kla_collection_shard34_replica1", "node_name":"172.29.24.54:8983_solr", "base_url":"http://172.29.24.54:8983/solr", "leader":"true"}, //No such line for Shard35 "core_node71":{ "state":" active ", "core":"kla_collection_shard34_replica2", "node_name":"172.29.24.53:8983_solr", "base_url":"http://172.29.24.53:8983/solr"}}}, "shard35":{ "range":"2e140000-3332ffff", "state":"active", "replicas":{ "core_node51":{ "state":"down", "core":"kla_collection_shard35_replica1", "node_name":"172.29.24.54:8983_solr", "base_url":"http://172.29.24.54:8983/solr"}, "core_node75":{ "state":"down", "core":"kla_collection_shard35_replica2", "node_name":"172.29.24.53:8983_solr", "base_url":"http://172.29.24.53:8983/solr"}}}, Related log entries: 7/31/2015, 1:25:17 PM ERROR ZkController Error getting leader from zk org.apache.solr.common.SolrException: Could not get leader props at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:950) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:914) at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:870) at org.apache.solr.cloud.ZkController.register(ZkController.java:815) at org.apache.solr.cloud.ZkController.register(ZkController.java:763) at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:221) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/kla_collection/leaders/shard39 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:307) at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:304) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:74) at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:304) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:928) ... 8 more -- :org.apache.solr.common.SolrException: Error getting leader from zk for shard shard35 at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:903) at org.apache.solr.cloud.ZkController.register(ZkController.java:815) at org.apache.solr.cloud.ZkController.register(ZkController.java:763) at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:221) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.solr.common.SolrException: Could not get leader props at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:950) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:914) at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:870) ... 6 more Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/kla_collection/leaders/shard35 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:307) at org.apache.solr.common.cloud.SolrZkClient$8.execute(SolrZkClient.java:304) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:74) at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:304) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:928) ... 8 more Thanks, Matt