Hi All, Solr verion 5.3.2 Zookeeper 3.6.2 SolrCloud - 2 shards, 4 replicas, 4 nodes
Above is the set up. 3 of the shards (replicas) went to a recovery mode which the following ERROR in the logs. Anyone experienced this before? I had to restart the Solr server nodes to bring them all up. Looks like a leader election issue? 2016-07-29 06:52:48.610 ERROR (coreZkRegister-1-thread-32-processing-s:shard2 x:tCollection_shard2_replica4 c:tCollection n:tsolr.prod2.xxx.com:8983_solr r:core_node6) [c:tCollection s:shard2 r:core_node6 x:tCollection_shard2_replica4] o.a.s.c.ZkController Error getting leader from zk org.apache.solr.common.SolrException: No registered leader was found after waiting for 1560000ms , collection: tCollection slice: shard2 at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:637) at org.apache.solr.common.cloud.ZkStateReader.getLeaderUrl(ZkStateReader.java:604) at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:970) at org.apache.solr.cloud.ZkController.register(ZkController.java:907) at org.apache.solr.cloud.ZkController$RegisterCoreAsync.call(ZkController.java:227) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2016-07-29 09:17:14.440 WARN (ShutdownMonitor) [ ] o.a.s.c.RecoveryStrategy Stopping recovery for core=tCollection_shard1_replica4 coreNodeName=core_node5 2016-07-29 09:17:14.683 WARN (zkCallback-3-thread-380-processing-n:tsolr.prod2.xxx.com:8983_solr) [ ] o.a.s.c.c.ZkStateReader ZooKeeper watch triggered, but Solr cannot talk to ZK 2016-07-29 09:17:14.684 WARN (zkCallback-3-thread-374-processing-n:tsolr.prod2.xxx.com:8983_solr) [ ] o.a.s.c.c.ZkStateReader ZooKeeper watch triggered, but Solr cannot talk to ZK 2016-07-29 09:17:14.684 ERROR (zkCallback-3-thread-9-processing-n:tsolr.prod2.xxx.com:8983_solr-EventThread) [ ] o.a.z.ClientCnxn Error while calling watcher java.util.concurrent.RejectedExecutionException: Task org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1@7402ec22 rejected from org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor@73ee87d4[Shutting down, pool size = 9, active threads = 2, queued tasks = 0, completed tasks = 1585] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.execute(ExecutorUtil.java:193) at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110) at org.apache.solr.common.cloud.SolrZkClient$3.process(SolrZkClient.java:261) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) Thank you, Aswath NS