After a full bounce of Tomcat, I'm now getting a new exception (below). I can browse the Zookeeper config in the Solr admin UI, and can confirm that there's a node for '/collections/customerOrderSearch/leaders/shard2', but no node for 'collections/customerOrderSearch/leaders/shard1'. Still, any ideas or guidance on how to recover would be appreciated. We've restarted all three zookeeper instances and both Solr instances, but that hasn't made any appreciable difference.
--p. 2014-01-07 10:06:14,980 [coreLoadExecutor-4-thread-1] ERROR org.apache.solr.core.CoreContainer - null:org.apache.solr.common.cloud.ZooKeeperException: at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:309) at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:556) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:365) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.solr.common.SolrException: Error getting leader from zk for shard shard1 at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:864) at org.apache.solr.cloud.ZkController.register(ZkController.java:773) at org.apache.solr.cloud.ZkController.register(ZkController.java:723) at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:286) ... 11 more Caused by: org.apache.solr.common.SolrException: Could not get leader props at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:911) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:875) at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:839) ... 14 more Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/customerOrderSearch/leaders/shard1 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65) at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:889) ... 16 more On Tue, Jan 7, 2014 at 9:57 AM, patrick conant <patrick.con...@gmail.com>wrote: > In our Solr instance we have two shards each running on two servers. The > server that was the leader for one of the shards ran into a problem, and > when we restarted the service, Solar is no longer electing a leader for the > shard. > > The stack traces from the logs are below, and the 'Cloud Dump' from the > Solr console is attached. We're running Solr 4.4.0. Any guidance on how > to recover from this? Restarting or redeploying the service doesn't seem > to make any difference. > > Thanks, > Pat. > > > 2014-01-07 00:00:10,754 [http-8080-62] ERROR org.apache.solr.core.SolrCore > - org.apache.solr.common.SolrException: no servers hosting shard: > at > org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149) > at > org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > > 2014-01-07 09:38:33,701 [http-8080-21] ERROR org.apache.solr.core.SolrCore > - org.apache.solr.common.SolrException: No registered leader was found, > collection:customerOrderSearch slice:shard1 > at > org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:487) > at > org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:470) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:223) > at > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428) > at > org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246) > at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173) > at > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606) > at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) > at java.lang.Thread.run(Thread.java:662) > > >