We found a way to recover. This sequence allowed everything to start up successfully.
- Stop all Solr instances - Stop all Zookeeper instances - Start all Zookeeper instances - Start Solr instances one at a time. Restarting the first Solr instance took several minutes, but the subsequent instances started up much more quickly. Cheers, Pat. On Tue, Jan 7, 2014 at 10:20 AM, patrick conant <patrick.con...@gmail.com>wrote: > After a full bounce of Tomcat, I'm now getting a new exception (below). I > can browse the Zookeeper config in the Solr admin UI, and can confirm that > there's a node for '/collections/customerOrderSearch/leaders/shard2', but > no node for 'collections/customerOrderSearch/leaders/shard1'. Still, any > ideas or guidance on how to recover would be appreciated. We've restarted > all three zookeeper instances and both Solr instances, but that hasn't made > any appreciable difference. > > --p. > > > > > 2014-01-07 10:06:14,980 [coreLoadExecutor-4-thread-1] ERROR > org.apache.solr.core.CoreContainer - > null:org.apache.solr.common.cloud.ZooKeeperException: > at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:309) > at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:556) > at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:365) > at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.solr.common.SolrException: Error getting leader from > zk for shard shard1 > at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:864) > at org.apache.solr.cloud.ZkController.register(ZkController.java:773) > at org.apache.solr.cloud.ZkController.register(ZkController.java:723) > at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:286) > ... 11 more > Caused by: org.apache.solr.common.SolrException: Could not get leader props > at > org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:911) > at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:875) > at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:839) > ... 14 more > Caused by: org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode = NoNode for /collections/customerOrderSearch/leaders/shard1 > at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151) > at > org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252) > at > org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249) > at > org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65) > at > org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249) > at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:889) > ... 16 more > > > > On Tue, Jan 7, 2014 at 9:57 AM, patrick conant > <patrick.con...@gmail.com>wrote: > >> In our Solr instance we have two shards each running on two servers. The >> server that was the leader for one of the shards ran into a problem, and >> when we restarted the service, Solar is no longer electing a leader for the >> shard. >> >> The stack traces from the logs are below, and the 'Cloud Dump' from the >> Solr console is attached. We're running Solr 4.4.0. Any guidance on how >> to recover from this? Restarting or redeploying the service doesn't seem >> to make any difference. >> >> Thanks, >> Pat. >> >> >> 2014-01-07 00:00:10,754 [http-8080-62] ERROR >> org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: no >> servers hosting shard: >> at >> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149) >> at >> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119) >> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) >> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) >> at java.lang.Thread.run(Thread.java:662) >> >> 2014-01-07 09:38:33,701 [http-8080-21] ERROR >> org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: No >> registered leader was found, collection:customerOrderSearch slice:shard1 >> at >> org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:487) >> at >> org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:470) >> at >> org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:223) >> at >> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428) >> at >> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246) >> at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173) >> at >> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) >> at >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) >> at >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) >> at >> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659) >> at >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362) >> at >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) >> at >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) >> at >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) >> at >> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) >> at >> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) >> at >> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) >> at >> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) >> at >> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) >> at >> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) >> at >> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861) >> at >> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606) >> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) >> at java.lang.Thread.run(Thread.java:662) >> >> >> >