Re: no servers hosting shard
After a full bounce of Tomcat, I'm now getting a new exception (below). I can browse the Zookeeper config in the Solr admin UI, and can confirm that there's a node for '/collections/customerOrderSearch/leaders/shard2', but no node for 'collections/customerOrderSearch/leaders/shard1'. Still, any ideas or guidance on how to recover would be appreciated. We've restarted all three zookeeper instances and both Solr instances, but that hasn't made any appreciable difference. --p. 2014-01-07 10:06:14,980 [coreLoadExecutor-4-thread-1] ERROR org.apache.solr.core.CoreContainer - null:org.apache.solr.common.cloud.ZooKeeperException: at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:309) at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:556) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:365) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.solr.common.SolrException: Error getting leader from zk for shard shard1 at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:864) at org.apache.solr.cloud.ZkController.register(ZkController.java:773) at org.apache.solr.cloud.ZkController.register(ZkController.java:723) at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:286) ... 11 more Caused by: org.apache.solr.common.SolrException: Could not get leader props at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:911) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:875) at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:839) ... 14 more Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/customerOrderSearch/leaders/shard1 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65) at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:889) ... 16 more On Tue, Jan 7, 2014 at 9:57 AM, patrick conant wrote: > In our Solr instance we have two shards each running on two servers. The > server that was the leader for one of the shards ran into a problem, and > when we restarted the service, Solar is no longer electing a leader for the > shard. > > The stack traces from the logs are below, and the 'Cloud Dump' from the > Solr console is attached. We're running Solr 4.4.0. Any guidance on how > to recover from this? Restarting or redeploying the service doesn't seem > to make any difference. > > Thanks, > Pat. > > > 2014-01-07 00:00:10,754 [http-8080-62] ERROR org.apache.solr.core.SolrCore > - org.apache.solr.common.SolrException: no servers hosting shard: > at > org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149) > at > org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > > 2014-01-07 09:38:33,701 [http-8080-21] ERROR org.apache.solr.core.SolrCore > - org.apache.solr.common.SolrException: No registered leader was found, > collection:customerOrderSearch slice:shard1 > at > org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:487) > at > org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:470) > at > org.
Re: no servers hosting shard
We found a way to recover. This sequence allowed everything to start up successfully. - Stop all Solr instances - Stop all Zookeeper instances - Start all Zookeeper instances - Start Solr instances one at a time. Restarting the first Solr instance took several minutes, but the subsequent instances started up much more quickly. Cheers, Pat. On Tue, Jan 7, 2014 at 10:20 AM, patrick conant wrote: > After a full bounce of Tomcat, I'm now getting a new exception (below). I > can browse the Zookeeper config in the Solr admin UI, and can confirm that > there's a node for '/collections/customerOrderSearch/leaders/shard2', but > no node for 'collections/customerOrderSearch/leaders/shard1'. Still, any > ideas or guidance on how to recover would be appreciated. We've restarted > all three zookeeper instances and both Solr instances, but that hasn't made > any appreciable difference. > > --p. > > > > > 2014-01-07 10:06:14,980 [coreLoadExecutor-4-thread-1] ERROR > org.apache.solr.core.CoreContainer - > null:org.apache.solr.common.cloud.ZooKeeperException: > at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:309) > at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:556) > at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:365) > at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.solr.common.SolrException: Error getting leader from > zk for shard shard1 > at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:864) > at org.apache.solr.cloud.ZkController.register(ZkController.java:773) > at org.apache.solr.cloud.ZkController.register(ZkController.java:723) > at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:286) > ... 11 more > Caused by: org.apache.solr.common.SolrException: Could not get leader props > at > org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:911) > at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:875) > at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:839) > ... 14 more > Caused by: org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode = NoNode for /collections/customerOrderSearch/leaders/shard1 > at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151) > at > org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252) > at > org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249) > at > org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65) > at > org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249) > at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:889) > ... 16 more > > > > On Tue, Jan 7, 2014 at 9:57 AM, patrick conant > wrote: > >> In our Solr instance we have two shards each running on two servers. The >> server that was the leader for one of the shards ran into a problem, and >> when we restarted the service, Solar is no longer electing a leader for the >> shard. >> >> The stack traces from the logs are below, and the 'Cloud Dump' from the >> Solr console is attached. We're running Solr 4.4.0. Any guidance on how >> to recover from this? Restarting or redeploying the service doesn't seem >> to make any difference. >> >> Thanks, >> Pat. >> >> >> 2014-01-07 00:00:10,754 [http-8080-62] ERROR >> org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: no >> servers hosting shard: >> at >> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149) >> at >> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119) >> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(E