Thanks Alexey -

Could you move this to a JIRA issue? 

- Mark

On Oct 25, 2012, at 7:53 AM, AlexeyK <lex.kudi...@gmail.com> wrote:

> setup:
> 1 node, 4 cores, 2 shards.
> 15 documents indexed.
> 
> problem:
> init stage times out.
> 
> probable cause:
> According to the init flow, cores are initialized one by one synchronously.
> Actually, the main thread waits
> ShardLeaderElectionContext.waitForReplicasToComeUp until retry threshold,
> while replica cores are *not* yet initialized, in other words there is no
> chance other replicas go up in the meanwhile. 
> stack trace:
> Thread [main] (Suspended)     
>       owns: HashMap<K,V>  (id=3876)   
>       owns: StandardContext  (id=3877)        
>       owns: HashMap<K,V>  (id=3878)   
>       owns: StandardHost  (id=3879)   
>       owns: StandardEngine  (id=3880) 
>       owns: Service[]  (id=3881)      
>       Thread.sleep(long) line: not available [native method]  
>       ShardLeaderElectionContext.waitForReplicasToComeUp(boolean, String) 
> line:
> 298   
>       ShardLeaderElectionContext.runLeaderProcess(boolean) line: 143  
>       LeaderElector.runIamLeaderProcess(ElectionContext, boolean) line: 152   
>       LeaderElector.checkIfIamLeader(int, ElectionContext, boolean) line: 96  
>       LeaderElector.joinElection(ElectionContext) line: 262   
>       ZkController.joinElection(CoreDescriptor, boolean) line: 733    
>       ZkController.register(String, CoreDescriptor, boolean, boolean) line: 
> 566       
>       ZkController.register(String, CoreDescriptor) line: 532 
>       CoreContainer.registerInZk(SolrCore) line: 709  
>       CoreContainer.register(String, SolrCore, boolean) line: 693     
>       CoreContainer.load(String, InputSource) line: 535       
>       CoreContainer.load(String, File) line: 356      
>       CoreContainer$Initializer.initialize() line: 308        
>       SolrDispatchFilter.init(FilterConfig) line: 107 
>       ApplicationFilterConfig.getFilter() line: 295   
>       ApplicationFilterConfig.setFilterDef(FilterDef) line: 422       
>       ApplicationFilterConfig.<init>(Context, FilterDef) line: 115    
>       StandardContext.filterStart() line: 4072        
>       StandardContext.start() line: 4726      
>       StandardHost(ContainerBase).addChildInternal(Container) line: 799       
>       StandardHost(ContainerBase).addChild(Container) line: 779       
>       StandardHost.addChild(Container) line: 601      
>       HostConfig.deployDescriptor(String, File, String) line: 675     
>       HostConfig.deployDescriptors(File, String[]) line: 601  
>       HostConfig.deployApps() line: 502       
>       HostConfig.start() line: 1317   
>       HostConfig.lifecycleEvent(LifecycleEvent) line: 324     
>       LifecycleSupport.fireLifecycleEvent(String, Object) line: 142   
>       StandardHost(ContainerBase).start() line: 1065  
>       StandardHost.start() line: 840  
>       StandardEngine(ContainerBase).start() line: 1057        
>       StandardEngine.start() line: 463        
>       StandardService.start() line: 525       
>       StandardServer.start() line: 754        
>       Catalina.start() line: 595      
>       NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not
> available [native method]     
>       NativeMethodAccessorImpl.invoke(Object, Object[]) line: not available   
>       DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: not 
> available       
>       Method.invoke(Object, Object...) line: not available    
>       Bootstrap.start() line: 289     
>       Bootstrap.main(String[]) line: 414
> 
>       
> After a while, the session times out and following exception appears:
> Oct 25, 2012 1:16:56 PM org.apache.solr.cloud.ShardLeaderElectionContext
> waitForReplicasToComeUp
> INFO: Waiting until we see more replicas up: total=2 found=0 timeoutin=-95
> Oct 25, 2012 1:16:56 PM org.apache.solr.cloud.ShardLeaderElectionContext
> waitForReplicasToComeUp
> INFO: Was waiting for replicas to come up, but they are taking too long -
> assuming they won't come back till later
> Oct 25, 2012 1:16:56 PM org.apache.solr.common.SolrException log
> SEVERE: Errir checking for the number of election
> participants:org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired for
> /collections/collection1/leader_elect/shard2/election
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>       at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1249)
>       at
> org.apache.solr.common.cloud.SolrZkClient$6.execute(SolrZkClient.java:227)
>       at
> org.apache.solr.common.cloud.SolrZkClient$6.execute(SolrZkClient.java:224)
>       at
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:63)
>       at
> org.apache.solr.common.cloud.SolrZkClient.getChildren(SolrZkClient.java:224)
>       at
> org.apache.solr.cloud.ShardLeaderElectionContext.waitForReplicasToComeUp(ElectionContext.java:276)
>       at
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:143)
>       at
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:152)
>       at
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:96)
>       at 
> org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:262)
>       at 
> org.apache.solr.cloud.ZkController.joinElection(ZkController.java:733)
>       at org.apache.solr.cloud.ZkController.register(ZkController.java:566)
>       at org.apache.solr.cloud.ZkController.register(ZkController.java:532)
>       at 
> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:709)
>       at org.apache.solr.core.CoreContainer.register(CoreContainer.java:693)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:535)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
>       at
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
>       at
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
>       at
> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295)
>       at
> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422)
>       at
> org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:115)
>       at
> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4072)
>       at
> org.apache.catalina.core.StandardContext.start(StandardContext.java:4726)
>       at
> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:799)
>       at 
> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779)
>       at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:601)
>       at
> org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:675)
>       at
> org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:601)
>       at 
> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:502)
>       at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1317)
>       at
> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:324)
>       at
> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142)
>       at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1065)
>       at org.apache.catalina.core.StandardHost.start(StandardHost.java:840)
>       at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1057)
>       at 
> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:463)
>       at 
> org.apache.catalina.core.StandardService.start(StandardService.java:525)
>       at 
> org.apache.catalina.core.StandardServer.start(StandardServer.java:754)
>       at org.apache.catalina.startup.Catalina.start(Catalina.java:595)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>       at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>       at java.lang.reflect.Method.invoke(Unknown Source)
>       at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
>       at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
> 
> Followed by:
> Oct 25, 2012 1:17:27 PM org.apache.solr.cloud.RecoveryStrategy doRecovery
> SEVERE: Recovery failed - trying again... core=collection1
> Oct 25, 2012 1:18:32 PM org.apache.solr.common.SolrException log
> SEVERE: Error while trying to recover. core=collection1
> Oct 25, 2012 1:18:32 PM org.apache.solr.common.SolrException log
> SEVERE: Error while trying to recover.
> core=collection1:org.apache.solr.common.SolrException: No registered leader
> was found, collection:collection1 slice:shard1
>       at
> org.apache.solr.common.cloud.ZkStateReader.getLeaderProps(ZkStateReader.java:413)
>       at
> org.apache.solr.common.cloud.ZkStateReader.getLeaderProps(ZkStateReader.java:399)
>       at
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:318)
>       at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:220)
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrCloud-leader-election-on-single-node-tp4015804.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to