Re: Leaders in Recovery Failed state

Nitin Solanki Tue, 20 Jan 2015 10:20:41 -0800

I am also facing the same issue. My solr version is 4.10.2

On Tue, Jan 20, 2015 at 11:33 PM, Erick Erickson <erickerick...@gmail.com>
wrote:


> What version of Solr?
>
>
> On Tue, Jan 20, 2015 at 7:07 AM, anand.mahajan <an...@zerebral.co.in>
> wrote:
> > Hi all,
> >
> >
> > I have a cluster with 36 Shards and 3 replica per shard. I had to
> recently
> > restart the entire cluster - most of the shards & replica are back up -
> but
> > a few shards have not had any leaders for a long long time (close to 18
> > hours now) - I tried reloading these cores and even the servlet
> containers
> > hosting these cores. Its only now that all the shards have leaders
> allocated
> > - but few of these Leaders are still shown as Recovery Failed status on
> the
> > Solr Cloud tree view.
> >
> >
> > I see the following in the logs for these shards -
> > INFO  - 2015-01-20 14:38:19.797;
> > org.apache.solr.handler.admin.CoreAdminHandler; In
> WaitForState(recovering):
> > collection=collection1, shard=shard1,
> thisCore=collection1_shard1_replica3,
> > leaderDoesNotNeedRecovery=false, isLeader? true, live=true,
> checkLive=true,
> > currentState=recovering, localState=recovery_failed,
> > nodeName=10.68.77.9:8983_solr, coreNodeName=core_node2,
> > onlyIfActiveCheckResult=true, nodeProps:
> >
> core_node2:{"state":"recovering","core":"collection1_shard1_replica1","node_name":"10.68.77.9:8983
> _solr","base_url":"http://10.68.77.9:8983/solr"}
> >
> >
> > And on other server hosting the replica for this shard -
> > ERROR - 2015-01-20 14:38:20.768; org.apache.solr.common.SolrException;
> > org.apache.solr.common.SolrException: I was asked to wait on state
> > recovering for shard3 in collection1 on 10.68.77.9:8983_solr but I
> still do
> > not see the requested state. I see state: recovering live:true leader
> from
> > ZK: http://10.68.77.3:8983/solr/collection1_shard3_replica3/
> >         at
> >
> org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler.java:999)
> >         at
> >
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:245)
> >         at
> >
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:188)
> >         at
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> >         at
> >
> org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)
> >         at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:258)
> >         at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> >         at
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> >         at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> >         at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> >         at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> >         at
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> >         at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> >         at
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> >         at
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> >         at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> >         at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> >         at
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> >         at
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> >         at
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> >         at org.eclipse.jetty.server.Server.handle(Server.java:368)
> >         at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> >         at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> >         at
> >
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
> >         at
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
> >         at
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
> >         at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> >         at
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> >         at
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> >         at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> >         at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> >         at java.lang.Thread.run(Unknown Source)
> >
> >
> > I see that there is no replica catch-up going on between any of these
> > servers now.
> > Couple of questions -
> > 1. What is it that the Solr cloud is waiting on to allocate the leaders
> for
> > such shards?
> > 2. Why are few of these shards show leaders in Recovery Failed state? And
> > how do I recover such shards?
> >
> > Thanks,
> > Anand
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Leaders-in-Recovery-Failed-state-tp4180611.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Leaders in Recovery Failed state

Reply via email to