It sounds like the node-local version of the ZK clusterstate has diverged from 
the ZK cluster state. You should check the contents of zookeeper and verify the 
state there looks sane. I’ve had issues (v5.4) on a few occasions where leader 
election got screwed up to the point where I had to delete data from ZK 
manually. (Usually after a ZK issue.)

Particularly, take a look in 
collections/<collection>/leader_elect/shard<x>/election

Non-sane states would be the same core_nodeX listed more than once, or fewer 
entries than (up) replicas. If you’re having trouble getting a sane election, 
you can try deleting the lowest-numbered entries (as well as any lower-numbered 
duplicates) and trying forceelection again. Possibly followed by restarting the 
node with that lowest-numbered entry.

Also make sure that this exists and has the expected replica
collections/<collection>/leader/shard<x>

collections/<collection>/leader_initiated_recovery 
can be informative too, this represents replicas that the *leader* thinks are 
out of sync, usually due to a failed update request.


On 7/19/16, 9:20 AM, "Tom Evans" <tevans...@googlemail.com> wrote:

    On the nodes that have the replica in a recovering state we now see:
    
    19-07-2016 16:18:28 ERROR RecoveryStrategy:159 - Error while trying to
    recover. core=lookups_shard1_replica8:org.apache.solr.common.SolrException:
    No registered leader was found after waiting for 4000ms , collection:
    lookups slice: shard1
            at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:607)
            at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:593)
            at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:308)
            at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
            at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
            at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
    
    19-07-2016 16:18:28 INFO  RecoveryStrategy:444 - Replay not started,
    or was not successful... still buffering updates.
    19-07-2016 16:18:28 ERROR RecoveryStrategy:481 - Recovery failed -
    trying again... (164)
    19-07-2016 16:18:28 INFO  RecoveryStrategy:503 - Wait [12.0] seconds
    before trying to recover again (attempt=165)
    
    
    This is with the "leader that is not the leader" shut down.
    
    Issuing a FORCELEADER via collections API doesn't in fact force a
    leader election to occur.
    
    Is there any other way to prompt Solr to have an election?
    
    Cheers
    
    Tom
    
    On Tue, Jul 19, 2016 at 5:10 PM, Tom Evans <tevans...@googlemail.com> wrote:
    > There are 11 collections, each only has one shard, and each node has
    > 10 replicas (9 collections are on every node, 2 are just on one node).
    > We're not seeing any OOM errors on restart.
    >
    > I think we're being patient waiting for the leader election to occur.
    > We stopped the troublesome "leader that is not the leader" server
    > about 15-20 minutes ago, but we still have not had a leader election.
    >
    > Cheers
    >
    > Tom
    >
    > On Tue, Jul 19, 2016 at 4:30 PM, Erick Erickson <erickerick...@gmail.com> 
wrote:
    >> How many replicas per Solr JVM? And do you
    >> see any OOM errors when you bounce a server?
    >> And how patient are you being, because it can
    >> take 3 minutes for a leaderless shard to decide
    >> it needs to elect a leader.
    >>
    >> See SOLR-7280 and SOLR-7191 for the case
    >> where lots of replicas are in the same JVM,
    >> the tell-tale symptom is errors in the log as you
    >> bring Solr up saying something like
    >> "OutOfMemory error.... unable to create native thread"
    >>
    >> SOLR-7280 has patches for 6x and 7x, with a 5x one
    >> being added momentarily.
    >>
    >> Best,
    >> Erick
    >>
    >> On Tue, Jul 19, 2016 at 7:41 AM, Tom Evans <tevans...@googlemail.com> 
wrote:
    >>> Hi all - problem with a SolrCloud 5.5.0, we have a node that has most
    >>> of the collections on it marked as "Recovering" or "Recovery Failed".
    >>> It attempts to recover from the leader, but the leader responds with:
    >>>
    >>> Error while trying to recover.
    >>> core=iris_shard1_replica1:java.util.concurrent.ExecutionException:
    >>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
    >>> Error from server at http://172.31.1.171:30000/solr: We are not the
    >>> leader
    >>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    >>> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    >>> at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:596)
    >>> at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:353)
    >>> at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
    >>> at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    >>> at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
    >>> at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    >>> at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    >>> at java.lang.Thread.run(Thread.java:745)
    >>> Caused by: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
    >>> Error from server at http://172.31.1.171:30000/solr: We are not the
    >>> leader
    >>> at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:576)
    >>> at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:284)
    >>> at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:280)
    >>> ... 5 more
    >>>
    >>> and recovery never occurs.
    >>>
    >>> Each collection in this state has plenty (10+) of active replicas, but
    >>> stopping the server that is marked as the leader doesn't trigger a
    >>> leader election amongst these replicas.
    >>>
    >>> REBALANCELEADERS did nothing.
    >>> FORCELEADER complains that there is already a leader.
    >>> FORCELEADER with the purported leader stopped took 45 seconds,
    >>> reported status of "0" (and no other message) and kept the down node
    >>> as the leader (!)
    >>> Deleting the failed collection from the failed node and re-adding it
    >>> has the same "Leader said I'm not the leader" error message.
    >>>
    >>> Any other ideas?
    >>>
    >>> Cheers
    >>>
    >>> Tom
    

Reply via email to