replica reports recovery_failed but is considered the leader

Oliver Schrenk Tue, 11 Mar 2014 08:26:25 -0700

Hi,

After an unsuccessful indexing on a Solr Cloud cluster with four machines, were 
we experienced a lot of errors we are still trying to investigate, we found the 
cluster to be in a weird state.


    {"collection_v1":{
        "shards":{
          "shard1":{
            "range":"80000000-bfffffff",
            "state":"active",
            "replicas":{
              "core_node1":{
                "state":"recovery_failed",
                "base_url":"http://solr-host9:7070/solr";,
                "core":"elmar_v1_shard1_replica1",
                "node_name":"solr-host9:7070_solr",
                "leader":"true"},
              "core_node2":{
                "state":"active",
                "base_url":"http://solr-host8:7070/solr";,
                "core":"elmar_v1_shard1_replica2",
                "node_name":"solr-host8:7070_solr"}}},

        ...

        "maxShardsPerNode":"2",
        "router":{"name":"compositeId"},
        "replicationFactor":"2"}}
    }


From my point of view it doesn’t make sense that core_node1is the leader of 
shard1, when it can’t even be recovered.  With the other machine completely 
working, why is core_node2 not the leader? Am I wrong in my assumption? In the 
same vein, how I can I manually set the leader?

Regards
Oliver

replica reports recovery_failed but is considered the leader

Reply via email to