Hi,
After an unsuccessful indexing on a Solr Cloud cluster with four machines, were
we experienced a lot of errors we are still trying to investigate, we found the
cluster to be in a weird state.
{"collection_v1":{
"shards":{
"shard1":{
"range":"80000000-bfffffff",
"state":"active",
"replicas":{
"core_node1":{
"state":"recovery_failed",
"base_url":"http://solr-host9:7070/solr",
"core":"elmar_v1_shard1_replica1",
"node_name":"solr-host9:7070_solr",
"leader":"true"},
"core_node2":{
"state":"active",
"base_url":"http://solr-host8:7070/solr",
"core":"elmar_v1_shard1_replica2",
"node_name":"solr-host8:7070_solr"}}},
...
"maxShardsPerNode":"2",
"router":{"name":"compositeId"},
"replicationFactor":"2"}}
}
From my point of view it doesn’t make sense that core_node1is the leader of
shard1, when it can’t even be recovered. With the other machine completely
working, why is core_node2 not the leader? Am I wrong in my assumption? In the
same vein, how I can I manually set the leader?
Regards
Oliver