I don't think the process Shalin describes applies to clusterstate.json. That JSON object reflects the status Solr "knows" about, or "last known status". When Solr is properly shutdown, I believe those attributes are cleared from clusterstate.json, as well the leaders give up their lease.
However, when Solr is killed, it takes ZK the 30 seconds or so timeout to kill the ephemeral node and release the leader lease. ZK is unaware of Solr's clusterstate.json and cannot update the 'leader' property to false. It simply releases the lease, so that other cores may claim it. Perhaps that explains the confusion? Shai On Mon, Sep 21, 2015 at 4:36 PM, Jeff Wu <wuhai...@gmail.com> wrote: > Hi Shalin, thank you for the response. > > We waited longer enough than the ZK session timeout time, and it still did > not kick off any leader election for these "remained down-leader" cores. > That's the question I'm actually asking. > > Our test scenario: > > Each solr server has 64 cores, and they are all active, and all leader > cores. > Shutdown the linux OS. > Monitor clusterstate.json over ZK, after enough ZK session timeout value. > We noticed some cores has leader election happened. But still saw some down > cores remains leader. > > 2015-09-21 9:15 GMT-04:00 Shalin Shekhar Mangar <shalinman...@gmail.com>: > > > Hi Jeff, > > > > The leader election relies on ephemeral nodes in Zookeeper to detect > > when leader or other nodes have gone down (abruptly). These ephemeral > > nodes are automatically deleted by ZooKeeper after the ZK session > > timeout which is by default 30 seconds. So if you kill a node then it > > can take up to 30 seconds for the cluster to detect it and start a new > > leader election. This won't be necessary during a graceful shutdown > > because on shutdown the node will give up leader position so that a > > new one can be elected. You could tune the zk session timeout to a > > lower value but then it makes the cluster more sensitive to GC pauses > > which can also trigger new leader elections. > > > > On Mon, Sep 21, 2015 at 5:55 PM, Jeff Wu <wuhai...@gmail.com> wrote: > > > Our environment still run with Solr4.7. Recently we noticed in a test. > > When > > > we stopped 1 solr server(solr02, which did OS shutdown), all the cores > of > > > solr02 are shown as "down", but remains a few cores still as leaders. > > After > > > that, we quickly seeing all other servers are still sending requests to > > > that down solr server, and therefore we saw a lot of TCP waiting > threads > > in > > > thread pool of other solr servers since solr02 already down. > > > > > > "shard53":{ > > > "range":"26660000-2998ffff", > > > "state":"active", > > > "replicas":{ > > > "core_node102":{ > > > "state":"down", > > > "base_url":"https://solr02.myhost/solr", > > > "core":"collection2_shard53_replica1", > > > "node_name":"https://solr02.myhost_solr", > > > "leader":"true"}, > > > "core_node104":{ > > > "state":"active", > > > "base_url":"https://solr04.myhost/solr", > > > "core":"collection2_shard53_replica2", > > > "node_name":"https://solr04.myhost/solr_solr"}}}, > > > > > > Is this something known bug in 4.7 and late on fixed? Any reference > JIRA > > we > > > can study about? If the solr service is stopped gracefully, we can see > > > leader core election happens and switched to other active core. But if > we > > > just directly shutdown a Solr OS, we can reproduce in our environment > > that > > > some "Down" cores remains "leader" at ZK clusterstate.json > > > > > > > > -- > > Regards, > > Shalin Shekhar Mangar. > > >