Hey Mark, Thanks again for your reply. 

/*"The way we know that its no longer connected to zookeeper is looking at
live_nodes - which are ephemeral and will go away if a node goes away"*/

 i am not too sure if this is really the case. As far as i remember, even
after a node was dead, live_nodes still reported that node as active /but/
the leader was changed to the one that was /really/ alive. 

I had a look in the Overseer's code and it seems its looping on FIFO queue
and wait for new state update requests. So if a node was killed, it would
never be sending a state update request and i guess that's why the state is
out of sync. 

If we can set up a wait time for each known node and then declare a node as
INACTIVE if overseer does not hear from that node within the wait time.
Something similar to heartbeats in several other systems. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Programmatically-create-multiple-collections-tp3916927p3944327.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to