On Mar 5, 2012, at 10:01 AM, dar...@ontrenet.com wrote: > If one of those 10 indexing nodes goes down or falls out of sync and comes > back, does ZK block the state of indexing until that single node catches > back up?
No - if a node falls out of sync or comes back, the rest of the cluster continues as normal and the node goes into recovery. In recovery, the node tries two things to catch up: first it tries to peer sync - if its off by less than 100 updates, it will simply exchange updates with the leader and come back into sync. If its off by more than that, it will start buffering updates from the leader, replicate the full index from the leader, and then apply its buffered updates to get come back in sync. The only time indexing is stopped for a node is if that node loses its connection to zookeeper. All other nodes that can still talk to zookeeper will continue indexing. How soon we consider that we can't talk to zookeeper depends on the zk session timeout - I have to look, but for an embedded ensemble, we may be defaulting this a little low currently. - Mark Miller lucidimagination.com