Re: SolrCloud and split-brain

Mark Miller Fri, 15 Jun 2012 10:33:44 -0700

Zookeeper avoids split brain using Paxos (or something very like it - I can't 
remember if they extended it or modified and/or what they call it).

So you will only ever see one Zookeeper cluster - the smaller partition will be 
down. There is a proof for Paxos if I remember right.

Zookeeper then acts as the system of record for Solr. Solr won't auto form its 
own new little clusters - *the* cluster is modeled in Zookeeper and that's the 
cluster. So Solr does not find it self organizing new mini clusters on 
partition splits.

When we lose our connection to Zookeeper, update requests are no longer 
accepted, because we may have a stale cluster view and not know it for a long 
period of time.

On Jun 15, 2012, at 12:49 PM, Otis Gospodnetic wrote:

> Hi,
> 
> How exactly does SolrCloud handle split brain situations?
> 
> Imagine a cluster of 10 nodes.
> Imagine 3 of them being connected to the network by some switch and imagine 
> the out port of this switch dies.
> When that happens, these 3 nodes will be disconnected from the other 7 nodes 
> and we'll have 2 clusters, one with 3 nodes and one with 7 nodes and we'll 
> have a split brain situation.  
> Imagine we had 3 ZK nodes in the original 10-node cluster, 2 of which are 
> connected to the dead switch and are thus aware only of the 3 node cluster 
> now, and 1 ZK instance which is on a different switch and is thus aware only 
> of the 7 node cluster.
> 
> At this point how exactly does ZK make SolrCloud immune to split brain?
> 
> 
> Does LBHttpSolrServer play a key role here? (I see LBHttpSolrServer mentioned 
> only once on http://wiki.apache.org/solr/SolrCloud and with a question mark 
> next to it)
> 
> 
> Thanks,
> Otis
> ----
> Performance Monitoring for Solr / ElasticSearch / HBase - 
> http://sematext.com/spm

- Mark Miller
lucidimagination.com

Re: SolrCloud and split-brain

Reply via email to