Hi,
 
> Zookeeper avoids split brain using Paxos (or something very like it - I 

> can't remember if they extended it or modified and/or what they call it).
> 
> So you will only ever see one Zookeeper cluster - the smaller partition will 
> be 
> down. There is a proof for Paxos if I remember right.
> 
> Zookeeper then acts as the system of record for Solr. Solr won't auto form 
> its own new little clusters - *the* cluster is modeled in Zookeeper and 
> that's the cluster. So Solr does not find it self organizing new mini 
> clusters on partition splits.
> 
> When we lose our connection to Zookeeper, update requests are no longer 
> accepted, because we may have a stale cluster view and not know it for a long 
> period of time.


Does this work even when outside clients (apps for indexing or searching) send 
their requests directly to individual nodes?
Let's use the example from my email where we end up with 2 groups of nodes: 
7-node group with 2 ZK nodes on the same network and 3-node group with 1 ZK 
node on the same network.

If a client sends a request to a node in the 7-node group what happens?
And if a client sends a request to a node in the 3-node group what happens?

Yury wrote:
> A quorum of N/2+1 nodes is required to operate (that's also the reason you 
>need at least 3 to begin with)

N=3 (ZK nodes), right?
So in that case we need at least 3/2+1 => 2.5 ZK nodes to operate.  So in my 
example neither the 7-node group nor the 3-node group will operate (does that 
mean request rejection or something else?) because neither sees 2.5 ZK nodes?

Thanks,
Otis
----
Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 




> On Jun 15, 2012, at 12:49 PM, Otis Gospodnetic wrote:
> 
>>  Hi,
>> 
>>  How exactly does SolrCloud handle split brain situations?
>> 
>>  Imagine a cluster of 10 nodes.
>>  Imagine 3 of them being connected to the network by some switch and imagine 
> the out port of this switch dies.
>>  When that happens, these 3 nodes will be disconnected from the other 7 
> nodes and we'll have 2 clusters, one with 3 nodes and one with 7 nodes and 
> we'll have a split brain situation.  
>>  Imagine we had 3 ZK nodes in the original 10-node cluster, 2 of which are 
> connected to the dead switch and are thus aware only of the 3 node cluster 
> now, 
> and 1 ZK instance which is on a different switch and is thus aware only of 
> the 7 
> node cluster.
>> 
>>  At this point how exactly does ZK make SolrCloud immune to split brain?
>> 
>> 
>>  Does LBHttpSolrServer play a key role here? (I see LBHttpSolrServer 
> mentioned only once on http://wiki.apache.org/solr/SolrCloud and with a 
> question 
> mark next to it)
>> 
>> 
>>  Thanks,
>>  Otis
>>  ----
>>  Performance Monitoring for Solr / ElasticSearch / HBase - 
> http://sematext.com/spm
> 
> - Mark Miller
> lucidimagination.com
>

Reply via email to