On Jun 15, 2012, at 3:21 PM, Otis Gospodnetic wrote:

> Thanks Mark, will open an issue in a bit.
> 
> But I think the following is the real meat of the Q about split brain and 
> SolrCloud, especially when it comes to how indexing is handled during split 
> brain:
> 
>>>   Does this work even when outside clients (apps for indexing or searching) 
> 
>> send their requests directly to individual nodes?
>>>   Let's use the example from my email where we end up with 2 groups of 
>> nodes: 7-node group with 2 ZK nodes on the same network and 3-node group 
>> with 1 
>> ZK node on the same network.
>>  
>> The 3-node group with 1 ZK would not have a functioning zk - so it would 
>> stop 
>> accepting updates. If it could serve a complete view of the index, it would 
>> though, for searches.
> 
> So in this case information in this 1 ZK node would tell the 3 Solr nodes 
> whether they have all index data or if some shards are missing (i.e. were 
> only on nodes in the other 7-node group)?
> And if nodes figure out they don't have all index data they will reject 
> search requests?  Or will they accept and perform searches, but return 
> responses that tell the client that the searched index was not complete?

The 1 ZK node will not function, so the 3 Solr nodes will not accept updates.

If there is one replica for each shard available, search will still work. I 
don't think partial results has been committed yet for distrib search. In that 
case, we will put something in the header to indicate a full copy of the index 
was not available. I think we can also add something in the header if we know 
we cannot talk to zookeeper to let the client know it could be seeing stale 
state. SmartClients that talked to zookeeper would see those nodes appear as 
down in zookeeper and stop trying to talk to them.

> 
>> The 7-node group would have a working ZK it could talk to, and it would 
>> continue 
>> to accept updates as long as a node for a shard for that hash range is up. 
>> It 
>> would also of course serve searches.
> 
> Right, so if the node for the shard where a doc is supposed to go to is in 
> that 3-node group, then the indexing request will be rejected.  Is this 
> correct? 

it depends on what is available - but you will need at least one replica for 
each shard available - eg your partition needs to have one copy of the index - 
otherwise updates are rejected if there are no nodes hosting a shard of the 
hash range. So if a replica made it into the larger partition, you will be fine 
- it will become the leader.

> 
> 
> 
> Otis 
> ----
> Performance Monitoring for Solr / ElasticSearch / HBase - 
> http://sematext.com/spm 
> 
> 
> 
> ----- Original Message -----
>> From: Mark Miller <markrmil...@gmail.com>
>> To: solr-user <solr-user@lucene.apache.org>
>> Cc: 
>> Sent: Friday, June 15, 2012 2:22 PM
>> Subject: Re: SolrCloud and split-brain
>> 
>> 
>> On Jun 15, 2012, at 2:12 PM, Otis Gospodnetic wrote:
>> 
>>> Makes sense.  Do responses carry something to alert the client that 
>> "something is rotten in the state of cluster"?
>> 
>> No, I don't think so - we should probably add that to the header similar to 
>> how I assume partial results will work.
>> 
>> Feel free to fire up a JIRA issue for that.
>> 
>> - Mark Miller
>> lucidimagination.com
>> 

- Mark Miller
lucidimagination.com











Reply via email to