Re: Manually assigning shard leader and replicas during initial setup on EC2

2013-01-30 Thread Erick Erickson
But there's still the latency issue. Draw a diagram of all the communications that have to go on to do an update and it's a _lot_ of arrows going across DCs. My suspicion is that it'll be much easier to just treat the separate DCs as separate clusters that don't know about each other. that is, you

Re: Manually assigning shard leader and replicas during initial setup on EC2

2013-01-23 Thread Upayavira
The way Zookeeper is set up, requiring 'quorum' is aimed at avoiding 'split brain' where two halves of your cluster start to operate independently. This means that you *have* to favour one half of your cluster over the other, in the case that they cannot communicate with each other. For example. i

Re: Manually assigning shard leader and replicas during initial setup on EC2

2013-01-23 Thread Daniel Collins
This is exactly the problem we are encountering as well, how to deal with the ZK Quorum when we have multiple DCs. Our index is spread so that each DC has a complete copy and *should* be able to survive on its own, but how to arrange ZK to deal with that. The problem with Quorum is we need an odd

Re: Manually assigning shard leader and replicas during initial setup on EC2

2013-01-22 Thread Timothy Potter
For the Zk quorum issue, we'll put nodes in 3 different AZ's so we can lose 1 AZ and still establish quorum with the other 2. On Tue, Jan 22, 2013 at 10:44 PM, Timothy Potter wrote: > Hi Markus, > > Thanks for the insight. There's a pretty high cost to using the approach > you suggest in that I'd

Re: Manually assigning shard leader and replicas during initial setup on EC2

2013-01-22 Thread Timothy Potter
Hi Markus, Thanks for the insight. There's a pretty high cost to using the approach you suggest in that I'd have to double my node count which won't make my acct'ing dept. very happy. As for cross AZ latency, I'm already running my cluster with nodes in 3 different AZ's and our distributed query

Re: Manually assigning shard leader and replicas during initial setup on EC2

2013-01-22 Thread Erick Erickson
Aside from the latency, how would you deal with the Zookeeper quorum? Say DC1 had ZK1 and ZK2, and DC2 had ZK3. Now anytime any server in DC2 can't talk to DC1, there is no Zookeeper quorum. So if DC1 goes down, having nodes in DC2 doesn't do you any good since theres no ZK quorum. I guess things

RE: Manually assigning shard leader and replicas during initial setup on EC2

2013-01-22 Thread Markus Jelsma
Hi, Regarding availability; since SolrCloud is not DC-aware at this moment we 'solve' the problem by simply operating multiple identical clusters in different DCs and send updates to them all. This works quite well but it requires some manual intervention if a DC is down due to a prolonged DOS