For the Zk quorum issue, we'll put nodes in 3 different AZ's so we can lose 1 AZ and still establish quorum with the other 2.
On Tue, Jan 22, 2013 at 10:44 PM, Timothy Potter <thelabd...@gmail.com>wrote: > Hi Markus, > > Thanks for the insight. There's a pretty high cost to using the approach > you suggest in that I'd have to double my node count which won't make my > acct'ing dept. very happy. > > As for cross AZ latency, I'm already running my cluster with nodes in 3 > different AZ's and our distributed query performance is acceptable for us. > Our AZ's are in the same region. > > However, I'm not sure I understand your point about Solr modifying > clusterstate.json when a node goes down. From what I understand, it will > assign a new shard leader but in my case that's expected and doesn't seem > to cause an issue. The new shard leader will be the previous replica from > the other AZ but that's OK. In this case, the cluster is still functional. > In other words, from my understanding, Solr is not going to change shard > assignments on the nodes, it's just going to select a new leader, which in > my case is in another AZ. > > Lastly, Erick raises a good point about Zk and cross AZ quorum. I don't > have a good answer to that issue but will post back if I come up with > something. > > Cheers, > Tim > > On Tue, Jan 22, 2013 at 3:11 PM, Markus Jelsma <markus.jel...@openindex.io > > wrote: > >> Hi, >> >> Regarding availability; since SolrCloud is not DC-aware at this moment we >> 'solve' the problem by simply operating multiple identical clusters in >> different DCs and send updates to them all. This works quite well but it >> requires some manual intervention if a DC is down due to a prolonged DOS >> attack or netwerk of power failure. >> >> I don't think it's a very good idea to change clusterstate.json because >> Solr will modify it when for example a node goes down. Your preconfigured >> state doesn't exist anymore. It's also a bad idea because distributed >> queries are going to be sent to remote locations, adding a lot of latency. >> Again, because it's not DC aware. >> >> Any good solution to this problem should be in Solr itself. >> >> Cheers, >> >> >> -----Original message----- >> > From:Timothy Potter <thelabd...@gmail.com> >> > Sent: Tue 22-Jan-2013 22:46 >> > To: solr-user@lucene.apache.org >> > Subject: Manually assigning shard leader and replicas during initial >> setup on EC2 >> > >> > Hi, >> > >> > I'm wanting to split my existing Solr 4 cluster into 2 different >> > availability zones in EC2, as in have my initial leaders in one zone and >> > their replicas in another AZ. My thinking here is if one zone goes >> down, my >> > cluster stays online. This is the recommendation of Amazon EC2 docs. >> > >> > My thinking here is to just cook up a clusterstate.json file to manually >> > set my desired shard / replica assignments to specific nodes. After >> which I >> > can update the clusterstate.json file in Zk and then bring the nodes >> > online. >> > >> > The other thing to mention is that I have existing indexes that need to >> be >> > preserved as I don't want to re-index. For this I'm planning to just >> move >> > data directories where they need to be based on my changes to >> > clusterstate.json >> > >> > Does this sound reasonable? Any pitfalls I should look out for? >> > >> > Thanks. >> > Tim >> > >> > >