On 6/24/2017 2:14 AM, Arcadius Ahouansou wrote: > Interpretation 1: > > - On slide 6 and 7: Only 2 DC used, so the ZK quorum will not survive and > recover after 1 DC failure > > - On slide 8: We have 3 DCs which OK for ZK. > But we have 6 ZK nodes. > This is a problem because ZK likes 3, 5, 7 ... odd nodes.
On both slide 6 and slide 7, Solr stays completely operational in DC1 if DC2 goes down. It all falls apart if DC1 goes down. For clients that can still reach them, the remaining Solr servers are read only in that situation. Slide 8 is very similar -- if DC1 goes down, Solr is read only. If either DC2 or DC3 goes down, everything is fine for clients that can still get to Solr. One additional consideration: If both DC2 and DC3 go down, then the remaining Solr severs in DC1 are read only. ZooKeeper doesn't *need* an odd number of servers, but there's no benefit to an even number. If you have 5 servers, two can go down. If you have 6 servers, you can still only lose two, so you might as well just run 5. You'd have fewer possible points of failure, less power usage, and less bandwidth usage. The best minimum option is an odd number of data centers, minimum 3, with one zookeeper in each location. For Solr, you want at least two servers, which should be split evenly between at least two of those datacenter locations. If you're really stuck with only two datacenters, then you can follow the advice in the presentation: Set up a full cloud in each datacenter and use CDCR between them. > Interpretation 2: > > Any SolrCloud deployment with "Remote SolrCloud nodes" i.e. solrCloud not in > same DC as ZK is deemed an anti-pattern (note that DCs can be just a couple > of miles apart and could be connected by high speed network) I'm not sure that this is actually true, but it does introduce latency and more moving parts in the form of network connections between data centers -- connections which might go down. I wouldn't do it, but I also wouldn't automatically dismiss it as a viable setup, as long as it meets ZooKeeper's requirements and there are two complete copies of the Solr collections, each in different data centers. Typical designs only stay viable if one datacenter goes down, but if you were to use five datacenters and have enough Solr servers for three complete copies of your collections, you could survive two data center outages. Thanks, Shawn