Your last comment really answered. A ZK quorum is explicitly ((num zk instances)/2) + 1.
So no, you don't need 6 nodes at all. It's perfectly reasonable to run a Solr instance on each node and a ZK instance (not embedded) on the same three nodes. I think you're over-thinking the problem though. How often does a machine fail? If it's more often than once in an blue moon, you have _other_ problems. The sole caution is that when running _embedded_ zookeeper with Solr (as opposed to stand-alone even if on the same node), you'll be bringing the Solr instance up and down repeatedly when developing your app. Trust me on this ;). Having Zookeeper embedded just makes this more likely to have ZK fall beneath quorum. Not to mention that you'll want to upgrade sometime or... That said, I know of production environments where all the Zookeepers are embedded in Solr, no independent ZKs at all. And in large installations, as Shawn mentions, you probably don't want the additional load on the Solr nodes. Finally, the ZK nodes don't have much in the way of CPU load. So one popular option is to put the ZK instances on lightweight nodes that you happen to have laying around anyway and reserve the bigger iron for Solr. Best, Erick On Tue, Apr 28, 2015 at 10:20 AM, shacky <shack...@gmail.com> wrote: >> Yeah, it took me a few tries to get it all straight in my head. > > Thanks Erick for your fast answer! > >> The only "problem" with running ZK on the same node as Solr is that if the >> node goes down, it takes _both_ zookeeper and Solr with it. If running >> the "embedded zookeeper", then you can't even bounce the Solr server without >> taking down the ZK node. Solr will run fine even with embedded ZK, >> you just have to be very careful when you take the node up or down. > > Yes, but what happens when a Zookeeper node goes down if I have three nodes? > As a Solr node could go down, even a Zookeper one could go down, so > this _needs_ to be an expected issue in a highly available > infrastructure, doesn't it? > >> Bottom line: It's just easier, from an administrative standpoint, to >> run Zookeeper >> as an external process. That way, you can freely bounce your Solr nodes >> without falling below quorum. Whether or not it shares the same machine as a >> running instance of Solr is up to you. > >> You absolutely _do_ want to >> 1> have at least one replica for each and every shard on a different box >> 2> have each Zookeeper running on a separate box. > > But doing so I need 6 nodes, am I wrong? > >> That way, if any single box dies you have a complete collection available and >> a quorum of ZK nodes present. How many more machines you have and >> how you distribute your collections amongst them is up to you. > > If I have three ZK nodes I will have the quorum even with two > available nodes, right?