Thanks for the explanations. My idea about 4 zookeepers is a result of having the same software (java, zookeeper, solr, ...) installed on all 4 servers. But yes, I don't need to start a zookeeper on the 4th server.
3 other machines outside the cloud for ZK seams a bit oversized. And you have another point of failure with the network between ZK and cloud. If one of the cloud servers end up in smoke the ZK system should still work with ZK and cloud on the same servers. So the offline argument says the first thing I start is ZK and the last I shutdown is ZK. Good point. While moving fom master-slave to cloud I'm aware of the fact that all shards have to be connected to ZK. But how can I tell ZK that on server_1 is leader shard_1 AND replica shard_4 ? Unfortunately the "Getting Started with SolrCloud" is a bit short on this. Regards Bernd Am 28.10.2014 um 09:15 schrieb Daniel Collins: > As Michael says, you really want an odd number of zookeepers in order to > meet the quorum requirements (which based on your comments you seem to be > aware of). There is nothing "wrong" with 4 ZKs as such, just that it > doesn't buy you anything above having 3, so its one more that might go > wrong and cause you problems. In your case, I would suggest you just pick > the first 3 machines to run ZK or even have 3 other machines "outside" the > cloud to house ZK. > > The offline argument is also a good one, you really want your ZK instances > to be longer lived than Solr, whilst you can restart individual Cores > within a Solr Instance, it is often (at least for us) more convenient to > bounce the whole java instance. In that scenario (again just re-iterating > what Michael said), you don't want ZK to be down at the same time. > > If you are using Solr Cloud, then all your replicas need to be connected to > ZK, you can't have the master instances in ZK, and the replicas not > connected (that's more of the old Master-Slave replication system which is > still available but orthogonal to Cloud). > > > On 28 October 2014 07:01, Bernd Fehling <bernd.fehl...@uni-bielefeld.de> > wrote: > >> Yes, garbage collection is a very good argument to have external >> zookeepers. I haven't thought about that. >> But does this also mean seperate server for each zookeeper or >> can they live side by side with solr on the same server? >> >> >> What is the problem with 4 zookeepers beside that I have no real >> gain against 3 zookeepers (only 1 can fail)? >> >> >> Regards >> Bernd >> >> >> Am 27.10.2014 um 15:41 schrieb Michael Della Bitta: >>> You want external zookeepers. Partially because you don't want your >>> Solr garbage collections holding up zookeeper availability, >>> but also because you don't want your zookeepers going offline if >>> you have to restart Solr for some reason. >>> >>> Also, you want 3 or 5 zookeeepers, not 4 or 8. >>> >>> On 10/27/14 10:35, Bernd Fehling wrote: >>>> While starting now with SolrCloud I tried to understand the sense >>>> of external zookeeper. >>>> >>>> Let's assume I want to split 1 huge collection accross 4 server. >>>> My straight forward idea is to setup a cloud with 4 shards (one >>>> on each server) and also have a replication of the shard on another >>>> server. >>>> server_1: shard_1, shard_replication_4 >>>> server_2: shard_2, shard_replication_1 >>>> server_3: shard_3, shard_replication_2 >>>> server_4: shard_4, shard_replication_3 >>>> >>>> In this configuration I always have all 4 shards available if >>>> one server fails. >>>> >>>> But now to zookeeper. I would start the internal zookeeper for >>>> all shards including replicas. Does this make sense? >>>> >>>> >>>> Or I only start the internal zookeeper for shard 1 to 4 but not >>>> the replicas. Should be good enough, one server can fail, or not? >>>> >>>> >>>> Or I follow the recommendations and install on all 4 server >>>> an external seperate zookeeper, but what is the advantage against >>>> having the internal zookeeper on each server? >>>> >>>> >>>> I really don't get it at this point. Can anyone help me here? >>>> >>>> Regards >>>> Bernd >>> >> >