Hi Varun, Running many Zookeeper instances improves read time but has a negative impact on writing states to Zookeeper. Having a node only talk to the local Zookeeper instance limits availability, your Zookeeper daemon will die at some point and that will cut off your Solr node from the entire cluster. Running so many Zookeper daemons is also a waste of resources, mostly RAM which you should use for your mmapped files for Solr.
As a minimum you must run three Zookeeper daemons in the network, but never an even amount because it won't have any positive effect on the quorum that Zookeeper needs. In your case i would start with five or seven daemons spread across the network, not sharing virtual machines and if possible not sharing switches. Cheers, Markus -----Original message----- > From:varun srivastava <varunmail...@gmail.com> > Sent: Mon 01-Oct-2012 23:56 > To: solr-user@lucene.apache.org > Subject: Re: Zookeeper setup for solr cloud > > Hi, > Rephrasing my question ... Let me know if anyone feel some problem with > following deployment of solrcloud > > 1) Have 200 solrcloud nodes ( serv1, serv2, .. serv200) with each machine > having both zookeeper and solr both. > 2) zookeeper config contain the list of all servers > > server.1=serv1:2888:3888 > server.2=serv2:2888:3888 > > ... > server.200=serv200:2888:3888 > > > 3) Each solrconfig only talks to localhost zookeeper - > > -DzkHost=localhost:9983 > > > Thanks > Varun > > > > On Sun, Sep 30, 2012 at 4:51 PM, Lance Norskog <goks...@gmail.com> wrote: > > > You can find Solr information with this: > > http://find.searchhub.org/?q=zookeeper+cluster > > > > http://find.searchhub.org/link?url=http://wiki.apache.org/solr/SolrCloud > > > > > > ----- Original Message ----- > > | From: "varun srivastava" <varunmail...@gmail.com> > > | To: solr-user@lucene.apache.org > > | Sent: Saturday, September 29, 2012 9:38:16 PM > > | Subject: Zookeeper setup for solr cloud > > | > > | Hi, > > | I would like to get recommendation on zookeeper ensemble > > | architecture. I > > | am thinking of following options, please let me know if I am correct > > | in > > | pros and con of each option. Also please feel free to add > > | differentiating > > | points I am missing. > > | > > | 1) Have separate boxes for zookeeper ensemble and all the solrcloud > > | instances access it on runtime. > > | Pros: Small set of zookeeper instances to maintain. May be sync up > > | between zookeeper boxes will be fast and reliable. > > | > > | 2) Let each solr box have zookeeper instance also. Each solr instance > > | accessing the localhost zookeeper. > > | Pros: solr will not incur over the wire cost at runtime, hence > > | should be > > | fast. More fault tolerant as solr not going over the wire to access > > | zookeeper. > > | Con: Lots of zookeeper instances and hence may be slow to update. > > | > > | > > | Thanks > > | Varun > > | > > >