Running ZK on all the cloud servers makes it very, very hard to add a new Solr node. You have to reconfigure every ZK server to do that.
Manage the ZK cluster and the Solr cluster separately. I'm not sure it is worth configuring Solr Cloud if you are only going to run two servers. Instead, run one server as live, and use simple replication to the second as a hot backup. If you need four or more Solr servers and you need NRT, run Solr Cloud. wunder On Jun 1, 2013, at 1:55 AM, Daniel Collins wrote: > Document updates will fail with less than the quorum of ZKs, so you won't be > able to index anything when 1 server is down. > > Its the one area that always seems counter intuitive (to me at any rate), > after all you have your 2 instances on 1 server, so you have all the shard > data, logically you should be able to index just using that (and if you had a > single ZK running on that server it would indeed be fine)... However, ZK > needs a 3rd instance running somewhere in order to maintain its majority rule. > > The consensus I've seen tends to be run a ZK on all your cloud servers, and > then run some "outside" the cloud on other machines. If you had a 3rd VM > that just ran ZK and nothing else, you could lose any 1 of the 3 machines and > still be ok. But if you lose 2 you are in trouble. > > -----Original Message----- From: James Dulin > Sent: Friday, May 31, 2013 10:28 PM > To: solr-user@lucene.apache.org > Subject: RE: 2 VM setup for SOLRCLOUD? > > Thanks. When you say updates will fail, do you mean document updates will > fail, or, updates to the cluster, like adding a new node? If adding new data > will fail, I will definitely need to figure out a different way to set this > up. > > -----Original Message----- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Friday, May 31, 2013 4:33 PM > To: solr-user@lucene.apache.org > Subject: Re: 2 VM setup for SOLRCLOUD? > > Be really careful here. Zookeeper requires a quorum, which is ((zk > nodes)/2) + 1. So the problem here is that if (zk nodes) is 2, both of them > need to be up. If either of them is down, searches will still work, but > updates will fail. > > Best > Erick > > On Fri, May 31, 2013 at 11:39 AM, James Dulin <jdu...@crelate.com> wrote: >> >> Thanks, I think that the load balancer will be simple enough to set up in >> Azure. My only other current concern is having the zookeepers on the same >> VMs as Solr. While not ideal, we basically just need simple redunancy, so >> my theory is that if VM1 goes down, VM 2 will have the shard, node, and >> zookeeper to keep everything going smooth. >> >> >> -----Original Message----- >> From: Erick Erickson [mailto:erickerick...@gmail.com] >> Sent: Friday, May 31, 2013 8:07 AM >> To: solr-user@lucene.apache.org >> Subject: Re: 2 VM setup for SOLRCLOUD? >> >> Actually, you don't technically _need_ a load balancer, you could hard code >> all requests to the same node and internally, everything would "just work". >> But then you'd be _creating_ a single point of failure if that node went >> down, so a fronting LB is usually indicated. >> >> Perhaps the thing you're missing is that Zookeeper is there explicitly for >> the purpose of knowing where all the nodes are and what their state is. Solr >> communicates with ZK and any incoming requests (update or query) are handled >> appripriately thus Jason's comment that once a request gets to any node in >> the cluster, things are handled automatically. >> >> All that said, if you're using SolrJ and use CloudSolrServer exclusively, >> then the load balancer isn't necessary. Internally CloudSolrServer (the >> client) reads the list of accessible nodes from Zookeeper and will be fault >> tolerant and load balance internally. >> >> Best >> Erick >> >> On Thu, May 30, 2013 at 3:51 PM, Jason Hellman >> <jhell...@innoventsolutions.com> wrote: >>> Jamey, >>> >>> You will need a load balancer on the front end to direct traffic into one >>> of your SolrCore entry points. It doesn't matter, technically, which one >>> though you will find benefits to narrowing traffic to fewer (for purposes >>> of better cache management). >>> >>> Internally SolrCloud will round-robin distribute requests to other shards >>> once a query begins execution. But you do need an entry point externally >>> to be defined through your load balancer. >>> >>> Hope this is useful! >>> >>> Jason >>> >>> On May 30, 2013, at 12:48 PM, James Dulin <jdu...@crelate.com> wrote: >>> >>>> Working to setup SolrCloud in Windows Azure. I have read over the >>>> solr Cloud wiki, but am a little confused about some of the >>>> deployment options. I am attaching an image for what I am thinking >>>> we want to do. 2 VM's that will have 2 shards spanning across them. >>>> 4 Nodes total across the two machines, and a zookeeper on each VM. >>>> I think this is feasible, but, I am a little confused about how each >>>> node knows how to respond to requests (do I need a load balancer in >>>> front, or can we just reference the "collection" etc.) >>>> >>>> >>>> >>>> Thanks! >>>> >>>> Jamey >>>> >>>> >