Use direct connections to Zookeeper. Using a load balancer or proxy is not recommended.
Zookeeper needs direct TCP connections. It is not an HTTP server. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Sep 13, 2018, at 5:23 AM, Gu, Steve (CDC/OD/OADS) (CTR) <c...@cdc.gov> > wrote: > > Hi, Florian > > We need to pass zookeeper url to CloudSolrClient. Since there are multiple > zk servers, is it the common practice to set a proxy server in front of > zookeeper? > > Thanks for your advice. > Steve > > -----Original Message----- > From: Florian Gleixner <f...@redflo.de> > Sent: Wednesday, September 12, 2018 6:27 PM > To: solr-user@lucene.apache.org > Subject: Re: how to access solr in solrcloud > > On 9/12/18 8:21 PM, Shawn Heisey wrote: >> On 9/12/2018 7:38 AM, Gu, Steve (CDC/OD/OADS) (CTR) wrote: >>> I am upgrading our solr to 7.4 and would like to set up solrcloud for >>> failover and load balance. There are three zookeeper servers >>> (zk1:2181, zk1:2182) and two solr instance solr1:8983, solr2:8983. >>> So what will be the solr url should the client to use for access? >>> Will it be solr1:8983, the leader? >>> >>> If we use solr1:8983 to access solr, what happens if solr1:8983 is >>> down? Will the request be routed to solr2:8983 via the zookeeper? I >>> understand that zookeeper is doing all the coordination works but >>> wanted to understand how this works. >> >> Zookeeper does not handle Solr requests. It doesn't know anything at >> all about Solr. It is Solr that uses ZK to coordinate the cluster. >> >> If you are using the Java client called CloudSolrClient, then you will >> most likely be informing it about ZK, not Solr, and it will >> automatically determine what Solr servers there are by talking to ZK, >> and then will talk directly to the correct Solr servers. If you are >> not using a client that is ZK-aware, then you will need a load >> balancer sitting in front of your Solr servers. Don't put a load >> balancer in front of ZooKeeper. Your clients will then talk to the load >> balancer. > > The advantage over haproxy/nginx/... solutions is, that a client, that is > using zookeeper, registers at zookeeper and in case a solr node goes down, > the solr node may inform zookeeper, which may inform all registered clients. > Failover can be much faster with CloudSolrClient than with haproxy or similar > solutions. > And CloudSolrClient knows which is the leader and when indexing, it routes > documents to the leader which avoids overhead. > I've written a SolrCloudProxy which can be used to connect non-cloud aware > clients to a solr cloud. The proxy uses CloudSolrClient with all its > advantages. It is not yet production ready, but you may want to try it: > https://gitlab.lrz.de/a2814ad/SolrCloudProxy > > >