Thanks all - very helpful. @Shawn - your reply implies that even if I'm hitting the URL for a single endpoint via HTTP - the "balancing" will still occur across the Solr Cloud (I understand the caveat about that single endpoint being a potential point of failure). I just want to verify that I'm interpreting your response correctly...
(I have been asked to provide IT with a comprehensive list of options prior to a design discussion - which is why I'm trying to get clear about the various options) In a nutshell, I think I understand the following: a. Even if hitting a single URL, the Solr Cloud will "balance" across all available nodes for searching Caveat: That single URL represents a potential single point of failure and this should be taken into account b. SolrJ's CloudSolrClient API provides the ability to distribute load -- based on Zookeeper's "knowledge" of all available Solr instances. Note: This is more robust than "a" due to the fact that it eliminates the "single point of failure" c. Use of a load balancer hitting all known Solr instances will be fine - although the search requests may not run on the Solr instance the load balancer targeted - due to "a" above. Corrections or refinements welcomed... On Mon, Apr 18, 2016 at 7:21 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 4/17/2016 10:35 PM, John Bickerstaff wrote: > > My prior use of SOLR in production was pre SOLR cloud. We put a > > round-robin load balancer in front of replicas for searching. > > > > Do I understand correctly that a load balancer is unnecessary with SOLR > > Cloud? I. E. -- SOLR and Zookeeper will balance the load, regardless of > > which replica's URL is getting hit? > > Your understanding is correct -- queries sent to a single SolrCloud node > will be balanced across the cloud, although the node you are sending the > queries to might represent a single point of failure. > > If your program is written in Java, you can use CloudSolrClient in SolrJ > -- this client talks to the zookeeper ensemble and dynamically adjusts > to the addition and removal of Solr nodes in the cloud. All > notifications from the cloud to the client about servers going up or > down are nearly instantaneous -- the client does not need to poll for > status. > > For other programming languages, if your client code is not capable of > failing over to a second node when the primary goes down, then you would > still need a load balancer. > > Thanks, > Shawn > >