Thanks all - very helpful.

@Shawn - your reply implies that even if I'm hitting the URL for a single
endpoint via HTTP - the "balancing" will still occur across the Solr Cloud
(I understand the caveat about that single endpoint being a potential point
of failure).  I just want to verify that I'm interpreting your response
correctly...

(I have been asked to provide IT with a comprehensive list of options prior
to a design discussion - which is why I'm trying to get clear about the
various options)

In a nutshell, I think I understand the following:

a. Even if hitting a single URL, the Solr Cloud will "balance" across all
available nodes for searching
          Caveat: That single URL represents a potential single point of
failure and this should be taken into account

b. SolrJ's CloudSolrClient API provides the ability to distribute load --
based on Zookeeper's "knowledge" of all available Solr instances.
          Note: This is more robust than "a" due to the fact that it
eliminates the "single point of failure"

c.  Use of a load balancer hitting all known Solr instances will be fine -
although the search requests may not run on the Solr instance the load
balancer targeted - due to "a" above.

Corrections or refinements welcomed...

On Mon, Apr 18, 2016 at 7:21 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 4/17/2016 10:35 PM, John Bickerstaff wrote:
> > My prior use of SOLR in production was pre SOLR cloud.  We put a
> > round-robin  load balancer in front of replicas for searching.
> >
> > Do I understand correctly that a load balancer is unnecessary with SOLR
> > Cloud?  I. E. -- SOLR and Zookeeper will balance the load, regardless of
> > which replica's URL is getting hit?
>
> Your understanding is correct -- queries sent to a single SolrCloud node
> will be balanced across the cloud, although the node you are sending the
> queries to might represent a single point of failure.
>
> If your program is written in Java, you can use CloudSolrClient in SolrJ
> -- this client talks to the zookeeper ensemble and dynamically adjusts
> to the addition and removal of Solr nodes in the cloud.  All
> notifications from the cloud to the client about servers going up or
> down are nearly instantaneous -- the client does not need to poll for
> status.
>
> For other programming languages, if your client code is not capable of
> failing over to a second node when the primary goes down, then you would
> still need a load balancer.
>
> Thanks,
> Shawn
>
>

Reply via email to