Excellent - thanks! On Mon, Apr 18, 2016 at 9:16 AM, Erick Erickson <erickerick...@gmail.com> wrote:
> Your summary pretty much nails it. > > For (b) note that CloudSolrClient uses an internal software load > balancer to distribute queries, FWIW. > > > > On Mon, Apr 18, 2016 at 7:52 AM, John Bickerstaff > <j...@johnbickerstaff.com> wrote: > > Thanks all - very helpful. > > > > @Shawn - your reply implies that even if I'm hitting the URL for a single > > endpoint via HTTP - the "balancing" will still occur across the Solr > Cloud > > (I understand the caveat about that single endpoint being a potential > point > > of failure). I just want to verify that I'm interpreting your response > > correctly... > > > > (I have been asked to provide IT with a comprehensive list of options > prior > > to a design discussion - which is why I'm trying to get clear about the > > various options) > > > > In a nutshell, I think I understand the following: > > > > a. Even if hitting a single URL, the Solr Cloud will "balance" across all > > available nodes for searching > > Caveat: That single URL represents a potential single point of > > failure and this should be taken into account > > > > b. SolrJ's CloudSolrClient API provides the ability to distribute load -- > > based on Zookeeper's "knowledge" of all available Solr instances. > > Note: This is more robust than "a" due to the fact that it > > eliminates the "single point of failure" > > > > c. Use of a load balancer hitting all known Solr instances will be fine > - > > although the search requests may not run on the Solr instance the load > > balancer targeted - due to "a" above. > > > > Corrections or refinements welcomed... > > > > On Mon, Apr 18, 2016 at 7:21 AM, Shawn Heisey <apa...@elyograg.org> > wrote: > > > >> On 4/17/2016 10:35 PM, John Bickerstaff wrote: > >> > My prior use of SOLR in production was pre SOLR cloud. We put a > >> > round-robin load balancer in front of replicas for searching. > >> > > >> > Do I understand correctly that a load balancer is unnecessary with > SOLR > >> > Cloud? I. E. -- SOLR and Zookeeper will balance the load, regardless > of > >> > which replica's URL is getting hit? > >> > >> Your understanding is correct -- queries sent to a single SolrCloud node > >> will be balanced across the cloud, although the node you are sending the > >> queries to might represent a single point of failure. > >> > >> If your program is written in Java, you can use CloudSolrClient in SolrJ > >> -- this client talks to the zookeeper ensemble and dynamically adjusts > >> to the addition and removal of Solr nodes in the cloud. All > >> notifications from the cloud to the client about servers going up or > >> down are nearly instantaneous -- the client does not need to poll for > >> status. > >> > >> For other programming languages, if your client code is not capable of > >> failing over to a second node when the primary goes down, then you would > >> still need a load balancer. > >> > >> Thanks, > >> Shawn > >> > >> >