Re: Load Balancing between Two Cloud Clusters

Monica Skidmore Mon, 30 Apr 2018 11:04:27 -0700

Thank you, Erick.  That confirms our understanding for a single cluster, or 
once we select a node from one of the two clusters to query.


As we try to set up an external load balancer to go between two clusters, 
though, we still have some questions.  We need a way to determine that a node 
is still 'alive' and should be in the load balancer, and we need a way to know 
that a new node is now available and fully ready with its replicas to add to 
the load balancer.

How does ZooKeeper make this determination?  Does it do something different if 
multiple collections are on a single cluster?  And, even with just one cluster, 
what is best practice for keeping a current list of active nodes in the 
cluster, especially for extremely high query rates?

Again, if there's some good documentation on this, I'd love a pointer...

Monica Skidmore
Senior Software Engineer
 

 
On 4/30/18, 1:09 PM, "Erick Erickson" <erickerick...@gmail.com> wrote:

    Multiple clusters with the same dataset aren't load-balanced by Solr,
    you'll have to accomplish that from "outside", e.g. something that sends
    queries to each cluster.
    
    _Within_ a cluster (collection), as long as a request gets to any Solr
    node, sub-requests are distributed with an internal software LB. As far as
    a single collection, you're fine just sending any query to any node. Even
    if you send a query to a node that hosts no replicas for a collection, Solr
    will "do the right thing" and forward it appropiately.
    
    HTH,
    Erick
    
    On Mon, Apr 30, 2018 at 9:46 AM, Monica Skidmore <
    monica.skidm...@careerbuilder.com> wrote:
    
    > We are migrating from a master-slave configuration to Solr cloud (7.3) and
    > have questions about the preferred way to load balance between the two
    > clusters.  It looks like we want to use a load balancer that directs
    > queries to any of the server nodes in either cluster, trusting that node 
to
    > handle the query correctly – true?  If we auto-scale nodes into the
    > cluster, are there considerations about when a node becomes ‘ready’ to
    > query from a Solr perspective and when it is added to the load balancer?
    > Also, what is the preferred method of doing a health-check for the load
    > balancer – would it be “bin/solr healthcheck -c myCollection”?
    >
    >
    >
    > Pointers in the right direction – especially to any documentation on
    > running multiple clusters with the same dataset – would be appreciated.
    >
    >
    >
    > *Monica Skidmore*
    > *Senior Software Engineer*
    >
    >
    >
    > [image: cid:image001.png@01D3A0F1.06327950]
    >
    >
    >

Re: Load Balancing between Two Cloud Clusters

Reply via email to