Thank you, Erick. That confirms our understanding for a single cluster, or once we select a node from one of the two clusters to query.
As we try to set up an external load balancer to go between two clusters, though, we still have some questions. We need a way to determine that a node is still 'alive' and should be in the load balancer, and we need a way to know that a new node is now available and fully ready with its replicas to add to the load balancer. How does ZooKeeper make this determination? Does it do something different if multiple collections are on a single cluster? And, even with just one cluster, what is best practice for keeping a current list of active nodes in the cluster, especially for extremely high query rates? Again, if there's some good documentation on this, I'd love a pointer... Monica Skidmore Senior Software Engineer On 4/30/18, 1:09 PM, "Erick Erickson" <erickerick...@gmail.com> wrote: Multiple clusters with the same dataset aren't load-balanced by Solr, you'll have to accomplish that from "outside", e.g. something that sends queries to each cluster. _Within_ a cluster (collection), as long as a request gets to any Solr node, sub-requests are distributed with an internal software LB. As far as a single collection, you're fine just sending any query to any node. Even if you send a query to a node that hosts no replicas for a collection, Solr will "do the right thing" and forward it appropiately. HTH, Erick On Mon, Apr 30, 2018 at 9:46 AM, Monica Skidmore < monica.skidm...@careerbuilder.com> wrote: > We are migrating from a master-slave configuration to Solr cloud (7.3) and > have questions about the preferred way to load balance between the two > clusters. It looks like we want to use a load balancer that directs > queries to any of the server nodes in either cluster, trusting that node to > handle the query correctly – true? If we auto-scale nodes into the > cluster, are there considerations about when a node becomes ‘ready’ to > query from a Solr perspective and when it is added to the load balancer? > Also, what is the preferred method of doing a health-check for the load > balancer – would it be “bin/solr healthcheck -c myCollection”? > > > > Pointers in the right direction – especially to any documentation on > running multiple clusters with the same dataset – would be appreciated. > > > > *Monica Skidmore* > *Senior Software Engineer* > > > > [image: cid:image001.png@01D3A0F1.06327950] > > >