On 10/16/2013 3:52 AM, michael.boom wrote: > I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3 > machines along with 3 Zookeeper instances. > > My web application makes queries to Solr specifying the hostname of one of > the machines. So that machine will always get the request and the other ones > will just serve as an aid. > So I would like to setup a load balancer that would fix that, balancing the > queries to all machines. > Maybe doing the same while indexing.
SolrCloud actually handles load balancing for you. You'll find that when you send requests to one server, they are actually being re-directed across the entire cloud, unless you include a "distrib=false" parameter on the request, but that would also limit the search to one shard, which is probably not what you want. The only thing that you don't get with a non-Java client is redundancy. If you can't build in failover capability yourself, which is a very advanced programming technique, then you need a load balancer. For my large non-Cloud Solr install, I use haproxy as a load balancer. Most of the time, it doesn't actually balance the load, just makes sure that Solr is always reachable even if part of it goes down. The haproxy program is simple and easy to use, but performs extremely well. I've got a pacemaker cluster making sure that the shared IP address, haproxy, and other homegrown utility applications related to Solr are only running on one machine. Thanks, Shawn