You do need to load balance the initial query request across the SolrCloud nodes. Solj's CloudSolrServer and LBHttpSolrServer can perform the load balancing for you in the client. Or you can use a hardware load balancer.
Joel Bernstein Search Engineer at Heliosearch On Thu, Jan 9, 2014 at 5:58 PM, Shawn Heisey <s...@elyograg.org> wrote: > On 1/9/2014 4:09 PM, Garth Grimm wrote: > >> As a follow-up question on this.... >> >> One would want to use some kind of load balancing 'above' the SolrCloud >> installation for search queries, correct? To ensure that the initial >> requests would get distributed evenly to all nodes? >> >> If you don't have that, and send all requests to M2S2 (IRT OP), it would >> be the only node that would ever act as controller, and it could become a >> bottleneck that further replicas won't be able to alleviate. Correct? >> >> Or is there something in the SolrCloud itself that even distributes the >> controller role, regardless of which node the query initially arrives at? >> > > Queries are automatically load balanced across the cloud, even if they all > hit the same host. This *probably* includes the controller role, but I am > not sure about that. > > Unless you are using a zookeeper aware client, a load balancer is a good > idea just from a redundancy perspective -- if the host you're hitting goes > down, you'll want to automatically switch to another one. The only > zookeeper aware client that I know if is CloudSolrServer, which is part of > SolrJ and allows you to write Java programs that access Solr. > > Thanks, > Shawn > >