All, This is my current understanding of how SolrCloud load balancing works...
Within SolrCloud, for a cluster with more than 1 shard and at least 1 replica, the Zookeeper aware SolrJ client uses LBHTTPSolrServer which is round robin across the replicas and leaders in the cluster. In turn the shard (which can be a leader or replica) that performs the distributed query may then go to the leader or replica for each shard based on round robin via LBHTTPSolrServer. If this is correct then in a SolrCloud instance that has let's say 1 replica, the initial query from the user may go to the leader for shard 1, then when the user paginates to the second page the subsequent query may go to the replica of shard 1. This seems inefficient from a caching perspective where the queryResultCache and possibly the filterCache would need to be reloaded. >From what I can find there does not appear to be any option of session affinity within the SolrCloud query execution? Thanks!