Re: Setup cloud collection

Erick Erickson Thu, 16 Jul 2015 17:46:11 -0700

Piling on to Shawn's comments. Leadership is a very misunderstood
role when people start using SolrCloud, and it often gets conflated
with the old "master" role in master/slave.

There is, indeed, a small additional bit of processing that goes on on
the leader node that's not done on replicas. But the REBALANCELEADER
code was put in place to handle situations where 100+ leaders happened
to be on the _same_ node. It took many tens of leaders being on a node
for the additional work imposed by being a leader to be noticed in a very
demanding environment.

Indexing is done _both_ on the leader and the replicas, so the workload
for indexing isn't substantially different. And, as Shawn says querying is
done on all replicas by a software load balancer, although you can reasonably
put a HW load balancer in front of the whole thing too.

So by and large you can completely ignore it whe leaders that aren't evenly
distributed. The additional load isn't worth the headache of trying to control
it. And it will change as you bounce Solr servers, leadership is assigned
to the node that contains the first replica of a shard to come up.

Best,
Erick

On Thu, Jul 16, 2015 at 8:23 AM,  <solr.user.1...@gmail.com> wrote:
> Thank you, very good explanation.
>
> Regards
>
>> On 16 Jul 2015, at 17:12, Shawn Heisey <apa...@elyograg.org> wrote:
>>
>>> On 7/16/2015 7:47 AM, solr.user.1...@gmail.com wrote:
>>> Thanks Shawn, but don't want to build something in front of Solr cloud to 
>>> help Solr assign leader role to distribute load of indexing.
>>>
>>> Instead of doing this manual step (rebalance leaders) maybe one host should 
>>> not take the leader role of multiple shards for same collection if the 
>>> number of live nodes are equal to number of shards.
>>>
>>> But assuming that when you say it will happen "over time", Maybe I'll 
>>> continue indexing and see that leaders will be rebalanced soon.
>>
>> Unless you have a fairly major event (like Solr restarting or an
>> operation taking longer than zkClientTimeout) your leaders will never
>> change.  It's a semi-permanent role.  When a qualifying event happens,
>> SolrCloud does an election process to determine the leader, but
>> elections do not happen unless you force them with a REBALANCELEADERS
>> action or one of several errors occurs.
>>
>> You don't have to build anything in front of Solr.  You simply have to
>> assign a preferred leader for each shard, an action that can be done
>> with an HTTP call in a browser.  I don't think we have anything in the
>> admin UI to assign preferred leaders ... I will look into it and open an
>> issue if necessary.
>>
>> The thing that I'm saying will happen over time is that all replicas
>> will be used for queries.  If you send a thousand queries, you'll find
>> that they will be divided fairly evenly among all replicas.  The fact
>> that you have one node as leader for three of your shards is not very
>> much of a big deal, but if you really want to change it, you can do so
>> with the preferred leader feature.
>>
>> Thanks,
>> Shawn
>>

Re: Setup cloud collection

Reply via email to