Re: Restarting SolrCloud that is taking realtime updates

Erick Erickson Fri, 25 Nov 2016 14:34:28 -0800

First, get out of thinking about the replication API, things like
DISABLEPOLL and the like when in SolrCloud mode. The
"old style" replication is used under the control of the synching
strategy. Unless you've configured master/slave sections of
your solrconfig.xml files and somehow dealt with the leader
changing (who should be polled?), I'm pretty sure this is a total red herring.

As for the rest, that's just the way it works. In SolrCloud, the
raw documents are forwarded from the leader to the followers.
Outside of a node going into recovery, replication isn't used
at all.

However, when a node goes into recovery (which by definition it will
when the core is reloaded or the Solr instance is restarted) then
the replica checks with the leader to see if it's "too far" out of date. The
default "too far" is 100 docs, although this can be changed by setting
the updatelog numRecordsToKeep to a higher number in solrconfig.xml.
If the replica is too far out of date, a full index replication is done which
is what you're observing.

If the number of updates the leader has received is < 100
(or numRecordsToKeep) the leader sends the raw documents to the
follower from it's update log and there is no "old style" replication there
at all.

So, the net-net here is that your choices are limited:

1> stop indexing while doing the restart.

2> bump numRecordsToKeep to some larger number that
     you expect not to be exceeded for the time it takes to
     restart each node.

3> live with the full index replication in this situation.

I'll add parenthetically that having to redeploy plugins and the like
_should_ be a relatively rare operation, and it seems (at least from
the outside) to be a perfectly reasonable thing to do in a maintenance
window when index updates are disabled.

You can also consider using collection aliasing to switch back and
forth between two collections so you can manipulate the current
cold one and, when you're satisfied, switch the alias.

Best,
Erick

On Fri, Nov 25, 2016 at 1:40 PM, Jichi Guo <jichi...@gmail.com> wrote:
> Hi,
>
>
>
> I am seeking for the best practice to restart a sharded SolrCloud that taking
> search traffic as well as realtime updates without downtime.
>
> When I deploy new customized Solr plugins,for example, it will require
> restarting the whole SolrCloud cluster.
>
> I am testing Solr 6.2.1 with 4 shards.
>
> And I find that when SolrCloud is taking updates, when I restart any Solr node
> (no matter whether it is a leader node or overseer or other normal replica),
> the restarted node would Reindex it's whole data from its leader. i.e., it
> will redownload the whole index data and then drop its old data.
>
> The only way I find to avoid such reindexing is to temporarily disable
> updates, such as invoke disableReplication in the leader node before
> restarting.
>
>
>
> Additionally, I didn't find a way to temporarily pause Solr replication to a
> single replica. Before sharding, we can do disablePoll to disable replication
> in a slave. But after sharding,  disable replication from the leader node is
> the only way I found, which will pause not only the replication to the one
> node to restart, but also disable replication in all nodes in the same shard.
>
>
>
> The procedure becomes more complex if I want to restart a leader node: I need
> first manually trigger a leader node failover through rebalancing, then
> disable replication in the new leader node, then restart the old leader node,
> and at last reenable replication in the new leader node.
>
>
>
> As you can see, it seems to take many steps to restart SolrCloud node by node
> this way.
>
> I am not sure if this is the best procedure to restart the whole SolrCloud
> that is taking realtime update?
>
>
>
> Thanks!
>
>
> Sent from [Nylas N1](https://link.nylas.com/link/5tkvmhpozan5j5h3lhni487b
> /local-7bf8174b-7288/0?redirect=https%3A%2F%2Fnylas.com%2Fn1%3Fref%3Dn1&r=c29s
> ci11c2VyQGx1Y2VuZS5hcGFjaGUub3Jn), the extensible, open source mail client.
>
> ![](https://link.nylas.com/open/5tkvmhpozan5j5h3lhni487b/local-
> 7bf8174b-7288?r=c29sci11c2VyQGx1Y2VuZS5hcGFjaGUub3Jn)
>

Re: Restarting SolrCloud that is taking realtime updates

Reply via email to