Re: Nol Leader after nodes restart

Erick Erickson Sun, 23 Dec 2018 17:09:49 -0800

There are a couple of options:

1> stop all your nodes. Start them one at a time and wait for "leader
election" to occur. This can take several minutes, but eventually the
replicas on that machine will become the leader. Then start the other
nodes, again one at a time waiting for them to recover fully before
starting the next node.


2> you can try the FORCELEADER collecrions API option..

The leater election and retry logic has been vastly improved in 7.3+
(with some of the last improvements in 7.5).

Best,
Erick

On Sun, Dec 23, 2018 at 1:43 AM Vadim Ivanov
<vadim.iva...@spb.ntk-intourist.ru> wrote:
>
> Hi!
> After restart of  nodes I have situation when no leader on shard can be
> elected
> Shard rpk51_222_306 resides on 3 nodes (solr00, solr06, solr09) with
> corresponding replica names
> (rpk51_222_306_00, rpk51_222_306_06, rpk51_222_306_09)
> Logs looks like this
> PeerSync: core=rpk51_222_306_00 url=http://solr00:8983/solr Requested 26
> updates from http://solr06:8983/solr/rpk51_222_306_06/ but retrieved 25
> PeerSync: core=rpk51_222_306_06 url=http://solr06:8983/solr Requested 29
> updates from http://solr00:8983/solr/rpk51_222_306_00/ but retrieved 24
> PeerSync: core=rpk51_222_306_09 url=http://solr09:8983/solr Requested 26
> updates from http://solr06:8983/solr/rpk51_222_306_06/ but retrieved 25
>
> 00 and 09 tries to recover from 06 and fail
> 06 tries to recover from 00 and fail
>
> It goes continuously every minute and forever
>
> How to break this deadlock loop?
> --
> Vadim
>
>

Re: Nol Leader after nodes restart

Reply via email to