On 4/7/20 1:16 PM, Andrei Borzenkov wrote:
07.04.2020 00:21, Sherrard Burton пишет:

It looks like some timing issue or race condition. After reboot node
manages to contact qnetd first, before connection to other node is
established. Qnetd behaves as documented - it sees two equal size
partitions and favors the partition that includes tie breaker (lowest
node id). So existing node goes out of quorum. Second later both nodes
see each other and so quorum is regained.


Define the right problem to solve?

Educated guess is that your problem is not corosync but pacemaker
stopping resources. In this case just do what was done for years in two
node cluster - set no-quorum-policy=ignore and rely on stonith to
resolve split brain.

I dropped idea to use qdevice in two node cluster. If you have reliable
stonith device it is not needed and without stonith relying on watchdog
suicide has too many problems.


Andrei,
in a two-node cluster with stonith only, but no qdevice, how do you avoid the dreaded stonith death match, and the resultant flip-flopping of services?

and are you using this configuration with stateful services? my main use case is DRBD, so i am very cautious of making sure that there is no data corruption, or disruption. so the qdevice is a part of my "belt and suspenders" approach.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to