On 2020/3/23 7:57, Ulrich Windl wrote:
Ken Gaillot <[email protected]> schrieb am 21.03.2020 um 18:07 in
Nachricht
<15250_1584810570_5E764A4A_15250_638_1_c8c5a180d8ad9327dfd1e743d43577772556f3fd.
[email protected]>:
Hi all,

I am happy to announce a feature that was discussed on this list a
while back. It will be in Pacemaker 2.0.4 (the first release candidate
is expected in about three weeks).

A longstanding concern in two-node clusters is that in a split brain,
one side must get a fencing delay to avoid simultaneous fencing of both
nodes, but there is no perfect way to determine which node gets the
delay.

The most common approach is to configure a static delay on one node.
This is particularly useful in an active/passive setup where one
particular node is normally assigned the active role.

Actually with sbd there could be a more simplitic approach: Allocate a a
pseudo-mode named "DC" or "locker" and then use a SCSI lock mechanism to update
that slot atomically. Only the node that "has the lock" may issue fence
commands. Once the fencing is confirmed, the locker slot is released
(wiped)...

It doesn't sound as simple as directly introducing a delay. What if the lock holder itself somehow runs into issue or dies after the fencing is issued but before it's confirmed? So the other node would have to somehow gain the lock after,well, a "delay" anyway?



Another approach is to use the relatively new fence_heuristics_ping
agent in a topology with your real fencing agent. A node that can ping
a configured IP will be more likely to survive.

In addition, we now have a new cluster-wide property, priority-fencing-
delay, that bases the delay on what resources were known to be active
where just before the split. If you set the new property, and configure
priorities for your resources, the node with the highest combined
priority of all resources running on it will be more likely to survive.

Or combined with a ping-like mechanism: Each node periodically sends an "I'm
alive" message that updates the node's timestamp in CIB status. The node that
was alive last will survive. If it doesn't react within fencing timeout, the
second-newest (in case of two nodes: the other) node may fence and try to form
a cluster.

Why would such an outdated node state matter more than what corosync tells us? And the point here is not pick "A" node. The point is pick the more "significant" node which is potentially hosting the more significant resources/instances to help it win inevitable fencing match in case it's split-brain.

Regards,
  Yan



As an example, if you set a default priority of 1 for all resources,
and set priority-fencing-delay to 15s, then the node running the most
resources will be more likely to survive because the other node will
wait 15 seconds before initiating fencing. If a particular resource is
more important than the rest, you can give it a higher priority.

The master role of promotable clones will get an extra 1 point, if a
priority has been configured for that clone.

If both nodes have equal priority, or fencing is needed for some reason
other than node loss (e.g. on-fail=fencing for some monitor), then the
usual delay properties apply (pcmk_delay_base, etc.).

I'd like to recognize the primary authors of the 2.0.4 features
announced so far:
- shutdown locks: myself
- switch to clock_gettime() for monotonic clock: Jan Pokorný
- crm_mon --include/--exclude: Chris Lumens
- priority-fencing-delay: Gao,Yan
--
Ken Gaillot <[email protected]>

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to