Hello There is two-node (HPE DL345 Gen12 servers) shared-nothing DRBD-based sync (Protocol C) replication, distributed active/standby pacemaker storage metro-cluster. The distributed active/standby pacemaker storage metro-cluster configured with qdevice, heuristics (parallel fping) and fencing - fence_ipmilan and diskless sbd (hpwdt, /dev/watchdog). All cluster resources are configured to always run together on the same node.
The two storage cluster nodes and qdevice running on Rocky Linux 10.1 Pacemaker version 3.0.1 Corosync version 3.1.9 DRBD version 9.3.0 So, the question is - what is the most correct way of implementing STONITH/fencing with fence_iomilan + diskless sbd (hpwdt, /dev/watchdog) ? I'm not sure about two-level fencing topology, because diskless sbd is not an external agent/resource... Currently it works without fencing topology, and both running in "parallel". Really no matter who wins. I just want to make sure fenced node is powered off of rebooted. Here is log how it works now in "parallel", [root@memverge2 ~]# cat /var/log/messages|grep -i fence Feb 2 12:46:07 memverge2 pacemaker-fenced[3902]: notice: Node memverge state is now lost Feb 2 12:46:07 memverge2 pacemaker-fenced[3902]: notice: Removed 1 inactive node with cluster layer ID 27 from the membership cache Feb 2 12:46:10 memverge2 pacemaker-schedulerd[3905]: warning: Cluster node memverge will be fenced: peer is no longer part of the cluster Feb 2 12:46:10 memverge2 pacemaker-schedulerd[3905]: warning: ipmi-fence-memverge2_stop_0 on memverge is unrunnable (node is offline) Feb 2 12:46:10 memverge2 pacemaker-schedulerd[3905]: warning: ipmi-fence-memverge2_stop_0 on memverge is unrunnable (node is offline) Feb 2 12:46:10 memverge2 pacemaker-schedulerd[3905]: notice: Actions: Fence (reboot) memverge 'peer is no longer part of the cluster' Feb 2 12:46:10 memverge2 pacemaker-schedulerd[3905]: notice: Actions: Stop ipmi-fence-memverge2 ( memverge ) due to node availability Feb 2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Client pacemaker-controld.3906 wants to fence (reboot) memverge using any device Feb 2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Requesting peer fencing (reboot) targeting memverge Feb 2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Requesting that memverge2 perform 'reboot' action targeting memverge Feb 2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Waiting 25s for memverge to self-fence (reboot) for client pacemaker-controld.3906 Feb 2 12:46:10 memverge2 pacemaker-fenced[3902]: notice: Delaying 'reboot' action targeting memverge using ipmi-fence-memverge for 5s Feb 2 12:46:36 memverge2 pacemaker-fenced[3902]: notice: Self-fencing (reboot) by memverge for pacemaker-controld.3906 assumed complete Feb 2 12:46:36 memverge2 pacemaker-fenced[3902]: notice: Operation 'reboot' targeting memverge by memverge2 for pacemaker-controld.3906@memverge2<mailto:pacemaker-controld.3906@memverge2>: OK (Done) Feb 2 12:46:36 memverge2 kernel: drbd ha-nfs memverge: helper command: /sbin/drbdadm fence-peer Feb 2 12:46:36 memverge2 kernel: drbd ha-iscsi memverge: helper command: /sbin/drbdadm fence-peer Feb 2 12:46:36 memverge2 crm-fence-peer.9.sh[7332]: DRBD_BACKING_DEV_1=/dev/mapper/object_block_nfs_vg-ha_nfs_exports_lv_with_vdo_1x8 DRBD_BACKING_DEV_2=/dev/mapper/object_block_nfs_vg-ha_nfs_internal_lv_without_vdo DRBD_BACKING_DEV_5=/dev/mapper/object_block_nfs_vg-ha_samba_exports_lv_with_vdo_1x8 DRBD_CONF=/etc/drbd.conf DRBD_CSTATE=Connecting DRBD_LL_DISK=/dev/mapper/object_block_nfs_vg-ha_nfs_exports_lv_with_vdo_1x8\ /dev/mapper/object_block_nfs_vg-ha_nfs_internal_lv_without_vdo\ /dev/mapper/object_block_nfs_vg-ha_samba_exports_lv_with_vdo_1x8 DRBD_MINOR=1\ 2\ 5 DRBD_MINOR_1=1 DRBD_MINOR_2=2 DRBD_MINOR_5=5 DRBD_MY_ADDRESS=192.168.0.8 DRBD_MY_AF=ipv4 DRBD_MY_NODE_ID=28 DRBD_NODE_ID_27=memverge DRBD_NODE_ID_28=memverge2 DRBD_PEER_ADDRESS=192.168.0.6 DRBD_PEER_AF=ipv4 DRBD_PEER_NODE_ID=27 DRBD_RESOURCE=ha-nfs DRBD_VOLUME=1\ 2\ 5 UP_TO_DATE_NODES=0x10000000 /usr/lib/drbd/crm-fence-peer.9.sh Feb 2 12:46:36 memverge2 crm-fence-peer.9.sh[7333]: DRBD_BACKING_DEV_3=/dev/mapper/object_block_nfs_vg-ha_block_exports_lv_with_vdo_1x8 DRBD_BACKING_DEV_4=/dev/mapper/object_block_nfs_vg-ha_block_exports_lv_without_vdo DRBD_CONF=/etc/drbd.conf DRBD_CSTATE=Connecting DRBD_LL_DISK=/dev/mapper/object_block_nfs_vg-ha_block_exports_lv_with_vdo_1x8\ /dev/mapper/object_block_nfs_vg-ha_block_exports_lv_without_vdo DRBD_MINOR=3\ 4 DRBD_MINOR_3=3 DRBD_MINOR_4=4 DRBD_MY_ADDRESS=192.168.0.8 DRBD_MY_AF=ipv4 DRBD_MY_NODE_ID=28 DRBD_NODE_ID_27=memverge DRBD_NODE_ID_28=memverge2 DRBD_PEER_ADDRESS=192.168.0.6 DRBD_PEER_AF=ipv4 DRBD_PEER_NODE_ID=27 DRBD_RESOURCE=ha-iscsi DRBD_VOLUME=3\ 4 UP_TO_DATE_NODES=0x10000000 /usr/lib/drbd/crm-fence-peer.9.sh Feb 2 12:46:36 memverge2 crm-fence-peer.9.sh[7333]: INFO Concurrency check: Peer is already marked clean/fenced by another resource. Returning success (Exit 4). Feb 2 12:46:36 memverge2 crm-fence-peer.9.sh[7332]: INFO Concurrency check: Peer is already marked clean/fenced by another resource. Returning success (Exit 4). Feb 2 12:46:36 memverge2 kernel: drbd ha-iscsi memverge: helper command: /sbin/drbdadm fence-peer exit code 4 (0x400) Feb 2 12:46:36 memverge2 kernel: drbd ha-iscsi memverge: fence-peer helper returned 4 (peer was fenced) Feb 2 12:46:36 memverge2 kernel: drbd ha-nfs memverge: helper command: /sbin/drbdadm fence-peer exit code 4 (0x400) Feb 2 12:46:36 memverge2 kernel: drbd ha-nfs memverge: fence-peer helper returned 4 (peer was fenced) Feb 2 12:46:37 memverge2 pacemaker-fenced[3902]: notice: Operation 'reboot' [7068] targeting memverge using ipmi-fence-memverge returned 0 Feb 2 12:46:37 memverge2 pacemaker-fenced[3902]: notice: Operation 'reboot' targeting memverge by memverge2 for pacemaker-controld.3906@memverge2<mailto:pacemaker-controld.3906@memverge2>: Result arrived too late [root@memverge2 ~]# Anton
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
