Thanks. This implies that I officially do not understand what it is that fencing can do for me, in my simple cluster. Back to the drawing board.
On Wed, Apr 17, 2019 at 3:33 PM digimer <[email protected]> wrote: > Fencing requires some mechanism, outside the nodes themselves, that can > terminate the nodes. Typically, IPMI (iLO, iRMC, RSA, DRAC, etc) is used > for this. Alternatively, switched PDUs are common. If you don't have these > but do have a watchdog timer on your nodes, SBD (storage-based death) can > work. > > You can use 'fence_<device> <options> -o status' at the command line to > figure out the what will work with your hardware. Once you can called > 'fence_foo ... -o status' and get the status of each node, then translating > that into a pacemaker configuration is pretty simple. That's when you > enable stonith. > > Once stonith is setup and working in pacemaker (ie: you can crash a node > and the peer reboots it), then you will go to DRBD and set 'fencing: > resource-and-stonith;' (tells DRBD to block on communication failure with > the peer and request a fence), and then setup the 'fence-handler > /path/to/crm-fence-peer.sh' and 'unfence-handler > /path/to/crm-unfence-handler.sh' (I am going from memory, check the man > page to verify syntax). > > With all this done; if either pacemaker/corosync or DRBD lose contact with > the peer, they will block and fence. Only after the peer has been confirmed > terminated will IO resume. This way, split-nodes become effectively > impossible. > > digimer > On 2019-04-17 5:17 p.m., JCA wrote: > > Here is what I did: > > # pcs stonith create disk_fencing fence_scsi pcmk_host_list="one two" > pcmk_monitor_action="metadata" pcmk_reboot_action="off" > devices="/dev/disk/by-id/ata-VBOX_HARDDISK_VBaaa429e4-514e8ecb" meta > provides="unfencing" > > where ata-VBOX-... corresponds to the device where I have the partition > that is shared between both nodes in my cluster. The command completes > without any errors (that I can see) and after that I have > > # pcs status > Cluster name: ClusterOne > Stack: corosync > Current DC: one (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with > quorum > Last updated: Wed Apr 17 14:35:25 2019 > Last change: Wed Apr 17 14:11:14 2019 by root via cibadmin on one > > 2 nodes configured > 5 resources configured > > Online: [ one two ] > > Full list of resources: > > MyCluster (ocf::myapp:myapp-script): Stopped > Master/Slave Set: DrbdDataClone [DrbdData] > Stopped: [ one two ] > DrbdFS (ocf::heartbeat:Filesystem): Stopped > disk_fencing (stonith:fence_scsi): Stopped > > Daemon Status: > corosync: active/enabled > pacemaker: active/enabled > pcsd: active/enabled > > Things stay that way indefinitely, until I set stonith-enabled to false - > at which point all the resources above get started immediately. > > Obviously, I am missing something big here. But, what is it? > > > On Wed, Apr 17, 2019 at 2:59 PM Adam Budziński <[email protected]> > wrote: > >> You did not configure any fencing device. >> >> śr., 17.04.2019, 22:51 użytkownik JCA <[email protected]> napisał: >> >>> I am trying to get fencing working, as described in the "Cluster from >>> Scratch" guide, and I am stymied at get-go :-( >>> >>> The document mentions a property named stonith-enabled. When I was >>> trying to get my first cluster going, I noticed that my resources would >>> start only when this property is set to false, by means of >>> >>> # pcs property set stonith-enabled=false >>> >>> Otherwise, all the resources remain stopped. >>> >>> I created a fencing resource for the partition that I am sharing across >>> the the nodes, by means of DRBD. This works fine - but I still have the >>> same problem as above - i.e. when stonith-enabled is set to true, all the >>> resources get stopped, and remain in that state. >>> >>> I am very confused here. Can anybody point me in the right direction out >>> of this conundrum? >>> >>> >>> >>> _______________________________________________ >>> Manage your subscription: >>> https://lists.clusterlabs.org/mailman/listinfo/users >>> >>> ClusterLabs home: https://www.clusterlabs.org/ >> >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > > _______________________________________________ > Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
