On 08/14/2017 12:20 PM, Ulrich Windl wrote: > Hi! > > Have you tried studying the logs? Usually you get useful information from > there (to share!). > > Regards, > Ulrich > >>>> Edwin Török <[email protected]> schrieb am 14.08.2017 um 11:51 in > Nachricht <[email protected]>: >> Hi, >> >> >> When setting up a cluster with just 1 node with auto-tie-breaker and >> DLM, and incrementally adding more I got some unexpected fencing if the >> 2nd node doesn't join the cluster soon enough. >> >> What I also found surprising is that if the cluster has ever seen 2 >> nodes, then turning off the 2nd node works fine and doesn't cause >> fencing (using auto-tie-breaker). >> >> >> I have a hardware watchdog, and can reproduce the problem with these (or >> older) versions and sequence of steps: >> >> corosync-2.4.0-9.el7.x86_64 >> pacemaker-1.1.16-12.el7.x86_64 >> sbd-1.3.0-3.el7.x86_64 >> pcs-0.9.158-6.el7.x86_64 >> >> pcs cluster destroy >> rm /var/lib/corosync/* -f >> pcs cluster auth -u hacluster cluster1 cluster2 >> pcs cluster setup --name cluster cluster1 --auto_tie_breaker=1 >> pcs stonith sbd enable
How does your /etc/sysconfig/sbd look like? With just that pcs-command you get some default-config with watchdog-only-support. Without cluster-property stonith-watchdog-timeout set to a value matching (twice is a good choice) the watchdog-timeout configured in /etc/sysconfig/sbd (default = 5s) a node will never assume the unseen partner as fenced. Anyway watchdog-only-sbd is of very limited use in 2-node scenarios. Kind of limits the availability to the one of the node that would win the tie_breaker-game. But might still be useful in certain scenarios of course. (like load-sharing ...) >> pcs cluster start --all >> pcs property set no-quorum-policy=ignore >> # or pcs property set no-quorum-policy=freeze >> # or pcs property set no-quorum-policy=suicide >> pcs resource create dlm ocf:pacemaker:controld op monitor interval=30s >> on-fail=fence clone interleave=true ordered=true >> while ! dlm_tool join testls; do sleep 1; done >> crm_mon -1 >> pcs cluster node add cluster2& >> journalctl --follow >> >> >> What am I doing wrong, and how can I avoid fencing? >> I thought that setting no-quorum-policy to ignore would prevent this (if That will just prevents self-fencing in case of lost quorum. Other reasons for self-fencing are still possible. e.g. failing of dlm in your case or a node becoming unclean. Regards, Klaus >> I have just 1 node I don't really need fencing until the 2nd node is >> actually up), but if there are any active DLM lockspaces that doesn't >> seem to be the case. >> >> Thanks, >> --Edwin >> >> _______________________________________________ >> Users mailing list: [email protected] >> http://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > Users mailing list: [email protected] > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
