Hi! I'm sorry that DLM/cLVM does not work for you. Did you double-check the configuration (meta interleave=true, colocation and ordering), especially the clones? Also as you have shared storage, why don't you use SBD for fencing?
Regards, Ulrich >>> Patrick Whitney <[email protected]> schrieb am 01.10.2018 um 22:01 in Nachricht <cae0zlk_va6gthz9tg3woecua2ridaehaoq8ieqdz4meokcy...@mail.gmail.com>: > Hi Ulrich, > > When I first encountered this issue, I posted this: > > https://lists.clusterlabs.org/pipermail/users/2018-September/015637.html > > ... I was using resource fencing in this example, but, as I've mentioned > before, the issue would come about, not when fencing occurred, but when the > fenced node was shutdown (we were using resource fencing). > > During that discussion, yourself and others suggested that power fencing > was the only way DLM was going to cooperate and one suggestion of using > meatware was proposed. > > Unfortunately, I found out later that meatware was no longer available ( > https://lists.clusterlabs.org/pipermail/users/2018-September/015715.html), > so we were lucky enough our test environment is a KVM/libvirt environment, > so I used fence_virsh. Again, I had the same problem... when the "bad" > node was fenced, dlm_controld would issue (what appears to be) a fence_all, > and I would receive messages that that the dlm clone was down on all > members and would have a log message that the clvm lockspace was > abandoned. > > It was only when I disabled fencing for dlm (enable_fencing=0 in dlm.conf; > but kept fencing enabled in pcmk) did things begin to work as expected. > > One suggestion earlier in this thread suggests trying the dlm configuration > of disabling startup fencing (enable_startup_fencing=0), which sounds like > a plausible solution after looking over the logs, but I haven't tested > yet. > > The conclusion I'm coming to is: > 1. The reason DLM cannot handle resource fencing is because it keeps its > own "heartbeat/control" channel (for lack of a better term) via the > network, and pcmk cannot instruct DLM "Don't worry about that guy over > there" which means we must use power fencing, but; > 2. DLM does not like to see one of its members disappear; when that does > happen, DLM does "something" which causes the lockspace to disappear... > unless you disable fencing for DLM. > > I am now speculating that DLM restarts when the communications fail, and > the theory that disabling startup fencing for DLM > (enable_startup_fencing=0) may be the solution to my problem (reverting my > enable_fencing=0 DLM config). > > Best, > -Pat > > On Mon, Oct 1, 2018 at 3:38 PM Ulrich Windl < > [email protected]> wrote: > >> Hi! >> >> It would be much more helpful, if you could provide logs around the >> problem events. Personally I think you _must_ implement proper fencing. In >> addition, DLM seems to do its own fencing when there is a communication >> problem. >> >> Regards, >> Ulrich >> >> >> >>> Patrick Whitney <[email protected]> 01.10.18 16.25 Uhr >>> >> Hi Everyone, >> >> I wanted to solicit input on my configuration. >> >> I have a two node (test) cluster running corosync/pacemaker with DLM and >> CLVM. >> >> I was running into an issue where when one node failed, the remaining node >> would appear to do the right thing, from the pcmk perspective, that is. >> It would create a new cluster (of one) and fence the other node, but >> then, rather surprisingly, DLM would see the other node offline, and it >> would go offline itself, abandoning the lockspace. >> >> I changed my DLM settings to "enable_fencing=0", disabling DLM fencing, and >> our tests are now working as expected. >> >> I'm a little concern I have masked an issue by doing this, as in all of the >> tutorials and docs I've read, there is no mention of having to configure >> DLM whatsoever. >> >> Is anyone else running a similar stack and can comment? >> >> Best, >> -Pat >> -- >> Patrick Whitney >> DevOps Engineer -- Tools >> >> _______________________________________________ >> Users mailing list: [email protected] >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > > > -- > Patrick Whitney > DevOps Engineer -- Tools _______________________________________________ Users mailing list: [email protected] https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
