On Fri, 2021-08-06 at 14:53 +0200, Ulrich Windl wrote: > Hi! > > I had this unecxpected behavior this morning: > A VM resource failed to stop and the cluster node (hypervisor) was > fenced. > However there were VMs runninf that could have been live-migrated. > That wasn't even tried. > > On another occasion when multiple VMs were to be live-migrated and > one failed, > I thought the cluster would wait for the migrations to finish before > issuing the fence command. > > So a few questions: > Is it intended that no migrations are attempted before fencing a > node?
Yes, it's assumed that if a node needs fencing, it can't be relied on to do anything properly. > If so, is there a way to make the cluster attempt migration before > fencing? No > Is it true that the cluster will delay fencing while there are > outstanding (no finished) operations? It depends. Was the fencing scheduled in the same transition as the pending operation? If not, the cluster will have to wait for the pending operation to complete before it can start a new transition that will schedule the fencing. Can the cluster execute multiple actions simultaneously (per batch- limit, CPU core count and load, etc.)? If not, the fencing might have to wait for the pending operation. Was the fencing scheduled by the cluster itself (vs DLM or stonith_admin)? If not, none of the above applies, and the fencing won't wait for the pending operation. > OS being used is SLES15 SP2 (pacemaker version > 2.0.4+20200616.2deceaa3a-3.9.1-2.0.4+20200616.2deceaa3a, corosync- > 2.4.5-6.3.2.x86_64, resource-agents-4.4.0+git57.70549516- > 3.23.1.x86_64) > > Regards, > Ulrich -- Ken Gaillot <[email protected]> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
