On Mon, 2018-12-03 at 00:05 -0700, Casey & Gina wrote: > So I've been using the fence_vmware_rest fence agent for a long while > now. It seems to work great, except that after a few days or weeks, > a given cluster will end up showing it as failed and stopped. > > For whatever reason, fencing continues to work when needed, but > seeing the fence agent errored out and stopped when looking at the > cluster status is alarming.
The fence resource is used primarily to control recurring monitoring of the device. The only effect it has on the use of the device for fencing is that disabling the resource (or banning it from running anywhere) will make the cluster stop using the device. So, having the resource stopped means the device is no longer being monitored, which is bad but doesn't prevent it from being used for fencing. > I'm hoping to write a cron script that detects when there are failed > actions for this agent, and issue a `pcs resource cleanup > vmware_fence` when there are, which always seems to fix things back > to normal. > > Is there a Pacemaker command that will show *just* the failed > actions, as opposed to the combined output of utilities like crm_mon > and pcs status? > > Thank you, You probably want to use crm_mon -X to generate XML output and grab the part you're interested in. It's a stable machine-readable interface as opposed to the usual user-friendly output that can change release to release. -- Ken Gaillot <[email protected]> _______________________________________________ Users mailing list: [email protected] https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
