On 02/22/2018 02:55 PM, [email protected] wrote: > Hi, > > I am trying to configure the failure-timeout for stonith, but I only can do > it for the other resources. > When try to enable it for stonith, I get this error: "Error: resource > option(s): 'failure-timeout', are not recognized for resource type: > 'stonith::fence_vmware_soap'".
It is a meta-attribute thus 'pcs stonith update ... meta failure-timeout=...' should work. Although I'm not 100% sure if it is being adhered properly. Regards, Klaus > > Thanks. > > 22 de febrero de 2018 13:46, "Andrei Borzenkov" <[email protected]> > escribió: > >> On Thu, Feb 22, 2018 at 2:40 PM, <[email protected]> wrote: >> >>> Thanks for the responses. >>> >>> So, if I understand, this is the right behaviour and it does not affect to >>> the stonith mechanism. >>> >>> If I remember correctly, the fault status persists for hours until I fix it >>> manually. >>> Is there any way to modify the expiry time to clean itself?. >> Yes, as mentioned set failure-timeout resource meta-attribute. >> >>> 22 de febrero de 2018 12:28, "Andrei Borzenkov" <[email protected]> >>> escribió: >>> >>>> Stonith resource state should have no impact on actual stonith >>>> operation. It only reflects whether monitor was successful or not and >>>> serves as warning to administrator that something may be wrong. It >>>> should automatically clear itself after failure-timeout has expired. >>>> >>>> On Thu, Feb 22, 2018 at 1:58 PM, <[email protected]> wrote: >>> Hi, >>> >>> I have a 2 node pacemaker cluster configured with the fence agent >>> vmware_soap. >>> Everything works fine until the vCenter is restarted. After that, stonith >>> fails and stop. >>> >>> [root@node1 ~]# pcs status >>> Cluster name: psqltest >>> Stack: corosync >>> Current DC: node2 (version 1.1.16-12.el7_4.7-94ff4df) - partition with >>> quorum >>> Last updated: Thu Feb 22 11:30:22 2018 >>> Last change: Mon Feb 19 09:28:37 2018 by root via crm_resource on node1 >>> >>> 2 nodes configured >>> 6 resources configured >>> >>> Online: [ node1 node2 ] >>> >>> Full list of resources: >>> >>> Master/Slave Set: ms_drbd_psqltest [drbd_psqltest] >>> Masters: [ node1 ] >>> Slaves: [ node2 ] >>> Resource Group: pgsqltest >>> psqltestfs (ocf::heartbeat:Filesystem): Started node1 >>> psqltest_vip (ocf::heartbeat:IPaddr2): Started node1 >>> postgresql-94 (ocf::heartbeat:pgsql): Started node1 >>> vmware_soap (stonith:fence_vmware_soap): Stopped >>> >>> Failed Actions: >>> * vmware_soap_start_0 on node1 'unknown error' (1): call=38, status=Error, >>> exitreason='none', >>> last-rc-change='Thu Feb 22 10:55:46 2018', queued=0ms, exec=5374ms >>> * vmware_soap_start_0 on node2 'unknown error' (1): call=56, status=Error, >>> exitreason='none', >>> last-rc-change='Thu Feb 22 10:55:39 2018', queued=0ms, exec=5479ms >>> >>> Daemon Status: >>> corosync: active/enabled >>> pacemaker: active/enabled >>> pcsd: active/enabled >>> >>> [root@node1 ~]# pcs stonith show --full >>> Resource: vmware_soap (class=stonith type=fence_vmware_soap) >>> Attributes: inet4_only=1 ipaddr=192.168.1.1 ipport=443 login=MYDOMAIN\User >>> passwd=mypass pcmk_host_list=node1,node2 power_wait=3 ssl_insecure=1 action= >>> pcmk_list_timeout=120s pcmk_monitor_timeout=120s pcmk_status_timeout=120s >>> Operations: monitor interval=60s (vmware_soap-monitor-interval-60s) >>> >>> I need to manually perform a "resource cleanup vmware_soap" to put it online >>> again. >>> Is there any way to do this automatically?. >>> Is it possible to detect vSphere online again and enable stonith?. >>> >>> Thanks. >>> >>> _______________________________________________ >>> Users mailing list: [email protected] >>> https://lists.clusterlabs.org/mailman/listinfo/users >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>>> _______________________________________________ >>>> Users mailing list: [email protected] >>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>> _______________________________________________ >>> Users mailing list: [email protected] >>> https://lists.clusterlabs.org/mailman/listinfo/users >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> _______________________________________________ >> Users mailing list: [email protected] >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > _______________________________________________ > Users mailing list: [email protected] > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: [email protected] https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
