I don't know why this happens, but I encounter this often.  My workaround is 
this:

killall -9 pacemakerd; killall pengine; killall lrmd; killall cib; killall 
corosync

> On May 10, 2018, at 11:26 PM, 范国腾 <[email protected]> wrote:
> 
> Hi,
> 
> When I run the "pcs cluster stop --all", it will hang and there is no any 
> response sometimes. The log is as below. Could we find the reason why it 
> hangs from the log and how to make the cluster stop right now? 
> 
> [root@node2 pg_log]# pcs status
> Cluster name: hgpurog
> Stack: corosync
> Current DC: sds1 (version 1.1.16-12.el7-94ff4df) - partition with quorum
> Last updated: Fri May 11 01:11:26 2018
> Last change: Fri May 11 01:09:24 2018 by hacluster via crmd on sds1
> 
> 2 nodes configured
> 3 resources configured
> 
> Online: [ sds1 sds2 ]
> 
> Full list of resources:
> 
> Master/Slave Set: pgsql-ha [pgsqld]
>     Stopped: [ sds1 sds2 ]
> Resource Group: mastergroup
>     master-vip (ocf::heartbeat:IPaddr2):       Started sds1
> 
> Daemon Status:
>  corosync: active/enabled
>  pacemaker: active/enabled
>  pcsd: active/enabled
> [root@node2 pg_log]# pcs cluster stop --all
> 
> 
> The /var/log/messages is as asbelow:
> May 11 01:07:50 node2 crmd[5365]:  notice: State transition S_PENDING -> 
> S_NOT_DC
> May 11 01:07:50 node2 crmd[5365]:  notice: State transition S_NOT_DC -> 
> S_PENDING
> May 11 01:07:50 node2 crmd[5365]:  notice: State transition S_PENDING -> 
> S_NOT_DC
> May 11 01:07:51 node2 pgsqlms(pgsqld)[5371]: INFO: Execute action monitor and 
> the result 7
> May 11 01:07:51 node2 pgsqlms(undef)[5408]: INFO: Execute action meta-data 
> and the result 0
> May 11 01:07:51 node2 crmd[5365]:  notice: Result of probe operation for 
> pgsqld on sds2: 7 (not running)
> May 11 01:07:51 node2 crmd[5365]:  notice: sds2-pgsqld_monitor_0:6 [ 
> /tmp:5866 - no response\n ]
> May 11 01:07:51 node2 crmd[5365]:  notice: Result of probe operation for 
> master-vip on sds2: 7 (not running)
> May 11 01:10:02 node2 systemd: Started Session 16 of user root.
> May 11 01:10:02 node2 systemd: Starting Session 16 of user root.
> May 11 01:11:33 node2 pacemakerd[5357]:  notice: Caught 'Terminated' signal
> May 11 01:11:33 node2 systemd: Stopping Pacemaker High Availability Cluster 
> Manager...
> May 11 01:11:33 node2 pacemakerd[5357]:  notice: Shutting down Pacemaker
> May 11 01:11:33 node2 pacemakerd[5357]:  notice: Stopping crmd
> May 11 01:11:33 node2 crmd[5365]:  notice: Caught 'Terminated' signal
> May 11 01:11:33 node2 crmd[5365]:  notice: Shutting down cluster resource 
> manager
> May 11 01:12:49 node2 systemd: Started Session 17 of user root.
> May 11 01:12:49 node2 systemd-logind: New session 17 of user root.
> May 11 01:12:49 node2 gdm-launch-environment]: AccountsService: 
> ActUserManager: user (null) has no username (object path: 
> /org/freedesktop/Accounts/User0, uid: 0)
> May 11 01:12:49 node2 journal: ActUserManager: user (null) has no username 
> (object path: /org/freedesktop/Accounts/User0, uid: 0)
> May 11 01:12:49 node2 systemd: Starting Session 17 of user root.
> May 11 01:12:49 node2 dbus[648]: [system] Activating service 
> name='org.freedesktop.problems' (using servicehelper)
> May 11 01:12:49 node2 dbus-daemon: dbus[648]: [system] Activating service 
> name='org.freedesktop.problems' (using servicehelper)
> May 11 01:12:49 node2 dbus[648]: [system] Successfully activated service 
> 'org.freedesktop.problems'
> May 11 01:12:49 node2 dbus-daemon: dbus[648]: [system] Successfully activated 
> service 'org.freedesktop.problems'
> May 11 01:12:49 node2 journal: g_dbus_interface_skeleton_unexport: assertion 
> 'interface_->priv->connections != NULL' failed
> 
> Here is the log in the peer node
> May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: WARNING: No secondary connected 
> to the master
> May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: WARNING: "sds2" is not 
> connected to the primary
> May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: INFO: Execute action monitor 
> and the result 8
> May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: WARNING: No secondary connected 
> to the master
> May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: WARNING: "sds2" is not 
> connected to the primary
> May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: INFO: Execute action monitor 
> and the result 8
> May 11 01:09:24 node1 crmd[1111]:  notice: sds1-pgsqld_monitor_10000:19 [ 
> /tmp:5866 - accepting connections\n ]
> May 11 01:09:24 node1 crmd[1111]:  notice: Transition aborted by deletion of 
> lrm_resource[@id='pgsqld']: Resource state removal
> May 11 01:10:02 node1 systemd: Started Session 17 of user root.
> May 11 01:10:02 node1 systemd: Starting Session 17 of user root.
> May 11 01:11:33 node1 pacemakerd[1042]:  notice: Caught 'Terminated' signal
> May 11 01:11:33 node1 systemd: Stopping Pacemaker High Availability Cluster 
> Manager...
> May 11 01:11:33 node1 pacemakerd[1042]:  notice: Shutting down Pacemaker
> May 11 01:11:33 node1 pacemakerd[1042]:  notice: Stopping crmd
> May 11 01:11:33 node1 crmd[1111]:  notice: Caught 'Terminated' signal
> May 11 01:11:33 node1 crmd[1111]:  notice: Shutting down cluster resource 
> manager
> May 11 01:11:33 node1 crmd[1111]: warning: Input I_SHUTDOWN received in state 
> S_TRANSITION_ENGINE from crm_shutdown
> 
> 
> _______________________________________________
> Users mailing list: [email protected]
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: [email protected]
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to