Hi,
When I run the "pcs cluster stop --all", it will hang and there is no any
response sometimes. The log is as below. Could we find the reason why it hangs
from the log and how to make the cluster stop right now?
[root@node2 pg_log]# pcs status
Cluster name: hgpurog
Stack: corosync
Current DC: sds1 (version 1.1.16-12.el7-94ff4df) - partition with quorum
Last updated: Fri May 11 01:11:26 2018
Last change: Fri May 11 01:09:24 2018 by hacluster via crmd on sds1
2 nodes configured
3 resources configured
Online: [ sds1 sds2 ]
Full list of resources:
Master/Slave Set: pgsql-ha [pgsqld]
Stopped: [ sds1 sds2 ]
Resource Group: mastergroup
master-vip (ocf::heartbeat:IPaddr2): Started sds1
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root@node2 pg_log]# pcs cluster stop --all
The /var/log/messages is as asbelow:
May 11 01:07:50 node2 crmd[5365]: notice: State transition S_PENDING ->
S_NOT_DC
May 11 01:07:50 node2 crmd[5365]: notice: State transition S_NOT_DC ->
S_PENDING
May 11 01:07:50 node2 crmd[5365]: notice: State transition S_PENDING ->
S_NOT_DC
May 11 01:07:51 node2 pgsqlms(pgsqld)[5371]: INFO: Execute action monitor and
the result 7
May 11 01:07:51 node2 pgsqlms(undef)[5408]: INFO: Execute action meta-data and
the result 0
May 11 01:07:51 node2 crmd[5365]: notice: Result of probe operation for pgsqld
on sds2: 7 (not running)
May 11 01:07:51 node2 crmd[5365]: notice: sds2-pgsqld_monitor_0:6 [ /tmp:5866
- no response\n ]
May 11 01:07:51 node2 crmd[5365]: notice: Result of probe operation for
master-vip on sds2: 7 (not running)
May 11 01:10:02 node2 systemd: Started Session 16 of user root.
May 11 01:10:02 node2 systemd: Starting Session 16 of user root.
May 11 01:11:33 node2 pacemakerd[5357]: notice: Caught 'Terminated' signal
May 11 01:11:33 node2 systemd: Stopping Pacemaker High Availability Cluster
Manager...
May 11 01:11:33 node2 pacemakerd[5357]: notice: Shutting down Pacemaker
May 11 01:11:33 node2 pacemakerd[5357]: notice: Stopping crmd
May 11 01:11:33 node2 crmd[5365]: notice: Caught 'Terminated' signal
May 11 01:11:33 node2 crmd[5365]: notice: Shutting down cluster resource
manager
May 11 01:12:49 node2 systemd: Started Session 17 of user root.
May 11 01:12:49 node2 systemd-logind: New session 17 of user root.
May 11 01:12:49 node2 gdm-launch-environment]: AccountsService: ActUserManager:
user (null) has no username (object path: /org/freedesktop/Accounts/User0, uid:
0)
May 11 01:12:49 node2 journal: ActUserManager: user (null) has no username
(object path: /org/freedesktop/Accounts/User0, uid: 0)
May 11 01:12:49 node2 systemd: Starting Session 17 of user root.
May 11 01:12:49 node2 dbus[648]: [system] Activating service
name='org.freedesktop.problems' (using servicehelper)
May 11 01:12:49 node2 dbus-daemon: dbus[648]: [system] Activating service
name='org.freedesktop.problems' (using servicehelper)
May 11 01:12:49 node2 dbus[648]: [system] Successfully activated service
'org.freedesktop.problems'
May 11 01:12:49 node2 dbus-daemon: dbus[648]: [system] Successfully activated
service 'org.freedesktop.problems'
May 11 01:12:49 node2 journal: g_dbus_interface_skeleton_unexport: assertion
'interface_->priv->connections != NULL' failed
Here is the log in the peer node
May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: WARNING: No secondary connected
to the master
May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: WARNING: "sds2" is not connected
to the primary
May 11 01:09:08 node1 pgsqlms(pgsqld)[28599]: INFO: Execute action monitor and
the result 8
May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: WARNING: No secondary connected
to the master
May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: WARNING: "sds2" is not connected
to the primary
May 11 01:09:18 node1 pgsqlms(pgsqld)[28679]: INFO: Execute action monitor and
the result 8
May 11 01:09:24 node1 crmd[1111]: notice: sds1-pgsqld_monitor_10000:19 [
/tmp:5866 - accepting connections\n ]
May 11 01:09:24 node1 crmd[1111]: notice: Transition aborted by deletion of
lrm_resource[@id='pgsqld']: Resource state removal
May 11 01:10:02 node1 systemd: Started Session 17 of user root.
May 11 01:10:02 node1 systemd: Starting Session 17 of user root.
May 11 01:11:33 node1 pacemakerd[1042]: notice: Caught 'Terminated' signal
May 11 01:11:33 node1 systemd: Stopping Pacemaker High Availability Cluster
Manager...
May 11 01:11:33 node1 pacemakerd[1042]: notice: Shutting down Pacemaker
May 11 01:11:33 node1 pacemakerd[1042]: notice: Stopping crmd
May 11 01:11:33 node1 crmd[1111]: notice: Caught 'Terminated' signal
May 11 01:11:33 node1 crmd[1111]: notice: Shutting down cluster resource
manager
May 11 01:11:33 node1 crmd[1111]: warning: Input I_SHUTDOWN received in state
S_TRANSITION_ENGINE from crm_shutdown
_______________________________________________
Users mailing list: [email protected]
https://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org