Hi Robert,
Your corosync.conf looks fine to me, node app3 has been removed and
two_node flag has been set properly. Try running 'pcs cluster reload
corosync' and then check the quorum status. If it doesn't get fixed,
take a look at /var/log/cluster/corosync.log to see if there are any
issues reported.
Regards,
Tomas
Dne 31. 05. 21 v 20:55 Hayden,Robert napsal(a):
-----Original Message-----
From: Users <[email protected]> On Behalf Of Tomas Jelinek
Sent: Monday, May 31, 2021 6:29 AM
To: [email protected]
Subject: Re: [ClusterLabs] Quorum when reducing cluster from 3 nodes to 2
nodes
Hi Robert,
Can you share your /etc/corosync/corosync.conf file? Also check if it's
the same on all nodes.
I verified that the corosync.conf file is the same across the nodes. As part of the troubleshooting, I
manually ran the command "crm_node --remove=app3 --force" to remove the third node from the
corosync configuration. My concern is around why the quorum number did not auto downgrade to a value of
"1", especially since we run with the "last_man_standing" flag. I suspect the issue is
in the two-node special case. That is, if I was removing a node from a 4+ node cluster, I would not have had
an issue.
Here is the information you requested, slightly redacted for security.
root:@app1:/root
#20:45:02 # cat /etc/corosync/corosync.conf
totem {
version: 2
cluster_name: XXXX_app_2
secauth: off
transport: udpu
token: 61000
}
nodelist {
node {
ring0_addr: app1
nodeid: 1
}
node {
ring0_addr: app2
nodeid: 3
}
}
quorum {
provider: corosync_votequorum
wait_for_all: 1
last_man_standing: 1
two_node: 1
}
logging {
to_logfile: yes
logfile: /var/log/cluster/corosync.log
to_syslog: yes
}
root:@app1:/root
#20:45:12 # ssh app2 md5sum /etc/corosync/corosync.conf
d69b80cd821ff44224b56ae71c5d731c /etc/corosync/corosync.conf
root:@app1:/root
#20:45:30 # md5sum /etc/corosync/corosync.conf
d69b80cd821ff44224b56ae71c5d731c /etc/corosync/corosync.conf
Thanks
Robert
Dne 26. 05. 21 v 17:48 Hayden,Robert napsal(a):
I had a SysAdmin reduce the number of nodes in a OL 7.9 cluster from
three nodes to two nodes.
From internal testing, I found the following commands would work and
the 2Node attribute would be automatically added. The other cluster
parameters we use are WaitForAll and LastManStanding.
pcs resource disable res_app03
pcs resource delete res_app03
pcs cluster node remove app03
pcs stonith delete fence_app03
Unfortunately, real world didn't go as planned. I am unsure if the
commands were ran out of order or something else was going on (e.g.
unexpected location constraints). When I got involved, I noticed that
pcs status had the app3 node in an OFFLINE state, but the pcs cluster
node remove app03 command was successful. I noticed some leftover
location constraints from past "moves" of resources. I manually removed
those constraints and I ended up removing the app03 node from the
corosync configuration with "crm_node --remove=app3 --force" command.
This removes the node from pacemaker config, not from corosync config.
Regards,
Tomas
Now pcs status no longer shows any information for app3 and crm_node
-l does not show app3.
My concern is with Quorum. From the pcs quorum status output below, I
still see Quorum set at 2 (expected to be 1) and the 2Node attribute was
not added. Am I stuck in this state until the next full cluster
downtime? Or is there a way to manipulate the expected quorum votes in
the run time cluster?
#17:25:08 # pcs quorum status
Quorum information
------------------
Date: Wed May 26 17:25:16 2021
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 3
Ring ID: 1/85
Quorate: Yes
Votequorum information
----------------------
Expected votes: 2
Highest expected: 2
Total votes: 2
Quorum: 2
Flags: Quorate WaitForAll LastManStanding
Membership information
----------------------
Nodeid Votes Qdevice Name
1 1 NR app1
3 1 NR app2 (local)
CONFIDENTIALITY NOTICE This message and any included attachments are from
Cerner Corporation and are intended only for the addressee. The information
contained in this message is confidential and may constitute inside or
non-public information under international, federal, or state securities laws.
Unauthorized forwarding, printing, copying, distribution, or use of such
information is strictly prohibited and may be unlawful. If you are not the
addressee, please promptly delete this message and notify the sender of the
delivery error by e-mail or you may call Cerner's corporate offices in Kansas
City, Missouri, U.S.A at (+1) (816)221-1024.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/