[ClusterLabs] pacemaker and cluster hostname reconfiguration

Riccardo Manfrin Thu, 01 Oct 2020 01:41:00 -0700

Ciao,

I'm among the people that have to deal with with the in-famous two nodes problem (http://www.beekhof.net/blog/2018/two-node-problems).

I am not sure if to open a bug for this.. so I'm first off reporting on the list.. in the hope to get fast feedback.

Problem statement

I have a cluster made by two nodes with a DRBD shared partition which some resources (systemd services) have to stick to.

Software versions

corosync -v
Corosync Cluster Engine, version '2.4.5'
Copyright (c) 2006-2009 Red Hat, Inc.
pacemakerd --version
Pacemaker 1.1.21-4.el7
drbdadm --version
DRBDADM_BUILDTAG=GIT-hash:\ fb98589a8e76783d2c56155c645dbaf02ac7ece7\ build\ by\ mockbuild@\,\ 2020-04-05\ 03:21:05
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090010
DRBD_KERNEL_VERSION=9.0.16
DRBDADM_VERSION_CODE=0x090c02
DRBDADM_VERSION=9.12.2

corosync.conf nodes:

nodelist {
    node {
        ring0_addr: 10.1.3.1
        nodeid: 1
    }
    node {
        ring0_addr: 10.1.3.2
        nodeid: 2
    }
}
quorum {
    provider: corosync_votequorum
    two_node: 1
}

drbd nodes config:

resource myresource {

volume 0 {
    device    /dev/drbd0;
    disk      /dev/mapper/vg0-res--etc;
    meta-disk internal;
}

on 123z555666y0 {
    node-id 0;
    address 10.1.3.1:7789;
}

on 123z555666y1 {
    node-id 1;
    address 10.1.3.2:7789;
}

connection {
    host 123z555666y0;
    host 123z555666y1;
}

handlers {
    before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh";
    after-resync-target "/usr/lib/drbd/unsnapshot-resync-target-lvm.sh";
}

}

I need to reconfigure the hostname of both the nodes of the cluster. I've gathered some literature around

but have not yet found a way to address this (unless with simultaneous reboot of both nodes).

The procedure:

Update the hostname on both Master and Slave nodes

update /etc/hostname
update /etc/hosts
update system with hostname -F /etc/hostname

Reconfigure drbd on Master and Slave nodes

modify drbd.01.conf (attached) to reflect new hostname
invoke drbdadm adjust all

Update pacemaker config on Master node only

crm configure property maintenance-mode=true
crm configure delete --force 1
crm configure delete --force 2
crm configure xml ' <node id="1" uname="newhostname0">
        <instance_attributes id="node-1">
          <nvpair id="node-1-standby" name="standby" value="off"/>
        </instance_attributes>
      </node>'
crm configure xml ' <node id="2" uname="newhostname1">
        <instance_attributes id="node-2">
          <nvpair id="node-2-standby" name="standby" value="off"/>
        </instance_attributes>
      </node>'
crm resource reprobe
crm configure refresh
crm configure property maintenance-mode=false

Let's say for example that I migrate the hostnames like this

hostname10 -> hostname20
hostname11 -> hostname21

After the above procedure is concluded the cluster is correctly reconfigured and when I check with crm_mon or crm status or crm configure show xml or even by inspecting the cib.xml I find the proper new hostnames fetched by pacemaker/corosync (hostname20 and hostname21).

The documentation reports that pacemaker node name is taken from

corosync.conf nodelist->ring0_addr if not an ip address: NOT MY CASE => skip
corosync.conf nodelist->name if available: NOT MY CASE => skip
uname -n [SHOULD BE IN HERE]

Apparently case number 3 does not apply:

[root@hostname20 ~]# crm_node -n
hostname10
[root@hostname20 ~]# uname -n
hostname20

This becomes evident as soon as I reboot/poweroff one of the two nodes: crm_mon which after the reconfiguration was correctly showing

Online: [ hostname21 hostname20 ]

"rolls back" the configuration without any notice and starts showing the old one

Online: [ hostname10 ]
OFFLINE: [ hostname11 ]

Do you have any idea of where on heath pacemaker is recovering the old hostnames ?

I've even checked the code and see that there are cmaps involved so I suspect there's some caching issues involved in this.

It looks like it is retaining the old hostnames in memory and when something .. "fails" it restores them.

Besides don't blame me for this use case (reconfigure hostnames in a two-nodes cluster), as I didn't make it up. I just carry the pain.

Riccardo Manfrin
R&D DEPARTMENT
Web | LinkedIn

t +39 (0)444 750045
e [email protected]

ATHONET | Via Cà del Luogo, 6/8 - 36050 Bolzano Vicentino (VI) Italy

This email and any attachments are confidential and intended solely for the use of the intended recipient. If you are not the named addressee, please be aware that you shall not distribute, copy, use or disclose this email. If you have received this email by error, please notify us immediately and delete this email from your system. Email transmission cannot be guaranteed to be secured or error-free or not to contain viruses. Athonet S.r.l. processes any personal data exchanged in email correspondence in accordance with EU Reg. 679/2016 (GDPR) - you may find here the privacy policy with information on such processing and your rights. Any views or opinions presented in this email are solely those of the sender and do not necessarily represent those of Athonet S.r.l.

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users


ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] pacemaker and cluster hostname reconfiguration

Reply via email to