I have reworked csync2's SSL keys, and I was able to use ha-cluster-join to add
the second node to the cluster. Thank you for the guidance!
However, not all the resources are happy with this.
eagnmnmeqfc1:/var/lib/pacemaker/cib # crm status
Stack: corosync
Current DC: eagnmnmeqfc0 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Tue Dec 31 15:16:14 2019
Last change: Tue Dec 31 15:01:34 2019 by hacluster via crmd on eagnmnmeqfc0
2 nodes configured
16 resources configured
Online: [ eagnmnmeqfc0 eagnmnmeqfc1 ]
Full list of resources:
Resource Group: grp_ncoa
ncoa_dg_mqm (ocf::heartbeat:LVM): Started eagnmnmeqfc0
ncoa_dg_a01 (ocf::heartbeat:LVM): Started eagnmnmeqfc0
ncoa_dg_a02 (ocf::heartbeat:LVM): Started eagnmnmeqfc0
ncoa_dg_a03 (ocf::heartbeat:LVM): Started eagnmnmeqfc0
ncoa_dg_a04 (ocf::heartbeat:LVM): Started eagnmnmeqfc0
ncoa_dg_a05 (ocf::heartbeat:LVM): Started eagnmnmeqfc0
ncoa_mqm (ocf::heartbeat:Filesystem): Started eagnmnmeqfc0
ncoa_a01shared (ocf::heartbeat:Filesystem): Started eagnmnmeqfc0
ncoa_a02shared (ocf::heartbeat:Filesystem): Started eagnmnmeqfc0
ncoa_a03shared (ocf::heartbeat:Filesystem): Started eagnmnmeqfc0
ncoa_a04shared (ocf::heartbeat:Filesystem): Started eagnmnmeqfc0
ncoa_a05shared (ocf::heartbeat:Filesystem): Started eagnmnmeqfc0
IP_56.76.161.36 (ocf::heartbeat:IPaddr2): Started eagnmnmeqfc0
ncoa_apache (systemd:apache2): Started eagnmnmeqfc0
ncoa_dg_a00 (ocf::heartbeat:LVM): FAILED[ eagnmnmeqfc0
eagnmnmeqfc1 ]
ncoa_a00shared (ocf::heartbeat:Filesystem): FAILED eagnmnmeqfc0
(blocked)
Failed Actions:
* ncoa_a00shared_stop_0 on eagnmnmeqfc0 'unknown error' (1): call=206,
status=complete, exitreason='Couldn't unmount /ncoa/qncoa/a00shared, giving
up!',
last-rc-change='Tue Dec 31 15:01:35 2019', queued=0ms, exec=7478ms
* ncoa_dg_a00_monitor_0 on eagnmnmeqfc1 'unknown error' (1): call=141,
status=complete, exitreason='WARNING: vg_qncoa_noncloned-a00 is active without
the cluster tag, "pacemaker"',
last-rc-change='Tue Dec 31 15:01:34 2019', queued=0ms, exec=287ms
eagnmnmeqfc1:/var/lib
The PV and VG are present on both servers. The resource is defined in cib.xml
as :
</primitive>
<primitive id="ncoa_dg_a00" class="ocf" provider="heartbeat" type="LVM">
<instance_attributes id="ncoa_dg_a00-instance_attributes">
<nvpair name="volgrpname" value="vg_qncoa_noncloned-a00"
id="ncoa_dg_a00-instance_attributes-volgrpname"/>
<nvpair name="exclusive" value="true"
id="ncoa_dg_a00-instance_attributes-exclusive"/>
</instance_attributes>
<operations>
<op name="monitor" interval="60" timeout="60"
id="ncoa_dg_a00-monitor-60">
<instance_attributes
id="ncoa_dg_a00-monitor-60-instance_attributes">
<nvpair name="is_managed" value="true"
id="ncoa_dg_a00-monitor-60-instance_attributes-is_managed"/>
</instance_attributes>
</op>
</operations>
</primitive>
<primitive id="ncoa_a00shared" class="ocf" provider="heartbeat"
type="Filesystem">
<instance_attributes id="ncoa_a00shared-instance_attributes">
<nvpair name="device"
value="/dev/vg_qncoa_noncloned-a00/lv_a00shared"
id="ncoa_a00shared-instance_attributes-device"/>
<nvpair name="directory" value="/ncoa/qncoa/a00shared"
id="ncoa_a00shared-instance_attributes-directory"/>
<nvpair name="fstype" value="xfs"
id="ncoa_a00shared-instance_attributes-fstype"/>
</instance_attributes>
<operations>
<op name="monitor" interval="60" timeout="60"
id="ncoa_a00shared-monitor-60"/>
</operations>
</primitive>
Which is, as far as I can see, the same as one of the resources that is
working:
<primitive id="ncoa_dg_a01" class="ocf" provider="heartbeat" type="LVM">
<instance_attributes id="ncoa_dg_a01-instance_attributes">
<nvpair name="volgrpname" value="vg_qncoa_noncloned-a01"
id="ncoa_dg_a01-instance_attributes-volgrpname"/>
<nvpair name="exclusive" value="true"
id="ncoa_dg_a01-instance_attributes-exclusive"/>
</instance_attributes>
<operations>
<op name="monitor" interval="60" timeout="60"
id="ncoa_dg_a01-monitor-60">
<instance_attributes
id="ncoa_dg_a01-monitor-60-instance_attributes">
<nvpair name="is_managed" value="true"
id="ncoa_dg_a01-monitor-60-instance_attributes-is_managed"/>
</instance_attributes>
</op>
</operations>
</primitive>
<primitive id="ncoa_dg_a02" class="ocf" provider="heartbeat" type="LVM">
<instance_attributes id="ncoa_dg_a02-instance_attributes">
<nvpair name="volgrpname" value="vg_ncoa_cloned-a02"
id="ncoa_dg_a02-instance_attributes-volgrpname"/>
<nvpair name="exclusive" value="true"
id="ncoa_dg_a02-instance_attributes-exclusive"/>
</instance_attributes>
<operations>
<op name="monitor" interval="60" timeout="60"
id="ncoa_dg_a02-monitor-60">
<instance_attributes
id="ncoa_dg_a02-monitor-60-instance_attributes">
<nvpair name="is_managed" value="true"
id="ncoa_dg_a02-monitor-60-instance_attributes-is_managed"/>
</instance_attributes>
</op>
</operations>
</primitive>
"crm resource cleanup' doesn’t fix the problem.
Now, the /ncoa/qncoa/a00shared filesystem can't be unmounted because there are
open files on it. Could it be that the problem is simply that the cluster-add
wanted to unmount and remount all the disk resources, and, since it couldn't do
it, tossed it as an error?
Thank you.
John Reynolds
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/