Re: [ClusterLabs] SLES cluster join fails with TLS handshake error

Reynolds, John F - San Mateo, CA - Contractor Wed, 01 Jan 2020 22:43:13 -0800

I have reworked csync2's SSL keys, and I was able to use ha-cluster-join to add 
the second node to the cluster.   Thank you for the guidance!



However, not all the resources are happy with this.  

eagnmnmeqfc1:/var/lib/pacemaker/cib # crm status
Stack: corosync
Current DC: eagnmnmeqfc0 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Tue Dec 31 15:16:14 2019
Last change: Tue Dec 31 15:01:34 2019 by hacluster via crmd on eagnmnmeqfc0

2 nodes configured
16 resources configured

Online: [ eagnmnmeqfc0 eagnmnmeqfc1 ]

Full list of resources:

 Resource Group: grp_ncoa
     ncoa_dg_mqm        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a01        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a02        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a03        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a04        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_dg_a05        (ocf::heartbeat:LVM):   Started eagnmnmeqfc0
     ncoa_mqm   (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a01shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a02shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a03shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a04shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     ncoa_a05shared     (ocf::heartbeat:Filesystem):    Started eagnmnmeqfc0
     IP_56.76.161.36    (ocf::heartbeat:IPaddr2):       Started eagnmnmeqfc0
     ncoa_apache        (systemd:apache2):      Started eagnmnmeqfc0
     ncoa_dg_a00        (ocf::heartbeat:LVM):   FAILED[ eagnmnmeqfc0 
eagnmnmeqfc1 ]
     ncoa_a00shared     (ocf::heartbeat:Filesystem):    FAILED eagnmnmeqfc0 
(blocked)

Failed Actions:
* ncoa_a00shared_stop_0 on eagnmnmeqfc0 'unknown error' (1): call=206, 
status=complete, exitreason='Couldn't unmount /ncoa/qncoa/a00shared, giving 
up!',
    last-rc-change='Tue Dec 31 15:01:35 2019', queued=0ms, exec=7478ms
* ncoa_dg_a00_monitor_0 on eagnmnmeqfc1 'unknown error' (1): call=141, 
status=complete, exitreason='WARNING: vg_qncoa_noncloned-a00 is active without 
the cluster tag, "pacemaker"',
    last-rc-change='Tue Dec 31 15:01:34 2019', queued=0ms, exec=287ms

eagnmnmeqfc1:/var/lib

The PV and VG are present on both servers.  The resource is defined in cib.xml 
as :

        </primitive>
        <primitive id="ncoa_dg_a00" class="ocf" provider="heartbeat" type="LVM">
          <instance_attributes id="ncoa_dg_a00-instance_attributes">
            <nvpair name="volgrpname" value="vg_qncoa_noncloned-a00" 
id="ncoa_dg_a00-instance_attributes-volgrpname"/>
            <nvpair name="exclusive" value="true" 
id="ncoa_dg_a00-instance_attributes-exclusive"/>
          </instance_attributes>
          <operations>
            <op name="monitor" interval="60" timeout="60" 
id="ncoa_dg_a00-monitor-60">
              <instance_attributes 
id="ncoa_dg_a00-monitor-60-instance_attributes">
                <nvpair name="is_managed" value="true" 
id="ncoa_dg_a00-monitor-60-instance_attributes-is_managed"/>
              </instance_attributes>
            </op>
          </operations>
        </primitive>
        <primitive id="ncoa_a00shared" class="ocf" provider="heartbeat" 
type="Filesystem">
          <instance_attributes id="ncoa_a00shared-instance_attributes">
            <nvpair name="device" 
value="/dev/vg_qncoa_noncloned-a00/lv_a00shared" 
id="ncoa_a00shared-instance_attributes-device"/>
            <nvpair name="directory" value="/ncoa/qncoa/a00shared" 
id="ncoa_a00shared-instance_attributes-directory"/>
            <nvpair name="fstype" value="xfs" 
id="ncoa_a00shared-instance_attributes-fstype"/>
          </instance_attributes>
          <operations>
            <op name="monitor" interval="60" timeout="60" 
id="ncoa_a00shared-monitor-60"/>
          </operations>
        </primitive>

Which is, as far as I can see, the same as one of the resources that is 
working: 

    <primitive id="ncoa_dg_a01" class="ocf" provider="heartbeat" type="LVM">
          <instance_attributes id="ncoa_dg_a01-instance_attributes">
            <nvpair name="volgrpname" value="vg_qncoa_noncloned-a01" 
id="ncoa_dg_a01-instance_attributes-volgrpname"/>
            <nvpair name="exclusive" value="true" 
id="ncoa_dg_a01-instance_attributes-exclusive"/>
          </instance_attributes>
          <operations>
            <op name="monitor" interval="60" timeout="60" 
id="ncoa_dg_a01-monitor-60">
              <instance_attributes 
id="ncoa_dg_a01-monitor-60-instance_attributes">
                <nvpair name="is_managed" value="true" 
id="ncoa_dg_a01-monitor-60-instance_attributes-is_managed"/>
              </instance_attributes>
            </op>
          </operations>
        </primitive>
        <primitive id="ncoa_dg_a02" class="ocf" provider="heartbeat" type="LVM">
          <instance_attributes id="ncoa_dg_a02-instance_attributes">
            <nvpair name="volgrpname" value="vg_ncoa_cloned-a02" 
id="ncoa_dg_a02-instance_attributes-volgrpname"/>
            <nvpair name="exclusive" value="true" 
id="ncoa_dg_a02-instance_attributes-exclusive"/>
          </instance_attributes>
          <operations>
            <op name="monitor" interval="60" timeout="60" 
id="ncoa_dg_a02-monitor-60">
              <instance_attributes 
id="ncoa_dg_a02-monitor-60-instance_attributes">
                <nvpair name="is_managed" value="true" 
id="ncoa_dg_a02-monitor-60-instance_attributes-is_managed"/>
              </instance_attributes>
            </op>
          </operations>
        </primitive>

"crm resource cleanup' doesn’t fix the problem.


Now, the /ncoa/qncoa/a00shared filesystem can't be unmounted because there are 
open files on it.  Could it be that the problem is simply that the cluster-add 
wanted to unmount and remount all the disk resources, and, since it couldn't do 
it, tossed it as an error?  


Thank you.

John Reynolds
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] SLES cluster join fails with TLS handshake error

Reply via email to