Re: [ClusterLabs] Help with tweaking an active/passive NFS cluster

Ronny Adsetts Mon, 17 Apr 2023 03:17:56 -0700

Andrei Borzenkov wrote on 05/04/2023 08:36:
> On Fri, Mar 31, 2023 at 12:42 AM Ronny Adsetts
> <[email protected]> wrote:
>>
>> Hi,
>>
>> I wonder if someone more familiar with the workings of pacemaker/corosync 
>> would be able to assist in solving an issue.
>>
>> I have a 3-node NFS cluster which exports several iSCSI LUNs. The LUNs are 
>> presented to the nodes via multipathd.
>>
>> This all works fine except that I can't stop just one export. Sometimes I 
>> need to take a single filesystem offline for maintenance for example. Or if 
>> there's an issue and a filesystem goes offline and can't come back.
>>
>> There's a trimmed down config below but essentially I want all the NFS 
>> exports on one node but I don't want any of the exports to block. So it's OK 
>> to stop (or fail) a single export.
>>
>> My config has a group for each export and filesystem and another group for 
>> the NFS server and VIP. I then co-locate them together.
>>
>> Cut-down config to limit the number of exports:
>>
>> node 1: nfs-01
>> node 2: nfs-02
>> node 3: nfs-03
>> primitive NFSExportAdminHomes exportfs \
>>         params clientspec="172.16.40.0/24" options="rw,async,no_root_squash" 
>> directory="/srv/adminhomes" fsid=dcfd1bbb-c026-4d6d-8541-7fc29d6fef1a \
>>         op monitor timeout=20 interval=10 \
>>         op_params interval=10
>> primitive NFSExportArchive exportfs \
>>         params clientspec="172.16.40.0/24" options="rw,async,no_root_squash" 
>> directory="/srv/archive" fsid=3abb6e34-bff2-4896-b8ff-fc1123517359 \
>>         op monitor timeout=20 interval=10 \
>>         op_params interval=10 \
>>         meta target-role=Started
>> primitive NFSExportDBBackups exportfs \
>>         params clientspec="172.16.40.0/24" options="rw,async,no_root_squash" 
>> directory="/srv/dbbackups" fsid=df58b9c0-593b-45c0-9923-155b3d7d9483 \
>>         op monitor timeout=20 interval=10 \
>>         op_params interval=10
>> primitive NFSFSAdminHomes Filesystem \
>>         params device="/dev/mapper/adminhomes-part1" 
>> directory="/srv/adminhomes" fstype=xfs \
>>         op start interval=0 timeout=120 \
>>         op monitor interval=60 timeout=60 \
>>         op_params OCF_CHECK_LEVEL=20 \
>>         op stop interval=0 timeout=240
>> primitive NFSFSArchive Filesystem \
>>         params device="/dev/mapper/archive-part1" directory="/srv/archive" 
>> fstype=xfs \
>>         op start interval=0 timeout=120 \
>>         op monitor interval=60 timeout=60 \
>>         op_params OCF_CHECK_LEVEL=20 \
>>         op stop interval=0 timeout=240 \
>>         meta target-role=Started
>> primitive NFSFSDBBackups Filesystem \
>>         params device="/dev/mapper/dbbackups-part1" 
>> directory="/srv/dbbackups" fstype=xfs \
>>         op start timeout=60 interval=0 \
>>         op monitor interval=20 timeout=40 \
>>         op stop timeout=60 interval=0 \
>>         op_params OCF_CHECK_LEVEL=20
>> primitive NFSIP-01 IPaddr2 \
>>         params ip=172.16.40.17 cidr_netmask=24 nic=ens14 \
>>         op monitor interval=30s
>> group AdminHomes NFSFSAdminHomes NFSExportAdminHomes \
>>         meta target-role=Started
>> group Archive NFSFSArchive NFSExportArchive \
>>         meta target-role=Started
>> group DBBackups NFSFSDBBackups NFSExportDBBackups \
>>         meta target-role=Started
>> group NFSServerIP NFSIP-01 NFSServer \
>>         meta target-role=Started
>> colocation NFSMaster inf: NFSServerIP AdminHomes Archive DBBackups
> 
> This is entirely equivalent to defining a group and says that
> resources must be started in strict order on the same node. Like with
> a group, if an earlier resource cannot be started, all following
> resources are not started either.
> 
>> property cib-bootstrap-options: \
>>         have-watchdog=false \
>>         dc-version=2.0.1-9e909a5bdd \
>>         cluster-infrastructure=corosync \
>>         cluster-name=nfs-cluster \
>>         stonith-enabled=false \
>>         last-lrm-refresh=1675344768
>> rsc_defaults rsc-options: \
>>         resource-stickiness=200
>>
>>
>> The problem is that if one export fails, none of the following exports will 
>> be attempted. Reading the docs, that's to be expected as each item in the 
>> colocation needs the preceding item to succeed.
>>
>> I tried changing the colocation line like so to remove the dependency:
>>
>> colocation NFSMaster inf: NFSServerIP ( AdminHomes Archive DBBackups )
>>
> 
> 1. The ( AdminHomes Archive DBBackups ) creates a set with
> sequential=false. Now, the documentation for "sequential" is one of
> the most obscure I have seen, but judging by "the individual members
> within any one set may or may not be colocated relative to each other
> (determined by the set’s sequential property)" and "A colocated set
> with sequential="false" makes sense only if there is another set in
> the constraint. Otherwise, the constraint has no effect" members of a
> set with sequential=false are not colocated on the same node.
> 
> 2. The condition is backward. You colocate NFSServerIP *with* set (
> AdminHomes Archive DBBackups ), while you actually want to colocate
> set ( AdminHomes Archive DBBackups ) *with* NFSServerIP.
> 
> So the
> 
> colocation NFSMaster inf: ( AdminHomes Archive DBBackups ) ( NFSServerIP )
> 
> may work.
> 
> The pacemaker behavior is rather puzzling though. According to
> documentation "in order for any member of one set in the constraint to
> be active, all members of sets listed after it must also be active
> (and naturally on the same node)", but in your case members of set are
> on the same node which would imply that NFSServerIP (which is a sole
> member of an implicit set) should not be active.


Thanks for the explainer here, that's really useful.

I don't spend lots of time tinkering with pacemaker as it's only a tiny part of 
what I do so I do suffer from lack of in-depth knowledge which can be both 
painful and annoying. :-).

This particular issue only came to the fore and therefore became more urgent to 
solve when one of the LUNs failed to mount.

> Anyway, an alternative is to define separate colocation for each group
> which likely makes configuration more clear

Yes, this seems the sensible way forward. I'll reconfigure and give it a go.

I've no idea why I did it the way I did - it was a couple of years ago now. 
Probably some aversion to having NFSServerIP listed in multiple colocation 
lines.

Ronny

-- 
Ronny Adsetts
Technical Director
Amazing Internet Ltd, London
t: +44 20 8977 8943
w: www.amazinginternet.com

Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ
Registered in England. Company No. 4042957

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Help with tweaking an active/passive NFS cluster

Reply via email to