On Tue, 2020-03-31 at 07:37 +0300, Strahil Nikolov wrote: > On March 31, 2020 6:01:35 AM GMT+03:00, Ken Gaillot < > [email protected]> wrote: > > On Sun, 2020-03-08 at 18:11 +0000, Strahil Nikolov wrote: > > > Hello All, > > > > > > can someone help me figure something out. > > > > > > I have a test cluster with 2 resource groups: > > > > > > [root@node3 cluster]# pcs status > > > Cluster name: HACLUSTER16 > > > Stack: corosync > > > Current DC: node3.localdomain (version 1.1.20-5.el7_7.2- > > > 3c4c782f70) - > > > partition with quorum > > > Last updated: Sun Mar 8 20:00:48 2020 > > > Last change: Sun Mar 8 20:00:04 2020 by root via cibadmin on > > > node3.localdomain > > > > > > 3 nodes configured > > > 14 resources configured > > > > > > Node node2.localdomain: standby > > > Node node3.localdomain: standby > > > Online: [ node1.localdomain ] > > > > > > Full list of resources: > > > > > > RHEVM (stonith:fence_rhevm): Started node1.localdomain > > > MPATH (stonith:fence_mpath): Started node1.localdomain > > > Resource Group: NFS > > > NFS_LVM (ocf::heartbeat:LVM): Started node1.localdomain > > > NFS_infodir (ocf::heartbeat:Filesystem): Started > > > node1.localdomain > > > NFS_data (ocf::heartbeat:Filesystem): Started > > > node1.localdomain > > > NFS_IP (ocf::heartbeat:IPaddr2): Started > > > node1.localdomain > > > NFS_SRV (ocf::heartbeat:nfsserver): Started > > > node1.localdomain > > > NFS_XPRT1 (ocf::heartbeat:exportfs): Started > > > node1.localdomain > > > NFS_NTFY (ocf::heartbeat:nfsnotify): Started > > > node1.localdomain > > > Resource Group: APACHE > > > APACHE_LVM (ocf::heartbeat:LVM): Started node1.localdomain > > > APACHE_cfg (ocf::heartbeat:Filesystem): Started > > > node1.localdomain > > > APACHE_data (ocf::heartbeat:Filesystem): Started > > > node1.localdomain > > > APACHE_IP (ocf::heartbeat:IPaddr2): Started > > > node1.localdomain > > > APACHE_SRV (ocf::heartbeat:apache): Started > > > node1.localdomain > > > > > > The constraints I have put are: > > > > > > [root@node3 cluster]# pcs constraint > > > Location Constraints: > > > Resource: APACHE > > > Enabled on: node1.localdomain (score:3000) > > > Enabled on: node2.localdomain (score:2000) > > > Enabled on: node3.localdomain (score:1000) > > > Resource: NFS > > > Enabled on: node1.localdomain (score:1000) > > > Enabled on: node2.localdomain (score:2000) > > > Enabled on: node3.localdomain (score:3000) > > > Ordering Constraints: > > > Colocation Constraints: > > > APACHE with NFS (score:-1000) > > > Ticket Constraints: > > > > > > [root@node3 cluster]# pcs resource defaults > > > resource-stickiness=1000 > > > > > > As you can see the default stickiness is 1000 per resource or > > > 7000 > > > for the APACHE group. > > > The colocation rule score is just -1000 and as per my > > > understanding > > > it should be ignored when the 2 nodes are removed from standby. > > > > > > Can someone clarify why the APACHE group is moved , when the > > > resource > > > stickiness score is higher than the colocation score. > > > > > > I have attached a file with the crm_simulate output (the output > > > is > > > correct, when the standby is removed - the group is moved). > > > > > > Best Regards, > > > Strahil Nikolov > > > > Coincidentally I just fixed a bug last week that I believe is the > > culprit here. I expect if you test the current master branch it > > won't > > happen. The fix will be in 2.0.4 (the first release candidate is > > expected in a couple of weeks). > > > > The problem was in the code that incorporates colocation > > dependencies' > > node preferences. If a group was colocated with some resource, the > > resource would incorporate the scores from each member of the group > > in > > turn. However each member of the group would also incorporate its > > own > > dependencies' scores in its score -- which includes the internal > > group > > colocation of all members after it. So, the members of the > > colocated > > group were being counted multiple times, and therefore having a > > bigger > > impact than the configured colocation score. The fix was just to > > incorporate scores from the first group member since it would > > incorporate all the rest. > > Hey Ken, > > Thanks for the detailed explanation and good job ! > So, in latest upstream version the bug is fixed.What about RHEL - > should I open a bugzilla ? > > Best Regards, > Strahil Nikolov
The fix is expected to land in RHEL 7.9 and 8.3. -- Ken Gaillot <[email protected]> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
