On Sun, 2020-03-08 at 18:11 +0000, Strahil Nikolov wrote: > Hello All, > > can someone help me figure something out. > > I have a test cluster with 2 resource groups: > > [root@node3 cluster]# pcs status > Cluster name: HACLUSTER16 > Stack: corosync > Current DC: node3.localdomain (version 1.1.20-5.el7_7.2-3c4c782f70) - > partition with quorum > Last updated: Sun Mar 8 20:00:48 2020 > Last change: Sun Mar 8 20:00:04 2020 by root via cibadmin on > node3.localdomain > > 3 nodes configured > 14 resources configured > > Node node2.localdomain: standby > Node node3.localdomain: standby > Online: [ node1.localdomain ] > > Full list of resources: > > RHEVM (stonith:fence_rhevm): Started node1.localdomain > MPATH (stonith:fence_mpath): Started node1.localdomain > Resource Group: NFS > NFS_LVM (ocf::heartbeat:LVM): Started node1.localdomain > NFS_infodir (ocf::heartbeat:Filesystem): Started > node1.localdomain > NFS_data (ocf::heartbeat:Filesystem): Started > node1.localdomain > NFS_IP (ocf::heartbeat:IPaddr2): Started > node1.localdomain > NFS_SRV (ocf::heartbeat:nfsserver): Started > node1.localdomain > NFS_XPRT1 (ocf::heartbeat:exportfs): Started > node1.localdomain > NFS_NTFY (ocf::heartbeat:nfsnotify): Started > node1.localdomain > Resource Group: APACHE > APACHE_LVM (ocf::heartbeat:LVM): Started node1.localdomain > APACHE_cfg (ocf::heartbeat:Filesystem): Started > node1.localdomain > APACHE_data (ocf::heartbeat:Filesystem): Started > node1.localdomain > APACHE_IP (ocf::heartbeat:IPaddr2): Started > node1.localdomain > APACHE_SRV (ocf::heartbeat:apache): Started > node1.localdomain > > The constraints I have put are: > > [root@node3 cluster]# pcs constraint > Location Constraints: > Resource: APACHE > Enabled on: node1.localdomain (score:3000) > Enabled on: node2.localdomain (score:2000) > Enabled on: node3.localdomain (score:1000) > Resource: NFS > Enabled on: node1.localdomain (score:1000) > Enabled on: node2.localdomain (score:2000) > Enabled on: node3.localdomain (score:3000) > Ordering Constraints: > Colocation Constraints: > APACHE with NFS (score:-1000) > Ticket Constraints: > > [root@node3 cluster]# pcs resource defaults > resource-stickiness=1000 > > As you can see the default stickiness is 1000 per resource or 7000 > for the APACHE group. > The colocation rule score is just -1000 and as per my understanding > it should be ignored when the 2 nodes are removed from standby. > > Can someone clarify why the APACHE group is moved , when the resource > stickiness score is higher than the colocation score. > > I have attached a file with the crm_simulate output (the output is > correct, when the standby is removed - the group is moved). > > Best Regards, > Strahil Nikolov
Coincidentally I just fixed a bug last week that I believe is the culprit here. I expect if you test the current master branch it won't happen. The fix will be in 2.0.4 (the first release candidate is expected in a couple of weeks). The problem was in the code that incorporates colocation dependencies' node preferences. If a group was colocated with some resource, the resource would incorporate the scores from each member of the group in turn. However each member of the group would also incorporate its own dependencies' scores in its score -- which includes the internal group colocation of all members after it. So, the members of the colocated group were being counted multiple times, and therefore having a bigger impact than the configured colocation score. The fix was just to incorporate scores from the first group member since it would incorporate all the rest. -- Ken Gaillot <[email protected]> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
