I confirmed that the problem was fixed. Many thanks! > -----Original Message----- > From: Ken Gaillot [mailto:[email protected]] > Sent: Thursday, August 17, 2017 12:25 AM > To: Cluster Labs - All topics related to open-source clustering welcomed > Subject: Re: [ClusterLabs] Updated attribute is not displayed in crm_mon > > I have a fix for this issue ready. I am running some tests on it, then > will merge it in the upstream master branch, to become part of the next > release. > > The fix is to clear the transient attributes from the CIB when attrd > starts, rather than when the crmd completes its first join. This > eliminates the window where attributes can be set before the CIB is > cleared. > > On Tue, 2017-08-15 at 08:42 +0000, 井上 和徳 wrote: > > Hi Ken, > > > > Thanks for the explanation. > > > > As an additional information, we are using Daemon(*1) that registers > > Corosync's ring status as attributes, so I want to avoid events where > > attributes are not displayed. > > > > *1 It's a ifcheckd that always running, not a resource. and registers > > attributes when Pacemaker is running. > > ( https://github.com/linux-ha-japan/pm_extras/tree/master/tools ) > > Attribute example : > > > > Node Attributes: > > * Node rhel73-1: > > + ringnumber_0 : 192.168.101.131 is UP > > + ringnumber_1 : 192.168.102.131 is UP > > * Node rhel73-2: > > + ringnumber_0 : 192.168.101.132 is UP > > + ringnumber_1 : 192.168.102.132 is UP > > > > Regards, > > Kazunori INOUE > > > > > -----Original Message----- > > > From: Ken Gaillot [mailto:[email protected]] > > > Sent: Tuesday, August 15, 2017 2:42 AM > > > To: Cluster Labs - All topics related to open-source clustering welcomed > > > Subject: Re: [ClusterLabs] Updated attribute is not displayed in crm_mon > > > > > > On Mon, 2017-08-14 at 12:33 -0500, Ken Gaillot wrote: > > > > On Wed, 2017-08-02 at 09:59 +0000, 井上 和徳 wrote: > > > > > Hi, > > > > > > > > > > In Pacemaker-1.1.17, the attribute updated while starting pacemaker > > > > > is not displayed in crm_mon. > > > > > In Pacemaker-1.1.16, it is displayed and results are different. > > > > > > > > > > https://github.com/ClusterLabs/pacemaker/commit/fe44f400a3116a158ab331a92a49a4ad8937170d > > > > > This commit is the cause, but the following result (3.) is expected > > > > > behavior? > > > > > > > > This turned out to be an odd one. The sequence of events is: > > > > > > > > 1. When the node leaves the cluster, the DC (correctly) wipes all its > > > > transient attributes from attrd and the CIB. > > > > > > > > 2. Pacemaker is newly started on the node, and a transient attribute is > > > > set before the node joins the cluster. > > > > > > > > 3. The node joins the cluster, and its transient attributes (including > > > > the new value) are sync'ed with the rest of the cluster, in both attrd > > > > and the CIB. So far, so good. > > > > > > > > 4. Because this is the node's first join since its crmd started, its > > > > crmd wipes all of its transient attributes again. The idea is that the > > > > node may have restarted so quickly that the DC hasn't yet done it (step > > > > 1 here), so clear them now to avoid any problems with old values. > > > > However, the crmd wipes only the CIB -- not attrd (arguably a bug). > > > > > > Whoops, clarification: the node may have restarted so quickly that > > > corosync didn't notice it left, so the DC would never have gotten the > > > "peer lost" message that triggers wiping its transient attributes. > > > > > > I suspect the crmd wipes only the CIB in this case because we assumed > > > attrd would be empty at this point -- missing exactly this case where a > > > value was set between start-up and first join. > > > > > > > 5. With the older pacemaker version, both the joining node and the DC > > > > would request a full write-out of all values from attrd. Because step 4 > > > > only wiped the CIB, this ends up restoring the new value. With the newer > > > > pacemaker version, this step is no longer done, so the value winds up > > > > staying in attrd but not in CIB (until the next write-out naturally > > > > occurs). > > > > > > > > I don't have a solution yet, but step 4 is clearly the problem (rather > > > > than the new code that skips step 5, which is still a good idea > > > > performance-wise). I'll keep working on it. > > > > > > > > > [test case] > > > > > 1. Start pacemaker on two nodes at the same time and update the > > > > > attribute during startup. > > > > > In this case, the attribute is displayed in crm_mon. > > > > > > > > > > [root@node1 ~]# ssh -f node1 'systemctl start pacemaker ; > > > > > attrd_updater -n KEY -U V-1' ; \ > > > > > ssh -f node3 'systemctl start pacemaker ; > > > > > attrd_updater -n KEY -U V-3' > > > > > [root@node1 ~]# crm_mon -QA1 > > > > > Stack: corosync > > > > > Current DC: node3 (version 1.1.17-1.el7-b36b869) - partition with > > > > > quorum > > > > > > > > > > 2 nodes configured > > > > > 0 resources configured > > > > > > > > > > Online: [ node1 node3 ] > > > > > > > > > > No active resources > > > > > > > > > > > > > > > Node Attributes: > > > > > * Node node1: > > > > > + KEY : V-1 > > > > > * Node node3: > > > > > + KEY : V-3 > > > > > > > > > > > > > > > 2. Restart pacemaker on node1, and update the attribute during > > > > > startup. > > > > > > > > > > [root@node1 ~]# systemctl stop pacemaker > > > > > [root@node1 ~]# systemctl start pacemaker ; attrd_updater -n KEY > > > > > -U V-10 > > > > > > > > > > > > > > > 3. The attribute is registered in attrd but it is not registered in > > > > > CIB, > > > > > so the updated attribute is not displayed in crm_mon. > > > > > > > > > > [root@node1 ~]# attrd_updater -Q -n KEY -A > > > > > name="KEY" host="node3" value="V-3" > > > > > name="KEY" host="node1" value="V-10" > > > > > > > > > > [root@node1 ~]# crm_mon -QA1 > > > > > Stack: corosync > > > > > Current DC: node3 (version 1.1.17-1.el7-b36b869) - partition with > > > > > quorum > > > > > > > > > > 2 nodes configured > > > > > 0 resources configured > > > > > > > > > > Online: [ node1 node3 ] > > > > > > > > > > No active resources > > > > > > > > > > > > > > > Node Attributes: > > > > > * Node node1: > > > > > * Node node3: > > > > > + KEY : V-3 > > > > > > > > > > > > > > > Best Regards > > > > > > > > > > _______________________________________________ > > > > > Users mailing list: [email protected] > > > > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > > > > > > > Project Home: http://www.clusterlabs.org > > > > > Getting started: > > > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > > > Bugs: http://bugs.clusterlabs.org > > > > > > > > > > -- > > > Ken Gaillot <[email protected]> > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Users mailing list: [email protected] > > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > > Users mailing list: [email protected] > > http://lists.clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > -- > Ken Gaillot <[email protected]> > > > > > > _______________________________________________ > Users mailing list: [email protected] > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
