On Wed, 2023-01-25 at 16:16 +0300, Andrei Borzenkov wrote: > On Wed, Jan 25, 2023 at 3:49 PM Antony Stone > <[email protected]> wrote: > > Hi. > > > > I have a corosync / pacemaker 3-node cluster with a resource group > > which can > > run on any node in the cluster. > > > > Every night a cron job on the node which is running the resources > > performs > > "crm_standby -v on" followed a short while later by "crm_standby -v > > off" in > > order to force the resources to migrate to another node member. > > > > We do this partly to verify that all nodes are capable of running > > the > > resources, and partly because some of those resources generate > > significant log > > files, and if one machine just keeps running them day after day, we > > run out of > > disk space (which effectively means we just need to add more > > capacity to the > > machines, which can be done, but at a cost). > > > > So long as a machine gets a day when it's not running the > > resources, a > > combination of migrating the log files to a central server, plus > > standard > > logfile rotation, takes care of managing the disk space. > > > > What I notice, though, is that two of the machines tend to swap the > > resources > > between them, and the third machine hardly ever becomes the active > > node. > > > > Pacemaker simply checks each eligible node whether it can run a > resource and I believe the order of the node list does not change (at > least as long as there is no join/leave event). So effectively the > resource just oscillates between the first two nodes in the list. > > > Is there some way of influencing the node selection mechanism when > > resources > > need to move away from the currently active node, so that, for > > example, the > > least recently used node could be favoured over the rest? > > > > I do not think pacemaker even knows which node is "the least recently > used", it does not keep this history. You can add a rule to define > location constraint based on some node attribute(s) and set this > attribute in the same script where you call crm_standby. E.g. you > could set a timestamp on the node where the resource is currently > active before doing crm_standby and select the node with the oldest > timestamp (I do not think pacemaker supports such computation in its > rules).
You could do it entirely with rules without needing the cron. Configure a location constraint for each node, with a high score, using rsc-pattern to match all the resources you want to move (for example ".*" matches all). Then, add a date-based rule to each constraint making it effective only on certain days of the week or month, such that the nodes alternate. -- Ken Gaillot <[email protected]> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
