“But in general I guess the idea of rechecking resource after failure timeout once (similar to initial probe) sounds interesting. It could be more robust in that resource agent could check whether resource start is possible now at all and prevent unsuccessful attempt to migrate resource back to original node.”
Yes! This is exactly the behavior I would like to produce. Maybe if this is not possible with LSB, is it possible with an OCF resource ? Also I’ve considered having it simply not retry but I would prefer this other configuration if it is at all possible. On Tue, Jun 15, 2021 at 10:54 PM Andrei Borzenkov <[email protected]> wrote: > On 16.06.2021 01:49, Michael Romero wrote: > > > > At which point an administrator or an automated script could intervene > > If you are going to always use manual intervention outside of pacemaker, > just leave failure timeout on default 0 so cluster will never clear > failure count automatically on a node. > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Michael Romero Lead Infrastructure Engineer Engineering | Convoso 562-338-9868 [email protected] www.convoso.com [image: linkedin] <https://linkedin.com/in/romerom>
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
