On Tue, Sep 20, 2016 at 09:43:23AM -0500, Ken Gaillot wrote: > On 09/20/2016 07:38 AM, Lars Ellenberg wrote: > > From the point of view of the resource agent, > > you configured it to use a non-existing network. > > Which it considers to be a configuration error, > > which is treated by pacemaker as > > "don't try to restart anywhere > > but let someone else configure it properly, first". > > > > I think the OCF_ERR_CONFIGURED is good, though, otherwise > > configuration errors might go unnoticed for quite some time. > > A network interface is not supposed to "vanish". > > > > You may disagree with that choice, > > This is a point we should settle in the upcoming changes to the OCF > standard.
I meant "that choice of this RA", namely to return this error code in this situation: interface specified in cluster configuration does not exist. I find OCF_ERR_CONFIGURED appropriate. One could argue that OCF_ERR_INSTALLED or OCF_ERR_GENERIC would be more appropriate. All with current pacemaker semantics, which you referenced below. > The OCF 1.0 standard > (https://github.com/ClusterLabs/OCF-spec/blob/master/ra/resource-agent-api.md) > merely says it means "Program is not configured". That is open to > interpretation. > > Pacemaker > (http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#s-ocf-return-codes) > has a more narrow view: "The resource's configuration is invalid. E.g. > required parameters are missing." > > The reason Pacemaker considers it a fatal error is that it expects it to > be returned only for an error in the resource agent's configuration *in > the cluster*. If the cluster config is bad, it doesn't matter which node > we try it on. For example, if an agent takes a parameter "frobble" with > valid values from 1 to 10, and the user supplies "frobble=-1", that > would be a configuration error. > > I think in OCF 2.0 we should distinguish "supplied RA parameters are > bad" from "service's configuration on this host is bad". Currently, > Pacemaker expects the latter error to generate OCF_ERR_GENERIC, > OCF_ERR_ARGS, OCF_ERR_PERM, or OCF_ERR_INSTALLED, which allows it to try > the resource on another node. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker : R&D, Integration, Ops, Consulting, Support DRBD® and LINBIT® are registered trademarks of LINBIT _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
