Hi Ken, let met sum it up:
Pacemaker in recent versions is smart enough to run (trigger, execute) the fence operation on the node, that is not the target. If i have an external stonith device that can fence multiple nodes, a single primitive is enough in pacemaker. If with external/ipmi i can only address a single node, i need to have multiple primitives - one for each node. In this case it's recommended to let the primitive always run on the opposite node - right? thank you. Stefan -----Ursprüngliche Nachricht----- > Von:Ken Gaillot <[email protected]> > Gesendet: Die 20 September 2016 16:49 > An: [email protected] > Betreff: Re: [ClusterLabs] best practice fencing with ipmi in 2node-setups / > cloneresource/monitor/timeout > > On 09/20/2016 06:42 AM, Digimer wrote: > > On 20/09/16 06:59 AM, Stefan Bauer wrote: > >> Hi, > >> > >> i run a 2 node cluster and want to be save in split-brain scenarios. For > >> this i setup external/ipmi to stonith the other node. > > > > Please use 'fence_ipmilan'. I believe that the older external/ipmi are > > deprecated (someone correct me if I am wrong on this). > > It's just an alternative. The "external/" agents come with the > cluster-glue package, which isn't provided by some distributions (such > as RHEL and its derivatives), so it's "deprecated" on those only. > > >> Some possible issues jumped to my mind and i would ike to find the best > >> practice solution: > >> > >> - I have a primitive for each node to stonith. Many documents and guides > >> recommend to never let them run on the host it should fence. I would > >> setup clone resources to avoid dealing with locations that would also > >> influence scoring. Does that make sense? > > > > Since v1.1.10 of pacemaker, you don't have to worry about this. > > Pacemaker is smart enough to know where to run a fence call from in > > order to terminate a target. > > Right, fence devices can run anywhere now, and in fact they don't even > have to be "running" for pacemaker to use them -- as long as they are > configured and not intentionally disabled, pacemaker will use them. > > There is still a slight advantage to not running a fence device on a > node it can fence. "Running" a fence device in pacemaker really means > running the recurring monitor for it. Since the node that runs the > monitor has "verified" access to the device, pacemaker will prefer to > use it to execute that device. However, pacemaker will not use a node to > fence itself, except as a last resort if no other node is available. So, > running a fence device on a node it can fence means that the preference > is lost. > > That's a very minor detail, not worth worrying about. It's more a matter > of personal preference. > > In this particular case, a more relevant concern is that you need > different configurations for the different targets (the IPMI address is > different). > > One approach is to define two different fence devices, each with one > IPMI address. In that case, it makes sense to use the location > constraints to ensure the device prefers the node that's not its target. > > Another approach (if the fence agent supports it) is to use > pcmk_host_map to provide a different "port" (IPMI address) depending on > which host is being fenced. In this case, you need only one fence device > to be able to fence both hosts. You don't need a clone. (Remember, the > node "running" the device merely refers to its monitor, so the cluster > can still use the fence device, even if that node crashes.) > > >> - Monitoring operation on the stonith primitive is dangerous. I read > >> that if monitor operations fail for the stonith device, stonith action > >> is triggered. I think its not clever to give the cluster the option to > >> fence a node just because it has an issue to monitor a fence device. > >> That should not be a reason to shutdown a node. What is your opinion on > >> this? Can i just set the primitive monitor operation to disabled? > > > > Monitoring is how you will detect that, for example, the IPMI cable > > failed or was unplugged. I do not believe the node will get fenced on > > fence agent monitor failing... At least not by default. > > I am not aware of any situation in which a failing fence monitor > triggers a fence. Monitoring is good -- it verifies that the fence > device is still working. > > One concern particular to on-board IPMI devices is that they typically > share the same power supply as their host. So if the machine loses > power, the cluster can't contact the IPMI to fence it -- which means it > will be unable to recover any resources from the lost node. (It can't > assume the node lost power -- it's possible just network connectivity > between the two nodes was lost.) > > The only way around that is to have a second fence device (such as an > intelligent power switch). If the cluster can't reach the IPMI, it will > try the second device. > > _______________________________________________ > Users mailing list: [email protected] > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
