Hi, On Tue, Oct 28, 2014 at 09:51:02AM -0400, Digimer wrote: > On 28/10/14 05:59 AM, [email protected] wrote: >> hi, >> >> any recommendation/documentation for a reliable fencing implementation >> on a multi-node cluster (4 or 6 nodes on 2 site). >> i think of implementing multiple node-fencing devices for each host to >> stonith remaining nodes on other site? >> >> thank you! >> Philipp > > Multi-site clustering is very hard to do well because of fencing issues. > How do you distinguish a site failure from severed links?
Indeed. There's a booth server managing the tickets in pacemaker, which uses arbitrators to resolve ties. booth source is available at github.com and packaged for several distributions at OBS (http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/) It's also supported in the newly released SLE12. Thanks, Dejan > Given that a > failed fence action can not be assumed to be a success, then the only > safe option is to block until a human intervenes. This makes your > cluster as reliable as your WAN between the sites, which is too say, not > very reliable. In any case, the destruction of a site will require > manual failover, which can be complicated if insufficient nodes remain > to form quorum. > > Generally, I'd recommend to different clusters, one per site, with > manual/service-level failover in the case of a disaster. > > In any case; A good fencing setup should have two fence methods. > Personally, I always use IPMI as a primary fence method (routed through > one switch) and a pair of switched PDUs as backup (via a backup switch). > This way, when IPMI is available, a confirmed fence is 100% certain to > be good. However, if the node is totally disabled/destroyed, IPMI will > be lost and the cluster will switch to the switched PDUs, cutting the > power outlets feeding the node. > > I've got a block diagram of how I do this: > > https://alteeve.ca/w/AN!Cluster_Tutorial_2#A_Map.21 > > It's trivial to scale the idea up to multiple node clusters. > > Cheers > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > > _______________________________________________ > Pacemaker mailing list: [email protected] > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
