On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote: > On 03/17/2016 05:10 PM, Christopher Harvey wrote: > > If I ignore pacemaker's existence, and just run corosync, corosync > > disagrees about node membership in the situation presented in the first > > email. While it's true that stonith just happens to quickly correct the > > situation after it occurs it still smells like a bug in the case where > > corosync in used in isolation. Corosync is after all a membership and > > total ordering protocol, and the nodes in the cluster are unable to > > agree on membership. > > > > The Totem protocol specifies a ring_id in the token passed in a ring. > > Since all of the 3 nodes but one have formed a new ring with a new id > > how is it that the single node can survive in a ring with no other > > members passing a token with the old ring_id? > > > > Are there network failure situations that can fool the Totem membership > > protocol or is this an implementation problem? I don't see how it could > > not be one or the other, and it's bad either way. > > Neither, really. In a split brain situation, there simply is not enough > information for any protocol or implementation to reliably decide what > to do. That's what fencing is meant to solve -- it provides the > information that certain nodes are definitely not active. > > There's no way for either side of the split to know whether the opposite > side is down, or merely unable to communicate properly. If the latter, > it's possible that they are still accessing shared resources, which > without proper communication, can lead to serious problems (e.g. data > corruption of a shared volume).
The totem protocol is silent on the topic of fencing and resources, much the way TCP is. Please explain to me what needs to be fenced in a cluster without resources where membership and total message ordering are the only concern. If fencing were a requirement for membership and ordering, wouldn't stonith be part of corosync and not pacemaker? _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
