Omer Frenkel has posted comments on this change. Change subject: core: prevent split brain caused by fence commands ......................................................................
Patch Set 3: (5 comments) .................................................... File backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/RestartVdsCommand.java Line 30: import org.ovirt.engine.core.compat.Guid; Line 31: import org.ovirt.engine.core.utils.lock.EngineLock; Line 32: Line 33: /** Line 34: * Send a Stop followed bt Start action to a power control device. typo 'by' Line 35: * Line 36: * This command should be run exclusively on a host for it is assuming that when Line 37: * a host is down it can mark all VMs as DOWN and start them on other host. Line 38: * 2 parallel action like that on the same server can lead to a race where Line 39: * the 1st flow end by starting VMs and the 2nd flow marking them as down and Line 40: * starting another instance of VMs on other hosts, leading to split-brain where Line 41: * 2 exact instances of VMs running in 2 different hosts and writing to the same disk. Line 42: * Line 43: * In order to make this flow distinct the child commands, Start, FenceManually and Start typo 'stop' before FenceManually Line 44: * are under the same lock as the parent, preventing other Restart, Start,Stop,FenceVdsManually to interleave. Line 45: * Line 46: * @see FenceVdsBaseCommand#restartVdsVms() The critical section restaring the VMs Line 47: */ Line 133: makeLockingPair(VDS_FENCE, SAME_ACTION_ON_ENTITY_ALREADY_IN_PROGRESS)); Line 134: } Line 135: Line 136: @Override Line 137: protected EngineLock getLock() { not sure why you need this override? this is done in commandBase, there is no reason for the commandLock to be null, no? Line 138: if (commandLock == null) { Line 139: commandLock = new EngineLock(getExclusiveLocks(), null); Line 140: } Line 141: return commandLock; .................................................... File backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/storage/FenceVdsManualyCommand.java Line 50: private final VDS _problematicVds; Line 51: Line 52: /** Line 53: * Constructor for command creation when compensation is applied on startup Line 54: * please remove white space Line 55: * @param commandId Line 56: */ Line 57: protected FenceVdsManualyCommand(Guid commandId) { Line 58: super(commandId); .................................................... File backend/manager/modules/dal/src/main/resources/bundles/AppErrors.properties Line 1057: SCHEDULING_ALL_HOSTS_FILTERED_OUT=Cannot ${action} ${type}. There is no host that satisfies current scheduling constraints. See bellow for details: Line 1058: SCHEDULING_HOST_FILTERED_REASON=The host ${hostName} did not satisfy ${filterType} filter ${filterName}. Line 1059: VAR__FILTERTYPE__EXTERNAL=$filterType external Line 1060: VAR__FILTERTYPE__INTERNAL=$filterType internal Line 1061: SAME_ACTION_ON_ENTITY_ALREADY_IN_PROGRESS=Same action, ${action}, is already in progress. not must, but maybe it would have been better to do a fencing specific message here so it would give meaningful info to the user, something like: "Cannot run power management action on host since ${action} is already in progress." -- To view, visit http://gerrit.ovirt.org/21049 To unsubscribe, visit http://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iaf021010188e3c3662fbf8a057da4c58f2600c02 Gerrit-PatchSet: 3 Gerrit-Project: ovirt-engine Gerrit-Branch: master Gerrit-Owner: Roy Golan <rgo...@redhat.com> Gerrit-Reviewer: Eli Mesika <emes...@redhat.com> Gerrit-Reviewer: Martin Peřina <mper...@redhat.com> Gerrit-Reviewer: Omer Frenkel <ofren...@redhat.com> Gerrit-Reviewer: Roy Golan <rgo...@redhat.com> Gerrit-Reviewer: oVirt Jenkins CI Server Gerrit-HasComments: Yes _______________________________________________ Engine-patches mailing list Engine-patches@ovirt.org http://lists.ovirt.org/mailman/listinfo/engine-patches