Martin Peřina has posted comments on this change. Change subject: core: Add kdump detection into fencing flow ......................................................................
Patch Set 3: (11 comments) http://gerrit.ovirt.org/#/c/28072/3/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/VdsKdumpDetectionCommand.java File backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/VdsKdumpDetectionCommand.java: Line 19: import org.ovirt.engine.core.utils.ThreadUtils; Line 20: import org.ovirt.engine.core.vdsbroker.ResourceManager; Line 21: Line 22: /** Line 23: * Tries to detect if VDS is kdumping. > VDS=> host Done Line 24: */ Line 25: @NonTransactiveCommandAttribute Line 26: public class VdsKdumpDetectionCommand<T extends VdsActionParameters> extends VdsCommand<T> { Line 27: /** Line 24: */ Line 25: @NonTransactiveCommandAttribute Line 26: public class VdsKdumpDetectionCommand<T extends VdsActionParameters> extends VdsCommand<T> { Line 27: /** Line 28: * Name of external variable tp store fence_kdump listener heartbeat > tp=>to Done Line 29: */ Line 30: private static final String LISTENER_HEARTBEAT = "fence-kdump-listener-heartbeat"; Line 31: Line 32: public VdsKdumpDetectionCommand(T parameters) { Line 35: Line 36: @Override Line 37: protected boolean canDoAction() { Line 38: if (getVds().getKdumpStatus() != KdumpStatus.ENABLED) { Line 39: log.infoFormat("Kdump detection support is not configured on host {0}({1}).", > Should add the proper message to can do action messages as well (as done in Done Line 40: getVdsName(), Line 41: getVdsId()); Line 42: return false; Line 43: } Line 66: int interval = Config.<Integer>getValue(ConfigValues.FenceKdumpMessageInterval) * 1000; Line 67: Calendar cal = Calendar.getInstance(); Line 68: cal.add(Calendar.SECOND, Config.<Integer>getValue(ConfigValues.KdumpStartedTimeout)); Line 69: long timeout = cal.getTimeInMillis(); Line 70: VdsKdumpStatus kdumpStatus = getDbFacade().getVdsKdumpStatusDao().get(getVdsId()); > Can't you initialize that one in the loop? Done Line 71: Line 72: while (!detected && timeout > System.currentTimeMillis()) { Line 73: detected = kdumpStatus != null; Line 74: if (!detected) { Line 76: } Line 77: kdumpStatus = getDbFacade().getVdsKdumpStatusDao().get(getVdsId()); Line 78: } Line 79: Line 80: if (detected) { > please apply this block as other commands Done Line 81: AuditLogableBase base = new AuditLogableBase(); Line 82: base.setVds(getVds()); Line 83: AuditLogDirector.log(base, AuditLogType.KDUMP_FLOW_DETECTED_ON_VDS); Line 84: } else { Line 92: // kdump detected set status to reboot and wait until kdump fisnihes Line 93: Backend.getInstance() Line 94: .getResourceManager() Line 95: .RunVdsCommand(VDSCommandType.SetVdsStatus, Line 96: new SetVdsStatusVDSCommandParameters(getVdsId(), VDSStatus.Reboot)); > So we're setting it to reboot before it really happened? If it's set to Reboot, then user cannot do anything with the host in UI and VURTI doesn't display connection timeout errors. Line 97: Line 98: while (kdumpStatus.getStatus() != KdumpFlowStatus.FINISHED) { Line 99: ThreadUtils.sleep(interval); Line 100: kdumpStatus = getDbFacade().getVdsKdumpStatusDao().get(getVdsId()); Line 111: Line 112: getReturnValue().setSucceeded(true); Line 113: } Line 114: Line 115: private boolean isListenerTimeouted(Date lastHeartbeat) { > I think isListenerTimeoutExceeded Changed to isListenerAlive() Line 116: long timeout = lastHeartbeat.getTime() Line 117: + Config.<Integer>getValue(ConfigValues.FenceKdumpListenerTimeout) * 1000; Line 118: return System.currentTimeMillis() > timeout; Line 119: } Line 119: } Line 120: Line 121: private void checkListenerAlive() { Line 122: ExternalVariable fkAlive = getDbFacade().getExternalVariableDao().get(LISTENER_HEARTBEAT); Line 123: if (fkAlive == null || isListenerTimeouted(fkAlive.getUpdated())) { > same Changed to isListenerAlive() Line 124: AuditLogableBase base = new AuditLogableBase(); Line 125: base.setVds(getVds()); Line 126: AuditLogDirector.log(base, AuditLogType.FENCE_KDUMP_LISTENER_IS_NOT_ALIVE); Line 127: } http://gerrit.ovirt.org/#/c/28072/3/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/VdsNotRespondingTreatmentCommand.java File backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/VdsNotRespondingTreatmentCommand.java: Line 70: Backend.getInstance().runInternalAction(VdcActionType.VdsKdumpDetection, Line 71: getParameters(), Line 72: ExecutionHandler.createInternalJobContext()); Line 73: Line 74: if (retVal.getSucceeded()) { > IIRC potential NPE here on retval AFAIK the result of runInternalAction() cannot be null. Or am I missing something? Line 75: // kdump on host detected and finished successfully, stop hard fencing execution Line 76: getReturnValue().setSucceeded(true); Line 77: return; Line 78: } http://gerrit.ovirt.org/#/c/28072/3/backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/AuditLogType.java File backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/AuditLogType.java: Line 113: 613 > As any other command we should have a log message for command failure as we Done http://gerrit.ovirt.org/#/c/28072/3/backend/manager/modules/dal/src/main/resources/bundles/AuditLogMessages.properties File backend/manager/modules/dal/src/main/resources/bundles/AuditLogMessages.properties: Line 825: ISCSI_BOND_REMOVE_FAILED=Failed to remove iSCSI bond '${IscsiBondName}' from Data Center '${StoragePoolName}' Line 826: Line 827: KDUMP_FLOW_DETECTED_ON_VDS=Kdump flow detected on host '${VdsName}'. Line 828: KDUMP_FLOW_FINISHED_ON_VDS=Kdump flow finished on host '${VdsName}'. Line 829: FENCE_KDUMP_LISTENER_IS_NOT_ALIVE=Kdump detection for host '${VdsName}' started, but fence_kdump listener is not running. > started=> had started Done -- To view, visit http://gerrit.ovirt.org/28072 To unsubscribe, visit http://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I18145bc5814dc0b97b8ed6593f0c76473285ae16 Gerrit-PatchSet: 3 Gerrit-Project: ovirt-engine Gerrit-Branch: master Gerrit-Owner: Martin Peřina <mper...@redhat.com> Gerrit-Reviewer: Barak Azulay <bazu...@redhat.com> Gerrit-Reviewer: Eli Mesika <emes...@redhat.com> Gerrit-Reviewer: Martin Peřina <mper...@redhat.com> Gerrit-Reviewer: Oved Ourfali <oourf...@redhat.com> Gerrit-Reviewer: automat...@ovirt.org Gerrit-Reviewer: oVirt Jenkins CI Server Gerrit-HasComments: Yes _______________________________________________ Engine-patches mailing list Engine-patches@ovirt.org http://lists.ovirt.org/mailman/listinfo/engine-patches