Liron Aravot has submitted this change and it was merged. Change subject: core: intrdoucing host immediate domain recovery mechanism ......................................................................
core: intrdoucing host immediate domain recovery mechanism oVirt engine allows hosts to be activated even if they can't access some of the data center's storage domains in case that those domains are marked as "inactive" which means that all the hosts that are already in status up reported them as problematic (therefore there's no need to prevent "new" hosts from being activated). In case that we have an inactive domain that we failed to connect to it's storage server we won't have the link for that domain and we won't be able to produce it (as the mount was possible unavailable when we attempted to connect to the storage server). If the connectivity to that domain will return, host that was already active before might report that he has access to the domain which will cause the engine to change that domain's status to "active". The issue is that hosts that were activated after the connectivity was lost would move to non operational (causing to vm migration..etc) as they possibly won't have connection to the domain (it's a race between the domain status being changed to Active and the domain auto recovery meachanism) and won't have the needed links of that domain. The implemented solution is attempting to prevent hosts from moving to non-operational status to avoid the related affects of it. A new quartz job is set to run every 30 seconds, that job will inspect all reports of hosts that were gatherd since it's last run. The motivation for that implementation is to aggregate the operations on the different hosts together to avoid long wait time and block other "pool" operations. If any hosts has a "new" report on a domain that is active or unknown that it can't access for "storage" reason, those hosts would be reconnected to the active/unknown domains storage servers and will refresh it's storage pool metadata. the engine will attempt to "recover" each host only once for each problematic report to avoid flooding the system with recovery attempts, if the host would still have problem accessing the domain it'll be moved to non operational as usual. Change-Id: Idb7b2fe8c87805986aaf25cd0f24f605d67d4186 Bug-Url: https://bugzilla.redhat.com/show_bug.cgi?id=1093924 Signed-off-by: Liron Aravot <lara...@redhat.com> --- M backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/VdsEventListener.java M backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/storage/ConnectHostToStoragePoolServerCommandBase.java M backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/storage/ConnectHostToStoragePoolServersCommand.java M backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/action/ConnectHostToStoragePoolServersParameters.java M backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/businessentities/IVdsEventListener.java M backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/config/ConfigValues.java M backend/manager/modules/common/src/main/java/org/ovirt/engine/core/common/locks/LockingGroup.java M backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/irsbroker/IrsProxyData.java M backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/storage/StoragePoolDomainHelper.java M packaging/dbscripts/upgrade/pre_upgrade/0000_config.sql 10 files changed, 363 insertions(+), 56 deletions(-) Approvals: Allon Mureinik: Looks good to me, approved Liron Aravot: Verified; Looks good to me, approved -- To view, visit http://gerrit.ovirt.org/27523 To unsubscribe, visit http://gerrit.ovirt.org/settings Gerrit-MessageType: merged Gerrit-Change-Id: Idb7b2fe8c87805986aaf25cd0f24f605d67d4186 Gerrit-PatchSet: 4 Gerrit-Project: ovirt-engine Gerrit-Branch: master Gerrit-Owner: Liron Aravot <lara...@redhat.com> Gerrit-Reviewer: Allon Mureinik <amure...@redhat.com> Gerrit-Reviewer: Daniel Erez <de...@redhat.com> Gerrit-Reviewer: Federico Simoncelli <fsimo...@redhat.com> Gerrit-Reviewer: Liron Aravot <lara...@redhat.com> Gerrit-Reviewer: Maor Lipchuk <mlipc...@redhat.com> Gerrit-Reviewer: Oved Ourfali <oourf...@redhat.com> Gerrit-Reviewer: Roy Golan <rgo...@redhat.com> Gerrit-Reviewer: Tal Nisan <tni...@redhat.com> Gerrit-Reviewer: automat...@ovirt.org Gerrit-Reviewer: oVirt Jenkins CI Server _______________________________________________ Engine-patches mailing list Engine-patches@ovirt.org http://lists.ovirt.org/mailman/listinfo/engine-patches