Arik Hadas has uploaded a new change for review.

Change subject: core: prevent possible deadlock between monitoring threads
......................................................................

core: prevent possible deadlock between monitoring threads

If VM went down during migration in the source host, the monitoring
(VURTI) thread of the source host tries to destroy the VM in the
destination host. In order to destroy the VM in the destination host, it
tries to lock the destination host.

When we have two monitoring threads that try to do it, we end-up in a
deadlock because each of them locked the host it monitors and tries to
lock the other one.

This patch solves this problem by postponing the destroy operation on
the destination host to take place after the lock on the monitored host
is released.

It is safe to do it that way because the status of the VM in the
destination host must be either Down or MigratingTo. If the status is
MigratingTo, the monitoring of the destination host just ignores the VM.
If the status is Down, the monitoring of the destination host will just
destroy the VM and the second destroy operation will fail.

Change-Id: Ia4c52e0f8109f9599965078d2c6dd4db705cec97
Bug-Url: https://bugzilla.redhat.com/1142776
Signed-off-by: Arik Hadas <aha...@redhat.com>
---
M 
backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/VdsUpdateRunTimeInfo.java
1 file changed, 20 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.ovirt.org:29418/ovirt-engine refs/changes/95/34995/1

diff --git 
a/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/VdsUpdateRunTimeInfo.java
 
b/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/VdsUpdateRunTimeInfo.java
index 5771ea2..b115436 100644
--- 
a/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/VdsUpdateRunTimeInfo.java
+++ 
b/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/VdsUpdateRunTimeInfo.java
@@ -130,6 +130,7 @@
     private static Map<Guid, Long> hostDownTimes = new HashMap<>();
     private boolean vdsMaintenanceTimeoutOccurred;
     private Map<String, InterfaceStatus> oldInterfaceStatus = new 
HashMap<String, InterfaceStatus>();
+    private final Set<VM> vmsToRemoveFromDestinationVds = new HashSet<VM>();
 
     private static final Logger log = 
LoggerFactory.getLogger(VdsUpdateRunTimeInfo.class);
 
@@ -621,6 +622,9 @@
                                     : "unknown");
                 }
             }
+            for (VM vmToRemoveFromDestinationVds : 
vmsToRemoveFromDestinationVds) {
+                removeFromDestinationVds(vmToRemoveFromDestinationVds);
+            }
             // rerun all vms from rerun list
             for (Guid vm_guid : _vmsToRerun) {
                 log.error("Rerun vm '{}'. Called from vds '{}'", vm_guid, 
_vds.getName());
@@ -659,6 +663,21 @@
         } catch (RuntimeException ex) {
             logFailureMessage("Could not finish afterRefreshTreatment", ex);
             log.error("Exception", ex);
+        }
+    }
+
+    private void removeFromDestinationVds(VM vm) {
+        DestroyVmVDSCommand<DestroyVmVDSCommandParameters> destroyCmd =
+                new DestroyVmVDSCommand<>
+                (new DestroyVmVDSCommandParameters(vm.getMigratingToVds(),
+                        vm.getId(), true, false, 0));
+        destroyCmd.execute();
+        if (destroyCmd.getVDSReturnValue().getSucceeded()) {
+            log.info("Stopped migrating vm: '{}' on vds: '{}'", vm.getName(),
+                    vm.getMigratingToVds());
+        } else {
+            log.info("Could not stop migrating vm: '{}' on vds: '{}': {}", 
vm.getName(),
+                    vm.getMigratingToVds(), 
destroyCmd.getVDSReturnValue().getExceptionString());
         }
     }
 
@@ -1674,21 +1693,7 @@
             // exit status that's OK, otherwise..
             if (vmDynamic != null && vmDynamic.getExitStatus() != 
VmExitStatus.Normal) {
                 if (curVm.getMigratingToVds() != null) {
-                    DestroyVmVDSCommand<DestroyVmVDSCommandParameters> 
destroyCmd =
-                            new 
DestroyVmVDSCommand<DestroyVmVDSCommandParameters>
-                            (new DestroyVmVDSCommandParameters(new 
Guid(curVm.getMigratingToVds().toString()),
-                                    curVm.getId(),
-                                    true,
-                                    false,
-                                    0));
-                    destroyCmd.execute();
-                    if (destroyCmd.getVDSReturnValue().getSucceeded()) {
-                        log.info("Stopped migrating vm: '{}' on vds: '{}'", 
curVm.getName(),
-                                curVm.getMigratingToVds());
-                    } else {
-                        log.info("Could not stop migrating vm: '{}' on vds: 
'{}': {}", curVm.getName(),
-                                curVm.getMigratingToVds(), 
destroyCmd.getVDSReturnValue().getExceptionString());
-                    }
+                    vmsToRemoveFromDestinationVds.add(curVm);
                 }
                 // set vm status to down if source vm crushed
                 ResourceManager.getInstance().InternalSetVmStatus(curVm,


-- 
To view, visit http://gerrit.ovirt.org/34995
To unsubscribe, visit http://gerrit.ovirt.org/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia4c52e0f8109f9599965078d2c6dd4db705cec97
Gerrit-PatchSet: 1
Gerrit-Project: ovirt-engine
Gerrit-Branch: master
Gerrit-Owner: Arik Hadas <aha...@redhat.com>
_______________________________________________
Engine-patches mailing list
Engine-patches@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-patches

Reply via email to