Arik Hadas has uploaded a new change for review.

Change subject: core: free pending resources properly on migration failure
......................................................................

core: free pending resources properly on migration failure

There is a race when VM switch from MIGRATING_FROM to UP due to
migration failure between:
1. Rerun operation which is executed from the monitoring (VURTI) on a
separate thread
2. Powering up handling which is executed from the monitoring thread

When #2 happens before #1, the powering up handling decrease the pending
resources from the destination host properly.

But when #1 happens before, then we've got a problem because it sets the
destination host to null so neither #1 or #2 decreases the pending
resources on the destination host.

This patch solves the pending resources decrement in the rerun flow.

Change-Id: Ib866d7da8a3e1fa7fe2023adc49f3b4f559c8982
Bug-Url: https://bugzilla.redhat.com/1157211
Signed-off-by: Arik Hadas <aha...@redhat.com>
---
M 
backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/MigrateVmCommand.java
M 
backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/RunVmCommandBase.java
2 files changed, 21 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.ovirt.org:29418/ovirt-engine refs/changes/74/34674/1

diff --git 
a/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/MigrateVmCommand.java
 
b/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/MigrateVmCommand.java
index 3e7354b..8defd00 100644
--- 
a/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/MigrateVmCommand.java
+++ 
b/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/MigrateVmCommand.java
@@ -384,7 +384,6 @@
 
         // if vm is up and rerun is called then it got up on the source, try 
to rerun
         if (getVm() != null && getVm().getStatus() == VMStatus.Up) {
-            setDestinationVdsId(null);
             super.rerun();
         } else {
             // vm went down on the destination and source, migration failed.
@@ -395,6 +394,12 @@
         }
     }
 
+    @Override
+    protected void reexecuteCommand() {
+        setDestinationVdsId(null);
+        super.reexecuteCommand();
+    }
+
     /**
      * Log that the migration had failed with the error code that is in the 
VDS and needs to be retrieved.
      */
diff --git 
a/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/RunVmCommandBase.java
 
b/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/RunVmCommandBase.java
index a514e37..55afb8d 100644
--- 
a/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/RunVmCommandBase.java
+++ 
b/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/RunVmCommandBase.java
@@ -107,17 +107,7 @@
          */
         if (getRunVdssList().size() < Config.<Integer> 
getValue(ConfigValues.MaxRerunVmOnVdsCount)
                 && getVm().getStatus() != VMStatus.Paused) {
-            // restore CanDoAction value to false so CanDoAction checks will 
run again
-            getReturnValue().setCanDoAction(false);
-            if (getExecutionContext() != null) {
-                Job job = getExecutionContext().getJob();
-                if (job != null) {
-                    // mark previous steps as fail
-                    
JobRepositoryFactory.getJobRepository().closeCompletedJobSteps(job.getId(), 
JobExecutionStatus.FAILED);
-                }
-            }
-            insertAsyncTaskPlaceHolders();
-            executeAction();
+            reexecuteCommand();
 
             // if there was no rerun attempt in the previous executeAction 
call and the command
             // wasn't done because canDoAction check returned false..
@@ -132,6 +122,20 @@
         }
     }
 
+    protected void reexecuteCommand() {
+        // restore CanDoAction value to false so CanDoAction checks will run 
again
+        getReturnValue().setCanDoAction(false);
+        if (getExecutionContext() != null) {
+            Job job = getExecutionContext().getJob();
+            if (job != null) {
+                // mark previous steps as fail
+                
JobRepositoryFactory.getJobRepository().closeCompletedJobSteps(job.getId(), 
JobExecutionStatus.FAILED);
+            }
+        }
+        insertAsyncTaskPlaceHolders();
+        executeAction();
+    }
+
     protected void runningFailed() {
         try {
             decreasePendingVms();


-- 
To view, visit http://gerrit.ovirt.org/34674
To unsubscribe, visit http://gerrit.ovirt.org/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib866d7da8a3e1fa7fe2023adc49f3b4f559c8982
Gerrit-PatchSet: 1
Gerrit-Project: ovirt-engine
Gerrit-Branch: ovirt-engine-3.5
Gerrit-Owner: Arik Hadas <aha...@redhat.com>
_______________________________________________
Engine-patches mailing list
Engine-patches@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-patches

Reply via email to