Jiří Moskovčák has uploaded a new change for review.

Change subject: try harder when starting vdsmd
......................................................................

try harder when starting vdsmd

I've observed a situation when ha-agent is started too early and
it fails to start vdsmd, but later when vdsmd is started ha-agent
also starts fine. This patch makes agent to wait a while and retry
if the first attempt to start vdsmd fails

Change-Id: I806437b8c5eafd32fb37d0b3f2f59995faa28cdc
Signed-off-by: Jiri Moskovcak <[email protected]>
---
M ovirt_hosted_engine_ha/agent/constants.py.in
M ovirt_hosted_engine_ha/agent/hosted_engine.py
2 files changed, 16 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.ovirt.org:29418/ovirt-hosted-engine-ha 
refs/changes/32/28432/1

diff --git a/ovirt_hosted_engine_ha/agent/constants.py.in 
b/ovirt_hosted_engine_ha/agent/constants.py.in
index 5ce56e7..3c28849 100644
--- a/ovirt_hosted_engine_ha/agent/constants.py.in
+++ b/ovirt_hosted_engine_ha/agent/constants.py.in
@@ -51,6 +51,7 @@
 ENGINE_BAD_HEALTH_EXPIRATION_SECS = 600
 VM_UNEXPECTED_SHUTDOWN_EXPIRATION_SECS = 600
 MAX_VDSM_WAIT_SECS = 15
+MAX_VDSM_START_RETRIES = 5
 MAX_DOMAIN_MONITOR_WAIT_SECS = 240
 METADATA_LOG_PERIOD_SECS = 600
 
diff --git a/ovirt_hosted_engine_ha/agent/hosted_engine.py 
b/ovirt_hosted_engine_ha/agent/hosted_engine.py
index ff9a17e..7d367b3 100644
--- a/ovirt_hosted_engine_ha/agent/hosted_engine.py
+++ b/ovirt_hosted_engine_ha/agent/hosted_engine.py
@@ -410,8 +410,21 @@
         self._log.info("Broker initialized, all submonitors started")
 
     def _initialize_vdsm(self):
-        # TODO not the most efficient means to maintain vdsmd...
-        self._cond_start_service('vdsmd')
+        tries = 0
+        while tries < constants.MAX_VDSM_START_RETRIES:
+            tries += 1
+            try:
+                self._cond_start_service('vdsmd')
+                break
+            except Exception as _ex:
+                if tries > constants.MAX_VDSM_START_RETRIES:
+                    self._log.error("Can't start vdsmd, the number of errors "
+                                    "has exceeded the limit: 
'{0}'".format(_ex))
+                    raise
+                self._log.warn("Can't start vdsmd, waiting '{0}' seconds 
before"
+                               " the next attempt"
+                               .format(constants.MAX_VDSM_WAIT_SECS))
+                time.sleep(constants.MAX_VDSM_WAIT_SECS)
 
         self._log.debug("Verifying storage is attached")
         tries = 0


-- 
To view, visit http://gerrit.ovirt.org/28432
To unsubscribe, visit http://gerrit.ovirt.org/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I806437b8c5eafd32fb37d0b3f2f59995faa28cdc
Gerrit-PatchSet: 1
Gerrit-Project: ovirt-hosted-engine-ha
Gerrit-Branch: master
Gerrit-Owner: Jiří Moskovčák <[email protected]>
_______________________________________________
Engine-patches mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/engine-patches

Reply via email to