(2.10) SimpleRetry on BlockDev.Remove()
authorDimitris Aragiorgis <dimara@grnet.gr>
Fri, 25 Oct 2013 15:43:06 +0000 (18:43 +0300)
committerDimitris Aragiorgis <dimara@grnet.gr>
Thu, 27 Mar 2014 07:57:05 +0000 (09:57 +0200)
Sometimes, upon disk removal, corresponding file descriptors
are kept briefly open by various processes (hypervisor, blkid, etc.).
With this patch, we retry several times before raising the appropriate
error, thus making disk removal more robust against those corner cases.

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Klaus Aehlig <aehlig@google.com>
Reviewed-by: Klaus Aehlig <aehlig@google.com>

In stable-2.10 we have constants auto-generated from haskell

Conflicts:
lib/constants.py
src/Ganeti/HsConstants.hs

Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>

lib/backend.py
lib/constants.py

index e0f2256..fbf8645 100644 (file)
@@ -1838,10 +1838,18 @@ def BlockdevRemove(disk):
     rdev = None
   if rdev is not None:
     r_path = rdev.dev_path
-    try:
-      rdev.Remove()
-    except errors.BlockDeviceError, err:
-      msgs.append(str(err))
+
+    def _TryRemove():
+      try:
+        rdev.Remove()
+        return []
+      except errors.BlockDeviceError, err:
+        return [str(err)]
+
+    msgs.extend(utils.SimpleRetry([], _TryRemove,
+                                  constants.DISK_REMOVE_RETRY_INTERVAL,
+                                  constants.DISK_REMOVE_RETRY_TIMEOUT))
+
     if not msgs:
       DevCacheManager.RemoveCache(r_path)
 
index 64d9836..7d00387 100644 (file)
@@ -2508,5 +2508,9 @@ OPCODE_REASON_SOURCES = compat.UniqueFrozenset([
   OPCODE_REASON_SRC_USER,
   ])
 
+# disk removal timeouts
+DISK_REMOVE_RETRY_INTERVAL = 3
+DISK_REMOVE_RETRY_TIMEOUT = 30
+
 # Do not re-export imported modules
 del re, _vcsversion, _autoconf, socket, pathutils, compat