Pause DRBD sync for OS install if not wait_for_sync
authorApollon Oikonomopoulos <apollon@noc.grnet.gr>
Wed, 3 Aug 2011 15:03:38 +0000 (18:03 +0300)
committerIustin Pop <iustin@google.com>
Wed, 3 Aug 2011 16:35:13 +0000 (18:35 +0200)
When wait_for_sync is set to False in LUInstanceCreate, Ganeti lets DRBD sync
in the background while performing the rest of the installation steps,
including OS installation.

However, OS installation is a very disk-intensive task that intereferes badly
with the background I/O caused by DRBD's initial sync. To this end, we pause
the background sync before OS installation and unpause it afterwards, which
yields a significant speed boost for OS installation. The following should be
noted:

a) The user has requested not to wait for sync, i.e. the instance will be
   non-redundant for an unspecified interval anyway and delaying this by a
   couple of minutes is not a big compromise.

b) This approach is also followed during disk wiping.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
[iustin@google.com: simplify an if check]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

lib/cmdlib.py

index 038bbb6..a70dda0 100644 (file)
@@ -8861,10 +8861,30 @@ class LUInstanceCreate(LogicalUnit):
     if iobj.disk_template != constants.DT_DISKLESS and not self.adopt_disks:
       if self.op.mode == constants.INSTANCE_CREATE:
         if not self.op.no_install:
+          pause_sync = (iobj.disk_template in constants.DTS_INT_MIRROR and
+                        not self.op.wait_for_sync)
+          if pause_sync:
+            feedback_fn("* pausing disk sync to install instance OS")
+            result = self.rpc.call_blockdev_pause_resume_sync(pnode_name,
+                                                              iobj.disks, True)
+            for idx, success in enumerate(result.payload):
+              if not success:
+                logging.warn("pause-sync of instance %s for disk %d failed",
+                             instance, idx)
+
           feedback_fn("* running the instance OS create scripts...")
           # FIXME: pass debug option from opcode to backend
           result = self.rpc.call_instance_os_add(pnode_name, iobj, False,
                                                  self.op.debug_level)
+          if pause_sync:
+            feedback_fn("* resuming disk sync")
+            result = self.rpc.call_blockdev_pause_resume_sync(pnode_name,
+                                                              iobj.disks, False)
+            for idx, success in enumerate(result.payload):
+              if not success:
+                logging.warn("resume-sync of instance %s for disk %d failed",
+                             instance, idx)
+
           result.Raise("Could not add os for instance %s"
                        " on node %s" % (instance, pnode_name))