Fix cluster-merging by not stopping noded
authorStephen Shirley <diamond@google.com>
Thu, 10 Feb 2011 10:52:13 +0000 (11:52 +0100)
committerIustin Pop <iustin@google.com>
Mon, 14 Feb 2011 11:18:16 +0000 (12:18 +0100)
cli.RunWhileClusterStopped() stops noded on all of the nodes in the
original cluster. This prevents /etc/hosts updates on the master, and
config redistribution doesn't reach the other nodes in the original
cluster. As all we want to do is merge while the master is stopped,
simply stop it and start it again after.

Signed-off-by: Stephen Shirley <diamond@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

tools/cluster-merge

index 85d3a1d..f2410f4 100755 (executable)
@@ -403,25 +403,26 @@ class Merger(object):
       logging.info("Merging config")
       self._FetchRemoteConfig()
 
-      def _OfflineClusterMerge(_):
-        """Closure run when master daemons stopped
-
-        """
-        rbsteps.append("Restore %s from another master candidate" %
-                       constants.CLUSTER_CONF_FILE)
-        self._MergeConfig()
-        self._StartMasterDaemon(no_vote=True)
-
-        # Point of no return, delete rbsteps
-        del rbsteps[:]
-
-        logging.warning("We are at the point of no return. Merge can not easily"
-                        " be undone after this point.")
-        logging.info("Readd nodes and redistribute config")
-        self._ReaddMergedNodesAndRedist()
-        self._KillMasterDaemon()
-
-      cli.RunWhileClusterStopped(logging.info, _OfflineClusterMerge)
+      logging.info("Stopping master daemon")
+      self._KillMasterDaemon()
+
+      rbsteps.append("Restore %s from another master candidate"
+                     " and restart master daemon" %
+                     constants.CLUSTER_CONF_FILE)
+      self._MergeConfig()
+      self._StartMasterDaemon(no_vote=True)
+
+      # Point of no return, delete rbsteps
+      del rbsteps[:]
+
+      logging.warning("We are at the point of no return. Merge can not easily"
+                      " be undone after this point.")
+      logging.info("Readd nodes")
+      self._ReaddMergedNodesAndRedist()
+
+      logging.info("Merge done, restart master daemon normally")
+      self._KillMasterDaemon()
+      self._StartMasterDaemon()
 
       logging.info("Starting instances again")
       self._StartupAllInstances()