It seems that xen can have issues that are triggered by too fast
migrations. For now, until we have more experience with it, we should
add some delays before/after the migration call so that things have
enough time to settle down (e.g. hot plug scripts, etc.).
Note that I don't have solid data that this will fix it, or that indeed
xen is the culprit, but for now this seems the simplest way to try to
mitigate it.
Reviewed-by: ultrotter
self._WaitUntilSync()
self.feedback_fn("* migrating instance to %s" % target_node)
+ time.sleep(2)
result = rpc.call_instance_migrate(source_node, instance,
self.nodes_ip[target_node],
self.op.live)
if not result or not result[0]:
logger.Error("Instance migration failed, trying to revert disk status")
-
try:
self._EnsureSecondary(target_node)
self._GoStandalone()
raise errors.OpExecError("Could not migrate instance %s: %s" %
(instance.name, result[1]))
+ time.sleep(2)
instance.primary_node = target_node
# distribute new instance config to the other nodes