====
+Version 2.9.1
+-------------
+
+*(Released Wed, 13 Nov 2013)*
+
+- fix bug, that kept nodes offline when readding
+- when verifying DRBD versions, ignore unavailable nodes
+- fix bug that made the console unavailable on kvm in split-user
+ setup (issue 608)
+- DRBD: ensure peers are UpToDate for dual-primary (inherited 2.8.2)
+
+
+Version 2.9.0
+-------------
+
+*(Released Tue, 5 Nov 2013)*
+
+Incompatible/important changes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- hroller now also plans for capacity to move non-redundant instances off
+ any node to be rebooted; the old behavior of completely ignoring any
+ non-redundant instances can be restored by adding the --ignore-non-redundant
+ option.
+- The cluster option '--no-lvm-storage' was removed in favor of the new option
+ '--enabled-disk-templates'.
+- On instance creation, disk templates no longer need to be specified
+ with '-t'. The default disk template will be taken from the list of
+ enabled disk templates.
+- The monitoring daemon is now running as root, in order to be able to collect
+ information only available to root (such as the state of Xen instances).
+- The ConfD client is now IPv6 compatible.
+- File and shared file storage is no longer dis/enabled at configure time,
+ but using the option '--enabled-disk-templates' at cluster initialization and
+ modification.
+- The default directories for file and shared file storage are not anymore
+ specified at configure time, but taken from the cluster's configuration.
+ They can be set at cluster initialization and modification with
+ '--file-storage-dir' and '--shared-file-storage-dir'.
+- Cluster verification now includes stricter checks regarding the
+ default file and shared file storage directories. It now checks that
+ the directories are explicitely allowed in the 'file-storage-paths' file and
+ that the directories exist on all nodes.
+- The list of allowed disk templates in the instance policy and the list
+ of cluster-wide enabled disk templates is now checked for consistency
+ on cluster or group modification. On cluster initialization, the ipolicy
+ disk templates are ensured to be a subset of the cluster-wide enabled
+ disk templates.
+
+New features
+~~~~~~~~~~~~
+
+- DRBD 8.4 support. Depending on the installed DRBD version, Ganeti now uses
+ the correct command syntax. It is possible to use different DRBD versions
+ on different nodes as long as they are compatible to each other. This
+ enables rolling upgrades of DRBD with no downtime. As permanent operation
+ of different DRBD versions within a node group is discouraged,
+ ``gnt-cluster verify`` will emit a warning if it detects such a situation.
+- New "inst-status-xen" data collector for the monitoring daemon, providing
+ information about the state of the xen instances on the nodes.
+- New "lv" data collector for the monitoring daemon, collecting data about the
+ logical volumes on the nodes, and pairing them with the name of the instances
+ they belong to.
+- New "diskstats" data collector, collecting the data from /proc/diskstats and
+ presenting them over the monitoring daemon interface.
+- The ConfD client is now IPv6 compatible.
+
+New dependencies
+~~~~~~~~~~~~~~~~
+The following new dependencies have been added.
+
+Python
+
+- ``python-mock`` (http://www.voidspace.org.uk/python/mock/) is now a required
+ for the unit tests (and only used for testing).
+
+Haskell
+
+- ``hslogger`` (http://software.complete.org/hslogger) is now always
+ required, even if confd is not enabled.
+
+Since 2.9.0 rc3
+~~~~~~~~~~~~~~~
+
+- Correctly start/stop luxid during gnt-cluster master-failover (inherited
+ from stable-2.8)
+- Improved error messsages (inherited from stable-2.8)
+
+
+Version 2.9.0 rc3
+-----------------
+
+*(Released Tue, 15 Oct 2013)*
+
+The third release candidate in the 2.9 series. Since 2.9.0 rc2:
+
+- in implicit configuration upgrade, match ipolicy with enabled disk templates
+- improved harep documentation (inherited from stable-2.8)
+
+
+Version 2.9.0 rc2
+-----------------
+
+*(Released Wed, 9 Oct 2013)*
+
+The second release candidate in the 2.9 series. Since 2.9.0 rc1:
+
+- Fix bug in cfgupgrade that led to failure when upgrading from 2.8 with
+ at least one DRBD instance.
+- Fix bug in cfgupgrade that led to an invalid 2.8 configuration after
+ downgrading.
+
+
+Version 2.9.0 rc1
+-----------------
+
+*(Released Tue, 1 Oct 2013)*
+
+The first release candidate in the 2.9 series. Since 2.9.0 beta1:
+
+- various bug fixes
+- update of the documentation, in particular installation instructions
+- merging of LD_* constants into DT_* constants
+- python style changes to be compatible with newer versions of pylint
+
+
+Version 2.9.0 beta1
+-------------------
+
+*(Released Thu, 29 Aug 2013)*
+
+This was the first beta release of the 2.9 series. All important changes
+are listed in the latest 2.9 entry.
+
+
+ Version 2.8.3
+ -------------
+
+ *(Released Thu, 12 Dec 2013)*
+
+ - Fixed Luxi daemon socket permissions after master-failover
+ - Improve IP version detection code directly checking for colons rather than
+ passing the family from the cluster object
+ - Fix NODE/NODE_RES locking in LUInstanceCreate by not acquiring NODE_RES locks
+ opportunistically anymore (Issue 622)
+ - Allow link local IPv6 gateways (Issue 624)
+ - Fix error printing (Issue 616)
+ - Fix a bug in InstanceSetParams concerning names: in case no name is passed in
+ disk modifications, keep the old one. If name=none then set disk name to
+ None.
+ - Update build_chroot script to work with the latest hackage packages
+ - Add a packet number limit to "fping" in master-ip-setup (Issue 630)
+ - Fix evacuation out of drained node (Issue 615)
+ - Add default file_driver if missing (Issue 571)
+ - Fix job error message after unclean master shutdown (Issue 618)
+ - Lock group(s) when creating instances (Issue 621)
+ - SetDiskID() before accepting an instance (Issue 633)
+ - Allow the ext template disks to receive arbitrary parameters, both at creation
+ time and while being modified
+ - Xen handle domain shutdown (future proofing cherry-pick)
+ - Refactor reading live data in htools (future proofing cherry-pick)
+
+
Version 2.8.2
-------------
cabal update
in_chroot -- \
- $APT_INSTALL libpcre3-dev
-
-in_chroot -- \
cabal install --global \
+ blaze-builder==0.3.1.1 \
network==2.3 \
regex-pcre==0.94.2 \
hinotify==0.3.2 \
if (not self.op.file_driver and
self.op.disk_template in [constants.DT_FILE,
constants.DT_SHARED_FILE]):
- self.op.file_driver = constants.FD_LOOP
+ self.op.file_driver = constants.FD_DEFAULT
- if self.op.disk_template == constants.DT_FILE:
- opcodes.RequireFileStorage()
- elif self.op.disk_template == constants.DT_SHARED_FILE:
- opcodes.RequireSharedFileStorage()
-
### Node/iallocator related checks
CheckIAllocatorOrNode(self, "iallocator", "pnode")
if self.op.opportunistic_locking:
self.opportunistic_locks[locking.LEVEL_NODE] = True
- self.opportunistic_locks[locking.LEVEL_NODE_RES] = True
else:
- self.op.pnode = ExpandNodeName(self.cfg, self.op.pnode)
- nodelist = [self.op.pnode]
+ (self.op.pnode_uuid, self.op.pnode) = \
+ ExpandNodeUuidAndName(self.cfg, self.op.pnode_uuid, self.op.pnode)
+ nodelist = [self.op.pnode_uuid]
if self.op.snode is not None:
- self.op.snode = ExpandNodeName(self.cfg, self.op.snode)
- nodelist.append(self.op.snode)
+ (self.op.snode_uuid, self.op.snode) = \
+ ExpandNodeUuidAndName(self.cfg, self.op.snode_uuid, self.op.snode)
+ nodelist.append(self.op.snode_uuid)
self.needed_locks[locking.LEVEL_NODE] = nodelist
# in case of import lock the source node too
else:
raise errors.ProgrammerError("Unhandled operation '%s'" % op)
- @staticmethod
- def _VerifyDiskModification(op, params, excl_stor):
- def _VerifyDiskModification(self, op, params):
++ def _VerifyDiskModification(self, op, params, excl_stor):
"""Verifies a disk modification.
"""
self._GoReconnect(True)
self._WaitUntilSync()
- self.feedback_fn("* preparing %s to accept the instance" % target_node)
+ self.feedback_fn("* preparing %s to accept the instance" %
+ self.cfg.GetNodeName(self.target_node_uuid))
+ # This fills physical_id slot that may be missing on newly created disks
+ for disk in instance.disks:
+ self.cfg.SetDiskID(disk, target_node)
- result = self.rpc.call_accept_instance(target_node,
- instance,
+ result = self.rpc.call_accept_instance(self.target_node_uuid,
+ self.instance,
migration_info,
- self.nodes_ip[target_node])
+ self.nodes_ip[self.target_node_uuid])
msg = result.fail_msg
if msg:
# file backend driver
FD_LOOP = "loop"
FD_BLKTAP = "blktap"
+ FD_DEFAULT = FD_LOOP
-# the set of drbd-like disk types
-LDS_DRBD = compat.UniqueFrozenset([LD_DRBD8])
-
# disk access mode
DISK_RDONLY = "ro"
DISK_RDWR = "rw"
raise errors.HypervisorError(errmsg)
- return _ParseXmList(lines, include_node)
+ return _ParseInstanceList(lines, include_node)
+ def _IsInstanceRunning(instance_info):
+ return instance_info == "r-----" \
+ or instance_info == "-b----"
+
+
+ def _IsInstanceShutdown(instance_info):
+ return instance_info == "---s--"
+
+
def _ParseNodeInfo(info):
"""Return information about the node.
if name is None:
name = instance.name
- return self._StopInstance(name, force)
+ return self._StopInstance(name, force, instance.hvparams)
- def _ShutdownInstance(self, name):
++ def _ShutdownInstance(self, name, hvparams):
+ """Shutdown an instance if the instance is running.
+
+ @type name: string
+ @param name: name of the instance to stop
++ @type hvparams: dict of string
++ @param hvparams: hypervisor parameters of the instance
+
+ The '-w' flag waits for shutdown to complete which avoids the need
+ to poll in the case where we want to destroy the domain
+ immediately after shutdown.
+
+ """
+ instance_info = self.GetInstanceInfo(name)
+
+ if instance_info is None or _IsInstanceShutdown(instance_info[4]):
+ logging.info("Failed to shutdown instance %s, not running", name)
+ return None
+
- return self._RunXen(["shutdown", "-w", name])
++ return self._RunXen(["shutdown", "-w", name], hvparams)
+
- def _DestroyInstance(self, name):
++ def _DestroyInstance(self, name, hvparams):
+ """Destroy an instance if the instance if the instance exists.
+
+ @type name: string
+ @param name: name of the instance to destroy
++ @type hvparams: dict of string
++ @param hvparams: hypervisor parameters of the instance
+
+ """
+ instance_info = self.GetInstanceInfo(name)
+
+ if instance_info is None:
+ logging.info("Failed to destroy instance %s, does not exist", name)
+ return None
+
- return self._RunXen(["destroy", name])
++ return self._RunXen(["destroy", name], hvparams)
+
- def _StopInstance(self, name, force):
+ def _StopInstance(self, name, force, hvparams):
"""Stop an instance.
@type name: string
- @param name: name of the instance to be shutdown
+ @param name: name of the instance to destroy
+
@type force: boolean
- @param force: flag specifying whether shutdown should be forced
+ @param force: whether to do a "hard" stop (destroy)
+
+ @type hvparams: dict of string
+ @param hvparams: hypervisor parameters of the instance
+
"""
if force:
- action = "destroy"
- result = self._DestroyInstance(name)
++ result = self._DestroyInstance(name, hvparams)
else:
- action = "shutdown"
- self._ShutdownInstance(name)
- result = self._DestroyInstance(name)
++ self._ShutdownInstance(name, hvparams)
++ result = self._DestroyInstance(name, hvparams)
- result = self._RunXen([action, name], hvparams)
- if result.failed:
+ if result is not None and result.failed and \
+ self.GetInstanceInfo(name) is not None:
raise errors.HypervisorError("Failed to stop instance %s: %s, %s" %
(name, result.fail_reason, result.output))
#TODO(dynmem): compute the right data on MAX and MIN memory
# make a copy of the current dict
node_results = dict(node_results)
- for nname, nresult in node_data.items():
- assert nname in node_results, "Missing basic data for node %s" % nname
- ninfo = node_cfg[nname]
+ for nuuid, nresult in node_data.items():
+ ninfo = node_cfg[nuuid]
+ assert ninfo.name in node_results, "Missing basic data for node %s" % \
+ ninfo.name
- if not (ninfo.offline or ninfo.drained):
+ if not ninfo.offline:
- nresult.Raise("Can't get data for node %s" % nname)
- node_iinfo[nname].Raise("Can't get node instance info from node %s" %
- nname)
- remote_info = rpc.MakeLegacyNodeInfo(nresult.payload,
- require_vg_info=has_lvm)
-
- def get_attr(attr):
- if attr not in remote_info:
- raise errors.OpExecError("Node '%s' didn't return attribute"
- " '%s'" % (nname, attr))
- value = remote_info[attr]
- if not isinstance(value, int):
- raise errors.OpExecError("Node '%s' returned invalid value"
- " for '%s': %s" %
- (nname, attr, value))
- return value
-
- mem_free = get_attr("memory_free")
-
- # compute memory used by primary instances
- i_p_mem = i_p_up_mem = 0
- for iinfo, beinfo in i_list:
- if iinfo.primary_node == nname:
- i_p_mem += beinfo[constants.BE_MAXMEM]
- if iinfo.name not in node_iinfo[nname].payload:
- i_used_mem = 0
- else:
- i_used_mem = int(node_iinfo[nname].payload[iinfo.name]["memory"])
- i_mem_diff = beinfo[constants.BE_MAXMEM] - i_used_mem
- mem_free -= max(0, i_mem_diff)
-
- if iinfo.admin_state == constants.ADMINST_UP:
- i_p_up_mem += beinfo[constants.BE_MAXMEM]
-
- # TODO: replace this with proper storage reporting
- if has_lvm:
- total_disk = get_attr("vg_size")
- free_disk = get_attr("vg_free")
- else:
- # we didn't even ask the node for VG status, so use zeros
- total_disk = free_disk = 0
+ nresult.Raise("Can't get data for node %s" % ninfo.name)
+ node_iinfo[nuuid].Raise("Can't get node instance info from node %s" %
+ ninfo.name)
+ (_, space_info, (hv_info, )) = nresult.payload
+
+ mem_free = self._GetAttributeFromHypervisorNodeData(hv_info, ninfo.name,
+ "memory_free")
+
+ (i_p_mem, i_p_up_mem, mem_free) = self._ComputeInstanceMemory(
+ i_list, node_iinfo, nuuid, mem_free)
+ (total_disk, free_disk, total_spindles, free_spindles) = \
+ self._ComputeStorageDataFromSpaceInfo(space_info, ninfo.name,
+ has_lvm)
# compute memory used by instances
pnr_dyn = {
child.UpgradeConfig()
# FIXME: Make this configurable in Ganeti 2.7
- self.params = {}
+ # Params should be an empty dict that gets filled any time needed
+ # In case of ext template we allow arbitrary params that should not
+ # be overrided during a config reload/upgrade.
+ if not self.params or not isinstance(self.params, dict):
+ self.params = {}
+
# add here config upgrade for this disk
- # If the file driver is empty, fill it up with the default value
- if self.dev_type == constants.LD_FILE and self.physical_id[0] is None:
- self.physical_id[0] = constants.FD_DEFAULT
+ # map of legacy device types (mapping differing LD constants to new
+ # DT constants)
+ LEG_DEV_TYPE_MAP = {"lvm": constants.DT_PLAIN, "drbd8": constants.DT_DRBD8}
+ if self.dev_type in LEG_DEV_TYPE_MAP:
+ self.dev_type = LEG_DEV_TYPE_MAP[self.dev_type]
@staticmethod
def ComputeLDParams(disk_template, disk_params):
let vm_capable' = fromMaybe True vm_capable
gidx <- lookupGroup ktg n guuid
ndparams <- extract "ndparams" >>= asJSObject
- spindles <- tryFromObj desc (fromJSObject ndparams) "spindle_count"
+ excl_stor <- tryFromObj desc (fromJSObject ndparams) "exclusive_storage"
- let live = not offline && not drained && vm_capable'
+ let live = not offline && vm_capable'
lvextract def = eitherLive live def . extract
+ sptotal <- if excl_stor
+ then lvextract 0 "total_spindles"
+ else tryFromObj desc (fromJSObject ndparams) "spindle_count"
+ spfree <- lvextract 0 "free_spindles"
mtotal <- lvextract 0.0 "total_memory"
mnode <- lvextract 0 "reserved_memory"
mfree <- lvextract 0 "free_memory"
dtotal <- lvextract 0.0 "total_disk"
dfree <- lvextract 0 "free_disk"
ctotal <- lvextract 0.0 "total_cpus"
- let node = Node.create n mtotal mnode mfree dtotal dfree ctotal
- (not live || drained) spindles gidx
+ cnos <- lvextract 0 "reserved_cpus"
+ let node = Node.create n mtotal mnode mfree dtotal dfree ctotal cnos
- (not live) sptotal spfree gidx excl_stor
++ (not live || drained) sptotal spfree gidx excl_stor
return (n, node)
-- | Parses a group as found in the cluster group list.
xoffline <- convert "offline" offline
xdrained <- convert "drained" drained
xvm_capable <- convert "vm_capable" vm_capable
- xspindles <- convert "spindles" spindles
xgdx <- convert "group.uuid" g_uuid >>= lookupGroup ktg xname
+ xtags <- convert "tags" tags
+ xexcl_stor <- convert "exclusive_storage" excl_stor
- let live = not xoffline && not xdrained && xvm_capable
+ let live = not xoffline && xvm_capable
lvconvert def n d = eitherLive live def $ convert n d
+ xsptotal <- if xexcl_stor
+ then lvconvert 0 "sptotal" sptotal
+ else convert "spindles" spindles
+ xspfree <- lvconvert 0 "spfree" spfree
xmtotal <- lvconvert 0.0 "mtotal" mtotal
xmnode <- lvconvert 0 "mnode" mnode
xmfree <- lvconvert 0 "mfree" mfree
xdtotal <- lvconvert 0.0 "dtotal" dtotal
xdfree <- lvconvert 0 "dfree" dfree
xctotal <- lvconvert 0.0 "ctotal" ctotal
- let node = Node.create xname xmtotal xmnode xmfree xdtotal xdfree
- xctotal (not live || xdrained) xspindles xgdx
+ xcnos <- lvconvert 0 "cnos" cnos
+ let node = flip Node.setNodeTags xtags $
+ Node.create xname xmtotal xmnode xmfree xdtotal xdfree
- xctotal xcnos (not live) xsptotal xspfree xgdx xexcl_stor
++ xctotal xcnos (not live || xdrained) xsptotal xspfree xgdx xexcl_stor
return (xname, node)
parseNode _ v = fail ("Invalid node query result: " ++ show v)
vm_cap <- annotateResult desc $ maybeFromObj a "vm_capable"
let vm_cap' = fromMaybe True vm_cap
ndparams <- extract "ndparams" >>= asJSObject
- spindles <- tryFromObj desc (fromJSObject ndparams) "spindle_count"
+ excl_stor <- tryFromObj desc (fromJSObject ndparams) "exclusive_storage"
guuid <- annotateResult desc $ maybeFromObj a "group.uuid"
guuid' <- lookupGroup ktg name (fromMaybe defaultGroupID guuid)
- let live = not offline && not drained && vm_cap'
+ let live = not offline && vm_cap'
lvextract def = eitherLive live def . extract
+ sptotal <- if excl_stor
+ then lvextract 0 "sptotal"
+ else tryFromObj desc (fromJSObject ndparams) "spindle_count"
+ spfree <- lvextract 0 "spfree"
mtotal <- lvextract 0.0 "mtotal"
mnode <- lvextract 0 "mnode"
mfree <- lvextract 0 "mfree"
dtotal <- lvextract 0.0 "dtotal"
dfree <- lvextract 0 "dfree"
ctotal <- lvextract 0.0 "ctotal"
- let node = Node.create name mtotal mnode mfree dtotal dfree ctotal
- (not live || drained) spindles guuid'
+ cnos <- lvextract 0 "cnos"
+ tags <- extract "tags"
+ let node = flip Node.setNodeTags tags $
+ Node.create name mtotal mnode mfree dtotal dfree ctotal cnos
- (not live) sptotal spfree guuid' excl_stor
++ (not live || drained) sptotal spfree guuid' excl_stor
return (name, node)
-- | Construct a group from a JSON object.
if fail:
try:
- hv._StopInstance(name, force)
+ hv._StopInstance(name, force, None)
except errors.HypervisorError, err:
- self.assertTrue(str(err).startswith("Failed to stop instance"))
- self.assertTrue(str(err).startswith("xm list failed"),
++ self.assertTrue(str(err).startswith("listing instances failed"),
+ msg=str(err))
else:
self.fail("Exception was not raised")
self.assertEqual(utils.ReadFile(cfgfile), cfgdata,