doc/design-2.5.rst \
doc/design-2.6.rst \
doc/design-2.7.rst \
+ doc/design-2.8.rst \
doc/design-autorepair.rst \
doc/design-bulk-create.rst \
doc/design-chained-jobs.rst \
man/gnt-os.8 \
man/gnt-storage.8 \
man/hail.1 \
+ man/harep.1 \
man/hbal.1 \
man/hcheck.1 \
man/hinfo.1 \
Version 2.7.0 rc2
-----------------
-*(unreleased)*
-
-- ``devel/upload`` now works when ``/var/run`` on the target nodes is a
- symlink.
-- Disks added through ``gnt-instance modify`` or created through
- ``gnt-instance recreate-disks`` are wiped, if the
- ``prealloc_wipe_disks`` flag is set.
-- If wiping newly created disks fails, the disks are removed. Also,
- partial failures in creating disks through ``gnt-instance modify``
- triggers a cleanup of the partially-created disks.
-- Removing the master IP address doesn't fail if the address has been
- already removed.
-
-
-Version 2.7.0 rc1
------------------
-
-*(Released Fri, 3 May 2013)*
+*(Released Fri, 24 May 2013)*
Incompatible/important changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- The functionality for allocating multiple instances at once has been
overhauled and is now also available through :doc:`RAPI <rapi>`.
+Since rc1:
+
+- ``devel/upload`` now works when ``/var/run`` on the target nodes is a
+ symlink.
+- Disks added through ``gnt-instance modify`` or created through
+ ``gnt-instance recreate-disks`` are wiped, if the
+ ``prealloc_wipe_disks`` flag is set.
+- If wiping newly created disks fails, the disks are removed. Also,
+ partial failures in creating disks through ``gnt-instance modify``
+ triggers a cleanup of the partially-created disks.
+- Removing the master IP address doesn't fail if the address has been
+ already removed.
+- Fix ownership of the OS log dir
+- Workaround missing SO_PEERCRED constant (Issue 191)
+
+
+Version 2.7.0 rc1
+-----------------
+
+*(Released Fri, 3 May 2013)*
-Since beta3:
+This was the first release candidate of the 2.7 series. Since beta3:
- Fix kvm compatibility with qemu 1.4 (Issue 389)
- Documentation updates (admin guide, upgrade notes, install
m4_define([gnt_version_major], [2])
m4_define([gnt_version_minor], [7])
m4_define([gnt_version_revision], [0])
-m4_define([gnt_version_suffix], [~rc1])
+m4_define([gnt_version_suffix], [~rc2])
m4_define([gnt_version_full],
m4_format([%d.%d.%d%s],
gnt_version_major, gnt_version_minor,
--- /dev/null
+=================
+Ganeti 2.8 design
+=================
+
+The following design documents have been implemented in Ganeti 2.8:
+
+- :doc:`design-reason-trail`
+- :doc"`design-autorepair`
+
+The following designs have been partially implemented in Ganeti 2.8:
+
+- :doc:`design-storagetypes`
+- :doc:`design-hroller`
+- :doc:`design-query-splitting`: everything except instance queries.
+- :doc:`design-partitioned`: "Constrained instance sizes" implemented.
+- :doc:`design-monitoring-agent`: implementation of all the core functionalities
+ of the monitoring agent. Reason trail implemented as part of the work for the
+ instance status collector.
+
+.. vim: set textwidth=72 :
+.. Local Variables:
+.. mode: rst
+.. fill-column: 72
+.. End:
design-impexp2.rst
design-resource-model.rst
design-query-splitting.rst
- design-autorepair.rst
design-partitioned.rst
design-monitoring-agent.rst
design-hroller.rst
design-storagetypes.rst
- design-reason-trail.rst
design-device-uuid-name.rst
design-internal-shutdown.rst
design-2.5.rst
design-2.6.rst
design-2.7.rst
+ design-2.8.rst
design-draft.rst
cluster-merge.rst
locking.rst
.. toctree::
:hidden:
+ design-autorepair.rst
design-bulk-create.rst
design-chained-jobs.rst
design-cpu-pinning.rst
design-opportunistic-locking.rst
design-ovf-support.rst
design-query2.rst
+ design-reason-trail.rst
design-restricted-commands.rst
design-shared-storage.rst
design-virtual-clusters.rst
line assumes that all your nodes have secondary IPs in the
192.0.2.0/24 network, adjust it accordingly to your setup.
-.. admonition:: Debian
-
- Besides the ballooning change which you need to set in
- ``/etc/xen/xend-config.sxp``, you need to set the memory and nosmp
- parameters in the file ``/boot/grub/menu.lst``. You need to modify
- the variable ``xenhopt`` to add ``dom0_mem=1024M`` like this:
-
- .. code-block:: text
-
- ## Xen hypervisor options to use with the default Xen boot option
- # xenhopt=dom0_mem=1024M
-
- and the ``xenkopt`` needs to include the ``maxcpus`` option like
- this:
-
- .. code-block:: text
-
- ## Xen Linux kernel options to use with the default Xen boot option
- # xenkopt=maxcpus=1
-
- Any existing parameters can be left in place: it's ok to have
- ``xenkopt=console=tty0 maxcpus=1``, for example. After modifying the
- files, you need to run::
-
- $ /sbin/update-grub
-
If you want to run HVM instances too with Ganeti and want VNC access to
the console of your instances, set the following two entries in
``/etc/xen/xend-config.sxp``:
else:
if not set(disks).issubset(instance.disks):
raise errors.ProgrammerError("Can only act on disks belonging to the"
- " target instance")
+ " target instance: expected a subset of %r,"
+ " got %r" % (instance.disks, disks))
return disks
"""
utils.RemoveFile(self._ConfigFileName(instance_name))
+ def _StashConfigFile(self, instance_name):
+ """Move the Xen config file to the log directory and return its new path.
+
+ """
+ old_filename = self._ConfigFileName(instance_name)
+ base = ("%s-%s" %
+ (instance_name, utils.TimestampForFilename()))
+ new_filename = utils.PathJoin(pathutils.LOG_XEN_DIR, base)
+ utils.RenameFile(old_filename, new_filename)
+ return new_filename
+
def _GetXmList(self, include_node):
"""Wrapper around module level L{_GetXmList}.
result = self._RunXen(cmd)
if result.failed:
- raise errors.HypervisorError("Failed to start instance %s: %s (%s)" %
+ # Move the Xen configuration file to the log directory to avoid
+ # leaving a stale config file behind.
+ stashed_config = self._StashConfigFile(instance.name)
+ raise errors.HypervisorError("Failed to start instance %s: %s (%s). Moved"
+ " config file to %s" %
(instance.name, result.fail_reason,
- result.output))
+ result.output, stashed_config))
def StopInstance(self, instance, force=False, retry=False, name=None):
"""Stop an instance.
_STRUCT_UCRED = "iII"
_STRUCT_UCRED_SIZE = struct.calcsize(_STRUCT_UCRED)
+# Workaround a bug in some linux distributions that don't define SO_PEERCRED
+try:
+ _SO_PEERCRED = IN.SO_PEERCRED
+except AttributeError:
+ _SO_PEERCRED = 17
+
# Regexes used to find IP addresses in the output of ip.
_IP_RE_TEXT = r"[.:a-z0-9]+" # separate for testing purposes
_IP_FAMILY_RE = re.compile(r"(?P<family>inet6?)\s+(?P<ip>%s)/" % _IP_RE_TEXT,
@return: The PID, UID and GID of the connected foreign process.
"""
- peercred = sock.getsockopt(socket.SOL_SOCKET, IN.SO_PEERCRED,
+ peercred = sock.getsockopt(socket.SOL_SOCKET, _SO_PEERCRED,
_STRUCT_UCRED_SIZE)
return struct.unpack(_STRUCT_UCRED, peercred)
LOG_OS_DIR = LOG_DIR + "/os"
LOG_ES_DIR = LOG_DIR + "/extstorage"
+#: Directory for storing Xen config files after failed instance starts
+LOG_XEN_DIR = LOG_DIR + "/xen"
# Job queue paths
JOB_QUEUE_LOCK_FILE = QUEUE_DIR + "/lock"
(confd_log, FILE, 0600, getent.confd_uid, getent.masterd_gid, False),
(noded_log, FILE, 0600, getent.noded_uid, getent.masterd_gid, False),
(rapi_log, FILE, 0600, getent.rapi_uid, getent.masterd_gid, False),
- (pathutils.LOG_OS_DIR, DIR, 0750, getent.masterd_uid, getent.daemons_gid),
+ (pathutils.LOG_OS_DIR, DIR, 0750, getent.noded_uid, getent.daemons_gid),
+ (pathutils.LOG_XEN_DIR, DIR, 0750, getent.noded_uid, getent.daemons_gid),
(cleaner_log_dir, DIR, 0750, getent.noded_uid, getent.noded_gid),
(master_cleaner_log_dir, DIR, 0750, getent.masterd_uid, getent.masterd_gid),
(pathutils.INSTANCE_REASON_DIR, DIR, 0755, getent.noded_uid,
to a network via the ``network`` NIC parameter. See **gnt-instance**\(8)
for more details.
+BUGS
+----
+
+The ``hail`` iallocator hasn't been updated to take networks into
+account in Ganeti 2.7. The only way to guarantee that it works correctly
+is having your networks connected to all nodegroups. This will be fixed
+in a future version.
+
COMMANDS
--------
The exist status of the command will be zero, unless for some reason
the algorithm fatally failed (e.g. wrong node or instance data).
+BUGS
+----
+
+Networks (as configured by **gnt-network**\(8)) are not taken into
+account in Ganeti 2.7. The only way to guarantee that they work
+correctly is having your networks connected to all nodegroups. This will
+be fixed in a future version.
+
.. vim: set textwidth=72 :
.. Local Variables:
.. mode: rst
--- /dev/null
+HAREP(1) Ganeti | Version @GANETI_VERSION@
+=========================================
+
+NAME
+----
+
+harep - Ganeti auto-repair tool
+
+SYNOPSIS
+--------
+
+**harep** [ [**-L** | **\--luxi** ] = *socket* ] [ --job-delay = *seconds* ]
+
+**harep** \--version
+
+DESCRIPTION
+-----------
+
+harep is the Ganeti auto-repair tool. It is able to detect that an instance is
+broken and to generate a sequence of jobs that will fix it, in accordance to the
+policies set by the administrator.
+
+OPTIONS
+-------
+
+The options that can be passed to the program are as follows:
+
+-L *socket*, \--luxi=*socket*
+ collect data via Luxi, optionally using the given *socket* path.
+
+\--job-delay=*seconds*
+ insert this much delay before the execution of repair jobs to allow the tool
+ to continue processing instances.
+
+.. vim: set textwidth=72 :
+.. Local Variables:
+.. mode: rst
+.. fill-column: 72
+.. End:
[FORMAT]
max-line-length = 80
-max-module-lines = 10000
+max-module-lines = 4500
indent-string = " "
[MISCELLANEOUS]
from ganeti import constants
from ganeti import objects
+from ganeti import pathutils
from ganeti import hypervisor
from ganeti import utils
from ganeti import errors
self.fail("Unhandled command: %s" % (cmd, ))
return self._SuccessCommand(output, cmd)
- #return self._FailingCommand(cmd)
def _MakeInstance(self):
# Copy default parameters
def testStartInstance(self):
(inst, disks) = self._MakeInstance()
+ pathutils.LOG_XEN_DIR = self.tmpdir
for failcreate in [False, True]:
for paused in [False, True]:
if failcreate:
self.assertRaises(errors.HypervisorError, hv.StartInstance,
inst, disks, paused)
+ # Check whether a stale config file is left behind
+ self.assertFalse(os.path.exists(cfgfile))
else:
hv.StartInstance(inst, disks, paused)
-
- # Check if configuration was updated
- lines = utils.ReadFile(cfgfile).splitlines()
+ # Check if configuration was updated
+ lines = utils.ReadFile(cfgfile).splitlines()
if constants.HV_VNC_PASSWORD_FILE in inst.hvparams:
self.assertTrue(("vncpasswd = '%s'" % self.vncpw) in lines)