- the :command:`ganeti-masterd` daemon which runs on the master node and
allows control of the cluster
+Beside the node role, there are other node flags that influence its
+behaviour:
+
+- the *master_capable* flag denotes whether the node can ever become a
+ master candidate; setting this to 'no' means that auto-promotion will
+ never make this node a master candidate; this flag can be useful for a
+ remote node that only runs local instances, and having it become a
+ master is impractical due to networking or other constraints
+- the *vm_capable* flag denotes whether the node can host instances or
+ not; for example, one might use a non-vm_capable node just as a master
+ candidate, for configuration backups; setting this flag to no
+ disallows placement of instances of this node, deactivates hypervisor
+ and related checks on it (e.g. bridge checks, LVM check, etc.), and
+ removes it from cluster capacity computations
+
+
Instance
~~~~~~~~
internall by Ganeti as an *OpCode* (abbreviation from operation
code). These OpCodes are executed as part of a *Job*. The OpCodes in a
single Job are processed serially by Ganeti, but different Jobs will be
-processed (depending on resource availability) in parallel.
+processed (depending on resource availability) in parallel. They will
+not be executed in the submission order, but depending on resource
+availability, locks and (starting with Ganeti 2.3) priority. An earlier
+job may have to wait for a lock while a newer job doesn't need any locks
+and can be executed right away. Operations requiring a certain order
+need to be submitted as a single job, or the client must submit one job
+at a time and wait for it to finish before continuing.
For example, shutting down the entire cluster can be done by running the
command ``gnt-instance shutdown --all``, which will submit for each
Importing an instance is similar to creating a new one, but additionally
one must specify the location of the snapshot. The command is::
- gnt-backup import -n TARGET_NODE -t DISK_TEMPLATE \
+ gnt-backup import -n TARGET_NODE \
--src-node=NODE --src-dir=DIR INSTANCE_NAME
-Most of the options available for the command :command:`gnt-instance
-add` are supported here too.
+By default, parameters will be read from the export information, but you
+can of course pass them in via the command line - most of the options
+available for the command :command:`gnt-instance add` are supported here
+too.
+
+Import of foreign instances
++++++++++++++++++++++++++++
+
+There is a possibility to import a foreign instance whose disk data is
+already stored as LVM volumes without going through copying it: the disk
+adoption mode.
+
+For this, ensure that the original, non-managed instance is stopped,
+then create a Ganeti instance in the usual way, except that instead of
+passing the disk information you specify the current volumes::
+
+ gnt-instance add -t plain -n HOME_NODE ... \
+ --disk 0:adopt=lv_name[,vg=vg_name] INSTANCE_NAME
+
+This will take over the given logical volumes, rename them to the Ganeti
+standard (UUID-based), and without installing the OS on them start
+directly the instance. If you configure the hypervisor similar to the
+non-managed configuration that the instance had, the transition should
+be seamless for the instance. For more than one disk, just pass another
+disk parameter (e.g. ``--disk 1:adopt=...``).
Instance HA features
--------------------
Note that this will fail if the disks already exists.
+Conversion of an instance's disk type
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It is possible to convert between a non-redundant instance of type
+``plain`` (LVM storage) and redundant ``drbd`` via the ``gnt-instance
+modify`` command::
+
+ # start with a non-redundant instance
+ gnt-instance add -t plain ... INSTANCE
+
+ # later convert it to redundant
+ gnt-instance stop INSTANCE
+ gnt-instance modify -t drbd -n NEW_SECONDARY INSTANCE
+ gnt-instance start INSTANCE
+
+ # and convert it back
+ gnt-instance stop INSTANCE
+ gnt-instance modify -t plain INSTANCE
+ gnt-instance start INSTANCE
+
+The conversion must be done while the instance is stopped, and
+converting from plain to drbd template presents a small risk, especially
+if the instance has multiple disks and/or if one node fails during the
+conversion procedure). As such, it's recommended (as always) to make
+sure that downtime for manual recovery is acceptable and that the
+instance has up-to-date backups.
+
Debugging instances
+++++++++++++++++++
If you want to promote a different node to the master role (for whatever
reason), run on any other master-candidate node the command::
- gnt-cluster masterfailover
+ gnt-cluster master-failover
and the node you ran it on is now the new master. In case you try to run
this on a non master-candidate node, you will get an error telling you
.. note:: this command only stores a local flag file, and if you
failover the master, it will not have effect on the new master.
+Node auto-maintenance
++++++++++++++++++++++
+
+If the cluster parameter ``maintain_node_health`` is enabled (see the
+manpage for :command:`gnt-cluster`, the init and modify subcommands),
+then the following will happen automatically:
+
+- the watcher will shutdown any instances running on offline nodes
+- the watcher will deactivate any DRBD devices on offline nodes
+
+In the future, more actions are planned, so only enable this parameter
+if the nodes are completely dedicated to Ganeti; otherwise it might be
+possible to lose data due to auto-maintenance actions.
+
Removing a cluster entirely
+++++++++++++++++++++++++++
Note that the set of characters present in a tag and the maximum tag
length are restricted. Currently the maximum length is 128 characters,
there can be at most 4096 tags per object, and the set of characters is
-comprised by alphanumeric characters and additionally ``.+*/:-``.
+comprised by alphanumeric characters and additionally ``.+*/:@-``.
Operations
++++++++++
Mon Oct 26 00:22:52 2009 adding instance instance1 to cluster config
Mon Oct 26 00:22:52 2009 - INFO: Waiting for instance instance1 to sync disks.
…
- Mon Oct 26 00:23:03 2009 creating os for instance xen-devi-18.fra.corp.google.com on node mpgntac4.fra.corp.google.com
+ Mon Oct 26 00:23:03 2009 creating os for instance instance1 on node node1
Mon Oct 26 00:23:03 2009 * running the instance OS create scripts...
Mon Oct 26 00:23:13 2009 * starting instance...
node1#
top-level queue directory, or look at its contents (it's a
JSON-formatted file).
+Special Ganeti deployments
+--------------------------
+
+Since Ganeti 2.4, it is possible to extend the Ganeti deployment with
+two custom scenarios: Ganeti inside Ganeti and multi-site model.
+
+Running Ganeti under Ganeti
++++++++++++++++++++++++++++
+
+It is sometimes useful to be able to use a Ganeti instance as a Ganeti
+node (part of another cluster, usually). One example scenario is two
+small clusters, where we want to have an additional master candidate
+that holds the cluster configuration and can be used for helping with
+the master voting process.
+
+However, these Ganeti instance should not host instances themselves, and
+should not be considered in the normal capacity planning, evacuation
+strategies, etc. In order to accomplish this, mark these nodes as
+non-``vm_capable``::
+
+ node1# gnt-node modify --vm-capable=no node3
+
+The vm_capable status can be listed as usual via ``gnt-node list``::
+
+ node1# gnt-node list -oname,vm_capable
+ Node VMCapable
+ node1 Y
+ node2 Y
+ node3 N
+
+When this flag is set, the cluster will not do any operations that
+relate to instances on such nodes, e.g. hypervisor operations,
+disk-related operations, etc. Basically they will just keep the ssconf
+files, and if master candidates the full configuration.
+
+Multi-site model
+++++++++++++++++
+
+If Ganeti is deployed in multi-site model, with each site being a node
+group (so that instances are not relocated across the WAN by mistake),
+it is conceivable that either the WAN latency is high or that some sites
+have a lower reliability than others. In this case, it doesn't make
+sense to replicate the job information across all sites (or even outside
+of a “central” node group), so it should be possible to restrict which
+nodes can become master candidates via the auto-promotion algorithm.
+
+Ganeti 2.4 introduces for this purpose a new ``master_capable`` flag,
+which (when unset) prevents nodes from being marked as master
+candidates, either manually or automatically.
+
+As usual, the node modify operation can change this flag::
+
+ node1# gnt-node modify --auto-promote --master-capable=no node3
+ Fri Jan 7 06:23:07 2011 - INFO: Demoting from master candidate
+ Fri Jan 7 06:23:08 2011 - INFO: Promoted nodes to master candidate role: node4
+ Modified node node3
+ - master_capable -> False
+ - master_candidate -> False
+
+And the node list operation will list this flag::
+
+ node1# gnt-node list -oname,master_capable node1 node2 node3
+ Node MasterCapable
+ node1 Y
+ node2 Y
+ node3 N
+
+Note that marking a node both not ``vm_capable`` and not
+``master_capable`` makes the node practically unusable from Ganeti's
+point of view. Hence these two flags should be used probably in
+contrast: some nodes will be only master candidates (master_capable but
+not vm_capable), and other nodes will only hold instances (vm_capable
+but not master_capable).
+
+
Ganeti tools
------------
More information about the upgrade procedure is listed on the wiki at
http://code.google.com/p/ganeti/wiki/UpgradeNotes.
+There is also a script designed to upgrade from Ganeti 1.2 to 2.0,
+called ``cfgupgrade12``.
+
cfgshell
++++++++
instance OS definitions are executing properly the rename, import and
export operations.
+sanitize-config
++++++++++++++++
+
+This tool takes the Ganeti configuration and outputs a "sanitized"
+version, by randomizing or clearing:
+
+- DRBD secrets and cluster public key (always)
+- host names (optional)
+- IPs (optional)
+- OS names (optional)
+- LV names (optional, only useful for very old clusters which still have
+ instances whose LVs are based on the instance name)
+
+By default, all optional items are activated except the LV name
+randomization. When passing ``--no-randomization``, which disables the
+optional items (i.e. just the DRBD secrets and cluster public keys are
+randomized), the resulting file can be used as a safety copy of the
+cluster config - while not trivial, the layout of the cluster can be
+recreated from it and if the instance disks have not been lost it
+permits recovery from the loss of all master candidates.
+
+move-instance
++++++++++++++
+
+See :doc:`separate documentation for move-instance <move-instance>`.
+
+.. TODO: document cluster-merge tool
+
+
Other Ganeti projects
---------------------