--- /dev/null
+<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
+]>
+ <article class="specification">
+ <articleinfo>
+ <title>Ganeti administrator's guide</title>
+ </articleinfo>
+ <para>Documents Ganeti version 1.2</para>
+ <sect1>
+ <title>Introduction</title>
+
+ <para>Ganeti is a virtualization cluster management software. You are
+ expected to be a system administrator familiar with your Linux distribution
+ and the Xen virtualization environment before using it.
+ </para>
+
+ <para>The various components of Ganeti all have man pages and interactive
+ help. This manual though will help you getting familiar with the system by
+ explaining the most common operations, grouped by related use.
+ </para>
+
+ <para>After a terminology glossary and a section on the prerequisites
+ needed to use this manual, the rest of this document is divided in three
+ main sections, which group different features of Ganeti:
+ <itemizedlist>
+ <listitem>
+ <simpara>Instance Management</simpara>
+ </listitem>
+ <listitem>
+ <simpara>High Availability Features</simpara>
+ </listitem>
+ <listitem>
+ <simpara>Debugging Features</simpara>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <sect2>
+ <title>Ganeti Terminology</title>
+
+ <para>This section provides a small introduction to Ganeti terminology,
+ which might be useful to read the rest of the document.
+
+ <variablelist>
+ <varlistentry>
+ <term>Cluster</term>
+ <listitem><para>A set of machines (nodes) that cooperate to offer a
+ coherent highly available virtualization service.</para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Node</term>
+ <listitem><para>A physical machine which is member of a cluster.
+ Nodes are the basic cluster infrastructure, and are not fault
+ tolerant.</para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Master Node</term>
+ <listitem><para>The node which controls the Cluster, from which all
+ Ganeti commands must be given.</para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Instance</term>
+ <listitem><para>A virtual machine which runs on a cluster. It can be
+ a fault tolerant highly available entity.</para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Pool</term>
+ <listitem><para>A pool is a set of clusters sharing the same
+ network.</para></listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Meta-Cluster</term>
+ <listitem><para>Anything that concerns more than one
+ cluster.</para></listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Prerequisites</title>
+
+ <para>You need to have your Ganeti cluster installed and configured
+ before you try any of the commands in this document. Please follow the
+ "installing tutorial" for instructions on how to do that.
+ </para>
+ </sect2>
+
+ </sect1>
+
+ <sect1>
+ <title>Managing Instances</title>
+
+ <sect2>
+ <title>Adding/Removing an instance</title>
+
+ <para>Adding a new virtual instance to your Ganeti cluster is really
+ easy. The command is:
+ <programlisting>
+gnt-instance add -n TARGET_NODE -o OS_TYPE -t DISK_TEMPLATE INSTANCE_NAME
+ </programlisting>
+ The instance name must exist in dns and of course map to an address in
+ the same subnet as the cluster itself. Options you can give to this
+ command include:
+ <itemizedlist>
+ <listitem>
+ <simpara>The disk size (-s)</simpara>
+ </listitem>
+ <listitem>
+ <simpara>The swap size (--swap-size)</simpara>
+ </listitem>
+ <listitem>
+ <simpara>The memory size (-m)</simpara>
+ </listitem>
+ <listitem>
+ <simpara>The number of virtual CPUs (-p)</simpara>
+ </listitem>
+ <listitem>
+ <simpara>The instance ip address (-i) (use -i auto to make Ganeti
+ record the address from dns)</simpara>
+ </listitem>
+ <listitem>
+ <simpara>The bridge to connect the instance to (-b), if you don't
+ want to use the default one</simpara>
+ </listitem>
+ </itemizedlist>
+ If you want to create an highly available instance use the remote_raid1
+ disk template:
+ <programlisting>
+gnt-instance add -n TARGET_NODE -o OS_TYPE -t remote_raid1 \
+ --secondary-node=SECONDARY_NODE INSTANCE_NAME
+ </programlisting>
+ To know which operating systems your cluster supports you can use:
+ <programlisting>
+gnt-os list
+ </programlisting>
+ </para>
+
+ <para>
+ Removing an instance is even easier than creating one. This operation is
+ non-reversible and destroys all the contents of your instance. Use with
+ care:
+ <programlisting>
+gnt-instance remove INSTANCE_NAME
+ </programlisting>
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Starting/Stopping an instance</title>
+
+ <para>Instances are automatically started at instance creation time. To
+ manually start one which is currently stopped you can run:
+ <programlisting>
+gnt-instance startup INSTANCE_NAME
+ </programlisting>
+ While the command to stop one is:
+ <programlisting>
+gnt-instance shutdown INSTANCE_NAME
+ </programlisting>
+ The command to see all the instances configured and their status is:
+ <programlisting>
+gnt-instance list
+ </programlisting>
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Exporting/Importing an instance</title>
+
+ <para>You can create a snapshot of an instance disk and Ganeti
+ configuration, which then you can backup, or import into another cluster.
+ The way to export an instance is:
+ <programlisting>
+gnt-backup export -n TARGET_NODE INSTANCE_NAME
+ </programlisting>
+ The target node can be any node in the cluster with enough space under
+ /srv/ganeti to hold the instance image. Use the --noshutdown option to
+ snapshot an instance without rebooting it. Any previous snapshot of the
+ same instance existing cluster-wide under /srv/ganeti will be removed by
+ this operation: if you want to keep them move them out of the Ganeti
+ exports directory.
+ </para>
+
+ <para>Importing an instance is as easy as creating a new one. The command
+ is:
+ <programlisting>
+gnt-backup import -n TRGT_NODE -t DISK_TMPL --src-node=NODE --src-dir=DIR INST_NAME
+ </programlisting>
+ Most of the options available for gnt-instance add are supported here
+ too.
+ </para>
+ </sect2>
+
+ </sect1>
+
+
+ <sect1>
+ <title>High availability features</title>
+
+ <sect2>
+ <title>Failing over an instance</title>
+
+ <para>If an instance is built in highly available mode you can at any
+ time fail it over to its secondary node, even if the primary has somehow
+ failed and it's not up anymore. Doing it is really easy, on the master
+ node you can just run:
+ <programlisting>
+gnt-instance failover INSTANCE_NAME
+ </programlisting>
+ That's it. After the command completes the secondary node is now the
+ primary, and vice versa.
+ </para>
+ </sect2>
+ <sect2>
+ <title>Replacing an instance disks</title>
+
+ <para>So what if instead the secondary node for an instance has failed,
+ or you plan to remove a node from your cluster, and you failed over all
+ its instances, but it's still secondary for some? The solution here is to
+ replace the instance disks, changing the secondary node:
+ <programlisting>
+gnt-instance replace-disks -n NEW_SECONDARY INSTANCE_NAME
+ </programlisting>
+ This process is a bit longer, but involves no instance downtime, and at
+ the end of it the instance has changed its secondary node, to which it
+ can if necessary be failed over.
+ </para>
+ </sect2>
+ <sect2>
+ <title>Failing over the master node</title>
+
+ <para>This is all good as long as the Ganeti Master Node is up. Should it
+ go down, or should you wish to decommission it, just run on any other node
+ the command:
+ <programlisting>
+gnt-cluster masterfailover
+ </programlisting>
+ and the node you ran it on is now the new master.
+ </para>
+ </sect2>
+ <sect2>
+ <title>Adding/Removing nodes</title>
+
+ <para>And of course, now that you know how to move instances around, it's
+ easy to free up a node, and then you can remove it from the cluster:
+ <programlisting>
+gnt-node remove NODE_NAME
+ </programlisting>
+ and maybe add a new one:
+ <programlisting>
+gnt-node add [--secondary-ip=ADDRESS] NODE_NAME
+ </programlisting>
+ </para>
+ </sect2>
+ </sect1>
+
+ <sect1>
+ <title>Debugging Features</title>
+
+ <para>At some point you might need to do some debugging operations on your
+ cluster or on your instances. This section will help you with the most used
+ debugging functionalities.
+ </para>
+
+ <sect2>
+ <title>Accessing an instance's disks</title>
+
+ <para>From an instance's primary node you have access to its disks. Never
+ ever mount the underlying logical volume manually on a fault tolerant
+ instance, though or you risk breaking replication. The correct way to
+ access them is to run the command:
+ <programlisting>
+gnt-instance activate-disks INSTANCE_NAME
+ </programlisting>
+ And then access the device that gets created. Of course after you've
+ finished you can deactivate them with the deactivate-disks command, which
+ works in the same way.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Accessing an instance's console</title>
+
+ <para>The command to access a running instance's console is:
+ <programlisting>
+gnt-instance console INSTANCE_NAME
+ </programlisting>
+ Use the console normally and then type ^] when done, to exit.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Instance Operating System Debugging</title>
+
+ <para>Should you have any problems with operating systems support the
+ command to ran to see a complete status for all your nodes is:
+ <programlisting>
+gnt-os diagnose
+ </programlisting>
+ </para>
+
+ </sect2>
+
+ <sect2>
+ <title>Cluster-wide debugging</title>
+
+ <para>The gnt-cluster command offers several options to run tests or
+ execute cluster-wide operations. For example:
+ <programlisting>
+gnt-cluster command
+gnt-cluster copyfile
+gnt-cluster verify
+gnt-cluster getmaster
+gnt-cluster version
+ </programlisting>
+ See the respective help to know more about their usage.
+ </para>
+ </sect2>
+
+ </sect1>
+
+ </article>