1 <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
3 <article class="specification">
5 <title>Ganeti administrator's guide</title>
7 <para>Documents Ganeti version 1.2</para>
9 <title>Introduction</title>
11 <para>Ganeti is a virtualization cluster management software. You are
12 expected to be a system administrator familiar with your Linux distribution
13 and the Xen virtualization environment before using it.
16 <para>The various components of Ganeti all have man pages and interactive
17 help. This manual though will help you getting familiar with the system by
18 explaining the most common operations, grouped by related use.
21 <para>After a terminology glossary and a section on the prerequisites
22 needed to use this manual, the rest of this document is divided in three
23 main sections, which group different features of Ganeti:
26 <simpara>Instance Management</simpara>
29 <simpara>High Availability Features</simpara>
32 <simpara>Debugging Features</simpara>
39 <title>Ganeti terminology</title>
41 <para>This section provides a small introduction to Ganeti terminology,
42 which might be useful to read the rest of the document.
46 <glossterm>Cluster</glossterm>
49 A set of machines (nodes) that cooperate to offer a
50 coherent highly available virtualization service.
55 <glossterm>Node</glossterm>
58 A physical machine which is member of a cluster.
59 Nodes are the basic cluster infrastructure, and are
65 <glossterm>Master node</glossterm>
68 The node which controls the Cluster, from which all
69 Ganeti commands must be given.
74 <glossterm>Instance</glossterm>
77 A virtual machine which runs on a cluster. It can be a
78 fault tolerant highly available entity.
83 <glossterm>Pool</glossterm>
86 A pool is a set of clusters sharing the same network.
91 <glossterm>Meta-Cluster</glossterm>
94 Anything that concerns more than one cluster.
104 <title>Prerequisites</title>
107 You need to have your Ganeti cluster installed and configured
108 before you try any of the commands in this document. Please
109 follow the <emphasis>Ganeti installation tutorial</emphasis>
110 for instructions on how to do that.
117 <title>Managing Instances</title>
120 <title>Adding/Removing an instance</title>
123 Adding a new virtual instance to your Ganeti cluster is really
124 easy. The command is:
126 <synopsis>gnt-instance add -n <replaceable>TARGET_NODE</replaceable> -o <replaceable>OS_TYPE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis>
128 The instance name must be resolvable (e.g. exist in DNS) and
129 of course map to an address in the same subnet as the cluster
130 itself. Options you can give to this command include:
134 <simpara>The disk size (<option>-s</option>)</simpara>
137 <simpara>The swap size (<option>--swap-size</option>)</simpara>
140 <simpara>The memory size (<option>-m</option>)</simpara>
143 <simpara>The number of virtual CPUs (<option>-p</option>)</simpara>
146 <simpara>The instance ip address (<option>-i</option>) (use
147 the value <literal>auto</literal> to make Ganeti record the
148 address from dns)</simpara>
151 <simpara>The bridge to connect the instance to
152 (<option>-b</option>), if you don't want to use the default
158 <para>There are four types of disk template you can choose from:</para>
162 <term>diskless</term>
163 <listitem><para>The instance has no disks. Only used for special
164 purpouse operating systems or for testing.</para></listitem>
169 <listitem><para>The instance will use LVM devices as backend for its
170 disks. No redundancy is provided.</para></listitem>
174 <term>local_raid1</term>
175 <listitem><para>A local mirror is set between LVM devices to back the
176 instance. This provides some redundancy for the instance's
177 data.</para></listitem>
181 <term>remote_raid1</term>
183 <simpara><emphasis role="strong">Note:</emphasis> This is
184 only valid for multi-node clusters.</simpara>
186 A mirror is set between the local node and a remote
187 one, which must be specified with the --secondary-node
188 option. Use this option to obtain a highly available
189 instance that can be failed over to a remote node
190 should the primary one fail.
198 For example if you want to create an highly available instance
199 use the remote_raid1 disk template:
200 <synopsis>gnt-instance add -n <replaceable>TARGET_NODE</replaceable> -o <replaceable>OS_TYPE</replaceable> -t remote_raid1 \
201 --secondary-node=<replaceable>SECONDARY_NODE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis>
204 To know which operating systems your cluster supports you can use:
206 <synopsis>gnt-os list</synopsis>
211 Removing an instance is even easier than creating one. This
212 operation is non-reversible and destroys all the contents of
213 your instance. Use with care:
215 <synopsis>gnt-instance remove <replaceable>INSTANCE_NAME</replaceable></synopsis>
221 <title>Starting/Stopping an instance</title>
224 Instances are automatically started at instance creation
225 time. To manually start one which is currently stopped you can
228 <synopsis>gnt-instance startup <replaceable>INSTANCE_NAME</replaceable></synopsis>
230 While the command to stop one is:
232 <synopsis>gnt-instance shutdown <replaceable>INSTANCE_NAME</replaceable></synopsis>
234 The command to see all the instances configured and their
237 <synopsis>gnt-instance list</synopsis>
242 Do not use the xen commands to stop instances. If you run for
243 example xm shutdown or xm destroy on an instance Ganeti will
244 automatically restart it (via the
245 <citerefentry><refentrytitle>ganeti-watcher</refentrytitle>
246 <manvolnum>8</manvolnum></citerefentry>)
252 <title>Exporting/Importing an instance</title>
255 You can create a snapshot of an instance disk and Ganeti
256 configuration, which then you can backup, or import into
257 another cluster. The way to export an instance is:
259 <synopsis>gnt-backup export -n <replaceable>TARGET_NODE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis>
261 The target node can be any node in the cluster with enough
262 space under <filename class="directory">/srv/ganeti</filename>
263 to hold the instance image. Use the
264 <option>--noshutdown</option> option to snapshot an instance
265 without rebooting it. Any previous snapshot of the same
266 instance existing cluster-wide under <filename
267 class="directory">/srv/ganeti</filename> will be removed by
268 this operation: if you want to keep them move them out of the
269 Ganeti exports directory.
273 Importing an instance is similar to creating a new one. The
276 <synopsis>gnt-backup import -n <replaceable>TARGET_NODE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> --src-node=<replaceable>NODE</replaceable> --src-dir=DIR INSTANCE_NAME</synopsis>
278 Most of the options available for the command
279 <emphasis>gnt-instance add</emphasis> are supported here too.
288 <title>High availability features</title>
291 <simpara>This section only applies to multi-node clusters.</simpara>
295 <title>Failing over an instance</title>
298 If an instance is built in highly available mode you can at
299 any time fail it over to its secondary node, even if the
300 primary has somehow failed and it's not up anymore. Doing it
301 is really easy, on the master node you can just run:
303 <synopsis>gnt-instance failover <replaceable>INSTANCE_NAME</replaceable></synopsis>
305 That's it. After the command completes the secondary node is
306 now the primary, and vice versa.
311 <title>Replacing an instance disks</title>
314 So what if instead the secondary node for an instance has
315 failed, or you plan to remove a node from your cluster, and
316 you failed over all its instances, but it's still secondary
317 for some? The solution here is to replace the instance disks,
318 changing the secondary node:
320 <synopsis>gnt-instance replace-disks -n <replaceable>NEW_SECONDARY</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis>
322 This process is a bit longer, but involves no instance
323 downtime, and at the end of it the instance has changed its
324 secondary node, to which it can if necessary be failed over.
328 <title>Failing over the master node</title>
331 This is all good as long as the Ganeti Master Node is
332 up. Should it go down, or should you wish to decommission it,
333 just run on any other node the command:
335 <synopsis>gnt-cluster masterfailover</synopsis>
337 and the node you ran it on is now the new master.
341 <title>Adding/Removing nodes</title>
344 And of course, now that you know how to move instances around,
345 it's easy to free up a node, and then you can remove it from
349 gnt-node remove <replaceable>NODE_NAME</replaceable>
352 and maybe add a new one:
355 gnt-node add <optional><option>--secondary-ip=<replaceable>ADDRESS</replaceable></option></optional> <replaceable>NODE_NAME</replaceable>
363 <title>Debugging Features</title>
366 At some point you might need to do some debugging operations on
367 your cluster or on your instances. This section will help you
368 with the most used debugging functionalities.
372 <title>Accessing an instance's disks</title>
375 From an instance's primary node you have access to its
376 disks. Never ever mount the underlying logical volume manually
377 on a fault tolerant instance, or you risk breaking
378 replication. The correct way to access them is to run the
381 <synopsis> gnt-instance activate-disks <replaceable>INSTANCE_NAME</replaceable></synopsis>
383 And then access the device that gets created. After you've
384 finished you can deactivate them with the deactivate-disks
385 command, which works in the same way.
390 <title>Accessing an instance's console</title>
393 The command to access a running instance's console is:
395 <synopsis>gnt-instance console <replaceable>INSTANCE_NAME</replaceable></synopsis>
397 Use the console normally and then type
398 <userinput>^]</userinput> when done, to exit.
403 <title>Instance Operating System Debugging</title>
406 Should you have any problems with operating systems support
407 the command to ran to see a complete status for all your nodes
410 <synopsis>gnt-os diagnose</synopsis>
417 <title>Cluster-wide debugging</title>
420 The gnt-cluster command offers several options to run tests or
421 execute cluster-wide operations. For example:
427 gnt-cluster getmaster
431 See the man page <citerefentry>
432 <refentrytitle>gnt-cluster</refentrytitle>
433 <manvolnum>8</manvolnum> </citerefentry> to know more about