1 <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
3 <article class="specification">
5 <title>Ganeti administrator's guide</title>
7 <para>Documents Ganeti version 1.2</para>
9 <title>Introduction</title>
11 <para>Ganeti is a virtualization cluster management software. You are
12 expected to be a system administrator familiar with your Linux distribution
13 and the Xen virtualization environment before using it.
16 <para>The various components of Ganeti all have man pages and interactive
17 help. This manual though will help you getting familiar with the system by
18 explaining the most common operations, grouped by related use.
21 <para>After a terminology glossary and a section on the prerequisites
22 needed to use this manual, the rest of this document is divided in three
23 main sections, which group different features of Ganeti:
26 <simpara>Instance Management</simpara>
29 <simpara>High Availability Features</simpara>
32 <simpara>Debugging Features</simpara>
38 <title>Ganeti terminology</title>
41 This section provides a small introduction to Ganeti terminology, which
42 might be useful to read the rest of the document.
46 <glossterm>Cluster</glossterm>
49 A set of machines (nodes) that cooperate to offer a
50 coherent highly available virtualization service.
55 <glossterm>Node</glossterm>
58 A physical machine which is member of a cluster.
59 Nodes are the basic cluster infrastructure, and are
65 <glossterm>Master node</glossterm>
68 The node which controls the Cluster, from which all
69 Ganeti commands must be given.
74 <glossterm>Instance</glossterm>
77 A virtual machine which runs on a cluster. It can be a
78 fault tolerant highly available entity.
83 <glossterm>Pool</glossterm>
86 A pool is a set of clusters sharing the same network.
91 <glossterm>Meta-Cluster</glossterm>
94 Anything that concerns more than one cluster.
103 <title>Prerequisites</title>
106 You need to have your Ganeti cluster installed and configured before
107 you try any of the commands in this document. Please follow the
108 <emphasis>Ganeti installation tutorial</emphasis> for instructions on
116 <title>Managing Instances</title>
119 <title>Adding/Removing an instance</title>
122 Adding a new virtual instance to your Ganeti cluster is really easy.
125 <synopsis>gnt-instance add -n <replaceable>TARGET_NODE</replaceable> -o <replaceable>OS_TYPE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis>
127 The instance name must be resolvable (e.g. exist in DNS) and
128 of course map to an address in the same subnet as the cluster
129 itself. Options you can give to this command include:
133 <simpara>The disk size (<option>-s</option>)</simpara>
136 <simpara>The swap size (<option>--swap-size</option>)</simpara>
139 <simpara>The memory size (<option>-m</option>)</simpara>
142 <simpara>The number of virtual CPUs (<option>-p</option>)</simpara>
145 <simpara>The instance ip address (<option>-i</option>) (use the value
146 <literal>auto</literal> to make Ganeti record the address from
150 <simpara>The bridge to connect the instance to (<option>-b</option>),
151 if you don't want to use the default one</simpara>
156 <para>There are four types of disk template you can choose from:</para>
160 <term>diskless</term>
162 <para>The instance has no disks. Only used for special purpouse
163 operating systems or for testing.</para>
170 <para>The instance will use LVM devices as backend for its disks.
171 No redundancy is provided.</para>
176 <term>local_raid1</term>
178 <para>A local mirror is set between LVM devices to back the
179 instance. This provides some redundancy for the instance's
185 <term>remote_raid1</term>
187 <simpara><emphasis role="strong">Note:</emphasis> This is only
188 valid for multi-node clusters using drbd 0.7.</simpara>
190 A mirror is set between the local node and a remote one, which
191 must be specified with the second value of the --node option. Use
192 this option to obtain a highly available instance that can be
193 failed over to a remote node should the primary one fail.
201 <simpara><emphasis role="strong">Note:</emphasis> This is only
202 valid for multi-node clusters using drbd 8.0.</simpara>
204 This is similar to the
205 <replaceable>remote_raid1</replaceable> option, but uses
206 new features in drbd 8 to simplify the device
207 stack. From a user's point of view, this will improve
208 the speed of the <command>replace-disks</command>
209 command and (in future versions) provide more
218 For example if you want to create an highly available instance use the
219 remote_raid1 or drbd disk templates:
220 <synopsis>gnt-instance add -n <replaceable>TARGET_NODE</replaceable><optional>:<replaceable>SECONDARY_NODE</replaceable></optional> -o <replaceable>OS_TYPE</replaceable> -t remote_raid1 \
221 <replaceable>INSTANCE_NAME</replaceable></synopsis>
224 To know which operating systems your cluster supports you can use
225 <synopsis>gnt-os list</synopsis>
229 Removing an instance is even easier than creating one. This operation
230 is non-reversible and destroys all the contents of your instance. Use
233 <synopsis>gnt-instance remove <replaceable>INSTANCE_NAME</replaceable></synopsis>
238 <title>Starting/Stopping an instance</title>
241 Instances are automatically started at instance creation time. To
242 manually start one which is currently stopped you can run:
244 <synopsis>gnt-instance startup <replaceable>INSTANCE_NAME</replaceable></synopsis>
246 While the command to stop one is:
248 <synopsis>gnt-instance shutdown <replaceable>INSTANCE_NAME</replaceable></synopsis>
250 The command to see all the instances configured and their status is:
252 <synopsis>gnt-instance list</synopsis>
257 Do not use the xen commands to stop instances. If you run for
258 example xm shutdown or xm destroy on an instance Ganeti will
259 automatically restart it (via the
260 <citerefentry><refentrytitle>ganeti-watcher</refentrytitle>
261 <manvolnum>8</manvolnum></citerefentry>)
267 <title>Exporting/Importing an instance</title>
270 You can create a snapshot of an instance disk and Ganeti
271 configuration, which then you can backup, or import into
272 another cluster. The way to export an instance is:
274 <synopsis>gnt-backup export -n <replaceable>TARGET_NODE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis>
276 The target node can be any node in the cluster with enough
277 space under <filename class="directory">/srv/ganeti</filename>
278 to hold the instance image. Use the
279 <option>--noshutdown</option> option to snapshot an instance
280 without rebooting it. Any previous snapshot of the same
281 instance existing cluster-wide under <filename
282 class="directory">/srv/ganeti</filename> will be removed by
283 this operation: if you want to keep them move them out of the
284 Ganeti exports directory.
288 Importing an instance is similar to creating a new one. The command is:
290 <synopsis>gnt-backup import -n <replaceable>TARGET_NODE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> --src-node=<replaceable>NODE</replaceable> --src-dir=DIR INSTANCE_NAME</synopsis>
292 Most of the options available for the command
293 <emphasis>gnt-instance add</emphasis> are supported here too.
302 <title>High availability features</title>
305 <simpara>This section only applies to multi-node clusters.</simpara>
309 <title>Failing over an instance</title>
312 If an instance is built in highly available mode you can at
313 any time fail it over to its secondary node, even if the
314 primary has somehow failed and it's not up anymore. Doing it
315 is really easy, on the master node you can just run:
317 <synopsis>gnt-instance failover <replaceable>INSTANCE_NAME</replaceable></synopsis>
319 That's it. After the command completes the secondary node is
320 now the primary, and vice versa.
325 <title>Replacing an instance disks</title>
328 So what if instead the secondary node for an instance has
329 failed, or you plan to remove a node from your cluster, and
330 you failed over all its instances, but it's still secondary
331 for some? The solution here is to replace the instance disks,
332 changing the secondary node. This is done in two ways, depending on the disk template type. For <literal>remote_raid1</literal>:
334 <synopsis>gnt-instance replace-disks <option>-n <replaceable>NEW_SECONDARY</replaceable></option> <replaceable>INSTANCE_NAME</replaceable></synopsis>
336 and for <literal>drbd</literal>:
337 <synopsis>gnt-instance replace-disks <option>-s</option> <option>-n <replaceable>NEW_SECONDARY</replaceable></option> <replaceable>INSTANCE_NAME</replaceable></synopsis>
339 This process is a bit longer, but involves no instance
340 downtime, and at the end of it the instance has changed its
341 secondary node, to which it can if necessary be failed over.
345 <title>Failing over the master node</title>
348 This is all good as long as the Ganeti Master Node is
349 up. Should it go down, or should you wish to decommission it,
350 just run on any other node the command:
352 <synopsis>gnt-cluster masterfailover</synopsis>
354 and the node you ran it on is now the new master.
358 <title>Adding/Removing nodes</title>
361 And of course, now that you know how to move instances around,
362 it's easy to free up a node, and then you can remove it from
365 <synopsis>gnt-node remove <replaceable>NODE_NAME</replaceable></synopsis>
367 and maybe add a new one:
369 <synopsis>gnt-node add <optional><option>--secondary-ip=<replaceable>ADDRESS</replaceable></option></optional> <replaceable>NODE_NAME</replaceable>
377 <title>Debugging Features</title>
380 At some point you might need to do some debugging operations on
381 your cluster or on your instances. This section will help you
382 with the most used debugging functionalities.
386 <title>Accessing an instance's disks</title>
389 From an instance's primary node you have access to its
390 disks. Never ever mount the underlying logical volume manually
391 on a fault tolerant instance, or you risk breaking
392 replication. The correct way to access them is to run the
395 <synopsis>gnt-instance activate-disks <replaceable>INSTANCE_NAME</replaceable></synopsis>
397 And then access the device that gets created. After you've
398 finished you can deactivate them with the deactivate-disks
399 command, which works in the same way.
404 <title>Accessing an instance's console</title>
407 The command to access a running instance's console is:
409 <synopsis>gnt-instance console <replaceable>INSTANCE_NAME</replaceable></synopsis>
411 Use the console normally and then type
412 <userinput>^]</userinput> when done, to exit.
417 <title>Instance OS definitions Debugging</title>
420 Should you have any problems with operating systems support
421 the command to ran to see a complete status for all your nodes
424 <synopsis>gnt-os diagnose</synopsis>
431 <title>Cluster-wide debugging</title>
434 The gnt-cluster command offers several options to run tests or
435 execute cluster-wide operations. For example:
441 gnt-cluster getmaster
445 See the man page <citerefentry>
446 <refentrytitle>gnt-cluster</refentrytitle>
447 <manvolnum>8</manvolnum> </citerefentry> to know more about