root / doc / admin.sgml @ 6c4811dc
History | View | Annotate | Download (15.2 kB)
1 |
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [ |
---|---|
2 |
]> |
3 |
<article class="specification"> |
4 |
<articleinfo> |
5 |
<title>Ganeti administrator's guide</title> |
6 |
</articleinfo> |
7 |
<para>Documents Ganeti version 1.2</para> |
8 |
<sect1> |
9 |
<title>Introduction</title> |
10 |
|
11 |
<para>Ganeti is a virtualization cluster management software. You are |
12 |
expected to be a system administrator familiar with your Linux distribution |
13 |
and the Xen virtualization environment before using it. |
14 |
</para> |
15 |
|
16 |
<para>The various components of Ganeti all have man pages and interactive |
17 |
help. This manual though will help you getting familiar with the system by |
18 |
explaining the most common operations, grouped by related use. |
19 |
</para> |
20 |
|
21 |
<para>After a terminology glossary and a section on the prerequisites |
22 |
needed to use this manual, the rest of this document is divided in three |
23 |
main sections, which group different features of Ganeti: |
24 |
<itemizedlist> |
25 |
<listitem> |
26 |
<simpara>Instance Management</simpara> |
27 |
</listitem> |
28 |
<listitem> |
29 |
<simpara>High Availability Features</simpara> |
30 |
</listitem> |
31 |
<listitem> |
32 |
<simpara>Debugging Features</simpara> |
33 |
</listitem> |
34 |
</itemizedlist> |
35 |
</para> |
36 |
|
37 |
<sect2> |
38 |
<title>Ganeti terminology</title> |
39 |
|
40 |
<para> |
41 |
This section provides a small introduction to Ganeti terminology, which |
42 |
might be useful to read the rest of the document. |
43 |
|
44 |
<glosslist> |
45 |
<glossentry> |
46 |
<glossterm>Cluster</glossterm> |
47 |
<glossdef> |
48 |
<simpara> |
49 |
A set of machines (nodes) that cooperate to offer a |
50 |
coherent highly available virtualization service. |
51 |
</simpara> |
52 |
</glossdef> |
53 |
</glossentry> |
54 |
<glossentry> |
55 |
<glossterm>Node</glossterm> |
56 |
<glossdef> |
57 |
<simpara> |
58 |
A physical machine which is member of a cluster. |
59 |
Nodes are the basic cluster infrastructure, and are |
60 |
not fault tolerant. |
61 |
</simpara> |
62 |
</glossdef> |
63 |
</glossentry> |
64 |
<glossentry> |
65 |
<glossterm>Master node</glossterm> |
66 |
<glossdef> |
67 |
<simpara> |
68 |
The node which controls the Cluster, from which all |
69 |
Ganeti commands must be given. |
70 |
</simpara> |
71 |
</glossdef> |
72 |
</glossentry> |
73 |
<glossentry> |
74 |
<glossterm>Instance</glossterm> |
75 |
<glossdef> |
76 |
<simpara> |
77 |
A virtual machine which runs on a cluster. It can be a |
78 |
fault tolerant highly available entity. |
79 |
</simpara> |
80 |
</glossdef> |
81 |
</glossentry> |
82 |
<glossentry> |
83 |
<glossterm>Pool</glossterm> |
84 |
<glossdef> |
85 |
<simpara> |
86 |
A pool is a set of clusters sharing the same network. |
87 |
</simpara> |
88 |
</glossdef> |
89 |
</glossentry> |
90 |
<glossentry> |
91 |
<glossterm>Meta-Cluster</glossterm> |
92 |
<glossdef> |
93 |
<simpara> |
94 |
Anything that concerns more than one cluster. |
95 |
</simpara> |
96 |
</glossdef> |
97 |
</glossentry> |
98 |
</glosslist> |
99 |
</para> |
100 |
</sect2> |
101 |
|
102 |
<sect2> |
103 |
<title>Prerequisites</title> |
104 |
|
105 |
<para> |
106 |
You need to have your Ganeti cluster installed and configured before |
107 |
you try any of the commands in this document. Please follow the |
108 |
<emphasis>Ganeti installation tutorial</emphasis> for instructions on |
109 |
how to do that. |
110 |
</para> |
111 |
</sect2> |
112 |
|
113 |
</sect1> |
114 |
|
115 |
<sect1> |
116 |
<title>Managing Instances</title> |
117 |
|
118 |
<sect2> |
119 |
<title>Adding/Removing an instance</title> |
120 |
|
121 |
<para> |
122 |
Adding a new virtual instance to your Ganeti cluster is really easy. |
123 |
The command is: |
124 |
|
125 |
<synopsis>gnt-instance add -n <replaceable>TARGET_NODE</replaceable> -o <replaceable>OS_TYPE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis> |
126 |
|
127 |
The instance name must be resolvable (e.g. exist in DNS) and |
128 |
of course map to an address in the same subnet as the cluster |
129 |
itself. Options you can give to this command include: |
130 |
|
131 |
<itemizedlist> |
132 |
<listitem> |
133 |
<simpara>The disk size (<option>-s</option>)</simpara> |
134 |
</listitem> |
135 |
<listitem> |
136 |
<simpara>The swap size (<option>--swap-size</option>)</simpara> |
137 |
</listitem> |
138 |
<listitem> |
139 |
<simpara>The memory size (<option>-m</option>)</simpara> |
140 |
</listitem> |
141 |
<listitem> |
142 |
<simpara>The number of virtual CPUs (<option>-p</option>)</simpara> |
143 |
</listitem> |
144 |
<listitem> |
145 |
<simpara>The instance ip address (<option>-i</option>) (use the value |
146 |
<literal>auto</literal> to make Ganeti record the address from |
147 |
dns)</simpara> |
148 |
</listitem> |
149 |
<listitem> |
150 |
<simpara>The bridge to connect the instance to (<option>-b</option>), |
151 |
if you don't want to use the default one</simpara> |
152 |
</listitem> |
153 |
</itemizedlist> |
154 |
</para> |
155 |
|
156 |
<para>There are four types of disk template you can choose from:</para> |
157 |
|
158 |
<variablelist> |
159 |
<varlistentry> |
160 |
<term>diskless</term> |
161 |
<listitem> |
162 |
<para>The instance has no disks. Only used for special purpouse |
163 |
operating systems or for testing.</para> |
164 |
</listitem> |
165 |
</varlistentry> |
166 |
|
167 |
<varlistentry> |
168 |
<term>plain</term> |
169 |
<listitem> |
170 |
<para>The instance will use LVM devices as backend for its disks. |
171 |
No redundancy is provided.</para> |
172 |
</listitem> |
173 |
</varlistentry> |
174 |
|
175 |
<varlistentry> |
176 |
<term>local_raid1</term> |
177 |
<listitem> |
178 |
<para>A local mirror is set between LVM devices to back the |
179 |
instance. This provides some redundancy for the instance's |
180 |
data.</para> |
181 |
</listitem> |
182 |
</varlistentry> |
183 |
|
184 |
<varlistentry> |
185 |
<term>remote_raid1</term> |
186 |
<listitem> |
187 |
<simpara><emphasis role="strong">Note:</emphasis> This is only |
188 |
valid for multi-node clusters using drbd 0.7.</simpara> |
189 |
<simpara> |
190 |
A mirror is set between the local node and a remote one, which |
191 |
must be specified with the second value of the --node option. Use |
192 |
this option to obtain a highly available instance that can be |
193 |
failed over to a remote node should the primary one fail. |
194 |
</simpara> |
195 |
</listitem> |
196 |
</varlistentry> |
197 |
|
198 |
<varlistentry> |
199 |
<term>drbd</term> |
200 |
<listitem> |
201 |
<simpara><emphasis role="strong">Note:</emphasis> This is only |
202 |
valid for multi-node clusters using drbd 8.0.</simpara> |
203 |
<simpara> |
204 |
This is similar to the |
205 |
<replaceable>remote_raid1</replaceable> option, but uses |
206 |
new features in drbd 8 to simplify the device |
207 |
stack. From a user's point of view, this will improve |
208 |
the speed of the <command>replace-disks</command> |
209 |
command and (in future versions) provide more |
210 |
functionality. |
211 |
</simpara> |
212 |
</listitem> |
213 |
</varlistentry> |
214 |
|
215 |
</variablelist> |
216 |
|
217 |
<para> |
218 |
For example if you want to create an highly available instance use the |
219 |
remote_raid1 or drbd disk templates: |
220 |
<synopsis>gnt-instance add -n <replaceable>TARGET_NODE</replaceable><optional>:<replaceable>SECONDARY_NODE</replaceable></optional> -o <replaceable>OS_TYPE</replaceable> -t remote_raid1 \ |
221 |
<replaceable>INSTANCE_NAME</replaceable></synopsis> |
222 |
|
223 |
<para> |
224 |
To know which operating systems your cluster supports you can use |
225 |
<synopsis>gnt-os list</synopsis> |
226 |
</para> |
227 |
|
228 |
<para> |
229 |
Removing an instance is even easier than creating one. This operation |
230 |
is non-reversible and destroys all the contents of your instance. Use |
231 |
with care: |
232 |
|
233 |
<synopsis>gnt-instance remove <replaceable>INSTANCE_NAME</replaceable></synopsis> |
234 |
</para> |
235 |
</sect2> |
236 |
|
237 |
<sect2> |
238 |
<title>Starting/Stopping an instance</title> |
239 |
|
240 |
<para> |
241 |
Instances are automatically started at instance creation time. To |
242 |
manually start one which is currently stopped you can run: |
243 |
|
244 |
<synopsis>gnt-instance startup <replaceable>INSTANCE_NAME</replaceable></synopsis> |
245 |
|
246 |
While the command to stop one is: |
247 |
|
248 |
<synopsis>gnt-instance shutdown <replaceable>INSTANCE_NAME</replaceable></synopsis> |
249 |
|
250 |
The command to see all the instances configured and their status is: |
251 |
|
252 |
<synopsis>gnt-instance list</synopsis> |
253 |
|
254 |
</para> |
255 |
|
256 |
<para> |
257 |
Do not use the xen commands to stop instances. If you run for |
258 |
example xm shutdown or xm destroy on an instance Ganeti will |
259 |
automatically restart it (via the |
260 |
<citerefentry><refentrytitle>ganeti-watcher</refentrytitle> |
261 |
<manvolnum>8</manvolnum></citerefentry>) |
262 |
</para> |
263 |
|
264 |
</sect2> |
265 |
|
266 |
<sect2> |
267 |
<title>Exporting/Importing an instance</title> |
268 |
|
269 |
<para> |
270 |
You can create a snapshot of an instance disk and Ganeti |
271 |
configuration, which then you can backup, or import into |
272 |
another cluster. The way to export an instance is: |
273 |
|
274 |
<synopsis>gnt-backup export -n <replaceable>TARGET_NODE</replaceable> <replaceable>INSTANCE_NAME</replaceable></synopsis> |
275 |
|
276 |
The target node can be any node in the cluster with enough |
277 |
space under <filename class="directory">/srv/ganeti</filename> |
278 |
to hold the instance image. Use the |
279 |
<option>--noshutdown</option> option to snapshot an instance |
280 |
without rebooting it. Any previous snapshot of the same |
281 |
instance existing cluster-wide under <filename |
282 |
class="directory">/srv/ganeti</filename> will be removed by |
283 |
this operation: if you want to keep them move them out of the |
284 |
Ganeti exports directory. |
285 |
</para> |
286 |
|
287 |
<para> |
288 |
Importing an instance is similar to creating a new one. The command is: |
289 |
|
290 |
<synopsis>gnt-backup import -n <replaceable>TARGET_NODE</replaceable> -t <replaceable>DISK_TEMPLATE</replaceable> --src-node=<replaceable>NODE</replaceable> --src-dir=DIR INSTANCE_NAME</synopsis> |
291 |
|
292 |
Most of the options available for the command |
293 |
<emphasis>gnt-instance add</emphasis> are supported here too. |
294 |
|
295 |
</para> |
296 |
</sect2> |
297 |
|
298 |
</sect1> |
299 |
|
300 |
|
301 |
<sect1> |
302 |
<title>High availability features</title> |
303 |
|
304 |
<note> |
305 |
<simpara>This section only applies to multi-node clusters.</simpara> |
306 |
</note> |
307 |
|
308 |
<sect2> |
309 |
<title>Failing over an instance</title> |
310 |
|
311 |
<para> |
312 |
If an instance is built in highly available mode you can at |
313 |
any time fail it over to its secondary node, even if the |
314 |
primary has somehow failed and it's not up anymore. Doing it |
315 |
is really easy, on the master node you can just run: |
316 |
|
317 |
<synopsis>gnt-instance failover <replaceable>INSTANCE_NAME</replaceable></synopsis> |
318 |
|
319 |
That's it. After the command completes the secondary node is |
320 |
now the primary, and vice versa. |
321 |
</para> |
322 |
</sect2> |
323 |
|
324 |
<sect2> |
325 |
<title>Replacing an instance disks</title> |
326 |
|
327 |
<para> |
328 |
So what if instead the secondary node for an instance has |
329 |
failed, or you plan to remove a node from your cluster, and |
330 |
you failed over all its instances, but it's still secondary |
331 |
for some? The solution here is to replace the instance disks, |
332 |
changing the secondary node. This is done in two ways, depending on the disk template type. For <literal>remote_raid1</literal>: |
333 |
|
334 |
<synopsis>gnt-instance replace-disks <option>-n <replaceable>NEW_SECONDARY</replaceable></option> <replaceable>INSTANCE_NAME</replaceable></synopsis> |
335 |
|
336 |
and for <literal>drbd</literal>: |
337 |
<synopsis>gnt-instance replace-disks <option>-s</option> <option>-n <replaceable>NEW_SECONDARY</replaceable></option> <replaceable>INSTANCE_NAME</replaceable></synopsis> |
338 |
|
339 |
This process is a bit longer, but involves no instance |
340 |
downtime, and at the end of it the instance has changed its |
341 |
secondary node, to which it can if necessary be failed over. |
342 |
</para> |
343 |
</sect2> |
344 |
<sect2> |
345 |
<title>Failing over the master node</title> |
346 |
|
347 |
<para> |
348 |
This is all good as long as the Ganeti Master Node is |
349 |
up. Should it go down, or should you wish to decommission it, |
350 |
just run on any other node the command: |
351 |
|
352 |
<synopsis>gnt-cluster masterfailover</synopsis> |
353 |
|
354 |
and the node you ran it on is now the new master. |
355 |
</para> |
356 |
</sect2> |
357 |
<sect2> |
358 |
<title>Adding/Removing nodes</title> |
359 |
|
360 |
<para> |
361 |
And of course, now that you know how to move instances around, |
362 |
it's easy to free up a node, and then you can remove it from |
363 |
the cluster: |
364 |
|
365 |
<synopsis>gnt-node remove <replaceable>NODE_NAME</replaceable></synopsis> |
366 |
|
367 |
and maybe add a new one: |
368 |
|
369 |
<synopsis>gnt-node add <optional><option>--secondary-ip=<replaceable>ADDRESS</replaceable></option></optional> <replaceable>NODE_NAME</replaceable> |
370 |
|
371 |
</synopsis> |
372 |
</para> |
373 |
</sect2> |
374 |
</sect1> |
375 |
|
376 |
<sect1> |
377 |
<title>Debugging Features</title> |
378 |
|
379 |
<para> |
380 |
At some point you might need to do some debugging operations on |
381 |
your cluster or on your instances. This section will help you |
382 |
with the most used debugging functionalities. |
383 |
</para> |
384 |
|
385 |
<sect2> |
386 |
<title>Accessing an instance's disks</title> |
387 |
|
388 |
<para> |
389 |
From an instance's primary node you have access to its |
390 |
disks. Never ever mount the underlying logical volume manually |
391 |
on a fault tolerant instance, or you risk breaking |
392 |
replication. The correct way to access them is to run the |
393 |
command: |
394 |
|
395 |
<synopsis>gnt-instance activate-disks <replaceable>INSTANCE_NAME</replaceable></synopsis> |
396 |
|
397 |
And then access the device that gets created. After you've |
398 |
finished you can deactivate them with the deactivate-disks |
399 |
command, which works in the same way. |
400 |
</para> |
401 |
</sect2> |
402 |
|
403 |
<sect2> |
404 |
<title>Accessing an instance's console</title> |
405 |
|
406 |
<para> |
407 |
The command to access a running instance's console is: |
408 |
|
409 |
<synopsis>gnt-instance console <replaceable>INSTANCE_NAME</replaceable></synopsis> |
410 |
|
411 |
Use the console normally and then type |
412 |
<userinput>^]</userinput> when done, to exit. |
413 |
</para> |
414 |
</sect2> |
415 |
|
416 |
<sect2> |
417 |
<title>Instance OS definitions Debugging</title> |
418 |
|
419 |
<para> |
420 |
Should you have any problems with operating systems support |
421 |
the command to ran to see a complete status for all your nodes |
422 |
is: |
423 |
|
424 |
<synopsis>gnt-os diagnose</synopsis> |
425 |
|
426 |
</para> |
427 |
|
428 |
</sect2> |
429 |
|
430 |
<sect2> |
431 |
<title>Cluster-wide debugging</title> |
432 |
|
433 |
<para> |
434 |
The gnt-cluster command offers several options to run tests or |
435 |
execute cluster-wide operations. For example: |
436 |
|
437 |
<screen> |
438 |
gnt-cluster command |
439 |
gnt-cluster copyfile |
440 |
gnt-cluster verify |
441 |
gnt-cluster getmaster |
442 |
gnt-cluster version |
443 |
</screen> |
444 |
|
445 |
See the man page <citerefentry> |
446 |
<refentrytitle>gnt-cluster</refentrytitle> |
447 |
<manvolnum>8</manvolnum> </citerefentry> to know more about |
448 |
their usage. |
449 |
</para> |
450 |
</sect2> |
451 |
|
452 |
</sect1> |
453 |
|
454 |
</article> |