Statistics
| Branch: | Tag: | Revision:

root / doc / admin.rst @ 7faf5110

History | View | Annotate | Download (11.1 kB)

1 ffa6869f Iustin Pop
Ganeti administrator's guide
2 ffa6869f Iustin Pop
============================
3 ffa6869f Iustin Pop
4 fd07c6b3 Iustin Pop
Documents Ganeti version |version|
5 ffa6869f Iustin Pop
6 ffa6869f Iustin Pop
.. contents::
7 ffa6869f Iustin Pop
8 ffa6869f Iustin Pop
Introduction
9 ffa6869f Iustin Pop
------------
10 ffa6869f Iustin Pop
11 ffa6869f Iustin Pop
Ganeti is a virtualization cluster management software. You are
12 ffa6869f Iustin Pop
expected to be a system administrator familiar with your Linux
13 ffa6869f Iustin Pop
distribution and the Xen or KVM virtualization environments before
14 ffa6869f Iustin Pop
using it.
15 ffa6869f Iustin Pop
16 ffa6869f Iustin Pop
17 ffa6869f Iustin Pop
The various components of Ganeti all have man pages and interactive
18 ffa6869f Iustin Pop
help. This manual though will help you getting familiar with the
19 ffa6869f Iustin Pop
system by explaining the most common operations, grouped by related
20 ffa6869f Iustin Pop
use.
21 ffa6869f Iustin Pop
22 ffa6869f Iustin Pop
After a terminology glossary and a section on the prerequisites needed
23 ffa6869f Iustin Pop
to use this manual, the rest of this document is divided in three main
24 ffa6869f Iustin Pop
sections, which group different features of Ganeti:
25 ffa6869f Iustin Pop
26 ffa6869f Iustin Pop
- Instance Management
27 ffa6869f Iustin Pop
- High Availability Features
28 ffa6869f Iustin Pop
- Debugging Features
29 ffa6869f Iustin Pop
30 ffa6869f Iustin Pop
Ganeti terminology
31 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~
32 ffa6869f Iustin Pop
33 ffa6869f Iustin Pop
This section provides a small introduction to Ganeti terminology,
34 ffa6869f Iustin Pop
which might be useful to read the rest of the document.
35 ffa6869f Iustin Pop
36 ffa6869f Iustin Pop
Cluster
37 ffa6869f Iustin Pop
  A set of machines (nodes) that cooperate to offer a coherent
38 ffa6869f Iustin Pop
  highly available virtualization service.
39 ffa6869f Iustin Pop
40 ffa6869f Iustin Pop
Node
41 ffa6869f Iustin Pop
  A physical machine which is member of a cluster.
42 ffa6869f Iustin Pop
  Nodes are the basic cluster infrastructure, and are
43 ffa6869f Iustin Pop
  not fault tolerant.
44 ffa6869f Iustin Pop
45 ffa6869f Iustin Pop
Master node
46 ffa6869f Iustin Pop
  The node which controls the Cluster, from which all
47 ffa6869f Iustin Pop
  Ganeti commands must be given.
48 ffa6869f Iustin Pop
49 ffa6869f Iustin Pop
Instance
50 ffa6869f Iustin Pop
  A virtual machine which runs on a cluster. It can be a
51 ffa6869f Iustin Pop
  fault tolerant highly available entity.
52 ffa6869f Iustin Pop
53 ffa6869f Iustin Pop
Pool
54 ffa6869f Iustin Pop
  A pool is a set of clusters sharing the same network.
55 ffa6869f Iustin Pop
56 ffa6869f Iustin Pop
Meta-Cluster
57 ffa6869f Iustin Pop
  Anything that concerns more than one cluster.
58 ffa6869f Iustin Pop
59 ffa6869f Iustin Pop
Prerequisites
60 ffa6869f Iustin Pop
~~~~~~~~~~~~~
61 ffa6869f Iustin Pop
62 ffa6869f Iustin Pop
You need to have your Ganeti cluster installed and configured before
63 ffa6869f Iustin Pop
you try any of the commands in this document. Please follow the
64 ffa6869f Iustin Pop
*Ganeti installation tutorial* for instructions on how to do that.
65 ffa6869f Iustin Pop
66 ffa6869f Iustin Pop
Managing Instances
67 ffa6869f Iustin Pop
------------------
68 ffa6869f Iustin Pop
69 ffa6869f Iustin Pop
Adding/Removing an instance
70 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~
71 ffa6869f Iustin Pop
72 ffa6869f Iustin Pop
Adding a new virtual instance to your Ganeti cluster is really easy.
73 ffa6869f Iustin Pop
The command is::
74 ffa6869f Iustin Pop
75 ffa6869f Iustin Pop
  gnt-instance add \
76 ffa6869f Iustin Pop
    -n TARGET_NODE:SECONDARY_NODE -o OS_TYPE -t DISK_TEMPLATE \
77 ffa6869f Iustin Pop
    INSTANCE_NAME
78 ffa6869f Iustin Pop
79 ffa6869f Iustin Pop
The instance name must be resolvable (e.g. exist in DNS) and usually
80 ffa6869f Iustin Pop
to an address in the same subnet as the cluster itself. Options you
81 ffa6869f Iustin Pop
can give to this command include:
82 ffa6869f Iustin Pop
83 ffa6869f Iustin Pop
- The disk size (``-s``) for a single-disk instance, or multiple
84 ffa6869f Iustin Pop
  ``--disk N:size=SIZE`` options for multi-instance disks
85 ffa6869f Iustin Pop
86 ffa6869f Iustin Pop
- The memory size (``-B memory``)
87 ffa6869f Iustin Pop
88 ffa6869f Iustin Pop
- The number of virtual CPUs (``-B vcpus``)
89 ffa6869f Iustin Pop
90 ffa6869f Iustin Pop
- Arguments for the NICs of the instance; by default, a single-NIC
91 ffa6869f Iustin Pop
  instance is created. The IP and/or bridge of the NIC can be changed
92 ffa6869f Iustin Pop
  via ``--nic 0:ip=IP,bridge=BRIDGE``
93 ffa6869f Iustin Pop
94 ffa6869f Iustin Pop
95 ffa6869f Iustin Pop
There are four types of disk template you can choose from:
96 ffa6869f Iustin Pop
97 ffa6869f Iustin Pop
diskless
98 ffa6869f Iustin Pop
  The instance has no disks. Only used for special purpouse operating
99 ffa6869f Iustin Pop
  systems or for testing.
100 ffa6869f Iustin Pop
101 ffa6869f Iustin Pop
file
102 ffa6869f Iustin Pop
  The instance will use plain files as backend for its disks. No
103 ffa6869f Iustin Pop
  redundancy is provided, and this is somewhat more difficult to
104 ffa6869f Iustin Pop
  configure for high performance.
105 ffa6869f Iustin Pop
106 ffa6869f Iustin Pop
plain
107 ffa6869f Iustin Pop
  The instance will use LVM devices as backend for its disks. No
108 ffa6869f Iustin Pop
  redundancy is provided.
109 ffa6869f Iustin Pop
110 ffa6869f Iustin Pop
drbd
111 ffa6869f Iustin Pop
  .. note:: This is only valid for multi-node clusters using DRBD 8.0.x
112 ffa6869f Iustin Pop
113 ffa6869f Iustin Pop
  A mirror is set between the local node and a remote one, which must
114 ffa6869f Iustin Pop
  be specified with the second value of the --node option. Use this
115 ffa6869f Iustin Pop
  option to obtain a highly available instance that can be failed over
116 ffa6869f Iustin Pop
  to a remote node should the primary one fail.
117 ffa6869f Iustin Pop
118 ffa6869f Iustin Pop
For example if you want to create an highly available instance use the
119 ffa6869f Iustin Pop
drbd disk templates::
120 ffa6869f Iustin Pop
121 ffa6869f Iustin Pop
  gnt-instance add -n TARGET_NODE:SECONDARY_NODE -o OS_TYPE -t drbd \
122 ffa6869f Iustin Pop
    INSTANCE_NAME
123 ffa6869f Iustin Pop
124 ffa6869f Iustin Pop
To know which operating systems your cluster supports you can use
125 ffa6869f Iustin Pop
the command::
126 ffa6869f Iustin Pop
127 ffa6869f Iustin Pop
  gnt-os list
128 ffa6869f Iustin Pop
129 ffa6869f Iustin Pop
Removing an instance is even easier than creating one. This operation
130 ffa6869f Iustin Pop
is irrereversible and destroys all the contents of your instance. Use
131 ffa6869f Iustin Pop
with care::
132 ffa6869f Iustin Pop
133 ffa6869f Iustin Pop
  gnt-instance remove INSTANCE_NAME
134 ffa6869f Iustin Pop
135 ffa6869f Iustin Pop
Starting/Stopping an instance
136 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
137 ffa6869f Iustin Pop
138 ffa6869f Iustin Pop
Instances are automatically started at instance creation time. To
139 ffa6869f Iustin Pop
manually start one which is currently stopped you can run::
140 ffa6869f Iustin Pop
141 ffa6869f Iustin Pop
  gnt-instance startup INSTANCE_NAME
142 ffa6869f Iustin Pop
143 ffa6869f Iustin Pop
While the command to stop one is::
144 ffa6869f Iustin Pop
145 ffa6869f Iustin Pop
  gnt-instance shutdown INSTANCE_NAME
146 ffa6869f Iustin Pop
147 ffa6869f Iustin Pop
The command to see all the instances configured and their status is::
148 ffa6869f Iustin Pop
149 ffa6869f Iustin Pop
  gnt-instance list
150 ffa6869f Iustin Pop
151 ffa6869f Iustin Pop
Do not use the Xen commands to stop instances. If you run for example
152 ffa6869f Iustin Pop
xm shutdown or xm destroy on an instance Ganeti will automatically
153 ffa6869f Iustin Pop
restart it (via the ``ganeti-watcher``).
154 ffa6869f Iustin Pop
155 ffa6869f Iustin Pop
Exporting/Importing an instance
156 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
157 ffa6869f Iustin Pop
158 ffa6869f Iustin Pop
You can create a snapshot of an instance disk and Ganeti
159 ffa6869f Iustin Pop
configuration, which then you can backup, or import into another
160 ffa6869f Iustin Pop
cluster. The way to export an instance is::
161 ffa6869f Iustin Pop
162 ffa6869f Iustin Pop
  gnt-backup export -n TARGET_NODE INSTANCE_NAME
163 ffa6869f Iustin Pop
164 ffa6869f Iustin Pop
The target node can be any node in the cluster with enough space under
165 ffa6869f Iustin Pop
``/srv/ganeti`` to hold the instance image. Use the *--noshutdown*
166 ffa6869f Iustin Pop
option to snapshot an instance without rebooting it. Any previous
167 ffa6869f Iustin Pop
snapshot of the same instance existing cluster-wide under
168 ffa6869f Iustin Pop
``/srv/ganeti`` will be removed by this operation: if you want to keep
169 ffa6869f Iustin Pop
them move them out of the Ganeti exports directory.
170 ffa6869f Iustin Pop
171 ffa6869f Iustin Pop
Importing an instance is similar to creating a new one. The command is::
172 ffa6869f Iustin Pop
173 ffa6869f Iustin Pop
  gnt-backup import -n TARGET_NODE -t DISK_TEMPLATE \
174 ffa6869f Iustin Pop
    --src-node=NODE --src-dir=DIR INSTANCE_NAME
175 ffa6869f Iustin Pop
176 fd07c6b3 Iustin Pop
Most of the options available for the command :command:`gnt-instance
177 fd07c6b3 Iustin Pop
add` are supported here too.
178 ffa6869f Iustin Pop
179 ffa6869f Iustin Pop
High availability features
180 ffa6869f Iustin Pop
--------------------------
181 ffa6869f Iustin Pop
182 ffa6869f Iustin Pop
.. note:: This section only applies to multi-node clusters
183 ffa6869f Iustin Pop
184 ffa6869f Iustin Pop
Failing over an instance
185 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~
186 ffa6869f Iustin Pop
187 ffa6869f Iustin Pop
If an instance is built in highly available mode you can at any time
188 ffa6869f Iustin Pop
fail it over to its secondary node, even if the primary has somehow
189 ffa6869f Iustin Pop
failed and it's not up anymore. Doing it is really easy, on the master
190 ffa6869f Iustin Pop
node you can just run::
191 ffa6869f Iustin Pop
192 ffa6869f Iustin Pop
  gnt-instance failover INSTANCE_NAME
193 ffa6869f Iustin Pop
194 ffa6869f Iustin Pop
That's it. After the command completes the secondary node is now the
195 ffa6869f Iustin Pop
primary, and vice versa.
196 ffa6869f Iustin Pop
197 ffa6869f Iustin Pop
Live migrating an instance
198 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~
199 ffa6869f Iustin Pop
200 ffa6869f Iustin Pop
If an instance is built in highly available mode, it currently runs
201 ffa6869f Iustin Pop
and both its nodes are running fine, you can at migrate it over to its
202 ffa6869f Iustin Pop
secondary node, without dowtime. On the master node you need to run::
203 ffa6869f Iustin Pop
204 ffa6869f Iustin Pop
  gnt-instance migrate INSTANCE_NAME
205 ffa6869f Iustin Pop
206 ffa6869f Iustin Pop
Replacing an instance disks
207 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~
208 ffa6869f Iustin Pop
209 ffa6869f Iustin Pop
So what if instead the secondary node for an instance has failed, or
210 ffa6869f Iustin Pop
you plan to remove a node from your cluster, and you failed over all
211 ffa6869f Iustin Pop
its instances, but it's still secondary for some? The solution here is
212 ffa6869f Iustin Pop
to replace the instance disks, changing the secondary node::
213 ffa6869f Iustin Pop
214 ffa6869f Iustin Pop
  gnt-instance replace-disks -n NODE INSTANCE_NAME
215 ffa6869f Iustin Pop
216 ffa6869f Iustin Pop
This process is a bit long, but involves no instance downtime, and at
217 ffa6869f Iustin Pop
the end of it the instance has changed its secondary node, to which it
218 ffa6869f Iustin Pop
can if necessary be failed over.
219 ffa6869f Iustin Pop
220 ffa6869f Iustin Pop
Failing over the master node
221 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
222 ffa6869f Iustin Pop
223 ffa6869f Iustin Pop
This is all good as long as the Ganeti Master Node is up. Should it go
224 ffa6869f Iustin Pop
down, or should you wish to decommission it, just run on any other
225 ffa6869f Iustin Pop
node the command::
226 ffa6869f Iustin Pop
227 ffa6869f Iustin Pop
  gnt-cluster masterfailover
228 ffa6869f Iustin Pop
229 ffa6869f Iustin Pop
and the node you ran it on is now the new master.
230 ffa6869f Iustin Pop
231 ffa6869f Iustin Pop
Adding/Removing nodes
232 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~
233 ffa6869f Iustin Pop
234 ffa6869f Iustin Pop
And of course, now that you know how to move instances around, it's
235 ffa6869f Iustin Pop
easy to free up a node, and then you can remove it from the cluster::
236 ffa6869f Iustin Pop
237 ffa6869f Iustin Pop
  gnt-node remove NODE_NAME
238 ffa6869f Iustin Pop
239 ffa6869f Iustin Pop
and maybe add a new one::
240 ffa6869f Iustin Pop
241 ffa6869f Iustin Pop
  gnt-node add --secondary-ip=ADDRESS NODE_NAME
242 ffa6869f Iustin Pop
243 ffa6869f Iustin Pop
Debugging Features
244 ffa6869f Iustin Pop
------------------
245 ffa6869f Iustin Pop
246 ffa6869f Iustin Pop
At some point you might need to do some debugging operations on your
247 ffa6869f Iustin Pop
cluster or on your instances. This section will help you with the most
248 ffa6869f Iustin Pop
used debugging functionalities.
249 ffa6869f Iustin Pop
250 ffa6869f Iustin Pop
Accessing an instance's disks
251 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
252 ffa6869f Iustin Pop
253 ffa6869f Iustin Pop
From an instance's primary node you have access to its disks. Never
254 ffa6869f Iustin Pop
ever mount the underlying logical volume manually on a fault tolerant
255 ffa6869f Iustin Pop
instance, or you risk breaking replication. The correct way to access
256 ffa6869f Iustin Pop
them is to run the command::
257 ffa6869f Iustin Pop
258 ffa6869f Iustin Pop
  gnt-instance activate-disks INSTANCE_NAME
259 ffa6869f Iustin Pop
260 ffa6869f Iustin Pop
And then access the device that gets created.  After you've finished
261 ffa6869f Iustin Pop
you can deactivate them with the deactivate-disks command, which works
262 ffa6869f Iustin Pop
in the same way.
263 ffa6869f Iustin Pop
264 ffa6869f Iustin Pop
Accessing an instance's console
265 fd07c6b3 Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
266 ffa6869f Iustin Pop
267 ffa6869f Iustin Pop
The command to access a running instance's console is::
268 ffa6869f Iustin Pop
269 ffa6869f Iustin Pop
  gnt-instance console INSTANCE_NAME
270 ffa6869f Iustin Pop
271 ffa6869f Iustin Pop
Use the console normally and then type ``^]`` when
272 ffa6869f Iustin Pop
done, to exit.
273 ffa6869f Iustin Pop
274 ffa6869f Iustin Pop
Instance OS definitions Debugging
275 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
276 ffa6869f Iustin Pop
277 ffa6869f Iustin Pop
Should you have any problems with operating systems support the
278 ffa6869f Iustin Pop
command to ran to see a complete status for all your nodes is::
279 ffa6869f Iustin Pop
280 ffa6869f Iustin Pop
   gnt-os diagnose
281 ffa6869f Iustin Pop
282 ffa6869f Iustin Pop
Cluster-wide debugging
283 ffa6869f Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~
284 ffa6869f Iustin Pop
285 fd07c6b3 Iustin Pop
The :command:`gnt-cluster` command offers several options to run tests
286 fd07c6b3 Iustin Pop
or execute cluster-wide operations. For example::
287 ffa6869f Iustin Pop
288 ffa6869f Iustin Pop
  gnt-cluster command
289 ffa6869f Iustin Pop
  gnt-cluster copyfile
290 ffa6869f Iustin Pop
  gnt-cluster verify
291 ffa6869f Iustin Pop
  gnt-cluster verify-disks
292 ffa6869f Iustin Pop
  gnt-cluster getmaster
293 ffa6869f Iustin Pop
  gnt-cluster version
294 ffa6869f Iustin Pop
295 fd07c6b3 Iustin Pop
See the man page :manpage:`gnt-cluster` to know more about their usage.
296 56c9a709 Iustin Pop
297 56c9a709 Iustin Pop
Removing a cluster entirely
298 56c9a709 Iustin Pop
~~~~~~~~~~~~~~~~~~~~~~~~~~~
299 56c9a709 Iustin Pop
300 56c9a709 Iustin Pop
The usual method to cleanup a cluster is to run ``gnt-cluster
301 56c9a709 Iustin Pop
destroy`` however if the Ganeti installation is broken in any way then
302 56c9a709 Iustin Pop
this will not run.
303 56c9a709 Iustin Pop
304 56c9a709 Iustin Pop
It is possible in such a case to cleanup manually most if not all
305 56c9a709 Iustin Pop
traces of a cluster installation by following these steps on all of
306 56c9a709 Iustin Pop
the nodes:
307 56c9a709 Iustin Pop
308 56c9a709 Iustin Pop
1. Shutdown all instances. This depends on the virtualisation
309 56c9a709 Iustin Pop
   method used (Xen, KVM, etc.):
310 56c9a709 Iustin Pop
311 56c9a709 Iustin Pop
  - Xen: run ``xm list`` and ``xm destroy`` on all the non-Domain-0
312 56c9a709 Iustin Pop
    instances
313 56c9a709 Iustin Pop
  - KVM: kill all the KVM processes
314 56c9a709 Iustin Pop
  - chroot: kill all processes under the chroot mountpoints
315 56c9a709 Iustin Pop
316 56c9a709 Iustin Pop
2. If using DRBD, shutdown all DRBD minors (which should by at this
317 56c9a709 Iustin Pop
   time no-longer in use by instances); on each node, run ``drbdsetup
318 56c9a709 Iustin Pop
   /dev/drbdN down`` for each active DRBD minor.
319 56c9a709 Iustin Pop
320 56c9a709 Iustin Pop
3. If using LVM, cleanup the Ganeti volume group; if only Ganeti
321 56c9a709 Iustin Pop
   created logical volumes (and you are not sharing the volume group
322 56c9a709 Iustin Pop
   with the OS, for example), then simply running ``lvremove -f
323 56c9a709 Iustin Pop
   xenvg`` (replace 'xenvg' with your volume group name) should do the
324 56c9a709 Iustin Pop
   required cleanup.
325 56c9a709 Iustin Pop
326 56c9a709 Iustin Pop
4. If using file-based storage, remove recursively all files and
327 56c9a709 Iustin Pop
   directories under your file-storage directory: ``rm -rf
328 56c9a709 Iustin Pop
   /srv/ganeti/file-storage/*`` replacing the path with the correct
329 56c9a709 Iustin Pop
   path for your cluster.
330 56c9a709 Iustin Pop
331 56c9a709 Iustin Pop
5. Stop the ganeti daemons (``/etc/init.d/ganeti stop``) and kill any
332 56c9a709 Iustin Pop
   that remain alive (``pgrep ganeti`` and ``pkill ganeti``).
333 56c9a709 Iustin Pop
334 56c9a709 Iustin Pop
6. Remove the ganeti state directory (``rm -rf /var/lib/ganeti/*``),
335 56c9a709 Iustin Pop
   replacing the path with the correct path for your installation.
336 56c9a709 Iustin Pop
337 56c9a709 Iustin Pop
On the master node, remove the cluster from the master-netdev (usually
338 56c9a709 Iustin Pop
``xen-br0`` for bridged mode, otherwise ``eth0`` or similar), by
339 56c9a709 Iustin Pop
running ``ip a del $clusterip/32 dev xen-br0`` (use the correct
340 56c9a709 Iustin Pop
cluster ip and network device name).
341 56c9a709 Iustin Pop
342 56c9a709 Iustin Pop
At this point, the machines are ready for a cluster creation; in case
343 56c9a709 Iustin Pop
you want to remove Ganeti completely, you need to also undo some of
344 56c9a709 Iustin Pop
the SSH changes and log directories:
345 56c9a709 Iustin Pop
346 7faf5110 Michael Hanselmann
- ``rm -rf /var/log/ganeti /srv/ganeti`` (replace with the correct
347 7faf5110 Michael Hanselmann
  paths)
348 56c9a709 Iustin Pop
- remove from ``/root/.ssh`` the keys that Ganeti added (check
349 56c9a709 Iustin Pop
  the ``authorized_keys`` and ``id_dsa`` files)
350 56c9a709 Iustin Pop
- regenerate the host's SSH keys (check the OpenSSH startup scripts)
351 56c9a709 Iustin Pop
- uninstall Ganeti
352 56c9a709 Iustin Pop
353 56c9a709 Iustin Pop
Otherwise, if you plan to re-create the cluster, you can just go ahead
354 56c9a709 Iustin Pop
and rerun ``gnt-cluster init``.
355 558fd122 Michael Hanselmann
356 558fd122 Michael Hanselmann
.. vim: set textwidth=72 :