Statistics
| Branch: | Tag: | Revision:

root / doc / admin.rst @ 56c9a709

History | View | Annotate | Download (11 kB)

1
Ganeti administrator's guide
2
============================
3

    
4
Documents Ganeti version |version|
5

    
6
.. contents::
7

    
8
Introduction
9
------------
10

    
11
Ganeti is a virtualization cluster management software. You are
12
expected to be a system administrator familiar with your Linux
13
distribution and the Xen or KVM virtualization environments before
14
using it.
15

    
16

    
17
The various components of Ganeti all have man pages and interactive
18
help. This manual though will help you getting familiar with the
19
system by explaining the most common operations, grouped by related
20
use.
21

    
22
After a terminology glossary and a section on the prerequisites needed
23
to use this manual, the rest of this document is divided in three main
24
sections, which group different features of Ganeti:
25

    
26
- Instance Management
27
- High Availability Features
28
- Debugging Features
29

    
30
Ganeti terminology
31
~~~~~~~~~~~~~~~~~~
32

    
33
This section provides a small introduction to Ganeti terminology,
34
which might be useful to read the rest of the document.
35

    
36
Cluster
37
  A set of machines (nodes) that cooperate to offer a coherent
38
  highly available virtualization service.
39

    
40
Node
41
  A physical machine which is member of a cluster.
42
  Nodes are the basic cluster infrastructure, and are
43
  not fault tolerant.
44

    
45
Master node
46
  The node which controls the Cluster, from which all
47
  Ganeti commands must be given.
48

    
49
Instance
50
  A virtual machine which runs on a cluster. It can be a
51
  fault tolerant highly available entity.
52

    
53
Pool
54
  A pool is a set of clusters sharing the same network.
55

    
56
Meta-Cluster
57
  Anything that concerns more than one cluster.
58

    
59
Prerequisites
60
~~~~~~~~~~~~~
61

    
62
You need to have your Ganeti cluster installed and configured before
63
you try any of the commands in this document. Please follow the
64
*Ganeti installation tutorial* for instructions on how to do that.
65

    
66
Managing Instances
67
------------------
68

    
69
Adding/Removing an instance
70
~~~~~~~~~~~~~~~~~~~~~~~~~~~
71

    
72
Adding a new virtual instance to your Ganeti cluster is really easy.
73
The command is::
74

    
75
  gnt-instance add \
76
    -n TARGET_NODE:SECONDARY_NODE -o OS_TYPE -t DISK_TEMPLATE \
77
    INSTANCE_NAME
78

    
79
The instance name must be resolvable (e.g. exist in DNS) and usually
80
to an address in the same subnet as the cluster itself. Options you
81
can give to this command include:
82

    
83
- The disk size (``-s``) for a single-disk instance, or multiple
84
  ``--disk N:size=SIZE`` options for multi-instance disks
85

    
86
- The memory size (``-B memory``)
87

    
88
- The number of virtual CPUs (``-B vcpus``)
89

    
90
- Arguments for the NICs of the instance; by default, a single-NIC
91
  instance is created. The IP and/or bridge of the NIC can be changed
92
  via ``--nic 0:ip=IP,bridge=BRIDGE``
93

    
94

    
95
There are four types of disk template you can choose from:
96

    
97
diskless
98
  The instance has no disks. Only used for special purpouse operating
99
  systems or for testing.
100

    
101
file
102
  The instance will use plain files as backend for its disks. No
103
  redundancy is provided, and this is somewhat more difficult to
104
  configure for high performance.
105

    
106
plain
107
  The instance will use LVM devices as backend for its disks. No
108
  redundancy is provided.
109

    
110
drbd
111
  .. note:: This is only valid for multi-node clusters using DRBD 8.0.x
112

    
113
  A mirror is set between the local node and a remote one, which must
114
  be specified with the second value of the --node option. Use this
115
  option to obtain a highly available instance that can be failed over
116
  to a remote node should the primary one fail.
117

    
118
For example if you want to create an highly available instance use the
119
drbd disk templates::
120

    
121
  gnt-instance add -n TARGET_NODE:SECONDARY_NODE -o OS_TYPE -t drbd \
122
    INSTANCE_NAME
123

    
124
To know which operating systems your cluster supports you can use
125
the command::
126

    
127
  gnt-os list
128

    
129
Removing an instance is even easier than creating one. This operation
130
is irrereversible and destroys all the contents of your instance. Use
131
with care::
132

    
133
  gnt-instance remove INSTANCE_NAME
134

    
135
Starting/Stopping an instance
136
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
137

    
138
Instances are automatically started at instance creation time. To
139
manually start one which is currently stopped you can run::
140

    
141
  gnt-instance startup INSTANCE_NAME
142

    
143
While the command to stop one is::
144

    
145
  gnt-instance shutdown INSTANCE_NAME
146

    
147
The command to see all the instances configured and their status is::
148

    
149
  gnt-instance list
150

    
151
Do not use the Xen commands to stop instances. If you run for example
152
xm shutdown or xm destroy on an instance Ganeti will automatically
153
restart it (via the ``ganeti-watcher``).
154

    
155
Exporting/Importing an instance
156
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
157

    
158
You can create a snapshot of an instance disk and Ganeti
159
configuration, which then you can backup, or import into another
160
cluster. The way to export an instance is::
161

    
162
  gnt-backup export -n TARGET_NODE INSTANCE_NAME
163

    
164
The target node can be any node in the cluster with enough space under
165
``/srv/ganeti`` to hold the instance image. Use the *--noshutdown*
166
option to snapshot an instance without rebooting it. Any previous
167
snapshot of the same instance existing cluster-wide under
168
``/srv/ganeti`` will be removed by this operation: if you want to keep
169
them move them out of the Ganeti exports directory.
170

    
171
Importing an instance is similar to creating a new one. The command is::
172

    
173
  gnt-backup import -n TARGET_NODE -t DISK_TEMPLATE \
174
    --src-node=NODE --src-dir=DIR INSTANCE_NAME
175

    
176
Most of the options available for the command :command:`gnt-instance
177
add` are supported here too.
178

    
179
High availability features
180
--------------------------
181

    
182
.. note:: This section only applies to multi-node clusters
183

    
184
Failing over an instance
185
~~~~~~~~~~~~~~~~~~~~~~~~
186

    
187
If an instance is built in highly available mode you can at any time
188
fail it over to its secondary node, even if the primary has somehow
189
failed and it's not up anymore. Doing it is really easy, on the master
190
node you can just run::
191

    
192
  gnt-instance failover INSTANCE_NAME
193

    
194
That's it. After the command completes the secondary node is now the
195
primary, and vice versa.
196

    
197
Live migrating an instance
198
~~~~~~~~~~~~~~~~~~~~~~~~~~
199

    
200
If an instance is built in highly available mode, it currently runs
201
and both its nodes are running fine, you can at migrate it over to its
202
secondary node, without dowtime. On the master node you need to run::
203

    
204
  gnt-instance migrate INSTANCE_NAME
205

    
206
Replacing an instance disks
207
~~~~~~~~~~~~~~~~~~~~~~~~~~~
208

    
209
So what if instead the secondary node for an instance has failed, or
210
you plan to remove a node from your cluster, and you failed over all
211
its instances, but it's still secondary for some? The solution here is
212
to replace the instance disks, changing the secondary node::
213

    
214
  gnt-instance replace-disks -n NODE INSTANCE_NAME
215

    
216
This process is a bit long, but involves no instance downtime, and at
217
the end of it the instance has changed its secondary node, to which it
218
can if necessary be failed over.
219

    
220
Failing over the master node
221
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
222

    
223
This is all good as long as the Ganeti Master Node is up. Should it go
224
down, or should you wish to decommission it, just run on any other
225
node the command::
226

    
227
  gnt-cluster masterfailover
228

    
229
and the node you ran it on is now the new master.
230

    
231
Adding/Removing nodes
232
~~~~~~~~~~~~~~~~~~~~~
233

    
234
And of course, now that you know how to move instances around, it's
235
easy to free up a node, and then you can remove it from the cluster::
236

    
237
  gnt-node remove NODE_NAME
238

    
239
and maybe add a new one::
240

    
241
  gnt-node add --secondary-ip=ADDRESS NODE_NAME
242

    
243
Debugging Features
244
------------------
245

    
246
At some point you might need to do some debugging operations on your
247
cluster or on your instances. This section will help you with the most
248
used debugging functionalities.
249

    
250
Accessing an instance's disks
251
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
252

    
253
From an instance's primary node you have access to its disks. Never
254
ever mount the underlying logical volume manually on a fault tolerant
255
instance, or you risk breaking replication. The correct way to access
256
them is to run the command::
257

    
258
  gnt-instance activate-disks INSTANCE_NAME
259

    
260
And then access the device that gets created.  After you've finished
261
you can deactivate them with the deactivate-disks command, which works
262
in the same way.
263

    
264
Accessing an instance's console
265
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
266

    
267
The command to access a running instance's console is::
268

    
269
  gnt-instance console INSTANCE_NAME
270

    
271
Use the console normally and then type ``^]`` when
272
done, to exit.
273

    
274
Instance OS definitions Debugging
275
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
276

    
277
Should you have any problems with operating systems support the
278
command to ran to see a complete status for all your nodes is::
279

    
280
   gnt-os diagnose
281

    
282
Cluster-wide debugging
283
~~~~~~~~~~~~~~~~~~~~~~
284

    
285
The :command:`gnt-cluster` command offers several options to run tests
286
or execute cluster-wide operations. For example::
287

    
288
  gnt-cluster command
289
  gnt-cluster copyfile
290
  gnt-cluster verify
291
  gnt-cluster verify-disks
292
  gnt-cluster getmaster
293
  gnt-cluster version
294

    
295
See the man page :manpage:`gnt-cluster` to know more about their usage.
296

    
297
Removing a cluster entirely
298
~~~~~~~~~~~~~~~~~~~~~~~~~~~
299

    
300
The usual method to cleanup a cluster is to run ``gnt-cluster
301
destroy`` however if the Ganeti installation is broken in any way then
302
this will not run.
303

    
304
It is possible in such a case to cleanup manually most if not all
305
traces of a cluster installation by following these steps on all of
306
the nodes:
307

    
308
1. Shutdown all instances. This depends on the virtualisation
309
   method used (Xen, KVM, etc.):
310

    
311
  - Xen: run ``xm list`` and ``xm destroy`` on all the non-Domain-0
312
    instances
313
  - KVM: kill all the KVM processes
314
  - chroot: kill all processes under the chroot mountpoints
315

    
316
2. If using DRBD, shutdown all DRBD minors (which should by at this
317
   time no-longer in use by instances); on each node, run ``drbdsetup
318
   /dev/drbdN down`` for each active DRBD minor.
319

    
320
3. If using LVM, cleanup the Ganeti volume group; if only Ganeti
321
   created logical volumes (and you are not sharing the volume group
322
   with the OS, for example), then simply running ``lvremove -f
323
   xenvg`` (replace 'xenvg' with your volume group name) should do the
324
   required cleanup.
325

    
326
4. If using file-based storage, remove recursively all files and
327
   directories under your file-storage directory: ``rm -rf
328
   /srv/ganeti/file-storage/*`` replacing the path with the correct
329
   path for your cluster.
330

    
331
5. Stop the ganeti daemons (``/etc/init.d/ganeti stop``) and kill any
332
   that remain alive (``pgrep ganeti`` and ``pkill ganeti``).
333

    
334
6. Remove the ganeti state directory (``rm -rf /var/lib/ganeti/*``),
335
   replacing the path with the correct path for your installation.
336

    
337
On the master node, remove the cluster from the master-netdev (usually
338
``xen-br0`` for bridged mode, otherwise ``eth0`` or similar), by
339
running ``ip a del $clusterip/32 dev xen-br0`` (use the correct
340
cluster ip and network device name).
341

    
342
At this point, the machines are ready for a cluster creation; in case
343
you want to remove Ganeti completely, you need to also undo some of
344
the SSH changes and log directories:
345

    
346
- ``rm -rf /var/log/ganeti /srv/ganeti`` (replace with the correct paths)
347
- remove from ``/root/.ssh`` the keys that Ganeti added (check
348
  the ``authorized_keys`` and ``id_dsa`` files)
349
- regenerate the host's SSH keys (check the OpenSSH startup scripts)
350
- uninstall Ganeti
351

    
352
Otherwise, if you plan to re-create the cluster, you can just go ahead
353
and rerun ``gnt-cluster init``.