Statistics
| Branch: | Tag: | Revision:

root / doc / admin.rst @ ac0af025

History | View | Annotate | Download (61.4 kB)

1
Ganeti administrator's guide
2
============================
3

    
4
Documents Ganeti version |version|
5

    
6
.. contents::
7

    
8
.. highlight:: shell-example
9

    
10
Introduction
11
------------
12

    
13
Ganeti is a virtualization cluster management software. You are expected
14
to be a system administrator familiar with your Linux distribution and
15
the Xen or KVM virtualization environments before using it.
16

    
17
The various components of Ganeti all have man pages and interactive
18
help. This manual though will help you getting familiar with the system
19
by explaining the most common operations, grouped by related use.
20

    
21
After a terminology glossary and a section on the prerequisites needed
22
to use this manual, the rest of this document is divided in sections
23
for the different targets that a command affects: instance, nodes, etc.
24

    
25
.. _terminology-label:
26

    
27
Ganeti terminology
28
++++++++++++++++++
29

    
30
This section provides a small introduction to Ganeti terminology, which
31
might be useful when reading the rest of the document.
32

    
33
Cluster
34
~~~~~~~
35

    
36
A set of machines (nodes) that cooperate to offer a coherent, highly
37
available virtualization service under a single administration domain.
38

    
39
Node
40
~~~~
41

    
42
A physical machine which is member of a cluster.  Nodes are the basic
43
cluster infrastructure, and they don't need to be fault tolerant in
44
order to achieve high availability for instances.
45

    
46
Node can be added and removed (if they host no instances) at will from
47
the cluster. In a HA cluster and only with HA instances, the loss of any
48
single node will not cause disk data loss for any instance; of course,
49
a node crash will cause the crash of its primary instances.
50

    
51
A node belonging to a cluster can be in one of the following roles at a
52
given time:
53

    
54
- *master* node, which is the node from which the cluster is controlled
55
- *master candidate* node, only nodes in this role have the full cluster
56
  configuration and knowledge, and only master candidates can become the
57
  master node
58
- *regular* node, which is the state in which most nodes will be on
59
  bigger clusters (>20 nodes)
60
- *drained* node, nodes in this state are functioning normally but the
61
  cannot receive new instances; the intention is that nodes in this role
62
  have some issue and they are being evacuated for hardware repairs
63
- *offline* node, in which there is a record in the cluster
64
  configuration about the node, but the daemons on the master node will
65
  not talk to this node; any instances declared as having an offline
66
  node as either primary or secondary will be flagged as an error in the
67
  cluster verify operation
68

    
69
Depending on the role, each node will run a set of daemons:
70

    
71
- the :command:`ganeti-noded` daemon, which controls the manipulation of
72
  this node's hardware resources; it runs on all nodes which are in a
73
  cluster
74
- the :command:`ganeti-confd` daemon (Ganeti 2.1+) which runs on all
75
  nodes, but is only functional on master candidate nodes; this daemon
76
  can be disabled at configuration time if you don't need its
77
  functionality
78
- the :command:`ganeti-rapi` daemon which runs on the master node and
79
  offers an HTTP-based API for the cluster
80
- the :command:`ganeti-masterd` daemon which runs on the master node and
81
  allows control of the cluster
82

    
83
Beside the node role, there are other node flags that influence its
84
behaviour:
85

    
86
- the *master_capable* flag denotes whether the node can ever become a
87
  master candidate; setting this to 'no' means that auto-promotion will
88
  never make this node a master candidate; this flag can be useful for a
89
  remote node that only runs local instances, and having it become a
90
  master is impractical due to networking or other constraints
91
- the *vm_capable* flag denotes whether the node can host instances or
92
  not; for example, one might use a non-vm_capable node just as a master
93
  candidate, for configuration backups; setting this flag to no
94
  disallows placement of instances of this node, deactivates hypervisor
95
  and related checks on it (e.g. bridge checks, LVM check, etc.), and
96
  removes it from cluster capacity computations
97

    
98

    
99
Instance
100
~~~~~~~~
101

    
102
A virtual machine which runs on a cluster. It can be a fault tolerant,
103
highly available entity.
104

    
105
An instance has various parameters, which are classified in three
106
categories: hypervisor related-parameters (called ``hvparams``), general
107
parameters (called ``beparams``) and per network-card parameters (called
108
``nicparams``). All these parameters can be modified either at instance
109
level or via defaults at cluster level.
110

    
111
Disk template
112
~~~~~~~~~~~~~
113

    
114
The are multiple options for the storage provided to an instance; while
115
the instance sees the same virtual drive in all cases, the node-level
116
configuration varies between them.
117

    
118
There are five disk templates you can choose from:
119

    
120
diskless
121
  The instance has no disks. Only used for special purpose operating
122
  systems or for testing.
123

    
124
file
125
  The instance will use plain files as backend for its disks. No
126
  redundancy is provided, and this is somewhat more difficult to
127
  configure for high performance. Note that for security reasons the
128
  file storage directory must be listed under
129
  ``/etc/ganeti/file-storage-paths``, and that file is not copied
130
  automatically to all nodes by Ganeti.
131

    
132
sharedfile
133
  The instance will use plain files as backend, but Ganeti assumes that
134
  those files will be available and in sync automatically on all nodes.
135
  This allows live migration and failover of instances using this
136
  method. As for ``file`` the file storage directory must be listed under
137
  ``/etc/ganeti/file-storage-paths`` or ganeti will refuse to create
138
  instances under it.
139

    
140
plain
141
  The instance will use LVM devices as backend for its disks. No
142
  redundancy is provided.
143

    
144
drbd
145
  .. note:: This is only valid for multi-node clusters using DRBD 8.0+
146

    
147
  A mirror is set between the local node and a remote one, which must be
148
  specified with the second value of the --node option. Use this option
149
  to obtain a highly available instance that can be failed over to a
150
  remote node should the primary one fail.
151

    
152
  .. note:: Ganeti does not support DRBD stacked devices:
153
     DRBD stacked setup is not fully symmetric and as such it is
154
     not working with live migration.
155

    
156
rbd
157
  The instance will use Volumes inside a RADOS cluster as backend for its
158
  disks. It will access them using the RADOS block device (RBD).
159

    
160
ext
161
  The instance will use an external storage provider. See
162
  :manpage:`ganeti-extstorage-interface(7)` for how to implement one.
163

    
164

    
165
IAllocator
166
~~~~~~~~~~
167

    
168
A framework for using external (user-provided) scripts to compute the
169
placement of instances on the cluster nodes. This eliminates the need to
170
manually specify nodes in instance add, instance moves, node evacuate,
171
etc.
172

    
173
In order for Ganeti to be able to use these scripts, they must be place
174
in the iallocator directory (usually ``lib/ganeti/iallocators`` under
175
the installation prefix, e.g. ``/usr/local``).
176

    
177
“Primary” and “secondary” concepts
178
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
179

    
180
An instance has a primary and depending on the disk configuration, might
181
also have a secondary node. The instance always runs on the primary node
182
and only uses its secondary node for disk replication.
183

    
184
Similarly, the term of primary and secondary instances when talking
185
about a node refers to the set of instances having the given node as
186
primary, respectively secondary.
187

    
188
Tags
189
~~~~
190

    
191
Tags are short strings that can be attached to either to cluster itself,
192
or to nodes or instances. They are useful as a very simplistic
193
information store for helping with cluster administration, for example
194
by attaching owner information to each instance after it's created::
195

    
196
  $ gnt-instance add … %instance1%
197
  $ gnt-instance add-tags %instance1% %owner:user2%
198

    
199
And then by listing each instance and its tags, this information could
200
be used for contacting the users of each instance.
201

    
202
Jobs and OpCodes
203
~~~~~~~~~~~~~~~~
204

    
205
While not directly visible by an end-user, it's useful to know that a
206
basic cluster operation (e.g. starting an instance) is represented
207
internally by Ganeti as an *OpCode* (abbreviation from operation
208
code). These OpCodes are executed as part of a *Job*. The OpCodes in a
209
single Job are processed serially by Ganeti, but different Jobs will be
210
processed (depending on resource availability) in parallel. They will
211
not be executed in the submission order, but depending on resource
212
availability, locks and (starting with Ganeti 2.3) priority. An earlier
213
job may have to wait for a lock while a newer job doesn't need any locks
214
and can be executed right away. Operations requiring a certain order
215
need to be submitted as a single job, or the client must submit one job
216
at a time and wait for it to finish before continuing.
217

    
218
For example, shutting down the entire cluster can be done by running the
219
command ``gnt-instance shutdown --all``, which will submit for each
220
instance a separate job containing the “shutdown instance” OpCode.
221

    
222

    
223
Prerequisites
224
+++++++++++++
225

    
226
You need to have your Ganeti cluster installed and configured before you
227
try any of the commands in this document. Please follow the
228
:doc:`install` for instructions on how to do that.
229

    
230
Instance management
231
-------------------
232

    
233
Adding an instance
234
++++++++++++++++++
235

    
236
The add operation might seem complex due to the many parameters it
237
accepts, but once you have understood the (few) required parameters and
238
the customisation capabilities you will see it is an easy operation.
239

    
240
The add operation requires at minimum five parameters:
241

    
242
- the OS for the instance
243
- the disk template
244
- the disk count and size
245
- the node specification or alternatively the iallocator to use
246
- and finally the instance name
247

    
248
The OS for the instance must be visible in the output of the command
249
``gnt-os list`` and specifies which guest OS to install on the instance.
250

    
251
The disk template specifies what kind of storage to use as backend for
252
the (virtual) disks presented to the instance; note that for instances
253
with multiple virtual disks, they all must be of the same type.
254

    
255
The node(s) on which the instance will run can be given either manually,
256
via the ``-n`` option, or computed automatically by Ganeti, if you have
257
installed any iallocator script.
258

    
259
With the above parameters in mind, the command is::
260

    
261
  $ gnt-instance add \
262
    -n %TARGET_NODE%:%SECONDARY_NODE% \
263
    -o %OS_TYPE% \
264
    -t %DISK_TEMPLATE% -s %DISK_SIZE% \
265
    %INSTANCE_NAME%
266

    
267
The instance name must be resolvable (e.g. exist in DNS) and usually
268
points to an address in the same subnet as the cluster itself.
269

    
270
The above command has the minimum required options; other options you
271
can give include, among others:
272

    
273
- The maximum/minimum memory size (``-B maxmem``, ``-B minmem``)
274
  (``-B memory`` can be used to specify only one size)
275

    
276
- The number of virtual CPUs (``-B vcpus``)
277

    
278
- Arguments for the NICs of the instance; by default, a single-NIC
279
  instance is created. The IP and/or bridge of the NIC can be changed
280
  via ``--net 0:ip=IP,link=BRIDGE``
281

    
282
See :manpage:`ganeti-instance(8)` for the detailed option list.
283

    
284
For example if you want to create an highly available instance, with a
285
single disk of 50GB and the default memory size, having primary node
286
``node1`` and secondary node ``node3``, use the following command::
287

    
288
  $ gnt-instance add -n node1:node3 -o debootstrap -t drbd -s 50G \
289
    instance1
290

    
291
There is a also a command for batch instance creation from a
292
specification file, see the ``batch-create`` operation in the
293
gnt-instance manual page.
294

    
295
Regular instance operations
296
+++++++++++++++++++++++++++
297

    
298
Removal
299
~~~~~~~
300

    
301
Removing an instance is even easier than creating one. This operation is
302
irreversible and destroys all the contents of your instance. Use with
303
care::
304

    
305
  $ gnt-instance remove %INSTANCE_NAME%
306

    
307
.. _instance-startup-label:
308

    
309
Startup/shutdown
310
~~~~~~~~~~~~~~~~
311

    
312
Instances are automatically started at instance creation time. To
313
manually start one which is currently stopped you can run::
314

    
315
  $ gnt-instance startup %INSTANCE_NAME%
316

    
317
Ganeti will start an instance with up to its maximum instance memory. If
318
not enough memory is available Ganeti will use all the available memory
319
down to the instance minimum memory. If not even that amount of memory
320
is free Ganeti will refuse to start the instance.
321

    
322
Note, that this will not work when an instance is in a permanently
323
stopped state ``offline``. In this case, you will first have to
324
put it back to online mode by running::
325

    
326
  $ gnt-instance modify --online %INSTANCE_NAME%
327

    
328
The command to stop the running instance is::
329

    
330
  $ gnt-instance shutdown %INSTANCE_NAME%
331

    
332
If you want to shut the instance down more permanently, so that it
333
does not require dynamically allocated resources (memory and vcpus),
334
after shutting down an instance, execute the following::
335

    
336
  $ gnt-instance modify --offline %INSTANCE_NAME%
337

    
338
.. warning:: Do not use the Xen or KVM commands directly to stop
339
   instances. If you run for example ``xm shutdown`` or ``xm destroy``
340
   on an instance Ganeti will automatically restart it (via
341
   the :command:`ganeti-watcher(8)` command which is launched via cron).
342

    
343
Querying instances
344
~~~~~~~~~~~~~~~~~~
345

    
346
There are two ways to get information about instances: listing
347
instances, which does a tabular output containing a given set of fields
348
about each instance, and querying detailed information about a set of
349
instances.
350

    
351
The command to see all the instances configured and their status is::
352

    
353
  $ gnt-instance list
354

    
355
The command can return a custom set of information when using the ``-o``
356
option (as always, check the manpage for a detailed specification). Each
357
instance will be represented on a line, thus making it easy to parse
358
this output via the usual shell utilities (grep, sed, etc.).
359

    
360
To get more detailed information about an instance, you can run::
361

    
362
  $ gnt-instance info %INSTANCE%
363

    
364
which will give a multi-line block of information about the instance,
365
it's hardware resources (especially its disks and their redundancy
366
status), etc. This is harder to parse and is more expensive than the
367
list operation, but returns much more detailed information.
368

    
369
Changing an instance's runtime memory
370
+++++++++++++++++++++++++++++++++++++
371

    
372
Ganeti will always make sure an instance has a value between its maximum
373
and its minimum memory available as runtime memory. As of version 2.6
374
Ganeti will only choose a size different than the maximum size when
375
starting up, failing over, or migrating an instance on a node with less
376
than the maximum memory available. It won't resize other instances in
377
order to free up space for an instance.
378

    
379
If you find that you need more memory on a node any instance can be
380
manually resized without downtime, with the command::
381

    
382
  $ gnt-instance modify -m %SIZE% %INSTANCE_NAME%
383

    
384
The same command can also be used to increase the memory available on an
385
instance, provided that enough free memory is available on its node, and
386
the specified size is not larger than the maximum memory size the
387
instance had when it was first booted (an instance will be unable to see
388
new memory above the maximum that was specified to the hypervisor at its
389
boot time, if it needs to grow further a reboot becomes necessary).
390

    
391
Export/Import
392
+++++++++++++
393

    
394
You can create a snapshot of an instance disk and its Ganeti
395
configuration, which then you can backup, or import into another
396
cluster. The way to export an instance is::
397

    
398
  $ gnt-backup export -n %TARGET_NODE% %INSTANCE_NAME%
399

    
400

    
401
The target node can be any node in the cluster with enough space under
402
``/srv/ganeti`` to hold the instance image. Use the ``--noshutdown``
403
option to snapshot an instance without rebooting it. Note that Ganeti
404
only keeps one snapshot for an instance - any previous snapshot of the
405
same instance existing cluster-wide under ``/srv/ganeti`` will be
406
removed by this operation: if you want to keep them, you need to move
407
them out of the Ganeti exports directory.
408

    
409
Importing an instance is similar to creating a new one, but additionally
410
one must specify the location of the snapshot. The command is::
411

    
412
  $ gnt-backup import -n %TARGET_NODE% \
413
    --src-node=%NODE% --src-dir=%DIR% %INSTANCE_NAME%
414

    
415
By default, parameters will be read from the export information, but you
416
can of course pass them in via the command line - most of the options
417
available for the command :command:`gnt-instance add` are supported here
418
too.
419

    
420
Import of foreign instances
421
+++++++++++++++++++++++++++
422

    
423
There is a possibility to import a foreign instance whose disk data is
424
already stored as LVM volumes without going through copying it: the disk
425
adoption mode.
426

    
427
For this, ensure that the original, non-managed instance is stopped,
428
then create a Ganeti instance in the usual way, except that instead of
429
passing the disk information you specify the current volumes::
430

    
431
  $ gnt-instance add -t plain -n %HOME_NODE% ... \
432
    --disk 0:adopt=%lv_name%[,vg=%vg_name%] %INSTANCE_NAME%
433

    
434
This will take over the given logical volumes, rename them to the Ganeti
435
standard (UUID-based), and without installing the OS on them start
436
directly the instance. If you configure the hypervisor similar to the
437
non-managed configuration that the instance had, the transition should
438
be seamless for the instance. For more than one disk, just pass another
439
disk parameter (e.g. ``--disk 1:adopt=...``).
440

    
441
Instance kernel selection
442
+++++++++++++++++++++++++
443

    
444
The kernel that instances uses to bootup can come either from the node,
445
or from instances themselves, depending on the setup.
446

    
447
Xen-PVM
448
~~~~~~~
449

    
450
With Xen PVM, there are three options.
451

    
452
First, you can use a kernel from the node, by setting the hypervisor
453
parameters as such:
454

    
455
- ``kernel_path`` to a valid file on the node (and appropriately
456
  ``initrd_path``)
457
- ``kernel_args`` optionally set to a valid Linux setting (e.g. ``ro``)
458
- ``root_path`` to a valid setting (e.g. ``/dev/xvda1``)
459
- ``bootloader_path`` and ``bootloader_args`` to empty
460

    
461
Alternatively, you can delegate the kernel management to instances, and
462
use either ``pvgrub`` or the deprecated ``pygrub``. For this, you must
463
install the kernels and initrds in the instance and create a valid GRUB
464
v1 configuration file.
465

    
466
For ``pvgrub`` (new in version 2.4.2), you need to set:
467

    
468
- ``kernel_path`` to point to the ``pvgrub`` loader present on the node
469
  (e.g. ``/usr/lib/xen/boot/pv-grub-x86_32.gz``)
470
- ``kernel_args`` to the path to the GRUB config file, relative to the
471
  instance (e.g. ``(hd0,0)/grub/menu.lst``)
472
- ``root_path`` **must** be empty
473
- ``bootloader_path`` and ``bootloader_args`` to empty
474

    
475
While ``pygrub`` is deprecated, here is how you can configure it:
476

    
477
- ``bootloader_path`` to the pygrub binary (e.g. ``/usr/bin/pygrub``)
478
- the other settings are not important
479

    
480
More information can be found in the Xen wiki pages for `pvgrub
481
<http://wiki.xensource.com/xenwiki/PvGrub>`_ and `pygrub
482
<http://wiki.xensource.com/xenwiki/PyGrub>`_.
483

    
484
KVM
485
~~~
486

    
487
For KVM also the kernel can be loaded either way.
488

    
489
For loading the kernels from the node, you need to set:
490

    
491
- ``kernel_path`` to a valid value
492
- ``initrd_path`` optionally set if you use an initrd
493
- ``kernel_args`` optionally set to a valid value (e.g. ``ro``)
494

    
495
If you want instead to have the instance boot from its disk (and execute
496
its bootloader), simply set the ``kernel_path`` parameter to an empty
497
string, and all the others will be ignored.
498

    
499
Instance HA features
500
--------------------
501

    
502
.. note:: This section only applies to multi-node clusters
503

    
504
.. _instance-change-primary-label:
505

    
506
Changing the primary node
507
+++++++++++++++++++++++++
508

    
509
There are three ways to exchange an instance's primary and secondary
510
nodes; the right one to choose depends on how the instance has been
511
created and the status of its current primary node. See
512
:ref:`rest-redundancy-label` for information on changing the secondary
513
node. Note that it's only possible to change the primary node to the
514
secondary and vice-versa; a direct change of the primary node with a
515
third node, while keeping the current secondary is not possible in a
516
single step, only via multiple operations as detailed in
517
:ref:`instance-relocation-label`.
518

    
519
Failing over an instance
520
~~~~~~~~~~~~~~~~~~~~~~~~
521

    
522
If an instance is built in highly available mode you can at any time
523
fail it over to its secondary node, even if the primary has somehow
524
failed and it's not up anymore. Doing it is really easy, on the master
525
node you can just run::
526

    
527
  $ gnt-instance failover %INSTANCE_NAME%
528

    
529
That's it. After the command completes the secondary node is now the
530
primary, and vice-versa.
531

    
532
The instance will be started with an amount of memory between its
533
``maxmem`` and its ``minmem`` value, depending on the free memory on its
534
target node, or the operation will fail if that's not possible. See
535
:ref:`instance-startup-label` for details.
536

    
537
If the instance's disk template is of type rbd, then you can specify
538
the target node (which can be any node) explicitly, or specify an
539
iallocator plugin. If you omit both, the default iallocator will be
540
used to determine the target node::
541

    
542
  $ gnt-instance failover -n %TARGET_NODE% %INSTANCE_NAME%
543

    
544
Live migrating an instance
545
~~~~~~~~~~~~~~~~~~~~~~~~~~
546

    
547
If an instance is built in highly available mode, it currently runs and
548
both its nodes are running fine, you can migrate it over to its
549
secondary node, without downtime. On the master node you need to run::
550

    
551
  $ gnt-instance migrate %INSTANCE_NAME%
552

    
553
The current load on the instance and its memory size will influence how
554
long the migration will take. In any case, for both KVM and Xen
555
hypervisors, the migration will be transparent to the instance.
556

    
557
If the destination node has less memory than the instance's current
558
runtime memory, but at least the instance's minimum memory available
559
Ganeti will automatically reduce the instance runtime memory before
560
migrating it, unless the ``--no-runtime-changes`` option is passed, in
561
which case the target node should have at least the instance's current
562
runtime memory free.
563

    
564
If the instance's disk template is of type rbd, then you can specify
565
the target node (which can be any node) explicitly, or specify an
566
iallocator plugin. If you omit both, the default iallocator will be
567
used to determine the target node::
568

    
569
   $ gnt-instance migrate -n %TARGET_NODE% %INSTANCE_NAME%
570

    
571
Moving an instance (offline)
572
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
573

    
574
If an instance has not been create as mirrored, then the only way to
575
change its primary node is to execute the move command::
576

    
577
  $ gnt-instance move -n %NEW_NODE% %INSTANCE%
578

    
579
This has a few prerequisites:
580

    
581
- the instance must be stopped
582
- its current primary node must be on-line and healthy
583
- the disks of the instance must not have any errors
584

    
585
Since this operation actually copies the data from the old node to the
586
new node, expect it to take proportional to the size of the instance's
587
disks and the speed of both the nodes' I/O system and their networking.
588

    
589
Disk operations
590
+++++++++++++++
591

    
592
Disk failures are a common cause of errors in any server
593
deployment. Ganeti offers protection from single-node failure if your
594
instances were created in HA mode, and it also offers ways to restore
595
redundancy after a failure.
596

    
597
Preparing for disk operations
598
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
599

    
600
It is important to note that for Ganeti to be able to do any disk
601
operation, the Linux machines on top of which Ganeti runs must be
602
consistent; for LVM, this means that the LVM commands must not return
603
failures; it is common that after a complete disk failure, any LVM
604
command aborts with an error similar to::
605

    
606
  $ vgs
607
  /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
608
  /dev/sdb1: read failed after 0 of 4096 at 750153695232: Input/output error
609
  /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
610
  Couldn't find device with uuid 't30jmN-4Rcf-Fr5e-CURS-pawt-z0jU-m1TgeJ'.
611
  Couldn't find all physical volumes for volume group xenvg.
612

    
613
Before restoring an instance's disks to healthy status, it's needed to
614
fix the volume group used by Ganeti so that we can actually create and
615
manage the logical volumes. This is usually done in a multi-step
616
process:
617

    
618
#. first, if the disk is completely gone and LVM commands exit with
619
   “Couldn't find device with uuid…” then you need to run the command::
620

    
621
    $ vgreduce --removemissing %VOLUME_GROUP%
622

    
623
#. after the above command, the LVM commands should be executing
624
   normally (warnings are normal, but the commands will not fail
625
   completely).
626

    
627
#. if the failed disk is still visible in the output of the ``pvs``
628
   command, you need to deactivate it from allocations by running::
629

    
630
    $ pvs -x n /dev/%DISK%
631

    
632
At this point, the volume group should be consistent and any bad
633
physical volumes should not longer be available for allocation.
634

    
635
Note that since version 2.1 Ganeti provides some commands to automate
636
these two operations, see :ref:`storage-units-label`.
637

    
638
.. _rest-redundancy-label:
639

    
640
Restoring redundancy for DRBD-based instances
641
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
642

    
643
A DRBD instance has two nodes, and the storage on one of them has
644
failed. Depending on which node (primary or secondary) has failed, you
645
have three options at hand:
646

    
647
- if the storage on the primary node has failed, you need to re-create
648
  the disks on it
649
- if the storage on the secondary node has failed, you can either
650
  re-create the disks on it or change the secondary and recreate
651
  redundancy on the new secondary node
652

    
653
Of course, at any point it's possible to force re-creation of disks even
654
though everything is already fine.
655

    
656
For all three cases, the ``replace-disks`` operation can be used::
657

    
658
  # re-create disks on the primary node
659
  $ gnt-instance replace-disks -p %INSTANCE_NAME%
660
  # re-create disks on the current secondary
661
  $ gnt-instance replace-disks -s %INSTANCE_NAME%
662
  # change the secondary node, via manual specification
663
  $ gnt-instance replace-disks -n %NODE% %INSTANCE_NAME%
664
  # change the secondary node, via an iallocator script
665
  $ gnt-instance replace-disks -I %SCRIPT% %INSTANCE_NAME%
666
  # since Ganeti 2.1: automatically fix the primary or secondary node
667
  $ gnt-instance replace-disks -a %INSTANCE_NAME%
668

    
669
Since the process involves copying all data from the working node to the
670
target node, it will take a while, depending on the instance's disk
671
size, node I/O system and network speed. But it is (barring any network
672
interruption) completely transparent for the instance.
673

    
674
Re-creating disks for non-redundant instances
675
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
676

    
677
.. versionadded:: 2.1
678

    
679
For non-redundant instances, there isn't a copy (except backups) to
680
re-create the disks. But it's possible to at-least re-create empty
681
disks, after which a reinstall can be run, via the ``recreate-disks``
682
command::
683

    
684
  $ gnt-instance recreate-disks %INSTANCE%
685

    
686
Note that this will fail if the disks already exists. The instance can
687
be assigned to new nodes automatically by specifying an iallocator
688
through the ``--iallocator`` option.
689

    
690
Conversion of an instance's disk type
691
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
692

    
693
It is possible to convert between a non-redundant instance of type
694
``plain`` (LVM storage) and redundant ``drbd`` via the ``gnt-instance
695
modify`` command::
696

    
697
  # start with a non-redundant instance
698
  $ gnt-instance add -t plain ... %INSTANCE%
699

    
700
  # later convert it to redundant
701
  $ gnt-instance stop %INSTANCE%
702
  $ gnt-instance modify -t drbd -n %NEW_SECONDARY% %INSTANCE%
703
  $ gnt-instance start %INSTANCE%
704

    
705
  # and convert it back
706
  $ gnt-instance stop %INSTANCE%
707
  $ gnt-instance modify -t plain %INSTANCE%
708
  $ gnt-instance start %INSTANCE%
709

    
710
The conversion must be done while the instance is stopped, and
711
converting from plain to drbd template presents a small risk, especially
712
if the instance has multiple disks and/or if one node fails during the
713
conversion procedure). As such, it's recommended (as always) to make
714
sure that downtime for manual recovery is acceptable and that the
715
instance has up-to-date backups.
716

    
717
Debugging instances
718
+++++++++++++++++++
719

    
720
Accessing an instance's disks
721
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
722

    
723
From an instance's primary node you can have access to its disks. Never
724
ever mount the underlying logical volume manually on a fault tolerant
725
instance, or will break replication and your data will be
726
inconsistent. The correct way to access an instance's disks is to run
727
(on the master node, as usual) the command::
728

    
729
  $ gnt-instance activate-disks %INSTANCE%
730

    
731
And then, *on the primary node of the instance*, access the device that
732
gets created. For example, you could mount the given disks, then edit
733
files on the filesystem, etc.
734

    
735
Note that with partitioned disks (as opposed to whole-disk filesystems),
736
you will need to use a tool like :manpage:`kpartx(8)`::
737

    
738
  # on node1
739
  $ gnt-instance activate-disks %instance1%
740
  node3:disk/0:…
741
  $ ssh node3
742
  # on node 3
743
  $ kpartx -l /dev/…
744
  $ kpartx -a /dev/…
745
  $ mount /dev/mapper/… /mnt/
746
  # edit files under mnt as desired
747
  $ umount /mnt/
748
  $ kpartx -d /dev/…
749
  $ exit
750
  # back to node 1
751

    
752
After you've finished you can deactivate them with the deactivate-disks
753
command, which works in the same way::
754

    
755
  $ gnt-instance deactivate-disks %INSTANCE%
756

    
757
Note that if any process started by you is still using the disks, the
758
above command will error out, and you **must** cleanup and ensure that
759
the above command runs successfully before you start the instance,
760
otherwise the instance will suffer corruption.
761

    
762
Accessing an instance's console
763
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
764

    
765
The command to access a running instance's console is::
766

    
767
  $ gnt-instance console %INSTANCE_NAME%
768

    
769
Use the console normally and then type ``^]`` when done, to exit.
770

    
771
Other instance operations
772
+++++++++++++++++++++++++
773

    
774
Reboot
775
~~~~~~
776

    
777
There is a wrapper command for rebooting instances::
778

    
779
  $ gnt-instance reboot %instance2%
780

    
781
By default, this does the equivalent of shutting down and then starting
782
the instance, but it accepts parameters to perform a soft-reboot (via
783
the hypervisor), a hard reboot (hypervisor shutdown and then startup) or
784
a full one (the default, which also de-configures and then configures
785
again the disks of the instance).
786

    
787
Instance OS definitions debugging
788
+++++++++++++++++++++++++++++++++
789

    
790
Should you have any problems with instance operating systems the command
791
to see a complete status for all your nodes is::
792

    
793
   $ gnt-os diagnose
794

    
795
.. _instance-relocation-label:
796

    
797
Instance relocation
798
~~~~~~~~~~~~~~~~~~~
799

    
800
While it is not possible to move an instance from nodes ``(A, B)`` to
801
nodes ``(C, D)`` in a single move, it is possible to do so in a few
802
steps::
803

    
804
  # instance is located on A, B
805
  $ gnt-instance replace -n %nodeC% %instance1%
806
  # instance has moved from (A, B) to (A, C)
807
  # we now flip the primary/secondary nodes
808
  $ gnt-instance migrate %instance1%
809
  # instance lives on (C, A)
810
  # we can then change A to D via:
811
  $ gnt-instance replace -n %nodeD% %instance1%
812

    
813
Which brings it into the final configuration of ``(C, D)``. Note that we
814
needed to do two replace-disks operation (two copies of the instance
815
disks), because we needed to get rid of both the original nodes (A and
816
B).
817

    
818
Node operations
819
---------------
820

    
821
There are much fewer node operations available than for instances, but
822
they are equivalently important for maintaining a healthy cluster.
823

    
824
Add/readd
825
+++++++++
826

    
827
It is at any time possible to extend the cluster with one more node, by
828
using the node add operation::
829

    
830
  $ gnt-node add %NEW_NODE%
831

    
832
If the cluster has a replication network defined, then you need to pass
833
the ``-s REPLICATION_IP`` parameter to this option.
834

    
835
A variation of this command can be used to re-configure a node if its
836
Ganeti configuration is broken, for example if it has been reinstalled
837
by mistake::
838

    
839
  $ gnt-node add --readd %EXISTING_NODE%
840

    
841
This will reinitialise the node as if it's been newly added, but while
842
keeping its existing configuration in the cluster (primary/secondary IP,
843
etc.), in other words you won't need to use ``-s`` here.
844

    
845
Changing the node role
846
++++++++++++++++++++++
847

    
848
A node can be in different roles, as explained in the
849
:ref:`terminology-label` section. Promoting a node to the master role is
850
special, while the other roles are handled all via a single command.
851

    
852
Failing over the master node
853
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
854

    
855
If you want to promote a different node to the master role (for whatever
856
reason), run on any other master-candidate node the command::
857

    
858
  $ gnt-cluster master-failover
859

    
860
and the node you ran it on is now the new master. In case you try to run
861
this on a non master-candidate node, you will get an error telling you
862
which nodes are valid.
863

    
864
Changing between the other roles
865
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
866

    
867
The ``gnt-node modify`` command can be used to select a new role::
868

    
869
  # change to master candidate
870
  $ gnt-node modify -C yes %NODE%
871
  # change to drained status
872
  $ gnt-node modify -D yes %NODE%
873
  # change to offline status
874
  $ gnt-node modify -O yes %NODE%
875
  # change to regular mode (reset all flags)
876
  $ gnt-node modify -O no -D no -C no %NODE%
877

    
878
Note that the cluster requires that at any point in time, a certain
879
number of nodes are master candidates, so changing from master candidate
880
to other roles might fail. It is recommended to either force the
881
operation (via the ``--force`` option) or first change the number of
882
master candidates in the cluster - see :ref:`cluster-config-label`.
883

    
884
Evacuating nodes
885
++++++++++++++++
886

    
887
There are two steps of moving instances off a node:
888

    
889
- moving the primary instances (actually converting them into secondary
890
  instances)
891
- moving the secondary instances (including any instances converted in
892
  the step above)
893

    
894
Primary instance conversion
895
~~~~~~~~~~~~~~~~~~~~~~~~~~~
896

    
897
For this step, you can use either individual instance move
898
commands (as seen in :ref:`instance-change-primary-label`) or the bulk
899
per-node versions; these are::
900

    
901
  $ gnt-node migrate %NODE%
902
  $ gnt-node evacuate -s %NODE%
903

    
904
Note that the instance “move” command doesn't currently have a node
905
equivalent.
906

    
907
Both these commands, or the equivalent per-instance command, will make
908
this node the secondary node for the respective instances, whereas their
909
current secondary node will become primary. Note that it is not possible
910
to change in one step the primary node to another node as primary, while
911
keeping the same secondary node.
912

    
913
Secondary instance evacuation
914
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
915

    
916
For the evacuation of secondary instances, a command called
917
:command:`gnt-node evacuate` is provided and its syntax is::
918

    
919
  $ gnt-node evacuate -I %IALLOCATOR_SCRIPT% %NODE%
920
  $ gnt-node evacuate -n %DESTINATION_NODE% %NODE%
921

    
922
The first version will compute the new secondary for each instance in
923
turn using the given iallocator script, whereas the second one will
924
simply move all instances to DESTINATION_NODE.
925

    
926
Removal
927
+++++++
928

    
929
Once a node no longer has any instances (neither primary nor secondary),
930
it's easy to remove it from the cluster::
931

    
932
  $ gnt-node remove %NODE_NAME%
933

    
934
This will deconfigure the node, stop the ganeti daemons on it and leave
935
it hopefully like before it joined to the cluster.
936

    
937
Replication network changes
938
+++++++++++++++++++++++++++
939

    
940
The :command:`gnt-node modify -s` command can be used to change the
941
secondary IP of a node. This operation can only be performed if:
942

    
943
- No instance is active on the target node
944
- The new target IP is reachable from the master's secondary IP
945

    
946
Also this operation will not allow to change a node from single-homed
947
(same primary and secondary ip) to multi-homed (separate replication
948
network) or vice versa, unless:
949

    
950
- The target node is the master node and `--force` is passed.
951
- The target cluster is single-homed and the new primary ip is a change
952
  to single homed for a particular node.
953
- The target cluster is multi-homed and the new primary ip is a change
954
  to multi homed for a particular node.
955

    
956
For example to do a single-homed to multi-homed conversion::
957

    
958
  $ gnt-node modify --force -s %SECONDARY_IP% %MASTER_NAME%
959
  $ gnt-node modify -s %SECONDARY_IP% %NODE1_NAME%
960
  $ gnt-node modify -s %SECONDARY_IP% %NODE2_NAME%
961
  $ gnt-node modify -s %SECONDARY_IP% %NODE3_NAME%
962
  ...
963

    
964
The same commands can be used for multi-homed to single-homed except the
965
secondary IPs should be the same as the primaries for each node, for
966
that case.
967

    
968
Storage handling
969
++++++++++++++++
970

    
971
When using LVM (either standalone or with DRBD), it can become tedious
972
to debug and fix it in case of errors. Furthermore, even file-based
973
storage can become complicated to handle manually on many hosts. Ganeti
974
provides a couple of commands to help with automation.
975

    
976
Logical volumes
977
~~~~~~~~~~~~~~~
978

    
979
This is a command specific to LVM handling. It allows listing the
980
logical volumes on a given node or on all nodes and their association to
981
instances via the ``volumes`` command::
982

    
983
  $ gnt-node volumes
984
  Node  PhysDev   VG    Name             Size Instance
985
  node1 /dev/sdb1 xenvg e61fbc97-….disk0 512M instance17
986
  node1 /dev/sdb1 xenvg ebd1a7d1-….disk0 512M instance19
987
  node2 /dev/sdb1 xenvg 0af08a3d-….disk0 512M instance20
988
  node2 /dev/sdb1 xenvg cc012285-….disk0 512M instance16
989
  node2 /dev/sdb1 xenvg f0fac192-….disk0 512M instance18
990

    
991
The above command maps each logical volume to a volume group and
992
underlying physical volume and (possibly) to an instance.
993

    
994
.. _storage-units-label:
995

    
996
Generalized storage handling
997
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
998

    
999
.. versionadded:: 2.1
1000

    
1001
Starting with Ganeti 2.1, a new storage framework has been implemented
1002
that tries to abstract the handling of the storage type the cluster
1003
uses.
1004

    
1005
First is listing the backend storage and their space situation::
1006

    
1007
  $ gnt-node list-storage
1008
  Node  Name        Size Used   Free
1009
  node1 /dev/sda7 673.8G   0M 673.8G
1010
  node1 /dev/sdb1 698.6G 1.5G 697.1G
1011
  node2 /dev/sda7 673.8G   0M 673.8G
1012
  node2 /dev/sdb1 698.6G 1.0G 697.6G
1013

    
1014
The default is to list LVM physical volumes. It's also possible to list
1015
the LVM volume groups::
1016

    
1017
  $ gnt-node list-storage -t lvm-vg
1018
  Node  Name  Size
1019
  node1 xenvg 1.3T
1020
  node2 xenvg 1.3T
1021

    
1022
Next is repairing storage units, which is currently only implemented for
1023
volume groups and does the equivalent of ``vgreduce --removemissing``::
1024

    
1025
  $ gnt-node repair-storage %node2% lvm-vg xenvg
1026
  Sun Oct 25 22:21:45 2009 Repairing storage unit 'xenvg' on node2 ...
1027

    
1028
Last is the modification of volume properties, which is (again) only
1029
implemented for LVM physical volumes and allows toggling the
1030
``allocatable`` value::
1031

    
1032
  $ gnt-node modify-storage --allocatable=no %node2% lvm-pv /dev/%sdb1%
1033

    
1034
Use of the storage commands
1035
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1036

    
1037
All these commands are needed when recovering a node from a disk
1038
failure:
1039

    
1040
- first, we need to recover from complete LVM failure (due to missing
1041
  disk), by running the ``repair-storage`` command
1042
- second, we need to change allocation on any partially-broken disk
1043
  (i.e. LVM still sees it, but it has bad blocks) by running
1044
  ``modify-storage``
1045
- then we can evacuate the instances as needed
1046

    
1047

    
1048
Cluster operations
1049
------------------
1050

    
1051
Beside the cluster initialisation command (which is detailed in the
1052
:doc:`install` document) and the master failover command which is
1053
explained under node handling, there are a couple of other cluster
1054
operations available.
1055

    
1056
.. _cluster-config-label:
1057

    
1058
Standard operations
1059
+++++++++++++++++++
1060

    
1061
One of the few commands that can be run on any node (not only the
1062
master) is the ``getmaster`` command::
1063

    
1064
  # on node2
1065
  $ gnt-cluster getmaster
1066
  node1.example.com
1067

    
1068
It is possible to query and change global cluster parameters via the
1069
``info`` and ``modify`` commands::
1070

    
1071
  $ gnt-cluster info
1072
  Cluster name: cluster.example.com
1073
  Cluster UUID: 07805e6f-f0af-4310-95f1-572862ee939c
1074
  Creation time: 2009-09-25 05:04:15
1075
  Modification time: 2009-10-18 22:11:47
1076
  Master node: node1.example.com
1077
  Architecture (this node): 64bit (x86_64)
1078
1079
  Tags: foo
1080
  Default hypervisor: xen-pvm
1081
  Enabled hypervisors: xen-pvm
1082
  Hypervisor parameters:
1083
    - xen-pvm:
1084
        root_path: /dev/sda1
1085
1086
  Cluster parameters:
1087
    - candidate pool size: 10
1088
1089
  Default instance parameters:
1090
    - default:
1091
        memory: 128
1092
1093
  Default nic parameters:
1094
    - default:
1095
        link: xen-br0
1096
1097

    
1098
There various parameters above can be changed via the ``modify``
1099
commands as follows:
1100

    
1101
- the hypervisor parameters can be changed via ``modify -H
1102
  xen-pvm:root_path=…``, and so on for other hypervisors/key/values
1103
- the "default instance parameters" are changeable via ``modify -B
1104
  parameter=value…`` syntax
1105
- the cluster parameters are changeable via separate options to the
1106
  modify command (e.g. ``--candidate-pool-size``, etc.)
1107

    
1108
For detailed option list see the :manpage:`gnt-cluster(8)` man page.
1109

    
1110
The cluster version can be obtained via the ``version`` command::
1111
  $ gnt-cluster version
1112
  Software version: 2.1.0
1113
  Internode protocol: 20
1114
  Configuration format: 2010000
1115
  OS api version: 15
1116
  Export interface: 0
1117

    
1118
This is not very useful except when debugging Ganeti.
1119

    
1120
Global node commands
1121
++++++++++++++++++++
1122

    
1123
There are two commands provided for replicating files to all nodes of a
1124
cluster and for running commands on all the nodes::
1125

    
1126
  $ gnt-cluster copyfile %/path/to/file%
1127
  $ gnt-cluster command %ls -l /path/to/file%
1128

    
1129
These are simple wrappers over scp/ssh and more advanced usage can be
1130
obtained using :manpage:`dsh(1)` and similar commands. But they are
1131
useful to update an OS script from the master node, for example.
1132

    
1133
Cluster verification
1134
++++++++++++++++++++
1135

    
1136
There are three commands that relate to global cluster checks. The first
1137
one is ``verify`` which gives an overview on the cluster state,
1138
highlighting any issues. In normal operation, this command should return
1139
no ``ERROR`` messages::
1140

    
1141
  $ gnt-cluster verify
1142
  Sun Oct 25 23:08:58 2009 * Verifying global settings
1143
  Sun Oct 25 23:08:58 2009 * Gathering data (2 nodes)
1144
  Sun Oct 25 23:09:00 2009 * Verifying node status
1145
  Sun Oct 25 23:09:00 2009 * Verifying instance status
1146
  Sun Oct 25 23:09:00 2009 * Verifying orphan volumes
1147
  Sun Oct 25 23:09:00 2009 * Verifying remaining instances
1148
  Sun Oct 25 23:09:00 2009 * Verifying N+1 Memory redundancy
1149
  Sun Oct 25 23:09:00 2009 * Other Notes
1150
  Sun Oct 25 23:09:00 2009   - NOTICE: 5 non-redundant instance(s) found.
1151
  Sun Oct 25 23:09:00 2009 * Hooks Results
1152

    
1153
The second command is ``verify-disks``, which checks that the instance's
1154
disks have the correct status based on the desired instance state
1155
(up/down)::
1156

    
1157
  $ gnt-cluster verify-disks
1158

    
1159
Note that this command will show no output when disks are healthy.
1160

    
1161
The last command is used to repair any discrepancies in Ganeti's
1162
recorded disk size and the actual disk size (disk size information is
1163
needed for proper activation and growth of DRBD-based disks)::
1164

    
1165
  $ gnt-cluster repair-disk-sizes
1166
  Sun Oct 25 23:13:16 2009  - INFO: Disk 0 of instance instance1 has mismatched size, correcting: recorded 512, actual 2048
1167
  Sun Oct 25 23:13:17 2009  - WARNING: Invalid result from node node4, ignoring node results
1168

    
1169
The above shows one instance having wrong disk size, and a node which
1170
returned invalid data, and thus we ignored all primary instances of that
1171
node.
1172

    
1173
Configuration redistribution
1174
++++++++++++++++++++++++++++
1175

    
1176
If the verify command complains about file mismatches between the master
1177
and other nodes, due to some node problems or if you manually modified
1178
configuration files, you can force an push of the master configuration
1179
to all other nodes via the ``redist-conf`` command::
1180

    
1181
  $ gnt-cluster redist-conf
1182

    
1183
This command will be silent unless there are problems sending updates to
1184
the other nodes.
1185

    
1186

    
1187
Cluster renaming
1188
++++++++++++++++
1189

    
1190
It is possible to rename a cluster, or to change its IP address, via the
1191
``rename`` command. If only the IP has changed, you need to pass the
1192
current name and Ganeti will realise its IP has changed::
1193

    
1194
  $ gnt-cluster rename %cluster.example.com%
1195
  This will rename the cluster to 'cluster.example.com'. If
1196
  you are connected over the network to the cluster name, the operation
1197
  is very dangerous as the IP address will be removed from the node and
1198
  the change may not go through. Continue?
1199
  y/[n]/?: %y%
1200
  Failure: prerequisites not met for this operation:
1201
  Neither the name nor the IP address of the cluster has changed
1202

    
1203
In the above output, neither value has changed since the cluster
1204
initialisation so the operation is not completed.
1205

    
1206
Queue operations
1207
++++++++++++++++
1208

    
1209
The job queue execution in Ganeti 2.0 and higher can be inspected,
1210
suspended and resumed via the ``queue`` command::
1211

    
1212
  $ gnt-cluster queue info
1213
  The drain flag is unset
1214
  $ gnt-cluster queue drain
1215
  $ gnt-instance stop %instance1%
1216
  Failed to submit job for instance1: Job queue is drained, refusing job
1217
  $ gnt-cluster queue info
1218
  The drain flag is set
1219
  $ gnt-cluster queue undrain
1220

    
1221
This is most useful if you have an active cluster and you need to
1222
upgrade the Ganeti software, or simply restart the software on any node:
1223

    
1224
#. suspend the queue via ``queue drain``
1225
#. wait until there are no more running jobs via ``gnt-job list``
1226
#. restart the master or another node, or upgrade the software
1227
#. resume the queue via ``queue undrain``
1228

    
1229
.. note:: this command only stores a local flag file, and if you
1230
   failover the master, it will not have effect on the new master.
1231

    
1232

    
1233
Watcher control
1234
+++++++++++++++
1235

    
1236
The :manpage:`ganeti-watcher(8)` is a program, usually scheduled via
1237
``cron``, that takes care of cluster maintenance operations (restarting
1238
downed instances, activating down DRBD disks, etc.). However, during
1239
maintenance and troubleshooting, this can get in your way; disabling it
1240
via commenting out the cron job is not so good as this can be
1241
forgotten. Thus there are some commands for automated control of the
1242
watcher: ``pause``, ``info`` and ``continue``::
1243

    
1244
  $ gnt-cluster watcher info
1245
  The watcher is not paused.
1246
  $ gnt-cluster watcher pause %1h%
1247
  The watcher is paused until Mon Oct 26 00:30:37 2009.
1248
  $ gnt-cluster watcher info
1249
  The watcher is paused until Mon Oct 26 00:30:37 2009.
1250
  $ ganeti-watcher -d
1251
  2009-10-25 23:30:47,984:  pid=28867 ganeti-watcher:486 DEBUG Pause has been set, exiting
1252
  $ gnt-cluster watcher continue
1253
  The watcher is no longer paused.
1254
  $ ganeti-watcher -d
1255
  2009-10-25 23:31:04,789:  pid=28976 ganeti-watcher:345 DEBUG Archived 0 jobs, left 0
1256
  2009-10-25 23:31:05,884:  pid=28976 ganeti-watcher:280 DEBUG Got data from cluster, writing instance status file
1257
  2009-10-25 23:31:06,061:  pid=28976 ganeti-watcher:150 DEBUG Data didn't change, just touching status file
1258
  $ gnt-cluster watcher info
1259
  The watcher is not paused.
1260

    
1261
The exact details of the argument to the ``pause`` command are available
1262
in the manpage.
1263

    
1264
.. note:: this command only stores a local flag file, and if you
1265
   failover the master, it will not have effect on the new master.
1266

    
1267
Node auto-maintenance
1268
+++++++++++++++++++++
1269

    
1270
If the cluster parameter ``maintain_node_health`` is enabled (see the
1271
manpage for :command:`gnt-cluster`, the init and modify subcommands),
1272
then the following will happen automatically:
1273

    
1274
- the watcher will shutdown any instances running on offline nodes
1275
- the watcher will deactivate any DRBD devices on offline nodes
1276

    
1277
In the future, more actions are planned, so only enable this parameter
1278
if the nodes are completely dedicated to Ganeti; otherwise it might be
1279
possible to lose data due to auto-maintenance actions.
1280

    
1281
Removing a cluster entirely
1282
+++++++++++++++++++++++++++
1283

    
1284
The usual method to cleanup a cluster is to run ``gnt-cluster destroy``
1285
however if the Ganeti installation is broken in any way then this will
1286
not run.
1287

    
1288
It is possible in such a case to cleanup manually most if not all traces
1289
of a cluster installation by following these steps on all of the nodes:
1290

    
1291
1. Shutdown all instances. This depends on the virtualisation method
1292
   used (Xen, KVM, etc.):
1293

    
1294
  - Xen: run ``xm list`` and ``xm destroy`` on all the non-Domain-0
1295
    instances
1296
  - KVM: kill all the KVM processes
1297
  - chroot: kill all processes under the chroot mountpoints
1298

    
1299
2. If using DRBD, shutdown all DRBD minors (which should by at this time
1300
   no-longer in use by instances); on each node, run ``drbdsetup
1301
   /dev/drbdN down`` for each active DRBD minor.
1302

    
1303
3. If using LVM, cleanup the Ganeti volume group; if only Ganeti created
1304
   logical volumes (and you are not sharing the volume group with the
1305
   OS, for example), then simply running ``lvremove -f xenvg`` (replace
1306
   'xenvg' with your volume group name) should do the required cleanup.
1307

    
1308
4. If using file-based storage, remove recursively all files and
1309
   directories under your file-storage directory: ``rm -rf
1310
   /srv/ganeti/file-storage/*`` replacing the path with the correct path
1311
   for your cluster.
1312

    
1313
5. Stop the ganeti daemons (``/etc/init.d/ganeti stop``) and kill any
1314
   that remain alive (``pgrep ganeti`` and ``pkill ganeti``).
1315

    
1316
6. Remove the ganeti state directory (``rm -rf /var/lib/ganeti/*``),
1317
   replacing the path with the correct path for your installation.
1318

    
1319
7. If using RBD, run ``rbd unmap /dev/rbdN`` to unmap the RBD disks.
1320
   Then remove the RBD disk images used by Ganeti, identified by their
1321
   UUIDs (``rbd rm uuid.rbd.diskN``).
1322

    
1323
On the master node, remove the cluster from the master-netdev (usually
1324
``xen-br0`` for bridged mode, otherwise ``eth0`` or similar), by running
1325
``ip a del $clusterip/32 dev xen-br0`` (use the correct cluster ip and
1326
network device name).
1327

    
1328
At this point, the machines are ready for a cluster creation; in case
1329
you want to remove Ganeti completely, you need to also undo some of the
1330
SSH changes and log directories:
1331

    
1332
- ``rm -rf /var/log/ganeti /srv/ganeti`` (replace with the correct
1333
  paths)
1334
- remove from ``/root/.ssh`` the keys that Ganeti added (check the
1335
  ``authorized_keys`` and ``id_dsa`` files)
1336
- regenerate the host's SSH keys (check the OpenSSH startup scripts)
1337
- uninstall Ganeti
1338

    
1339
Otherwise, if you plan to re-create the cluster, you can just go ahead
1340
and rerun ``gnt-cluster init``.
1341

    
1342
Monitoring the cluster
1343
----------------------
1344

    
1345
Starting with Ganeti 2.8, a monitoring daemon is available, providing
1346
information about the status and the performance of the system.
1347

    
1348
The monitoring daemon runs on every node, listening on TCP port 1815. Each
1349
instance of the daemon provides information related to the node it is running
1350
on.
1351

    
1352
.. include:: monitoring-query-format.rst
1353

    
1354
Tags handling
1355
-------------
1356

    
1357
The tags handling (addition, removal, listing) is similar for all the
1358
objects that support it (instances, nodes, and the cluster).
1359

    
1360
Limitations
1361
+++++++++++
1362

    
1363
Note that the set of characters present in a tag and the maximum tag
1364
length are restricted. Currently the maximum length is 128 characters,
1365
there can be at most 4096 tags per object, and the set of characters is
1366
comprised by alphanumeric characters and additionally ``.+*/:@-``.
1367

    
1368
Operations
1369
++++++++++
1370

    
1371
Tags can be added via ``add-tags``::
1372

    
1373
  $ gnt-instance add-tags %INSTANCE% %a% %b% %c%
1374
  $ gnt-node add-tags %INSTANCE% %a% %b% %c%
1375
  $ gnt-cluster add-tags %a% %b% %c%
1376

    
1377

    
1378
The above commands add three tags to an instance, to a node and to the
1379
cluster. Note that the cluster command only takes tags as arguments,
1380
whereas the node and instance commands first required the node and
1381
instance name.
1382

    
1383
Tags can also be added from a file, via the ``--from=FILENAME``
1384
argument. The file is expected to contain one tag per line.
1385

    
1386
Tags can also be remove via a syntax very similar to the add one::
1387

    
1388
  $ gnt-instance remove-tags %INSTANCE% %a% %b% %c%
1389

    
1390
And listed via::
1391

    
1392
  $ gnt-instance list-tags
1393
  $ gnt-node list-tags
1394
  $ gnt-cluster list-tags
1395

    
1396
Global tag search
1397
+++++++++++++++++
1398

    
1399
It is also possible to execute a global search on the all tags defined
1400
in the cluster configuration, via a cluster command::
1401

    
1402
  $ gnt-cluster search-tags %REGEXP%
1403

    
1404
The parameter expected is a regular expression (see
1405
:manpage:`regex(7)`). This will return all tags that match the search,
1406
together with the object they are defined in (the names being show in a
1407
hierarchical kind of way)::
1408

    
1409
  $ gnt-cluster search-tags %o%
1410
  /cluster foo
1411
  /instances/instance1 owner:bar
1412

    
1413

    
1414
Job operations
1415
--------------
1416

    
1417
The various jobs submitted by the instance/node/cluster commands can be
1418
examined, canceled and archived by various invocations of the
1419
``gnt-job`` command.
1420

    
1421
First is the job list command::
1422

    
1423
  $ gnt-job list
1424
  17771 success INSTANCE_QUERY_DATA
1425
  17773 success CLUSTER_VERIFY_DISKS
1426
  17775 success CLUSTER_REPAIR_DISK_SIZES
1427
  17776 error   CLUSTER_RENAME(cluster.example.com)
1428
  17780 success CLUSTER_REDIST_CONF
1429
  17792 success INSTANCE_REBOOT(instance1.example.com)
1430

    
1431
More detailed information about a job can be found via the ``info``
1432
command::
1433

    
1434
  $ gnt-job info %17776%
1435
  Job ID: 17776
1436
    Status: error
1437
    Received:         2009-10-25 23:18:02.180569
1438
    Processing start: 2009-10-25 23:18:02.200335 (delta 0.019766s)
1439
    Processing end:   2009-10-25 23:18:02.279743 (delta 0.079408s)
1440
    Total processing time: 0.099174 seconds
1441
    Opcodes:
1442
      OP_CLUSTER_RENAME
1443
        Status: error
1444
        Processing start: 2009-10-25 23:18:02.200335
1445
        Processing end:   2009-10-25 23:18:02.252282
1446
        Input fields:
1447
          name: cluster.example.com
1448
        Result:
1449
          OpPrereqError
1450
          [Neither the name nor the IP address of the cluster has changed]
1451
        Execution log:
1452

    
1453
During the execution of a job, it's possible to follow the output of a
1454
job, similar to the log that one get from the ``gnt-`` commands, via the
1455
watch command::
1456

    
1457
  $ gnt-instance add --submit … %instance1%
1458
  JobID: 17818
1459
  $ gnt-job watch %17818%
1460
  Output from job 17818 follows
1461
  -----------------------------
1462
  Mon Oct 26 00:22:48 2009  - INFO: Selected nodes for instance instance1 via iallocator dumb: node1, node2
1463
  Mon Oct 26 00:22:49 2009 * creating instance disks...
1464
  Mon Oct 26 00:22:52 2009 adding instance instance1 to cluster config
1465
  Mon Oct 26 00:22:52 2009  - INFO: Waiting for instance instance1 to sync disks.
1466
1467
  Mon Oct 26 00:23:03 2009 creating os for instance instance1 on node node1
1468
  Mon Oct 26 00:23:03 2009 * running the instance OS create scripts...
1469
  Mon Oct 26 00:23:13 2009 * starting instance...
1470
  $
1471

    
1472
This is useful if you need to follow a job's progress from multiple
1473
terminals.
1474

    
1475
A job that has not yet started to run can be canceled::
1476

    
1477
  $ gnt-job cancel %17810%
1478

    
1479
But not one that has already started execution::
1480

    
1481
  $ gnt-job cancel %17805%
1482
  Job 17805 is no longer waiting in the queue
1483

    
1484
There are two queues for jobs: the *current* and the *archive*
1485
queue. Jobs are initially submitted to the current queue, and they stay
1486
in that queue until they have finished execution (either successfully or
1487
not). At that point, they can be moved into the archive queue using e.g.
1488
``gnt-job autoarchive all``. The ``ganeti-watcher`` script will do this
1489
automatically 6 hours after a job is finished. The ``ganeti-cleaner``
1490
script will then remove archived the jobs from the archive directory
1491
after three weeks.
1492

    
1493
Note that ``gnt-job list`` only shows jobs in the current queue.
1494
Archived jobs can be viewed using ``gnt-job info <id>``.
1495

    
1496
Special Ganeti deployments
1497
--------------------------
1498

    
1499
Since Ganeti 2.4, it is possible to extend the Ganeti deployment with
1500
two custom scenarios: Ganeti inside Ganeti and multi-site model.
1501

    
1502
Running Ganeti under Ganeti
1503
+++++++++++++++++++++++++++
1504

    
1505
It is sometimes useful to be able to use a Ganeti instance as a Ganeti
1506
node (part of another cluster, usually). One example scenario is two
1507
small clusters, where we want to have an additional master candidate
1508
that holds the cluster configuration and can be used for helping with
1509
the master voting process.
1510

    
1511
However, these Ganeti instance should not host instances themselves, and
1512
should not be considered in the normal capacity planning, evacuation
1513
strategies, etc. In order to accomplish this, mark these nodes as
1514
non-``vm_capable``::
1515

    
1516
  $ gnt-node modify --vm-capable=no %node3%
1517

    
1518
The vm_capable status can be listed as usual via ``gnt-node list``::
1519

    
1520
  $ gnt-node list -oname,vm_capable
1521
  Node  VMCapable
1522
  node1 Y
1523
  node2 Y
1524
  node3 N
1525

    
1526
When this flag is set, the cluster will not do any operations that
1527
relate to instances on such nodes, e.g. hypervisor operations,
1528
disk-related operations, etc. Basically they will just keep the ssconf
1529
files, and if master candidates the full configuration.
1530

    
1531
Multi-site model
1532
++++++++++++++++
1533

    
1534
If Ganeti is deployed in multi-site model, with each site being a node
1535
group (so that instances are not relocated across the WAN by mistake),
1536
it is conceivable that either the WAN latency is high or that some sites
1537
have a lower reliability than others. In this case, it doesn't make
1538
sense to replicate the job information across all sites (or even outside
1539
of a “central” node group), so it should be possible to restrict which
1540
nodes can become master candidates via the auto-promotion algorithm.
1541

    
1542
Ganeti 2.4 introduces for this purpose a new ``master_capable`` flag,
1543
which (when unset) prevents nodes from being marked as master
1544
candidates, either manually or automatically.
1545

    
1546
As usual, the node modify operation can change this flag::
1547

    
1548
  $ gnt-node modify --auto-promote --master-capable=no %node3%
1549
  Fri Jan  7 06:23:07 2011  - INFO: Demoting from master candidate
1550
  Fri Jan  7 06:23:08 2011  - INFO: Promoted nodes to master candidate role: node4
1551
  Modified node node3
1552
   - master_capable -> False
1553
   - master_candidate -> False
1554

    
1555
And the node list operation will list this flag::
1556

    
1557
  $ gnt-node list -oname,master_capable %node1% %node2% %node3%
1558
  Node  MasterCapable
1559
  node1 Y
1560
  node2 Y
1561
  node3 N
1562

    
1563
Note that marking a node both not ``vm_capable`` and not
1564
``master_capable`` makes the node practically unusable from Ganeti's
1565
point of view. Hence these two flags should be used probably in
1566
contrast: some nodes will be only master candidates (master_capable but
1567
not vm_capable), and other nodes will only hold instances (vm_capable
1568
but not master_capable).
1569

    
1570

    
1571
Ganeti tools
1572
------------
1573

    
1574
Beside the usual ``gnt-`` and ``ganeti-`` commands which are provided
1575
and installed in ``$prefix/sbin`` at install time, there are a couple of
1576
other tools installed which are used seldom but can be helpful in some
1577
cases.
1578

    
1579
lvmstrap
1580
++++++++
1581

    
1582
The ``lvmstrap`` tool, introduced in :ref:`configure-lvm-label` section,
1583
has two modes of operation:
1584

    
1585
- ``diskinfo`` shows the discovered disks on the system and their status
1586
- ``create`` takes all not-in-use disks and creates a volume group out
1587
  of them
1588

    
1589
.. warning:: The ``create`` argument to this command causes data-loss!
1590

    
1591
cfgupgrade
1592
++++++++++
1593

    
1594
The ``cfgupgrade`` tools is used to upgrade between major (and minor)
1595
Ganeti versions, and to roll back. Point-releases are usually
1596
transparent for the admin.
1597

    
1598
More information about the upgrade procedure is listed on the wiki at
1599
http://code.google.com/p/ganeti/wiki/UpgradeNotes.
1600

    
1601
There is also a script designed to upgrade from Ganeti 1.2 to 2.0,
1602
called ``cfgupgrade12``.
1603

    
1604
cfgshell
1605
++++++++
1606

    
1607
.. note:: This command is not actively maintained; make sure you backup
1608
   your configuration before using it
1609

    
1610
This can be used as an alternative to direct editing of the
1611
main configuration file if Ganeti has a bug and prevents you, for
1612
example, from removing an instance or a node from the configuration
1613
file.
1614

    
1615
.. _burnin-label:
1616

    
1617
burnin
1618
++++++
1619

    
1620
.. warning:: This command will erase existing instances if given as
1621
   arguments!
1622

    
1623
This tool is used to exercise either the hardware of machines or
1624
alternatively the Ganeti software. It is safe to run on an existing
1625
cluster **as long as you don't pass it existing instance names**.
1626

    
1627
The command will, by default, execute a comprehensive set of operations
1628
against a list of instances, these being:
1629

    
1630
- creation
1631
- disk replacement (for redundant instances)
1632
- failover and migration (for redundant instances)
1633
- move (for non-redundant instances)
1634
- disk growth
1635
- add disks, remove disk
1636
- add NICs, remove NICs
1637
- export and then import
1638
- rename
1639
- reboot
1640
- shutdown/startup
1641
- and finally removal of the test instances
1642

    
1643
Executing all these operations will test that the hardware performs
1644
well: the creation, disk replace, disk add and disk growth will exercise
1645
the storage and network; the migrate command will test the memory of the
1646
systems. Depending on the passed options, it can also test that the
1647
instance OS definitions are executing properly the rename, import and
1648
export operations.
1649

    
1650
sanitize-config
1651
+++++++++++++++
1652

    
1653
This tool takes the Ganeti configuration and outputs a "sanitized"
1654
version, by randomizing or clearing:
1655

    
1656
- DRBD secrets and cluster public key (always)
1657
- host names (optional)
1658
- IPs (optional)
1659
- OS names (optional)
1660
- LV names (optional, only useful for very old clusters which still have
1661
  instances whose LVs are based on the instance name)
1662

    
1663
By default, all optional items are activated except the LV name
1664
randomization. When passing ``--no-randomization``, which disables the
1665
optional items (i.e. just the DRBD secrets and cluster public keys are
1666
randomized), the resulting file can be used as a safety copy of the
1667
cluster config - while not trivial, the layout of the cluster can be
1668
recreated from it and if the instance disks have not been lost it
1669
permits recovery from the loss of all master candidates.
1670

    
1671
move-instance
1672
+++++++++++++
1673

    
1674
See :doc:`separate documentation for move-instance <move-instance>`.
1675

    
1676
users-setup
1677
+++++++++++
1678

    
1679
Ganeti can either be run entirely as root, or with every daemon running as
1680
its own specific user (if the parameters ``--with-user-prefix`` and/or
1681
``--with-group-prefix`` have been specified at ``./configure``-time).
1682

    
1683
In case split users are activated, they are required to exist on the system,
1684
and they need to belong to the proper groups in order for the access
1685
permissions to files and programs to be correct.
1686

    
1687
The ``users-setup`` tool, when run, takes care of setting up the proper
1688
users and groups.
1689

    
1690
The tool does not accept any parameter, and requires root permissions to run.
1691

    
1692
.. TODO: document cluster-merge tool
1693

    
1694

    
1695
Other Ganeti projects
1696
---------------------
1697

    
1698
Below is a list (which might not be up-to-date) of additional projects
1699
that can be useful in a Ganeti deployment. They can be downloaded from
1700
the project site (http://code.google.com/p/ganeti/) and the repositories
1701
are also on the project git site (http://git.ganeti.org).
1702

    
1703
NBMA tools
1704
++++++++++
1705

    
1706
The ``ganeti-nbma`` software is designed to allow instances to live on a
1707
separate, virtual network from the nodes, and in an environment where
1708
nodes are not guaranteed to be able to reach each other via multicasting
1709
or broadcasting. For more information see the README in the source
1710
archive.
1711

    
1712
ganeti-htools
1713
+++++++++++++
1714

    
1715
Before Ganeti version 2.5, this was a standalone project; since that
1716
version it is integrated into the Ganeti codebase (see
1717
:doc:`install-quick` for instructions on how to enable it). If you run
1718
an older Ganeti version, you will have to download and build it
1719
separately.
1720

    
1721
For more information and installation instructions, see the README file
1722
in the source archive.
1723

    
1724
.. vim: set textwidth=72 :
1725
.. Local Variables:
1726
.. mode: rst
1727
.. fill-column: 72
1728
.. End: