Statistics
| Branch: | Tag: | Revision:

root / doc / admin.rst @ c27ba1cc

History | View | Annotate | Download (45.4 kB)

1
Ganeti administrator's guide
2
============================
3

    
4
Documents Ganeti version |version|
5

    
6
.. contents::
7

    
8
.. highlight:: text
9

    
10
Introduction
11
------------
12

    
13
Ganeti is a virtualization cluster management software. You are expected
14
to be a system administrator familiar with your Linux distribution and
15
the Xen or KVM virtualization environments before using it.
16

    
17
The various components of Ganeti all have man pages and interactive
18
help. This manual though will help you getting familiar with the system
19
by explaining the most common operations, grouped by related use.
20

    
21
After a terminology glossary and a section on the prerequisites needed
22
to use this manual, the rest of this document is divided in sections
23
for the different targets that a command affects: instance, nodes, etc.
24

    
25
.. _terminology-label:
26

    
27
Ganeti terminology
28
++++++++++++++++++
29

    
30
This section provides a small introduction to Ganeti terminology, which
31
might be useful when reading the rest of the document.
32

    
33
Cluster
34
~~~~~~~
35

    
36
A set of machines (nodes) that cooperate to offer a coherent, highly
37
available virtualization service under a single administration domain.
38

    
39
Node
40
~~~~
41

    
42
A physical machine which is member of a cluster.  Nodes are the basic
43
cluster infrastructure, and they don't need to be fault tolerant in
44
order to achieve high availability for instances.
45

    
46
Node can be added and removed (if they host no instances) at will from
47
the cluster. In a HA cluster and only with HA instances, the loss of any
48
single node will not cause disk data loss for any instance; of course,
49
a node crash will cause the crash of the its primary instances.
50

    
51
A node belonging to a cluster can be in one of the following roles at a
52
given time:
53

    
54
- *master* node, which is the node from which the cluster is controlled
55
- *master candidate* node, only nodes in this role have the full cluster
56
  configuration and knowledge, and only master candidates can become the
57
  master node
58
- *regular* node, which is the state in which most nodes will be on
59
  bigger clusters (>20 nodes)
60
- *drained* node, nodes in this state are functioning normally but the
61
  cannot receive new instances; the intention is that nodes in this role
62
  have some issue and they are being evacuated for hardware repairs
63
- *offline* node, in which there is a record in the cluster
64
  configuration about the node, but the daemons on the master node will
65
  not talk to this node; any instances declared as having an offline
66
  node as either primary or secondary will be flagged as an error in the
67
  cluster verify operation
68

    
69
Depending on the role, each node will run a set of daemons:
70

    
71
- the :command:`ganeti-noded` daemon, which control the manipulation of
72
  this node's hardware resources; it runs on all nodes which are in a
73
  cluster
74
- the :command:`ganeti-confd` daemon (Ganeti 2.1+) which runs on all
75
  nodes, but is only functional on master candidate nodes
76
- the :command:`ganeti-rapi` daemon which runs on the master node and
77
  offers an HTTP-based API for the cluster
78
- the :command:`ganeti-masterd` daemon which runs on the master node and
79
  allows control of the cluster
80

    
81
Instance
82
~~~~~~~~
83

    
84
A virtual machine which runs on a cluster. It can be a fault tolerant,
85
highly available entity.
86

    
87
An instance has various parameters, which are classified in three
88
categories: hypervisor related-parameters (called ``hvparams``), general
89
parameters (called ``beparams``) and per network-card parameters (called
90
``nicparams``). All these parameters can be modified either at instance
91
level or via defaults at cluster level.
92

    
93
Disk template
94
~~~~~~~~~~~~~
95

    
96
The are multiple options for the storage provided to an instance; while
97
the instance sees the same virtual drive in all cases, the node-level
98
configuration varies between them.
99

    
100
There are four disk templates you can choose from:
101

    
102
diskless
103
  The instance has no disks. Only used for special purpose operating
104
  systems or for testing.
105

    
106
file
107
  The instance will use plain files as backend for its disks. No
108
  redundancy is provided, and this is somewhat more difficult to
109
  configure for high performance.
110

    
111
plain
112
  The instance will use LVM devices as backend for its disks. No
113
  redundancy is provided.
114

    
115
drbd
116
  .. note:: This is only valid for multi-node clusters using DRBD 8.0+
117

    
118
  A mirror is set between the local node and a remote one, which must be
119
  specified with the second value of the --node option. Use this option
120
  to obtain a highly available instance that can be failed over to a
121
  remote node should the primary one fail.
122

    
123
IAllocator
124
~~~~~~~~~~
125

    
126
A framework for using external (user-provided) scripts to compute the
127
placement of instances on the cluster nodes. This eliminates the need to
128
manually specify nodes in instance add, instance moves, node evacuate,
129
etc.
130

    
131
In order for Ganeti to be able to use these scripts, they must be place
132
in the iallocator directory (usually ``lib/ganeti/iallocators`` under
133
the installation prefix, e.g. ``/usr/local``).
134

    
135
“Primary” and “secondary” concepts
136
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
137

    
138
An instance has a primary and depending on the disk configuration, might
139
also have a secondary node. The instance always runs on the primary node
140
and only uses its secondary node for disk replication.
141

    
142
Similarly, the term of primary and secondary instances when talking
143
about a node refers to the set of instances having the given node as
144
primary, respectively secondary.
145

    
146
Tags
147
~~~~
148

    
149
Tags are short strings that can be attached to either to cluster itself,
150
or to nodes or instances. They are useful as a very simplistic
151
information store for helping with cluster administration, for example
152
by attaching owner information to each instance after it's created::
153

    
154
  gnt-instance add … instance1
155
  gnt-instance add-tags instance1 owner:user2
156

    
157
And then by listing each instance and its tags, this information could
158
be used for contacting the users of each instance.
159

    
160
Jobs and OpCodes
161
~~~~~~~~~~~~~~~~
162

    
163
While not directly visible by an end-user, it's useful to know that a
164
basic cluster operation (e.g. starting an instance) is represented
165
internall by Ganeti as an *OpCode* (abbreviation from operation
166
code). These OpCodes are executed as part of a *Job*. The OpCodes in a
167
single Job are processed serially by Ganeti, but different Jobs will be
168
processed (depending on resource availability) in parallel.
169

    
170
For example, shutting down the entire cluster can be done by running the
171
command ``gnt-instance shutdown --all``, which will submit for each
172
instance a separate job containing the “shutdown instance” OpCode.
173

    
174

    
175
Prerequisites
176
+++++++++++++
177

    
178
You need to have your Ganeti cluster installed and configured before you
179
try any of the commands in this document. Please follow the
180
:doc:`install` for instructions on how to do that.
181

    
182
Instance management
183
-------------------
184

    
185
Adding an instance
186
++++++++++++++++++
187

    
188
The add operation might seem complex due to the many parameters it
189
accepts, but once you have understood the (few) required parameters and
190
the customisation capabilities you will see it is an easy operation.
191

    
192
The add operation requires at minimum five parameters:
193

    
194
- the OS for the instance
195
- the disk template
196
- the disk count and size
197
- the node specification or alternatively the iallocator to use
198
- and finally the instance name
199

    
200
The OS for the instance must be visible in the output of the command
201
``gnt-os list`` and specifies which guest OS to install on the instance.
202

    
203
The disk template specifies what kind of storage to use as backend for
204
the (virtual) disks presented to the instance; note that for instances
205
with multiple virtual disks, they all must be of the same type.
206

    
207
The node(s) on which the instance will run can be given either manually,
208
via the ``-n`` option, or computed automatically by Ganeti, if you have
209
installed any iallocator script.
210

    
211
With the above parameters in mind, the command is::
212

    
213
  gnt-instance add \
214
    -n TARGET_NODE:SECONDARY_NODE \
215
    -o OS_TYPE \
216
    -t DISK_TEMPLATE -s DISK_SIZE \
217
    INSTANCE_NAME
218

    
219
The instance name must be resolvable (e.g. exist in DNS) and usually
220
points to an address in the same subnet as the cluster itself.
221

    
222
The above command has the minimum required options; other options you
223
can give include, among others:
224

    
225
- The memory size (``-B memory``)
226

    
227
- The number of virtual CPUs (``-B vcpus``)
228

    
229
- Arguments for the NICs of the instance; by default, a single-NIC
230
  instance is created. The IP and/or bridge of the NIC can be changed
231
  via ``--nic 0:ip=IP,bridge=BRIDGE``
232

    
233
See the manpage for gnt-instance for the detailed option list.
234

    
235
For example if you want to create an highly available instance, with a
236
single disk of 50GB and the default memory size, having primary node
237
``node1`` and secondary node ``node3``, use the following command::
238

    
239
  gnt-instance add -n node1:node3 -o debootstrap -t drbd \
240
    instance1
241

    
242
There is a also a command for batch instance creation from a
243
specification file, see the ``batch-create`` operation in the
244
gnt-instance manual page.
245

    
246
Regular instance operations
247
+++++++++++++++++++++++++++
248

    
249
Removal
250
~~~~~~~
251

    
252
Removing an instance is even easier than creating one. This operation is
253
irreversible and destroys all the contents of your instance. Use with
254
care::
255

    
256
  gnt-instance remove INSTANCE_NAME
257

    
258
Startup/shutdown
259
~~~~~~~~~~~~~~~~
260

    
261
Instances are automatically started at instance creation time. To
262
manually start one which is currently stopped you can run::
263

    
264
  gnt-instance startup INSTANCE_NAME
265

    
266
While the command to stop one is::
267

    
268
  gnt-instance shutdown INSTANCE_NAME
269

    
270
.. warning:: Do not use the Xen or KVM commands directly to stop
271
   instances. If you run for example ``xm shutdown`` or ``xm destroy``
272
   on an instance Ganeti will automatically restart it (via the
273
   :command:`ganeti-watcher` command which is launched via cron).
274

    
275
Querying instances
276
~~~~~~~~~~~~~~~~~~
277

    
278
There are two ways to get information about instances: listing
279
instances, which does a tabular output containing a given set of fields
280
about each instance, and querying detailed information about a set of
281
instances.
282

    
283
The command to see all the instances configured and their status is::
284

    
285
  gnt-instance list
286

    
287
The command can return a custom set of information when using the ``-o``
288
option (as always, check the manpage for a detailed specification). Each
289
instance will be represented on a line, thus making it easy to parse
290
this output via the usual shell utilities (grep, sed, etc.).
291

    
292
To get more detailed information about an instance, you can run::
293

    
294
  gnt-instance info INSTANCE
295

    
296
which will give a multi-line block of information about the instance,
297
it's hardware resources (especially its disks and their redundancy
298
status), etc. This is harder to parse and is more expensive than the
299
list operation, but returns much more detailed information.
300

    
301

    
302
Export/Import
303
+++++++++++++
304

    
305
You can create a snapshot of an instance disk and its Ganeti
306
configuration, which then you can backup, or import into another
307
cluster. The way to export an instance is::
308

    
309
  gnt-backup export -n TARGET_NODE INSTANCE_NAME
310

    
311

    
312
The target node can be any node in the cluster with enough space under
313
``/srv/ganeti`` to hold the instance image. Use the ``--noshutdown``
314
option to snapshot an instance without rebooting it. Note that Ganeti
315
only keeps one snapshot for an instance - any previous snapshot of the
316
same instance existing cluster-wide under ``/srv/ganeti`` will be
317
removed by this operation: if you want to keep them, you need to move
318
them out of the Ganeti exports directory.
319

    
320
Importing an instance is similar to creating a new one, but additionally
321
one must specify the location of the snapshot. The command is::
322

    
323
  gnt-backup import -n TARGET_NODE -t DISK_TEMPLATE \
324
    --src-node=NODE --src-dir=DIR INSTANCE_NAME
325

    
326
Most of the options available for the command :command:`gnt-instance
327
add` are supported here too.
328

    
329
Instance HA features
330
--------------------
331

    
332
.. note:: This section only applies to multi-node clusters
333

    
334
.. _instance-change-primary-label:
335

    
336
Changing the primary node
337
+++++++++++++++++++++++++
338

    
339
There are three ways to exchange an instance's primary and secondary
340
nodes; the right one to choose depends on how the instance has been
341
created and the status of its current primary node. See
342
:ref:`rest-redundancy-label` for information on changing the secondary
343
node. Note that it's only possible to change the primary node to the
344
secondary and vice-versa; a direct change of the primary node with a
345
third node, while keeping the current secondary is not possible in a
346
single step, only via multiple operations as detailed in
347
:ref:`instance-relocation-label`.
348

    
349
Failing over an instance
350
~~~~~~~~~~~~~~~~~~~~~~~~
351

    
352
If an instance is built in highly available mode you can at any time
353
fail it over to its secondary node, even if the primary has somehow
354
failed and it's not up anymore. Doing it is really easy, on the master
355
node you can just run::
356

    
357
  gnt-instance failover INSTANCE_NAME
358

    
359
That's it. After the command completes the secondary node is now the
360
primary, and vice-versa.
361

    
362
Live migrating an instance
363
~~~~~~~~~~~~~~~~~~~~~~~~~~
364

    
365
If an instance is built in highly available mode, it currently runs and
366
both its nodes are running fine, you can at migrate it over to its
367
secondary node, without downtime. On the master node you need to run::
368

    
369
  gnt-instance migrate INSTANCE_NAME
370

    
371
The current load on the instance and its memory size will influence how
372
long the migration will take. In any case, for both KVM and Xen
373
hypervisors, the migration will be transparent to the instance.
374

    
375
Moving an instance (offline)
376
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
377

    
378
If an instance has not been create as mirrored, then the only way to
379
change its primary node is to execute the move command::
380

    
381
  gnt-instance move -n NEW_NODE INSTANCE
382

    
383
This has a few prerequisites:
384

    
385
- the instance must be stopped
386
- its current primary node must be on-line and healthy
387
- the disks of the instance must not have any errors
388

    
389
Since this operation actually copies the data from the old node to the
390
new node, expect it to take proportional to the size of the instance's
391
disks and the speed of both the nodes' I/O system and their networking.
392

    
393
Disk operations
394
+++++++++++++++
395

    
396
Disk failures are a common cause of errors in any server
397
deployment. Ganeti offers protection from single-node failure if your
398
instances were created in HA mode, and it also offers ways to restore
399
redundancy after a failure.
400

    
401
Preparing for disk operations
402
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
403

    
404
It is important to note that for Ganeti to be able to do any disk
405
operation, the Linux machines on top of which Ganeti must be consistent;
406
for LVM, this means that the LVM commands must not return failures; it
407
is common that after a complete disk failure, any LVM command aborts
408
with an error similar to::
409

    
410
  # vgs
411
  /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
412
  /dev/sdb1: read failed after 0 of 4096 at 750153695232: Input/output
413
  error
414
  /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
415
  Couldn't find device with uuid
416
  't30jmN-4Rcf-Fr5e-CURS-pawt-z0jU-m1TgeJ'.
417
  Couldn't find all physical volumes for volume group xenvg.
418

    
419
Before restoring an instance's disks to healthy status, it's needed to
420
fix the volume group used by Ganeti so that we can actually create and
421
manage the logical volumes. This is usually done in a multi-step
422
process:
423

    
424
#. first, if the disk is completely gone and LVM commands exit with
425
   “Couldn't find device with uuid…” then you need to run the command::
426

    
427
    vgreduce --removemissing VOLUME_GROUP
428

    
429
#. after the above command, the LVM commands should be executing
430
   normally (warnings are normal, but the commands will not fail
431
   completely).
432

    
433
#. if the failed disk is still visible in the output of the ``pvs``
434
   command, you need to deactivate it from allocations by running::
435

    
436
    pvs -x n /dev/DISK
437

    
438
At this point, the volume group should be consistent and any bad
439
physical volumes should not longer be available for allocation.
440

    
441
Note that since version 2.1 Ganeti provides some commands to automate
442
these two operations, see :ref:`storage-units-label`.
443

    
444
.. _rest-redundancy-label:
445

    
446
Restoring redundancy for DRBD-based instances
447
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
448

    
449
A DRBD instance has two nodes, and the storage on one of them has
450
failed. Depending on which node (primary or secondary) has failed, you
451
have three options at hand:
452

    
453
- if the storage on the primary node has failed, you need to re-create
454
  the disks on it
455
- if the storage on the secondary node has failed, you can either
456
  re-create the disks on it or change the secondary and recreate
457
  redundancy on the new secondary node
458

    
459
Of course, at any point it's possible to force re-creation of disks even
460
though everything is already fine.
461

    
462
For all three cases, the ``replace-disks`` operation can be used::
463

    
464
  # re-create disks on the primary node
465
  gnt-instance replace-disks -p INSTANCE_NAME
466
  # re-create disks on the current secondary
467
  gnt-instance replace-disks -s INSTANCE_NAME
468
  # change the secondary node, via manual specification
469
  gnt-instance replace-disks -n NODE INSTANCE_NAME
470
  # change the secondary node, via an iallocator script
471
  gnt-instance replace-disks -I SCRIPT INSTANCE_NAME
472
  # since Ganeti 2.1: automatically fix the primary or secondary node
473
  gnt-instance replace-disks -a INSTANCE_NAME
474

    
475
Since the process involves copying all data from the working node to the
476
target node, it will take a while, depending on the instance's disk
477
size, node I/O system and network speed. But it is (baring any network
478
interruption) completely transparent for the instance.
479

    
480
Re-creating disks for non-redundant instances
481
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
482

    
483
.. versionadded:: 2.1
484

    
485
For non-redundant instances, there isn't a copy (except backups) to
486
re-create the disks. But it's possible to at-least re-create empty
487
disks, after which a reinstall can be run, via the ``recreate-disks``
488
command::
489

    
490
  gnt-instance recreate-disks INSTANCE
491

    
492
Note that this will fail if the disks already exists.
493

    
494
Debugging instances
495
+++++++++++++++++++
496

    
497
Accessing an instance's disks
498
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
499

    
500
From an instance's primary node you can have access to its disks. Never
501
ever mount the underlying logical volume manually on a fault tolerant
502
instance, or will break replication and your data will be
503
inconsistent. The correct way to access an instance's disks is to run
504
(on the master node, as usual) the command::
505

    
506
  gnt-instance activate-disks INSTANCE
507

    
508
And then, *on the primary node of the instance*, access the device that
509
gets created. For example, you could mount the given disks, then edit
510
files on the filesystem, etc.
511

    
512
Note that with partitioned disks (as opposed to whole-disk filesystems),
513
you will need to use a tool like :manpage:`kpartx(8)`::
514

    
515
  node1# gnt-instance activate-disks instance1
516
517
  node1# ssh node3
518
  node3# kpartx -l /dev/…
519
  node3# kpartx -a /dev/…
520
  node3# mount /dev/mapper/… /mnt/
521
  # edit files under mnt as desired
522
  node3# umount /mnt/
523
  node3# kpartx -d /dev/…
524
  node3# exit
525
  node1#
526

    
527
After you've finished you can deactivate them with the deactivate-disks
528
command, which works in the same way::
529

    
530
  gnt-instance deactivate-disks INSTANCE
531

    
532
Note that if any process started by you is still using the disks, the
533
above command will error out, and you **must** cleanup and ensure that
534
the above command runs successfully before you start the instance,
535
otherwise the instance will suffer corruption.
536

    
537
Accessing an instance's console
538
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
539

    
540
The command to access a running instance's console is::
541

    
542
  gnt-instance console INSTANCE_NAME
543

    
544
Use the console normally and then type ``^]`` when done, to exit.
545

    
546
Other instance operations
547
+++++++++++++++++++++++++
548

    
549
Reboot
550
~~~~~~
551

    
552
There is a wrapper command for rebooting instances::
553

    
554
  gnt-instance reboot instance2
555

    
556
By default, this does the equivalent of shutting down and then starting
557
the instance, but it accepts parameters to perform a soft-reboot (via
558
the hypervisor), a hard reboot (hypervisor shutdown and then startup) or
559
a full one (the default, which also de-configures and then configures
560
again the disks of the instance).
561

    
562
Instance OS definitions debugging
563
+++++++++++++++++++++++++++++++++
564

    
565
Should you have any problems with instance operating systems the command
566
to see a complete status for all your nodes is::
567

    
568
   gnt-os diagnose
569

    
570
.. _instance-relocation-label:
571

    
572
Instance relocation
573
~~~~~~~~~~~~~~~~~~~
574

    
575
While it is not possible to move an instance from nodes ``(A, B)`` to
576
nodes ``(C, D)`` in a single move, it is possible to do so in a few
577
steps::
578

    
579
  # instance is located on A, B
580
  node1# gnt-instance replace -n nodeC instance1
581
  # instance has moved from (A, B) to (A, C)
582
  # we now flip the primary/secondary nodes
583
  node1# gnt-instance migrate instance1
584
  # instance lives on (C, A)
585
  # we can then change A to D via:
586
  node1# gnt-instance replace -n nodeD instance1
587

    
588
Which brings it into the final configuration of ``(C, D)``. Note that we
589
needed to do two replace-disks operation (two copies of the instance
590
disks), because we needed to get rid of both the original nodes (A and
591
B).
592

    
593
Node operations
594
---------------
595

    
596
There are much fewer node operations available than for instances, but
597
they are equivalently important for maintaining a healthy cluster.
598

    
599
Add/readd
600
+++++++++
601

    
602
It is at any time possible to extend the cluster with one more node, by
603
using the node add operation::
604

    
605
  gnt-node add NEW_NODE
606

    
607
If the cluster has a replication network defined, then you need to pass
608
the ``-s REPLICATION_IP`` parameter to this option.
609

    
610
A variation of this command can be used to re-configure a node if its
611
Ganeti configuration is broken, for example if it has been reinstalled
612
by mistake::
613

    
614
  gnt-node add --readd EXISTING_NODE
615

    
616
This will reinitialise the node as if it's been newly added, but while
617
keeping its existing configuration in the cluster (primary/secondary IP,
618
etc.), in other words you won't need to use ``-s`` here.
619

    
620
Changing the node role
621
++++++++++++++++++++++
622

    
623
A node can be in different roles, as explained in the
624
:ref:`terminology-label` section. Promoting a node to the master role is
625
special, while the other roles are handled all via a single command.
626

    
627
Failing over the master node
628
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
629

    
630
If you want to promote a different node to the master role (for whatever
631
reason), run on any other master-candidate node the command::
632

    
633
  gnt-cluster masterfailover
634

    
635
and the node you ran it on is now the new master. In case you try to run
636
this on a non master-candidate node, you will get an error telling you
637
which nodes are valid.
638

    
639
Changing between the other roles
640
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
641

    
642
The ``gnt-node modify`` command can be used to select a new role::
643

    
644
  # change to master candidate
645
  gnt-node modify -C yes NODE
646
  # change to drained status
647
  gnt-node modify -D yes NODE
648
  # change to offline status
649
  gnt-node modify -O yes NODE
650
  # change to regular mode (reset all flags)
651
  gnt-node modify -O no -D no -C no NODE
652

    
653
Note that the cluster requires that at any point in time, a certain
654
number of nodes are master candidates, so changing from master candidate
655
to other roles might fail. It is recommended to either force the
656
operation (via the ``--force`` option) or first change the number of
657
master candidates in the cluster - see :ref:`cluster-config-label`.
658

    
659
Evacuating nodes
660
++++++++++++++++
661

    
662
There are two steps of moving instances off a node:
663

    
664
- moving the primary instances (actually converting them into secondary
665
  instances)
666
- moving the secondary instances (including any instances converted in
667
  the step above)
668

    
669
Primary instance conversion
670
~~~~~~~~~~~~~~~~~~~~~~~~~~~
671

    
672
For this step, you can use either individual instance move
673
commands (as seen in :ref:`instance-change-primary-label`) or the bulk
674
per-node versions; these are::
675

    
676
  gnt-node migrate NODE
677
  gnt-node evacuate NODE
678

    
679
Note that the instance “move” command doesn't currently have a node
680
equivalent.
681

    
682
Both these commands, or the equivalent per-instance command, will make
683
this node the secondary node for the respective instances, whereas their
684
current secondary node will become primary. Note that it is not possible
685
to change in one step the primary node to another node as primary, while
686
keeping the same secondary node.
687

    
688
Secondary instance evacuation
689
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
690

    
691
For the evacuation of secondary instances, a command called
692
:command:`gnt-node evacuate` is provided and its syntax is::
693

    
694
  gnt-node evacuate -I IALLOCATOR_SCRIPT NODE
695
  gnt-node evacuate -n DESTINATION_NODE NODE
696

    
697
The first version will compute the new secondary for each instance in
698
turn using the given iallocator script, whereas the second one will
699
simply move all instances to DESTINATION_NODE.
700

    
701
Removal
702
+++++++
703

    
704
Once a node no longer has any instances (neither primary nor secondary),
705
it's easy to remove it from the cluster::
706

    
707
  gnt-node remove NODE_NAME
708

    
709
This will deconfigure the node, stop the ganeti daemons on it and leave
710
it hopefully like before it joined to the cluster.
711

    
712
Storage handling
713
++++++++++++++++
714

    
715
When using LVM (either standalone or with DRBD), it can become tedious
716
to debug and fix it in case of errors. Furthermore, even file-based
717
storage can become complicated to handle manually on many hosts. Ganeti
718
provides a couple of commands to help with automation.
719

    
720
Logical volumes
721
~~~~~~~~~~~~~~~
722

    
723
This is a command specific to LVM handling. It allows listing the
724
logical volumes on a given node or on all nodes and their association to
725
instances via the ``volumes`` command::
726

    
727
  node1# gnt-node volumes
728
  Node  PhysDev   VG    Name             Size Instance
729
  node1 /dev/sdb1 xenvg e61fbc97-….disk0 512M instance17
730
  node1 /dev/sdb1 xenvg ebd1a7d1-….disk0 512M instance19
731
  node2 /dev/sdb1 xenvg 0af08a3d-….disk0 512M instance20
732
  node2 /dev/sdb1 xenvg cc012285-….disk0 512M instance16
733
  node2 /dev/sdb1 xenvg f0fac192-….disk0 512M instance18
734

    
735
The above command maps each logical volume to a volume group and
736
underlying physical volume and (possibly) to an instance.
737

    
738
.. _storage-units-label:
739

    
740
Generalized storage handling
741
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
742

    
743
.. versionadded:: 2.1
744

    
745
Starting with Ganeti 2.1, a new storage framework has been implemented
746
that tries to abstract the handling of the storage type the cluster
747
uses.
748

    
749
First is listing the backend storage and their space situation::
750

    
751
  node1# gnt-node list-storage
752
  Node  Name        Size Used   Free
753
  node1 /dev/sda7 673.8G   0M 673.8G
754
  node1 /dev/sdb1 698.6G 1.5G 697.1G
755
  node2 /dev/sda7 673.8G   0M 673.8G
756
  node2 /dev/sdb1 698.6G 1.0G 697.6G
757

    
758
The default is to list LVM physical volumes. It's also possible to list
759
the LVM volume groups::
760

    
761
  node1# gnt-node list-storage -t lvm-vg
762
  Node  Name  Size
763
  node1 xenvg 1.3T
764
  node2 xenvg 1.3T
765

    
766
Next is repairing storage units, which is currently only implemented for
767
volume groups and does the equivalent of ``vgreduce --removemissing``::
768

    
769
  node1# gnt-node repair-storage node2 lvm-vg xenvg
770
  Sun Oct 25 22:21:45 2009 Repairing storage unit 'xenvg' on node2 ...
771

    
772
Last is the modification of volume properties, which is (again) only
773
implemented for LVM physical volumes and allows toggling the
774
``allocatable`` value::
775

    
776
  node1# gnt-node modify-storage --allocatable=no node2 lvm-pv /dev/sdb1
777

    
778
Use of the storage commands
779
~~~~~~~~~~~~~~~~~~~~~~~~~~~
780

    
781
All these commands are needed when recovering a node from a disk
782
failure:
783

    
784
- first, we need to recover from complete LVM failure (due to missing
785
  disk), by running the ``repair-storage`` command
786
- second, we need to change allocation on any partially-broken disk
787
  (i.e. LVM still sees it, but it has bad blocks) by running
788
  ``modify-storage``
789
- then we can evacuate the instances as needed
790

    
791

    
792
Cluster operations
793
------------------
794

    
795
Beside the cluster initialisation command (which is detailed in the
796
:doc:`install` document) and the master failover command which is
797
explained under node handling, there are a couple of other cluster
798
operations available.
799

    
800
.. _cluster-config-label:
801

    
802
Standard operations
803
+++++++++++++++++++
804

    
805
One of the few commands that can be run on any node (not only the
806
master) is the ``getmaster`` command::
807

    
808
  node2# gnt-cluster getmaster
809
  node1.example.com
810
  node2#
811

    
812
It is possible to query and change global cluster parameters via the
813
``info`` and ``modify`` commands::
814

    
815
  node1# gnt-cluster info
816
  Cluster name: cluster.example.com
817
  Cluster UUID: 07805e6f-f0af-4310-95f1-572862ee939c
818
  Creation time: 2009-09-25 05:04:15
819
  Modification time: 2009-10-18 22:11:47
820
  Master node: node1.example.com
821
  Architecture (this node): 64bit (x86_64)
822
823
  Tags: foo
824
  Default hypervisor: xen-pvm
825
  Enabled hypervisors: xen-pvm
826
  Hypervisor parameters:
827
    - xen-pvm:
828
        root_path: /dev/sda1
829
830
  Cluster parameters:
831
    - candidate pool size: 10
832
833
  Default instance parameters:
834
    - default:
835
        memory: 128
836
837
  Default nic parameters:
838
    - default:
839
        link: xen-br0
840
841

    
842
There various parameters above can be changed via the ``modify``
843
commands as follows:
844

    
845
- the hypervisor parameters can be changed via ``modify -H
846
  xen-pvm:root_path=…``, and so on for other hypervisors/key/values
847
- the "default instance parameters" are changeable via ``modify -B
848
  parameter=value…`` syntax
849
- the cluster parameters are changeable via separate options to the
850
  modify command (e.g. ``--candidate-pool-size``, etc.)
851

    
852
For detailed option list see the :manpage:`gnt-cluster(8)` man page.
853

    
854
The cluster version can be obtained via the ``version`` command::
855
  node1# gnt-cluster version
856
  Software version: 2.1.0
857
  Internode protocol: 20
858
  Configuration format: 2010000
859
  OS api version: 15
860
  Export interface: 0
861

    
862
This is not very useful except when debugging Ganeti.
863

    
864
Global node commands
865
++++++++++++++++++++
866

    
867
There are two commands provided for replicating files to all nodes of a
868
cluster and for running commands on all the nodes::
869

    
870
  node1# gnt-cluster copyfile /path/to/file
871
  node1# gnt-cluster command ls -l /path/to/file
872

    
873
These are simple wrappers over scp/ssh and more advanced usage can be
874
obtained using :manpage:`dsh(1)` and similar commands. But they are
875
useful to update an OS script from the master node, for example.
876

    
877
Cluster verification
878
++++++++++++++++++++
879

    
880
There are three commands that relate to global cluster checks. The first
881
one is ``verify`` which gives an overview on the cluster state,
882
highlighting any issues. In normal operation, this command should return
883
no ``ERROR`` messages::
884

    
885
  node1# gnt-cluster verify
886
  Sun Oct 25 23:08:58 2009 * Verifying global settings
887
  Sun Oct 25 23:08:58 2009 * Gathering data (2 nodes)
888
  Sun Oct 25 23:09:00 2009 * Verifying node status
889
  Sun Oct 25 23:09:00 2009 * Verifying instance status
890
  Sun Oct 25 23:09:00 2009 * Verifying orphan volumes
891
  Sun Oct 25 23:09:00 2009 * Verifying remaining instances
892
  Sun Oct 25 23:09:00 2009 * Verifying N+1 Memory redundancy
893
  Sun Oct 25 23:09:00 2009 * Other Notes
894
  Sun Oct 25 23:09:00 2009   - NOTICE: 5 non-redundant instance(s) found.
895
  Sun Oct 25 23:09:00 2009 * Hooks Results
896

    
897
The second command is ``verify-disks``, which checks that the instance's
898
disks have the correct status based on the desired instance state
899
(up/down)::
900

    
901
  node1# gnt-cluster verify-disks
902

    
903
Note that this command will show no output when disks are healthy.
904

    
905
The last command is used to repair any discrepancies in Ganeti's
906
recorded disk size and the actual disk size (disk size information is
907
needed for proper activation and growth of DRBD-based disks)::
908

    
909
  node1# gnt-cluster repair-disk-sizes
910
  Sun Oct 25 23:13:16 2009  - INFO: Disk 0 of instance instance1 has mismatched size, correcting: recorded 512, actual 2048
911
  Sun Oct 25 23:13:17 2009  - WARNING: Invalid result from node node4, ignoring node results
912

    
913
The above shows one instance having wrong disk size, and a node which
914
returned invalid data, and thus we ignored all primary instances of that
915
node.
916

    
917
Configuration redistribution
918
++++++++++++++++++++++++++++
919

    
920
If the verify command complains about file mismatches between the master
921
and other nodes, due to some node problems or if you manually modified
922
configuration files, you can force an push of the master configuration
923
to all other nodes via the ``redist-conf`` command::
924

    
925
  node1# gnt-cluster redist-conf
926
  node1#
927

    
928
This command will be silent unless there are problems sending updates to
929
the other nodes.
930

    
931

    
932
Cluster renaming
933
++++++++++++++++
934

    
935
It is possible to rename a cluster, or to change its IP address, via the
936
``rename`` command. If only the IP has changed, you need to pass the
937
current name and Ganeti will realise its IP has changed::
938

    
939
  node1# gnt-cluster rename cluster.example.com
940
  This will rename the cluster to 'cluster.example.com'. If
941
  you are connected over the network to the cluster name, the operation
942
  is very dangerous as the IP address will be removed from the node and
943
  the change may not go through. Continue?
944
  y/[n]/?: y
945
  Failure: prerequisites not met for this operation:
946
  Neither the name nor the IP address of the cluster has changed
947

    
948
In the above output, neither value has changed since the cluster
949
initialisation so the operation is not completed.
950

    
951
Queue operations
952
++++++++++++++++
953

    
954
The job queue execution in Ganeti 2.0 and higher can be inspected,
955
suspended and resumed via the ``queue`` command::
956

    
957
  node1~# gnt-cluster queue info
958
  The drain flag is unset
959
  node1~# gnt-cluster queue drain
960
  node1~# gnt-instance stop instance1
961
  Failed to submit job for instance1: Job queue is drained, refusing job
962
  node1~# gnt-cluster queue info
963
  The drain flag is set
964
  node1~# gnt-cluster queue undrain
965

    
966
This is most useful if you have an active cluster and you need to
967
upgrade the Ganeti software, or simply restart the software on any node:
968

    
969
#. suspend the queue via ``queue drain``
970
#. wait until there are no more running jobs via ``gnt-job list``
971
#. restart the master or another node, or upgrade the software
972
#. resume the queue via ``queue undrain``
973

    
974
.. note:: this command only stores a local flag file, and if you
975
   failover the master, it will not have effect on the new master.
976

    
977

    
978
Watcher control
979
+++++++++++++++
980

    
981
The :manpage:`ganeti-watcher` is a program, usually scheduled via
982
``cron``, that takes care of cluster maintenance operations (restarting
983
downed instances, activating down DRBD disks, etc.). However, during
984
maintenance and troubleshooting, this can get in your way; disabling it
985
via commenting out the cron job is not so good as this can be
986
forgotten. Thus there are some commands for automated control of the
987
watcher: ``pause``, ``info`` and ``continue``::
988

    
989
  node1~# gnt-cluster watcher info
990
  The watcher is not paused.
991
  node1~# gnt-cluster watcher pause 1h
992
  The watcher is paused until Mon Oct 26 00:30:37 2009.
993
  node1~# gnt-cluster watcher info
994
  The watcher is paused until Mon Oct 26 00:30:37 2009.
995
  node1~# ganeti-watcher -d
996
  2009-10-25 23:30:47,984:  pid=28867 ganeti-watcher:486 DEBUG Pause has been set, exiting
997
  node1~# gnt-cluster watcher continue
998
  The watcher is no longer paused.
999
  node1~# ganeti-watcher -d
1000
  2009-10-25 23:31:04,789:  pid=28976 ganeti-watcher:345 DEBUG Archived 0 jobs, left 0
1001
  2009-10-25 23:31:05,884:  pid=28976 ganeti-watcher:280 DEBUG Got data from cluster, writing instance status file
1002
  2009-10-25 23:31:06,061:  pid=28976 ganeti-watcher:150 DEBUG Data didn't change, just touching status file
1003
  node1~# gnt-cluster watcher info
1004
  The watcher is not paused.
1005
  node1~#
1006

    
1007
The exact details of the argument to the ``pause`` command are available
1008
in the manpage.
1009

    
1010
.. note:: this command only stores a local flag file, and if you
1011
   failover the master, it will not have effect on the new master.
1012

    
1013
Removing a cluster entirely
1014
+++++++++++++++++++++++++++
1015

    
1016
The usual method to cleanup a cluster is to run ``gnt-cluster destroy``
1017
however if the Ganeti installation is broken in any way then this will
1018
not run.
1019

    
1020
It is possible in such a case to cleanup manually most if not all traces
1021
of a cluster installation by following these steps on all of the nodes:
1022

    
1023
1. Shutdown all instances. This depends on the virtualisation method
1024
   used (Xen, KVM, etc.):
1025

    
1026
  - Xen: run ``xm list`` and ``xm destroy`` on all the non-Domain-0
1027
    instances
1028
  - KVM: kill all the KVM processes
1029
  - chroot: kill all processes under the chroot mountpoints
1030

    
1031
2. If using DRBD, shutdown all DRBD minors (which should by at this time
1032
   no-longer in use by instances); on each node, run ``drbdsetup
1033
   /dev/drbdN down`` for each active DRBD minor.
1034

    
1035
3. If using LVM, cleanup the Ganeti volume group; if only Ganeti created
1036
   logical volumes (and you are not sharing the volume group with the
1037
   OS, for example), then simply running ``lvremove -f xenvg`` (replace
1038
   'xenvg' with your volume group name) should do the required cleanup.
1039

    
1040
4. If using file-based storage, remove recursively all files and
1041
   directories under your file-storage directory: ``rm -rf
1042
   /srv/ganeti/file-storage/*`` replacing the path with the correct path
1043
   for your cluster.
1044

    
1045
5. Stop the ganeti daemons (``/etc/init.d/ganeti stop``) and kill any
1046
   that remain alive (``pgrep ganeti`` and ``pkill ganeti``).
1047

    
1048
6. Remove the ganeti state directory (``rm -rf /var/lib/ganeti/*``),
1049
   replacing the path with the correct path for your installation.
1050

    
1051
On the master node, remove the cluster from the master-netdev (usually
1052
``xen-br0`` for bridged mode, otherwise ``eth0`` or similar), by running
1053
``ip a del $clusterip/32 dev xen-br0`` (use the correct cluster ip and
1054
network device name).
1055

    
1056
At this point, the machines are ready for a cluster creation; in case
1057
you want to remove Ganeti completely, you need to also undo some of the
1058
SSH changes and log directories:
1059

    
1060
- ``rm -rf /var/log/ganeti /srv/ganeti`` (replace with the correct
1061
  paths)
1062
- remove from ``/root/.ssh`` the keys that Ganeti added (check the
1063
  ``authorized_keys`` and ``id_dsa`` files)
1064
- regenerate the host's SSH keys (check the OpenSSH startup scripts)
1065
- uninstall Ganeti
1066

    
1067
Otherwise, if you plan to re-create the cluster, you can just go ahead
1068
and rerun ``gnt-cluster init``.
1069

    
1070
Tags handling
1071
-------------
1072

    
1073
The tags handling (addition, removal, listing) is similar for all the
1074
objects that support it (instances, nodes, and the cluster).
1075

    
1076
Limitations
1077
+++++++++++
1078

    
1079
Note that the set of characters present in a tag and the maximum tag
1080
length are restricted. Currently the maximum length is 128 characters,
1081
there can be at most 4096 tags per object, and the set of characters is
1082
comprised by alphanumeric characters and additionally ``.+*/:-``.
1083

    
1084
Operations
1085
++++++++++
1086

    
1087
Tags can be added via ``add-tags``::
1088

    
1089
  gnt-instance add-tags INSTANCE a b c
1090
  gnt-node add-tags INSTANCE a b c
1091
  gnt-cluster add-tags a b c
1092

    
1093

    
1094
The above commands add three tags to an instance, to a node and to the
1095
cluster. Note that the cluster command only takes tags as arguments,
1096
whereas the node and instance commands first required the node and
1097
instance name.
1098

    
1099
Tags can also be added from a file, via the ``--from=FILENAME``
1100
argument. The file is expected to contain one tag per line.
1101

    
1102
Tags can also be remove via a syntax very similar to the add one::
1103

    
1104
  gnt-instance remove-tags INSTANCE a b c
1105

    
1106
And listed via::
1107

    
1108
  gnt-instance list-tags
1109
  gnt-node list-tags
1110
  gnt-cluster list-tags
1111

    
1112
Global tag search
1113
+++++++++++++++++
1114

    
1115
It is also possible to execute a global search on the all tags defined
1116
in the cluster configuration, via a cluster command::
1117

    
1118
  gnt-cluster search-tags REGEXP
1119

    
1120
The parameter expected is a regular expression (see
1121
:manpage:`regex(7)`). This will return all tags that match the search,
1122
together with the object they are defined in (the names being show in a
1123
hierarchical kind of way)::
1124

    
1125
  node1# gnt-cluster search-tags o
1126
  /cluster foo
1127
  /instances/instance1 owner:bar
1128

    
1129

    
1130
Job operations
1131
--------------
1132

    
1133
The various jobs submitted by the instance/node/cluster commands can be
1134
examined, canceled and archived by various invocations of the
1135
``gnt-job`` command.
1136

    
1137
First is the job list command::
1138

    
1139
  node1# gnt-job list
1140
  17771 success INSTANCE_QUERY_DATA
1141
  17773 success CLUSTER_VERIFY_DISKS
1142
  17775 success CLUSTER_REPAIR_DISK_SIZES
1143
  17776 error   CLUSTER_RENAME(cluster.example.com)
1144
  17780 success CLUSTER_REDIST_CONF
1145
  17792 success INSTANCE_REBOOT(instance1.example.com)
1146

    
1147
More detailed information about a job can be found via the ``info``
1148
command::
1149

    
1150
  node1# gnt-job info 17776
1151
  Job ID: 17776
1152
    Status: error
1153
    Received:         2009-10-25 23:18:02.180569
1154
    Processing start: 2009-10-25 23:18:02.200335 (delta 0.019766s)
1155
    Processing end:   2009-10-25 23:18:02.279743 (delta 0.079408s)
1156
    Total processing time: 0.099174 seconds
1157
    Opcodes:
1158
      OP_CLUSTER_RENAME
1159
        Status: error
1160
        Processing start: 2009-10-25 23:18:02.200335
1161
        Processing end:   2009-10-25 23:18:02.252282
1162
        Input fields:
1163
          name: cluster.example.com
1164
        Result:
1165
          OpPrereqError
1166
          [Neither the name nor the IP address of the cluster has changed]
1167
        Execution log:
1168

    
1169
During the execution of a job, it's possible to follow the output of a
1170
job, similar to the log that one get from the ``gnt-`` commands, via the
1171
watch command::
1172

    
1173
  node1# gnt-instance add --submit … instance1
1174
  JobID: 17818
1175
  node1# gnt-job watch 17818
1176
  Output from job 17818 follows
1177
  -----------------------------
1178
  Mon Oct 26 00:22:48 2009  - INFO: Selected nodes for instance instance1 via iallocator dumb: node1, node2
1179
  Mon Oct 26 00:22:49 2009 * creating instance disks...
1180
  Mon Oct 26 00:22:52 2009 adding instance instance1 to cluster config
1181
  Mon Oct 26 00:22:52 2009  - INFO: Waiting for instance instance1 to sync disks.
1182
1183
  Mon Oct 26 00:23:03 2009 creating os for instance xen-devi-18.fra.corp.google.com on node mpgntac4.fra.corp.google.com
1184
  Mon Oct 26 00:23:03 2009 * running the instance OS create scripts...
1185
  Mon Oct 26 00:23:13 2009 * starting instance...
1186
  node1#
1187

    
1188
This is useful if you need to follow a job's progress from multiple
1189
terminals.
1190

    
1191
A job that has not yet started to run can be canceled::
1192

    
1193
  node1# gnt-job cancel 17810
1194

    
1195
But not one that has already started execution::
1196

    
1197
  node1# gnt-job cancel 17805
1198
  Job 17805 is no longer waiting in the queue
1199

    
1200
There are two queues for jobs: the *current* and the *archive*
1201
queue. Jobs are initially submitted to the current queue, and they stay
1202
in that queue until they have finished execution (either successfully or
1203
not). At that point, they can be moved into the archive queue, and the
1204
ganeti-watcher script will do this automatically after 6 hours. The
1205
ganeti-cleaner script will remove the jobs from the archive directory
1206
after three weeks.
1207

    
1208
Note that only jobs in the current queue can be viewed via the list and
1209
info commands; Ganeti itself doesn't examine the archive directory. If
1210
you need to see an older job, either move the file manually in the
1211
top-level queue directory, or look at its contents (it's a
1212
JSON-formatted file).
1213

    
1214
Ganeti tools
1215
------------
1216

    
1217
Beside the usual ``gnt-`` and ``ganeti-`` commands which are provided
1218
and installed in ``$prefix/sbin`` at install time, there are a couple of
1219
other tools installed which are used seldom but can be helpful in some
1220
cases.
1221

    
1222
lvmstrap
1223
++++++++
1224

    
1225
The ``lvmstrap`` tool, introduced in :ref:`configure-lvm-label` section,
1226
has two modes of operation:
1227

    
1228
- ``diskinfo`` shows the discovered disks on the system and their status
1229
- ``create`` takes all not-in-use disks and creates a volume group out
1230
  of them
1231

    
1232
.. warning:: The ``create`` argument to this command causes data-loss!
1233

    
1234
cfgupgrade
1235
++++++++++
1236

    
1237
The ``cfgupgrade`` tools is used to upgrade between major (and minor)
1238
Ganeti versions. Point-releases are usually transparent for the admin.
1239

    
1240
More information about the upgrade procedure is listed on the wiki at
1241
http://code.google.com/p/ganeti/wiki/UpgradeNotes.
1242

    
1243
cfgshell
1244
++++++++
1245

    
1246
.. note:: This command is not actively maintained; make sure you backup
1247
   your configuration before using it
1248

    
1249
This can be used as an alternative to direct editing of the
1250
main configuration file if Ganeti has a bug and prevents you, for
1251
example, from removing an instance or a node from the configuration
1252
file.
1253

    
1254
.. _burnin-label:
1255

    
1256
burnin
1257
++++++
1258

    
1259
.. warning:: This command will erase existing instances if given as
1260
   arguments!
1261

    
1262
This tool is used to exercise either the hardware of machines or
1263
alternatively the Ganeti software. It is safe to run on an existing
1264
cluster **as long as you don't pass it existing instance names**.
1265

    
1266
The command will, by default, execute a comprehensive set of operations
1267
against a list of instances, these being:
1268

    
1269
- creation
1270
- disk replacement (for redundant instances)
1271
- failover and migration (for redundant instances)
1272
- move (for non-redundant instances)
1273
- disk growth
1274
- add disks, remove disk
1275
- add NICs, remove NICs
1276
- export and then import
1277
- rename
1278
- reboot
1279
- shutdown/startup
1280
- and finally removal of the test instances
1281

    
1282
Executing all these operations will test that the hardware performs
1283
well: the creation, disk replace, disk add and disk growth will exercise
1284
the storage and network; the migrate command will test the memory of the
1285
systems. Depending on the passed options, it can also test that the
1286
instance OS definitions are executing properly the rename, import and
1287
export operations.
1288

    
1289
Other Ganeti projects
1290
---------------------
1291

    
1292
There are two other Ganeti-related projects that can be useful in a
1293
Ganeti deployment. These can be downloaded from the project site
1294
(http://code.google.com/p/ganeti/) and the repositories are also on the
1295
project git site (http://git.ganeti.org).
1296

    
1297
NBMA tools
1298
++++++++++
1299

    
1300
The ``ganeti-nbma`` software is designed to allow instances to live on a
1301
separate, virtual network from the nodes, and in an environment where
1302
nodes are not guaranteed to be able to reach each other via multicasting
1303
or broadcasting. For more information see the README in the source
1304
archive.
1305

    
1306
ganeti-htools
1307
+++++++++++++
1308

    
1309
The ``ganeti-htools`` software consists of a set of tools:
1310

    
1311
- ``hail``: an advanced iallocator script compared to Ganeti's builtin
1312
  one
1313
- ``hbal``: a tool for rebalancing the cluster, i.e. moving instances
1314
  around in order to better use the resources on the nodes
1315
- ``hspace``: a tool for estimating the available capacity of a cluster,
1316
  so that capacity planning can be done efficiently
1317

    
1318
For more information and installation instructions, see the README file
1319
in the source archive.
1320

    
1321
.. vim: set textwidth=72 :
1322
.. Local Variables:
1323
.. mode: rst
1324
.. fill-column: 72
1325
.. End: