Revision 73225861
b/doc/admin.rst | ||
---|---|---|
5 | 5 |
|
6 | 6 |
.. contents:: |
7 | 7 |
|
8 |
.. highlight:: text
|
|
8 |
.. highlight:: shell-example
|
|
9 | 9 |
|
10 | 10 |
Introduction |
11 | 11 |
------------ |
... | ... | |
173 | 173 |
information store for helping with cluster administration, for example |
174 | 174 |
by attaching owner information to each instance after it's created:: |
175 | 175 |
|
176 |
gnt-instance add … instance1
|
|
177 |
gnt-instance add-tags instance1 owner:user2
|
|
176 |
$ gnt-instance add … %instance1%
|
|
177 |
$ gnt-instance add-tags %instance1% %owner:user2%
|
|
178 | 178 |
|
179 | 179 |
And then by listing each instance and its tags, this information could |
180 | 180 |
be used for contacting the users of each instance. |
... | ... | |
184 | 184 |
|
185 | 185 |
While not directly visible by an end-user, it's useful to know that a |
186 | 186 |
basic cluster operation (e.g. starting an instance) is represented |
187 |
internall by Ganeti as an *OpCode* (abbreviation from operation |
|
187 |
internally by Ganeti as an *OpCode* (abbreviation from operation
|
|
188 | 188 |
code). These OpCodes are executed as part of a *Job*. The OpCodes in a |
189 | 189 |
single Job are processed serially by Ganeti, but different Jobs will be |
190 | 190 |
processed (depending on resource availability) in parallel. They will |
... | ... | |
238 | 238 |
|
239 | 239 |
With the above parameters in mind, the command is:: |
240 | 240 |
|
241 |
gnt-instance add \ |
|
242 |
-n TARGET_NODE:SECONDARY_NODE \
|
|
243 |
-o OS_TYPE \
|
|
244 |
-t DISK_TEMPLATE -s DISK_SIZE \
|
|
245 |
INSTANCE_NAME
|
|
241 |
$ gnt-instance add \
|
|
242 |
-n %TARGET_NODE%:%SECONDARY_NODE% \
|
|
243 |
-o %OS_TYPE% \
|
|
244 |
-t %DISK_TEMPLATE% -s %DISK_SIZE% \
|
|
245 |
%INSTANCE_NAME%
|
|
246 | 246 |
|
247 | 247 |
The instance name must be resolvable (e.g. exist in DNS) and usually |
248 | 248 |
points to an address in the same subnet as the cluster itself. |
... | ... | |
265 | 265 |
single disk of 50GB and the default memory size, having primary node |
266 | 266 |
``node1`` and secondary node ``node3``, use the following command:: |
267 | 267 |
|
268 |
gnt-instance add -n node1:node3 -o debootstrap -t drbd \
|
|
268 |
$ gnt-instance add -n node1:node3 -o debootstrap -t drbd -s 50G \
|
|
269 | 269 |
instance1 |
270 | 270 |
|
271 | 271 |
There is a also a command for batch instance creation from a |
... | ... | |
282 | 282 |
irreversible and destroys all the contents of your instance. Use with |
283 | 283 |
care:: |
284 | 284 |
|
285 |
gnt-instance remove INSTANCE_NAME
|
|
285 |
$ gnt-instance remove %INSTANCE_NAME%
|
|
286 | 286 |
|
287 | 287 |
.. _instance-startup-label: |
288 | 288 |
|
... | ... | |
292 | 292 |
Instances are automatically started at instance creation time. To |
293 | 293 |
manually start one which is currently stopped you can run:: |
294 | 294 |
|
295 |
gnt-instance startup INSTANCE_NAME
|
|
295 |
$ gnt-instance startup %INSTANCE_NAME%
|
|
296 | 296 |
|
297 | 297 |
Ganeti will start an instance with up to its maximum instance memory. If |
298 | 298 |
not enough memory is available Ganeti will use all the available memory |
... | ... | |
303 | 303 |
stopped state ``offline``. In this case, you will first have to |
304 | 304 |
put it back to online mode by running:: |
305 | 305 |
|
306 |
gnt-instance modify --online INSTANCE_NAME
|
|
306 |
$ gnt-instance modify --online %INSTANCE_NAME%
|
|
307 | 307 |
|
308 | 308 |
The command to stop the running instance is:: |
309 | 309 |
|
310 |
gnt-instance shutdown INSTANCE_NAME
|
|
310 |
$ gnt-instance shutdown %INSTANCE_NAME%
|
|
311 | 311 |
|
312 | 312 |
If you want to shut the instance down more permanently, so that it |
313 | 313 |
does not require dynamically allocated resources (memory and vcpus), |
314 | 314 |
after shutting down an instance, execute the following:: |
315 | 315 |
|
316 |
gnt-instance modify --offline INSTANCE_NAME
|
|
316 |
$ gnt-instance modify --offline %INSTANCE_NAME%
|
|
317 | 317 |
|
318 | 318 |
.. warning:: Do not use the Xen or KVM commands directly to stop |
319 | 319 |
instances. If you run for example ``xm shutdown`` or ``xm destroy`` |
... | ... | |
330 | 330 |
|
331 | 331 |
The command to see all the instances configured and their status is:: |
332 | 332 |
|
333 |
gnt-instance list |
|
333 |
$ gnt-instance list
|
|
334 | 334 |
|
335 | 335 |
The command can return a custom set of information when using the ``-o`` |
336 | 336 |
option (as always, check the manpage for a detailed specification). Each |
... | ... | |
339 | 339 |
|
340 | 340 |
To get more detailed information about an instance, you can run:: |
341 | 341 |
|
342 |
gnt-instance info INSTANCE
|
|
342 |
$ gnt-instance info %INSTANCE%
|
|
343 | 343 |
|
344 | 344 |
which will give a multi-line block of information about the instance, |
345 | 345 |
it's hardware resources (especially its disks and their redundancy |
... | ... | |
359 | 359 |
If you find that you need more memory on a node any instance can be |
360 | 360 |
manually resized without downtime, with the command:: |
361 | 361 |
|
362 |
gnt-instance modify -m SIZE INSTANCE_NAME
|
|
362 |
$ gnt-instance modify -m %SIZE% %INSTANCE_NAME%
|
|
363 | 363 |
|
364 | 364 |
The same command can also be used to increase the memory available on an |
365 | 365 |
instance, provided that enough free memory is available on its node, and |
... | ... | |
375 | 375 |
configuration, which then you can backup, or import into another |
376 | 376 |
cluster. The way to export an instance is:: |
377 | 377 |
|
378 |
gnt-backup export -n TARGET_NODE INSTANCE_NAME
|
|
378 |
$ gnt-backup export -n %TARGET_NODE% %INSTANCE_NAME%
|
|
379 | 379 |
|
380 | 380 |
|
381 | 381 |
The target node can be any node in the cluster with enough space under |
... | ... | |
389 | 389 |
Importing an instance is similar to creating a new one, but additionally |
390 | 390 |
one must specify the location of the snapshot. The command is:: |
391 | 391 |
|
392 |
gnt-backup import -n TARGET_NODE \
|
|
393 |
--src-node=NODE --src-dir=DIR INSTANCE_NAME
|
|
392 |
$ gnt-backup import -n %TARGET_NODE% \
|
|
393 |
--src-node=%NODE% --src-dir=%DIR% %INSTANCE_NAME%
|
|
394 | 394 |
|
395 | 395 |
By default, parameters will be read from the export information, but you |
396 | 396 |
can of course pass them in via the command line - most of the options |
... | ... | |
408 | 408 |
then create a Ganeti instance in the usual way, except that instead of |
409 | 409 |
passing the disk information you specify the current volumes:: |
410 | 410 |
|
411 |
gnt-instance add -t plain -n HOME_NODE ... \
|
|
412 |
--disk 0:adopt=lv_name[,vg=vg_name] INSTANCE_NAME
|
|
411 |
$ gnt-instance add -t plain -n %HOME_NODE% ... \
|
|
412 |
--disk 0:adopt=%lv_name%[,vg=%vg_name%] %INSTANCE_NAME%
|
|
413 | 413 |
|
414 | 414 |
This will take over the given logical volumes, rename them to the Ganeti |
415 | 415 |
standard (UUID-based), and without installing the OS on them start |
... | ... | |
504 | 504 |
failed and it's not up anymore. Doing it is really easy, on the master |
505 | 505 |
node you can just run:: |
506 | 506 |
|
507 |
gnt-instance failover INSTANCE_NAME
|
|
507 |
$ gnt-instance failover %INSTANCE_NAME%
|
|
508 | 508 |
|
509 | 509 |
That's it. After the command completes the secondary node is now the |
510 | 510 |
primary, and vice-versa. |
... | ... | |
519 | 519 |
iallocator plugin. If you omit both, the default iallocator will be |
520 | 520 |
used to determine the target node:: |
521 | 521 |
|
522 |
gnt-instance failover -n TARGET_NODE INSTANCE_NAME
|
|
522 |
$ gnt-instance failover -n %TARGET_NODE% %INSTANCE_NAME%
|
|
523 | 523 |
|
524 | 524 |
Live migrating an instance |
525 | 525 |
~~~~~~~~~~~~~~~~~~~~~~~~~~ |
... | ... | |
528 | 528 |
both its nodes are running fine, you can at migrate it over to its |
529 | 529 |
secondary node, without downtime. On the master node you need to run:: |
530 | 530 |
|
531 |
gnt-instance migrate INSTANCE_NAME
|
|
531 |
$ gnt-instance migrate %INSTANCE_NAME%
|
|
532 | 532 |
|
533 | 533 |
The current load on the instance and its memory size will influence how |
534 | 534 |
long the migration will take. In any case, for both KVM and Xen |
... | ... | |
546 | 546 |
iallocator plugin. If you omit both, the default iallocator will be |
547 | 547 |
used to determine the target node:: |
548 | 548 |
|
549 |
gnt-instance migrate -n TARGET_NODE INSTANCE_NAME
|
|
549 |
$ gnt-instance migrate -n %TARGET_NODE% %INSTANCE_NAME%
|
|
550 | 550 |
|
551 | 551 |
Moving an instance (offline) |
552 | 552 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
... | ... | |
554 | 554 |
If an instance has not been create as mirrored, then the only way to |
555 | 555 |
change its primary node is to execute the move command:: |
556 | 556 |
|
557 |
gnt-instance move -n NEW_NODE INSTANCE
|
|
557 |
$ gnt-instance move -n %NEW_NODE% %INSTANCE%
|
|
558 | 558 |
|
559 | 559 |
This has a few prerequisites: |
560 | 560 |
|
... | ... | |
583 | 583 |
is common that after a complete disk failure, any LVM command aborts |
584 | 584 |
with an error similar to:: |
585 | 585 |
|
586 |
# vgs
|
|
586 |
$ vgs
|
|
587 | 587 |
/dev/sdb1: read failed after 0 of 4096 at 0: Input/output error |
588 |
/dev/sdb1: read failed after 0 of 4096 at 750153695232: Input/output |
|
589 |
error |
|
588 |
/dev/sdb1: read failed after 0 of 4096 at 750153695232: Input/output error |
|
590 | 589 |
/dev/sdb1: read failed after 0 of 4096 at 0: Input/output error |
591 |
Couldn't find device with uuid |
|
592 |
't30jmN-4Rcf-Fr5e-CURS-pawt-z0jU-m1TgeJ'. |
|
590 |
Couldn't find device with uuid 't30jmN-4Rcf-Fr5e-CURS-pawt-z0jU-m1TgeJ'. |
|
593 | 591 |
Couldn't find all physical volumes for volume group xenvg. |
594 | 592 |
|
595 | 593 |
Before restoring an instance's disks to healthy status, it's needed to |
... | ... | |
600 | 598 |
#. first, if the disk is completely gone and LVM commands exit with |
601 | 599 |
“Couldn't find device with uuid…” then you need to run the command:: |
602 | 600 |
|
603 |
vgreduce --removemissing VOLUME_GROUP
|
|
601 |
$ vgreduce --removemissing %VOLUME_GROUP%
|
|
604 | 602 |
|
605 | 603 |
#. after the above command, the LVM commands should be executing |
606 | 604 |
normally (warnings are normal, but the commands will not fail |
... | ... | |
609 | 607 |
#. if the failed disk is still visible in the output of the ``pvs`` |
610 | 608 |
command, you need to deactivate it from allocations by running:: |
611 | 609 |
|
612 |
pvs -x n /dev/DISK
|
|
610 |
$ pvs -x n /dev/%DISK%
|
|
613 | 611 |
|
614 | 612 |
At this point, the volume group should be consistent and any bad |
615 | 613 |
physical volumes should not longer be available for allocation. |
... | ... | |
638 | 636 |
For all three cases, the ``replace-disks`` operation can be used:: |
639 | 637 |
|
640 | 638 |
# re-create disks on the primary node |
641 |
gnt-instance replace-disks -p INSTANCE_NAME
|
|
639 |
$ gnt-instance replace-disks -p %INSTANCE_NAME%
|
|
642 | 640 |
# re-create disks on the current secondary |
643 |
gnt-instance replace-disks -s INSTANCE_NAME
|
|
641 |
$ gnt-instance replace-disks -s %INSTANCE_NAME%
|
|
644 | 642 |
# change the secondary node, via manual specification |
645 |
gnt-instance replace-disks -n NODE INSTANCE_NAME
|
|
643 |
$ gnt-instance replace-disks -n %NODE% %INSTANCE_NAME%
|
|
646 | 644 |
# change the secondary node, via an iallocator script |
647 |
gnt-instance replace-disks -I SCRIPT INSTANCE_NAME
|
|
645 |
$ gnt-instance replace-disks -I %SCRIPT% %INSTANCE_NAME%
|
|
648 | 646 |
# since Ganeti 2.1: automatically fix the primary or secondary node |
649 |
gnt-instance replace-disks -a INSTANCE_NAME
|
|
647 |
$ gnt-instance replace-disks -a %INSTANCE_NAME%
|
|
650 | 648 |
|
651 | 649 |
Since the process involves copying all data from the working node to the |
652 | 650 |
target node, it will take a while, depending on the instance's disk |
... | ... | |
663 | 661 |
disks, after which a reinstall can be run, via the ``recreate-disks`` |
664 | 662 |
command:: |
665 | 663 |
|
666 |
gnt-instance recreate-disks INSTANCE
|
|
664 |
$ gnt-instance recreate-disks %INSTANCE%
|
|
667 | 665 |
|
668 | 666 |
Note that this will fail if the disks already exists. |
669 | 667 |
|
... | ... | |
675 | 673 |
modify`` command:: |
676 | 674 |
|
677 | 675 |
# start with a non-redundant instance |
678 |
gnt-instance add -t plain ... INSTANCE
|
|
676 |
$ gnt-instance add -t plain ... %INSTANCE%
|
|
679 | 677 |
|
680 | 678 |
# later convert it to redundant |
681 |
gnt-instance stop INSTANCE
|
|
682 |
gnt-instance modify -t drbd -n NEW_SECONDARY INSTANCE
|
|
683 |
gnt-instance start INSTANCE
|
|
679 |
$ gnt-instance stop %INSTANCE%
|
|
680 |
$ gnt-instance modify -t drbd -n %NEW_SECONDARY% %INSTANCE%
|
|
681 |
$ gnt-instance start %INSTANCE%
|
|
684 | 682 |
|
685 | 683 |
# and convert it back |
686 |
gnt-instance stop INSTANCE
|
|
687 |
gnt-instance modify -t plain INSTANCE
|
|
688 |
gnt-instance start INSTANCE
|
|
684 |
$ gnt-instance stop %INSTANCE%
|
|
685 |
$ gnt-instance modify -t plain %INSTANCE%
|
|
686 |
$ gnt-instance start %INSTANCE%
|
|
689 | 687 |
|
690 | 688 |
The conversion must be done while the instance is stopped, and |
691 | 689 |
converting from plain to drbd template presents a small risk, especially |
... | ... | |
706 | 704 |
inconsistent. The correct way to access an instance's disks is to run |
707 | 705 |
(on the master node, as usual) the command:: |
708 | 706 |
|
709 |
gnt-instance activate-disks INSTANCE
|
|
707 |
$ gnt-instance activate-disks %INSTANCE%
|
|
710 | 708 |
|
711 | 709 |
And then, *on the primary node of the instance*, access the device that |
712 | 710 |
gets created. For example, you could mount the given disks, then edit |
... | ... | |
715 | 713 |
Note that with partitioned disks (as opposed to whole-disk filesystems), |
716 | 714 |
you will need to use a tool like :manpage:`kpartx(8)`:: |
717 | 715 |
|
718 |
node1# gnt-instance activate-disks instance1 |
|
719 |
… |
|
720 |
node1# ssh node3 |
|
721 |
node3# kpartx -l /dev/… |
|
722 |
node3# kpartx -a /dev/… |
|
723 |
node3# mount /dev/mapper/… /mnt/ |
|
716 |
# on node1 |
|
717 |
$ gnt-instance activate-disks %instance1% |
|
718 |
node3:disk/0:… |
|
719 |
$ ssh node3 |
|
720 |
# on node 3 |
|
721 |
$ kpartx -l /dev/… |
|
722 |
$ kpartx -a /dev/… |
|
723 |
$ mount /dev/mapper/… /mnt/ |
|
724 | 724 |
# edit files under mnt as desired |
725 |
node3# umount /mnt/
|
|
726 |
node3# kpartx -d /dev/…
|
|
727 |
node3# exit
|
|
728 |
node1#
|
|
725 |
$ umount /mnt/
|
|
726 |
$ kpartx -d /dev/…
|
|
727 |
$ exit
|
|
728 |
# back to node 1
|
|
729 | 729 |
|
730 | 730 |
After you've finished you can deactivate them with the deactivate-disks |
731 | 731 |
command, which works in the same way:: |
732 | 732 |
|
733 |
gnt-instance deactivate-disks INSTANCE
|
|
733 |
$ gnt-instance deactivate-disks %INSTANCE%
|
|
734 | 734 |
|
735 | 735 |
Note that if any process started by you is still using the disks, the |
736 | 736 |
above command will error out, and you **must** cleanup and ensure that |
... | ... | |
742 | 742 |
|
743 | 743 |
The command to access a running instance's console is:: |
744 | 744 |
|
745 |
gnt-instance console INSTANCE_NAME
|
|
745 |
$ gnt-instance console %INSTANCE_NAME%
|
|
746 | 746 |
|
747 | 747 |
Use the console normally and then type ``^]`` when done, to exit. |
748 | 748 |
|
... | ... | |
754 | 754 |
|
755 | 755 |
There is a wrapper command for rebooting instances:: |
756 | 756 |
|
757 |
gnt-instance reboot instance2
|
|
757 |
$ gnt-instance reboot %instance2%
|
|
758 | 758 |
|
759 | 759 |
By default, this does the equivalent of shutting down and then starting |
760 | 760 |
the instance, but it accepts parameters to perform a soft-reboot (via |
... | ... | |
768 | 768 |
Should you have any problems with instance operating systems the command |
769 | 769 |
to see a complete status for all your nodes is:: |
770 | 770 |
|
771 |
gnt-os diagnose |
|
771 |
$ gnt-os diagnose
|
|
772 | 772 |
|
773 | 773 |
.. _instance-relocation-label: |
774 | 774 |
|
... | ... | |
780 | 780 |
steps:: |
781 | 781 |
|
782 | 782 |
# instance is located on A, B |
783 |
node1# gnt-instance replace -n nodeC instance1
|
|
783 |
$ gnt-instance replace -n %nodeC% %instance1%
|
|
784 | 784 |
# instance has moved from (A, B) to (A, C) |
785 | 785 |
# we now flip the primary/secondary nodes |
786 |
node1# gnt-instance migrate instance1
|
|
786 |
$ gnt-instance migrate %instance1%
|
|
787 | 787 |
# instance lives on (C, A) |
788 | 788 |
# we can then change A to D via: |
789 |
node1# gnt-instance replace -n nodeD instance1
|
|
789 |
$ gnt-instance replace -n %nodeD% %instance1%
|
|
790 | 790 |
|
791 | 791 |
Which brings it into the final configuration of ``(C, D)``. Note that we |
792 | 792 |
needed to do two replace-disks operation (two copies of the instance |
... | ... | |
805 | 805 |
It is at any time possible to extend the cluster with one more node, by |
806 | 806 |
using the node add operation:: |
807 | 807 |
|
808 |
gnt-node add NEW_NODE
|
|
808 |
$ gnt-node add %NEW_NODE%
|
|
809 | 809 |
|
810 | 810 |
If the cluster has a replication network defined, then you need to pass |
811 | 811 |
the ``-s REPLICATION_IP`` parameter to this option. |
... | ... | |
814 | 814 |
Ganeti configuration is broken, for example if it has been reinstalled |
815 | 815 |
by mistake:: |
816 | 816 |
|
817 |
gnt-node add --readd EXISTING_NODE
|
|
817 |
$ gnt-node add --readd %EXISTING_NODE%
|
|
818 | 818 |
|
819 | 819 |
This will reinitialise the node as if it's been newly added, but while |
820 | 820 |
keeping its existing configuration in the cluster (primary/secondary IP, |
... | ... | |
833 | 833 |
If you want to promote a different node to the master role (for whatever |
834 | 834 |
reason), run on any other master-candidate node the command:: |
835 | 835 |
|
836 |
gnt-cluster master-failover |
|
836 |
$ gnt-cluster master-failover
|
|
837 | 837 |
|
838 | 838 |
and the node you ran it on is now the new master. In case you try to run |
839 | 839 |
this on a non master-candidate node, you will get an error telling you |
... | ... | |
845 | 845 |
The ``gnt-node modify`` command can be used to select a new role:: |
846 | 846 |
|
847 | 847 |
# change to master candidate |
848 |
gnt-node modify -C yes NODE
|
|
848 |
$ gnt-node modify -C yes %NODE%
|
|
849 | 849 |
# change to drained status |
850 |
gnt-node modify -D yes NODE
|
|
850 |
$ gnt-node modify -D yes %NODE%
|
|
851 | 851 |
# change to offline status |
852 |
gnt-node modify -O yes NODE
|
|
852 |
$ gnt-node modify -O yes %NODE%
|
|
853 | 853 |
# change to regular mode (reset all flags) |
854 |
gnt-node modify -O no -D no -C no NODE
|
|
854 |
$ gnt-node modify -O no -D no -C no %NODE%
|
|
855 | 855 |
|
856 | 856 |
Note that the cluster requires that at any point in time, a certain |
857 | 857 |
number of nodes are master candidates, so changing from master candidate |
... | ... | |
876 | 876 |
commands (as seen in :ref:`instance-change-primary-label`) or the bulk |
877 | 877 |
per-node versions; these are:: |
878 | 878 |
|
879 |
gnt-node migrate NODE
|
|
880 |
gnt-node evacuate NODE
|
|
879 |
$ gnt-node migrate %NODE%
|
|
880 |
$ gnt-node evacuate -s %NODE%
|
|
881 | 881 |
|
882 | 882 |
Note that the instance “move” command doesn't currently have a node |
883 | 883 |
equivalent. |
... | ... | |
894 | 894 |
For the evacuation of secondary instances, a command called |
895 | 895 |
:command:`gnt-node evacuate` is provided and its syntax is:: |
896 | 896 |
|
897 |
gnt-node evacuate -I IALLOCATOR_SCRIPT NODE
|
|
898 |
gnt-node evacuate -n DESTINATION_NODE NODE
|
|
897 |
$ gnt-node evacuate -I %IALLOCATOR_SCRIPT% %NODE%
|
|
898 |
$ gnt-node evacuate -n %DESTINATION_NODE% %NODE%
|
|
899 | 899 |
|
900 | 900 |
The first version will compute the new secondary for each instance in |
901 | 901 |
turn using the given iallocator script, whereas the second one will |
... | ... | |
907 | 907 |
Once a node no longer has any instances (neither primary nor secondary), |
908 | 908 |
it's easy to remove it from the cluster:: |
909 | 909 |
|
910 |
gnt-node remove NODE_NAME
|
|
910 |
$ gnt-node remove %NODE_NAME%
|
|
911 | 911 |
|
912 | 912 |
This will deconfigure the node, stop the ganeti daemons on it and leave |
913 | 913 |
it hopefully like before it joined to the cluster. |
... | ... | |
927 | 927 |
logical volumes on a given node or on all nodes and their association to |
928 | 928 |
instances via the ``volumes`` command:: |
929 | 929 |
|
930 |
node1# gnt-node volumes
|
|
930 |
$ gnt-node volumes
|
|
931 | 931 |
Node PhysDev VG Name Size Instance |
932 | 932 |
node1 /dev/sdb1 xenvg e61fbc97-….disk0 512M instance17 |
933 | 933 |
node1 /dev/sdb1 xenvg ebd1a7d1-….disk0 512M instance19 |
... | ... | |
951 | 951 |
|
952 | 952 |
First is listing the backend storage and their space situation:: |
953 | 953 |
|
954 |
node1# gnt-node list-storage
|
|
954 |
$ gnt-node list-storage
|
|
955 | 955 |
Node Name Size Used Free |
956 | 956 |
node1 /dev/sda7 673.8G 0M 673.8G |
957 | 957 |
node1 /dev/sdb1 698.6G 1.5G 697.1G |
... | ... | |
961 | 961 |
The default is to list LVM physical volumes. It's also possible to list |
962 | 962 |
the LVM volume groups:: |
963 | 963 |
|
964 |
node1# gnt-node list-storage -t lvm-vg
|
|
964 |
$ gnt-node list-storage -t lvm-vg
|
|
965 | 965 |
Node Name Size |
966 | 966 |
node1 xenvg 1.3T |
967 | 967 |
node2 xenvg 1.3T |
... | ... | |
969 | 969 |
Next is repairing storage units, which is currently only implemented for |
970 | 970 |
volume groups and does the equivalent of ``vgreduce --removemissing``:: |
971 | 971 |
|
972 |
node1# gnt-node repair-storage node2 lvm-vg xenvg
|
|
972 |
$ gnt-node repair-storage %node2% lvm-vg xenvg
|
|
973 | 973 |
Sun Oct 25 22:21:45 2009 Repairing storage unit 'xenvg' on node2 ... |
974 | 974 |
|
975 | 975 |
Last is the modification of volume properties, which is (again) only |
976 | 976 |
implemented for LVM physical volumes and allows toggling the |
977 | 977 |
``allocatable`` value:: |
978 | 978 |
|
979 |
node1# gnt-node modify-storage --allocatable=no node2 lvm-pv /dev/sdb1
|
|
979 |
$ gnt-node modify-storage --allocatable=no %node2% lvm-pv /dev/%sdb1%
|
|
980 | 980 |
|
981 | 981 |
Use of the storage commands |
982 | 982 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
... | ... | |
1008 | 1008 |
One of the few commands that can be run on any node (not only the |
1009 | 1009 |
master) is the ``getmaster`` command:: |
1010 | 1010 |
|
1011 |
node2# gnt-cluster getmaster |
|
1011 |
# on node2 |
|
1012 |
$ gnt-cluster getmaster |
|
1012 | 1013 |
node1.example.com |
1013 |
node2# |
|
1014 | 1014 |
|
1015 | 1015 |
It is possible to query and change global cluster parameters via the |
1016 | 1016 |
``info`` and ``modify`` commands:: |
1017 | 1017 |
|
1018 |
node1# gnt-cluster info
|
|
1018 |
$ gnt-cluster info
|
|
1019 | 1019 |
Cluster name: cluster.example.com |
1020 | 1020 |
Cluster UUID: 07805e6f-f0af-4310-95f1-572862ee939c |
1021 | 1021 |
Creation time: 2009-09-25 05:04:15 |
... | ... | |
1055 | 1055 |
For detailed option list see the :manpage:`gnt-cluster(8)` man page. |
1056 | 1056 |
|
1057 | 1057 |
The cluster version can be obtained via the ``version`` command:: |
1058 |
node1# gnt-cluster version
|
|
1058 |
$ gnt-cluster version
|
|
1059 | 1059 |
Software version: 2.1.0 |
1060 | 1060 |
Internode protocol: 20 |
1061 | 1061 |
Configuration format: 2010000 |
... | ... | |
1070 | 1070 |
There are two commands provided for replicating files to all nodes of a |
1071 | 1071 |
cluster and for running commands on all the nodes:: |
1072 | 1072 |
|
1073 |
node1# gnt-cluster copyfile /path/to/file
|
|
1074 |
node1# gnt-cluster command ls -l /path/to/file
|
|
1073 |
$ gnt-cluster copyfile %/path/to/file%
|
|
1074 |
$ gnt-cluster command %ls -l /path/to/file%
|
|
1075 | 1075 |
|
1076 | 1076 |
These are simple wrappers over scp/ssh and more advanced usage can be |
1077 | 1077 |
obtained using :manpage:`dsh(1)` and similar commands. But they are |
... | ... | |
1085 | 1085 |
highlighting any issues. In normal operation, this command should return |
1086 | 1086 |
no ``ERROR`` messages:: |
1087 | 1087 |
|
1088 |
node1# gnt-cluster verify
|
|
1088 |
$ gnt-cluster verify
|
|
1089 | 1089 |
Sun Oct 25 23:08:58 2009 * Verifying global settings |
1090 | 1090 |
Sun Oct 25 23:08:58 2009 * Gathering data (2 nodes) |
1091 | 1091 |
Sun Oct 25 23:09:00 2009 * Verifying node status |
... | ... | |
1101 | 1101 |
disks have the correct status based on the desired instance state |
1102 | 1102 |
(up/down):: |
1103 | 1103 |
|
1104 |
node1# gnt-cluster verify-disks
|
|
1104 |
$ gnt-cluster verify-disks
|
|
1105 | 1105 |
|
1106 | 1106 |
Note that this command will show no output when disks are healthy. |
1107 | 1107 |
|
... | ... | |
1109 | 1109 |
recorded disk size and the actual disk size (disk size information is |
1110 | 1110 |
needed for proper activation and growth of DRBD-based disks):: |
1111 | 1111 |
|
1112 |
node1# gnt-cluster repair-disk-sizes
|
|
1112 |
$ gnt-cluster repair-disk-sizes
|
|
1113 | 1113 |
Sun Oct 25 23:13:16 2009 - INFO: Disk 0 of instance instance1 has mismatched size, correcting: recorded 512, actual 2048 |
1114 | 1114 |
Sun Oct 25 23:13:17 2009 - WARNING: Invalid result from node node4, ignoring node results |
1115 | 1115 |
|
... | ... | |
1125 | 1125 |
configuration files, you can force an push of the master configuration |
1126 | 1126 |
to all other nodes via the ``redist-conf`` command:: |
1127 | 1127 |
|
1128 |
node1# gnt-cluster redist-conf |
|
1129 |
node1# |
|
1128 |
$ gnt-cluster redist-conf |
|
1130 | 1129 |
|
1131 | 1130 |
This command will be silent unless there are problems sending updates to |
1132 | 1131 |
the other nodes. |
... | ... | |
1139 | 1138 |
``rename`` command. If only the IP has changed, you need to pass the |
1140 | 1139 |
current name and Ganeti will realise its IP has changed:: |
1141 | 1140 |
|
1142 |
node1# gnt-cluster rename cluster.example.com
|
|
1141 |
$ gnt-cluster rename %cluster.example.com%
|
|
1143 | 1142 |
This will rename the cluster to 'cluster.example.com'. If |
1144 | 1143 |
you are connected over the network to the cluster name, the operation |
1145 | 1144 |
is very dangerous as the IP address will be removed from the node and |
1146 | 1145 |
the change may not go through. Continue? |
1147 |
y/[n]/?: y
|
|
1146 |
y/[n]/?: %y%
|
|
1148 | 1147 |
Failure: prerequisites not met for this operation: |
1149 | 1148 |
Neither the name nor the IP address of the cluster has changed |
1150 | 1149 |
|
... | ... | |
1157 | 1156 |
The job queue execution in Ganeti 2.0 and higher can be inspected, |
1158 | 1157 |
suspended and resumed via the ``queue`` command:: |
1159 | 1158 |
|
1160 |
node1~# gnt-cluster queue info
|
|
1159 |
$ gnt-cluster queue info
|
|
1161 | 1160 |
The drain flag is unset |
1162 |
node1~# gnt-cluster queue drain
|
|
1163 |
node1~# gnt-instance stop instance1
|
|
1161 |
$ gnt-cluster queue drain
|
|
1162 |
$ gnt-instance stop %instance1%
|
|
1164 | 1163 |
Failed to submit job for instance1: Job queue is drained, refusing job |
1165 |
node1~# gnt-cluster queue info
|
|
1164 |
$ gnt-cluster queue info
|
|
1166 | 1165 |
The drain flag is set |
1167 |
node1~# gnt-cluster queue undrain
|
|
1166 |
$ gnt-cluster queue undrain
|
|
1168 | 1167 |
|
1169 | 1168 |
This is most useful if you have an active cluster and you need to |
1170 | 1169 |
upgrade the Ganeti software, or simply restart the software on any node: |
... | ... | |
1189 | 1188 |
forgotten. Thus there are some commands for automated control of the |
1190 | 1189 |
watcher: ``pause``, ``info`` and ``continue``:: |
1191 | 1190 |
|
1192 |
node1~# gnt-cluster watcher info
|
|
1191 |
$ gnt-cluster watcher info
|
|
1193 | 1192 |
The watcher is not paused. |
1194 |
node1~# gnt-cluster watcher pause 1h
|
|
1193 |
$ gnt-cluster watcher pause %1h%
|
|
1195 | 1194 |
The watcher is paused until Mon Oct 26 00:30:37 2009. |
1196 |
node1~# gnt-cluster watcher info
|
|
1195 |
$ gnt-cluster watcher info
|
|
1197 | 1196 |
The watcher is paused until Mon Oct 26 00:30:37 2009. |
1198 |
node1~# ganeti-watcher -d
|
|
1197 |
$ ganeti-watcher -d
|
|
1199 | 1198 |
2009-10-25 23:30:47,984: pid=28867 ganeti-watcher:486 DEBUG Pause has been set, exiting |
1200 |
node1~# gnt-cluster watcher continue
|
|
1199 |
$ gnt-cluster watcher continue
|
|
1201 | 1200 |
The watcher is no longer paused. |
1202 |
node1~# ganeti-watcher -d
|
|
1201 |
$ ganeti-watcher -d
|
|
1203 | 1202 |
2009-10-25 23:31:04,789: pid=28976 ganeti-watcher:345 DEBUG Archived 0 jobs, left 0 |
1204 | 1203 |
2009-10-25 23:31:05,884: pid=28976 ganeti-watcher:280 DEBUG Got data from cluster, writing instance status file |
1205 | 1204 |
2009-10-25 23:31:06,061: pid=28976 ganeti-watcher:150 DEBUG Data didn't change, just touching status file |
1206 |
node1~# gnt-cluster watcher info
|
|
1205 |
$ gnt-cluster watcher info
|
|
1207 | 1206 |
The watcher is not paused. |
1208 |
node1~# |
|
1209 | 1207 |
|
1210 | 1208 |
The exact details of the argument to the ``pause`` command are available |
1211 | 1209 |
in the manpage. |
... | ... | |
1307 | 1305 |
|
1308 | 1306 |
Tags can be added via ``add-tags``:: |
1309 | 1307 |
|
1310 |
gnt-instance add-tags INSTANCE a b c
|
|
1311 |
gnt-node add-tags INSTANCE a b c
|
|
1312 |
gnt-cluster add-tags a b c
|
|
1308 |
$ gnt-instance add-tags %INSTANCE% %a% %b% %c%
|
|
1309 |
$ gnt-node add-tags %INSTANCE% %a% %b% %c%
|
|
1310 |
$ gnt-cluster add-tags %a% %b% %c%
|
|
1313 | 1311 |
|
1314 | 1312 |
|
1315 | 1313 |
The above commands add three tags to an instance, to a node and to the |
... | ... | |
1322 | 1320 |
|
1323 | 1321 |
Tags can also be remove via a syntax very similar to the add one:: |
1324 | 1322 |
|
1325 |
gnt-instance remove-tags INSTANCE a b c
|
|
1323 |
$ gnt-instance remove-tags %INSTANCE% %a% %b% %c%
|
|
1326 | 1324 |
|
1327 | 1325 |
And listed via:: |
1328 | 1326 |
|
1329 |
gnt-instance list-tags |
|
1330 |
gnt-node list-tags |
|
1331 |
gnt-cluster list-tags |
|
1327 |
$ gnt-instance list-tags
|
|
1328 |
$ gnt-node list-tags
|
|
1329 |
$ gnt-cluster list-tags
|
|
1332 | 1330 |
|
1333 | 1331 |
Global tag search |
1334 | 1332 |
+++++++++++++++++ |
... | ... | |
1336 | 1334 |
It is also possible to execute a global search on the all tags defined |
1337 | 1335 |
in the cluster configuration, via a cluster command:: |
1338 | 1336 |
|
1339 |
gnt-cluster search-tags REGEXP
|
|
1337 |
$ gnt-cluster search-tags %REGEXP%
|
|
1340 | 1338 |
|
1341 | 1339 |
The parameter expected is a regular expression (see |
1342 | 1340 |
:manpage:`regex(7)`). This will return all tags that match the search, |
1343 | 1341 |
together with the object they are defined in (the names being show in a |
1344 | 1342 |
hierarchical kind of way):: |
1345 | 1343 |
|
1346 |
node1# gnt-cluster search-tags o
|
|
1344 |
$ gnt-cluster search-tags %o%
|
|
1347 | 1345 |
/cluster foo |
1348 | 1346 |
/instances/instance1 owner:bar |
1349 | 1347 |
|
... | ... | |
1357 | 1355 |
|
1358 | 1356 |
First is the job list command:: |
1359 | 1357 |
|
1360 |
node1# gnt-job list
|
|
1358 |
$ gnt-job list
|
|
1361 | 1359 |
17771 success INSTANCE_QUERY_DATA |
1362 | 1360 |
17773 success CLUSTER_VERIFY_DISKS |
1363 | 1361 |
17775 success CLUSTER_REPAIR_DISK_SIZES |
... | ... | |
1368 | 1366 |
More detailed information about a job can be found via the ``info`` |
1369 | 1367 |
command:: |
1370 | 1368 |
|
1371 |
node1# gnt-job info 17776
|
|
1369 |
$ gnt-job info %17776%
|
|
1372 | 1370 |
Job ID: 17776 |
1373 | 1371 |
Status: error |
1374 | 1372 |
Received: 2009-10-25 23:18:02.180569 |
... | ... | |
1391 | 1389 |
job, similar to the log that one get from the ``gnt-`` commands, via the |
1392 | 1390 |
watch command:: |
1393 | 1391 |
|
1394 |
node1# gnt-instance add --submit … instance1
|
|
1392 |
$ gnt-instance add --submit … %instance1%
|
|
1395 | 1393 |
JobID: 17818 |
1396 |
node1# gnt-job watch 17818
|
|
1394 |
$ gnt-job watch %17818%
|
|
1397 | 1395 |
Output from job 17818 follows |
1398 | 1396 |
----------------------------- |
1399 | 1397 |
Mon Oct 26 00:22:48 2009 - INFO: Selected nodes for instance instance1 via iallocator dumb: node1, node2 |
... | ... | |
1404 | 1402 |
Mon Oct 26 00:23:03 2009 creating os for instance instance1 on node node1 |
1405 | 1403 |
Mon Oct 26 00:23:03 2009 * running the instance OS create scripts... |
1406 | 1404 |
Mon Oct 26 00:23:13 2009 * starting instance... |
1407 |
node1#
|
|
1405 |
$
|
|
1408 | 1406 |
|
1409 | 1407 |
This is useful if you need to follow a job's progress from multiple |
1410 | 1408 |
terminals. |
1411 | 1409 |
|
1412 | 1410 |
A job that has not yet started to run can be canceled:: |
1413 | 1411 |
|
1414 |
node1# gnt-job cancel 17810
|
|
1412 |
$ gnt-job cancel %17810%
|
|
1415 | 1413 |
|
1416 | 1414 |
But not one that has already started execution:: |
1417 | 1415 |
|
1418 |
node1# gnt-job cancel 17805
|
|
1416 |
$ gnt-job cancel %17805%
|
|
1419 | 1417 |
Job 17805 is no longer waiting in the queue |
1420 | 1418 |
|
1421 | 1419 |
There are two queues for jobs: the *current* and the *archive* |
... | ... | |
1450 | 1448 |
strategies, etc. In order to accomplish this, mark these nodes as |
1451 | 1449 |
non-``vm_capable``:: |
1452 | 1450 |
|
1453 |
node1# gnt-node modify --vm-capable=no node3
|
|
1451 |
$ gnt-node modify --vm-capable=no %node3%
|
|
1454 | 1452 |
|
1455 | 1453 |
The vm_capable status can be listed as usual via ``gnt-node list``:: |
1456 | 1454 |
|
1457 |
node1# gnt-node list -oname,vm_capable
|
|
1455 |
$ gnt-node list -oname,vm_capable
|
|
1458 | 1456 |
Node VMCapable |
1459 | 1457 |
node1 Y |
1460 | 1458 |
node2 Y |
... | ... | |
1482 | 1480 |
|
1483 | 1481 |
As usual, the node modify operation can change this flag:: |
1484 | 1482 |
|
1485 |
node1# gnt-node modify --auto-promote --master-capable=no node3
|
|
1483 |
$ gnt-node modify --auto-promote --master-capable=no %node3%
|
|
1486 | 1484 |
Fri Jan 7 06:23:07 2011 - INFO: Demoting from master candidate |
1487 | 1485 |
Fri Jan 7 06:23:08 2011 - INFO: Promoted nodes to master candidate role: node4 |
1488 | 1486 |
Modified node node3 |
... | ... | |
1491 | 1489 |
|
1492 | 1490 |
And the node list operation will list this flag:: |
1493 | 1491 |
|
1494 |
node1# gnt-node list -oname,master_capable node1 node2 node3
|
|
1492 |
$ gnt-node list -oname,master_capable %node1% %node2% %node3%
|
|
1495 | 1493 |
Node MasterCapable |
1496 | 1494 |
node1 Y |
1497 | 1495 |
node2 Y |
Also available in: Unified diff