code.grnet.gr Git - ganeti-local/blob - man/gnt-cluster.rst

   1 gnt-cluster(8) Ganeti | Version @GANETI_VERSION@
   2 ================================================
   3
   4 Name
   5 ----
   6
   7 gnt-cluster - Ganeti administration, cluster-wide
   8
   9 Synopsis
  10 --------
  11
  12 **gnt-cluster** {command} [arguments...]
  13
  14 DESCRIPTION
  15 -----------
  16
  17 The **gnt-cluster** is used for cluster-wide administration in the
  18 Ganeti system.
  19
  20 COMMANDS
  21 --------
  22
  23 ADD-TAGS
  24 ~~~~~~~~
  25
  26 **add-tags** [--from *file*] {*tag*...}
  27
  28 Add tags to the cluster. If any of the tags contains invalid
  29 characters, the entire operation will abort.
  30
  31 If the ``--from`` option is given, the list of tags will be
  32 extended with the contents of that file (each line becomes a tag).
  33 In this case, there is not need to pass tags on the command line
  34 (if you do, both sources will be used). A file name of - will be
  35 interpreted as stdin.
  36
  37 COMMAND
  38 ~~~~~~~
  39
  40 **command** [-n *node*] {*command*}
  41
  42 Executes a command on all nodes. If the option ``-n`` is not given,
  43 the command will be executed on all nodes, otherwise it will be
  44 executed only on the node(s) specified. Use the option multiple
  45 times for running it on multiple nodes, like::
  46
  47     # gnt-cluster command -n node1.example.com -n node2.example.com date
  48
  49 The command is executed serially on the selected nodes. If the
  50 master node is present in the list, the command will be executed
  51 last on the master. Regarding the other nodes, the execution order
  52 is somewhat alphabetic, so that node2.example.com will be earlier
  53 than node10.example.com but after node1.example.com.
  54
  55 So given the node names node1, node2, node3, node10, node11, with
  56 node3 being the master, the order will be: node1, node2, node10,
  57 node11, node3.
  58
  59 The command is constructed by concatenating all other command line
  60 arguments. For example, to list the contents of the /etc directory
  61 on all nodes, run::
  62
  63     # gnt-cluster command ls -l /etc
  64
  65 and the command which will be executed will be ``ls -l /etc``.
  66
  67 COPYFILE
  68 ~~~~~~~~
  69
  70 **copyfile** [--use-replication-network] [-n *node*] {*file*}
  71
  72 Copies a file to all or to some nodes. The argument specifies the
  73 source file (on the current system), the ``-n`` argument specifies
  74 the target node, or nodes if the option is given multiple times. If
  75 ``-n`` is not given at all, the file will be copied to all nodes.
  76 Passing the ``--use-replication-network`` option will cause the
  77 copy to be done over the replication network (only matters if the
  78 primary/secondary IPs are different). Example::
  79
  80     # gnt-cluster -n node1.example.com -n node2.example.com copyfile /tmp/test
  81
  82 This will copy the file /tmp/test from the current node to the two
  83 named nodes.
  84
  85 DESTROY
  86 ~~~~~~~
  87
  88 **destroy** {--yes-do-it}
  89
  90 Remove all configuration files related to the cluster, so that a
  91 **gnt-cluster init** can be done again afterwards.
  92
  93 Since this is a dangerous command, you are required to pass the
  94 argument *--yes-do-it.*
  95
  96 EPO
  97 ~~~
  98
  99 **epo** [--on] [--groups|--all] [--power-delay] *arguments*
 100
 101 Performs an emergency power-off on nodes given as arguments. If
 102 ``--groups`` is given, arguments are node groups. If ``--all`` is
 103 provided, the whole cluster will be shut down.
 104
 105 The ``--on`` flag recovers the cluster after an emergency power-off.
 106 When powering on the cluster you can use ``--power-delay`` to define the
 107 time in seconds (fractions allowed) waited between powering on
 108 individual nodes.
 109
 110 Please note that the master node will not be turned down or up
 111 automatically.  It will just be left in a state, where you can manully
 112 perform the shutdown of that one node. If the master is in the list of
 113 affected nodes and this is not a complete cluster emergency power-off
 114 (e.g. using ``--all``), you're required to do a master failover to
 115 another node not affected.
 116
 117 GETMASTER
 118 ~~~~~~~~~
 119
 120 **getmaster**
 121
 122 Displays the current master node.
 123
 124 INFO
 125 ~~~~
 126
 127 **info** [--roman]
 128
 129 Shows runtime cluster information: cluster name, architecture (32
 130 or 64 bit), master node, node list and instance list.
 131
 132 Passing the ``--roman`` option gnt-cluster info will try to print
 133 its integer fields in a latin friendly way. This allows further
 134 diffusion of Ganeti among ancient cultures.
 135
 136 INIT
 137 ~~~~
 138
 139 | **init**
 140 | [-s *secondary\_ip*]
 141 | [--vg-name *vg-name*]
 142 | [--master-netdev *interface-name*]
 143 | [-m *mac-prefix*]
 144 | [--no-lvm-storage]
 145 | [--no-etc-hosts]
 146 | [--no-ssh-init]
 147 | [--file-storage-dir *dir*]
 148 | [--enabled-hypervisors *hypervisors*]
 149 | [-t *hypervisor name*]
 150 | [--hypervisor-parameters *hypervisor*:*hv-param*=*value*[,*hv-param*=*value*...]]
 151 | [--backend-parameters *be-param*=*value* [,*be-param*=*value*...]]
 152 | [--nic-parameters *nic-param*=*value* [,*nic-param*=*value*...]]
 153 | [--maintain-node-health {yes \| no}]
 154 | [--uid-pool *user-id pool definition*]
 155 | [-I *default instance allocator*]
 156 | [--primary-ip-version *version*]
 157 | [--prealloc-wipe-disks {yes \| no}]
 158 | [--node-parameters *ndparams*]
 159 | {*clustername*}
 160
 161 This commands is only run once initially on the first node of the
 162 cluster. It will initialize the cluster configuration, setup the
 163 ssh-keys, start the daemons on the master node, etc. in order to have
 164 a working one-node cluster.
 165
 166 Note that the *clustername* is not any random name. It has to be
 167 resolvable to an IP address using DNS, and it is best if you give the
 168 fully-qualified domain name. This hostname must resolve to an IP
 169 address reserved exclusively for this purpose, i.e. not already in
 170 use.
 171
 172 The cluster can run in two modes: single-home or dual-homed. In the
 173 first case, all traffic (both public traffic, inter-node traffic
 174 and data replication traffic) goes over the same interface. In the
 175 dual-homed case, the data replication traffic goes over the second
 176 network. The ``-s`` option here marks the cluster as dual-homed and
 177 its parameter represents this node's address on the second network.
 178 If you initialise the cluster with ``-s``, all nodes added must
 179 have a secondary IP as well.
 180
 181 Note that for Ganeti it doesn't matter if the secondary network is
 182 actually a separate physical network, or is done using tunneling,
 183 etc. For performance reasons, it's recommended to use a separate
 184 network, of course.
 185
 186 The ``--vg-name`` option will let you specify a volume group
 187 different than "xenvg" for Ganeti to use when creating instance
 188 disks. This volume group must have the same name on all nodes. Once
 189 the cluster is initialized this can be altered by using the
 190 **modify** command. If you don't want to use lvm storage at all use
 191 the ``--no-lvm-storage`` option. Once the cluster is initialized
 192 you can change this setup with the **modify** command.
 193
 194 The ``--master-netdev`` option is useful for specifying a different
 195 interface on which the master will activate its IP address. It's
 196 important that all nodes have this interface because you'll need it
 197 for a master failover.
 198
 199 The ``-m`` option will let you specify a three byte prefix under
 200 which the virtual MAC addresses of your instances will be
 201 generated. The prefix must be specified in the format XX:XX:XX and
 202 the default is aa:00:00.
 203
 204 The ``--no-lvm-storage`` option allows you to initialize the
 205 cluster without lvm support. This means that only instances using
 206 files as storage backend will be possible to create. Once the
 207 cluster is initialized you can change this setup with the
 208 **modify** command.
 209
 210 The ``--no-etc-hosts`` option allows you to initialize the cluster
 211 without modifying the /etc/hosts file.
 212
 213 The ``--no-ssh-init`` option allows you to initialize the cluster
 214 without creating or distributing SSH key pairs.
 215
 216 The ``--file-storage-dir`` option allows you set the directory to
 217 use for storing the instance disk files when using file storage as
 218 backend for instance disks.
 219
 220 The ``--enabled-hypervisors`` option allows you to set the list of
 221 hypervisors that will be enabled for this cluster. Instance
 222 hypervisors can only be chosen from the list of enabled
 223 hypervisors, and the first entry of this list will be used by
 224 default. Currently, the following hypervisors are available:
 225
 226 The ``--prealloc-wipe-disks`` sets a cluster wide configuration
 227 value for wiping disks prior to allocation. This increases security
 228 on instance level as the instance can't access untouched data from
 229 it's underlying storage.
 230
 231
 232
 233
 234
 235 xen-pvm
 236     Xen PVM hypervisor
 237
 238 xen-hvm
 239     Xen HVM hypervisor
 240
 241 kvm
 242     Linux KVM hypervisor
 243
 244 chroot
 245     a simple chroot manager that starts chroot based on a script at the
 246     root of the filesystem holding the chroot
 247
 248 fake
 249     fake hypervisor for development/testing
 250
 251
 252 Either a single hypervisor name or a comma-separated list of
 253 hypervisor names can be specified. If this option is not specified,
 254 only the xen-pvm hypervisor is enabled by default.
 255
 256 The ``--hypervisor-parameters`` option allows you to set default
 257 hypervisor specific parameters for the cluster. The format of this
 258 option is the name of the hypervisor, followed by a colon and a
 259 comma-separated list of key=value pairs. The keys available for
 260 each hypervisors are detailed in the gnt-instance(8) man page, in
 261 the **add** command plus the following parameters which are only
 262 configurable globally (at cluster level):
 263
 264 migration\_port
 265     Valid for the Xen PVM and KVM hypervisors.
 266
 267     This options specifies the TCP port to use for live-migration. For
 268     Xen, the same port should be configured on all nodes in the
 269     ``/etc/xen/xend-config.sxp`` file, under the key
 270     "xend-relocation-port".
 271
 272 migration\_bandwidth
 273     Valid for the KVM hypervisor.
 274
 275     This option specifies the maximum bandwidth that KVM will use for
 276     instance live migrations. The value is in MiB/s.
 277
 278     This option is only effective with kvm versions >= 78 and qemu-kvm
 279     versions >= 0.10.0.
 280
 281
 282 The ``--backend-parameters`` option allows you to set the default
 283 backend parameters for the cluster. The parameter format is a
 284 comma-separated list of key=value pairs with the following
 285 supported keys:
 286
 287 vcpus
 288     Number of VCPUs to set for an instance by default, must be an
 289     integer, will be set to 1 if no specified.
 290
 291 memory
 292     Amount of memory to allocate for an instance by default, can be
 293     either an integer or an integer followed by a unit (M for mebibytes
 294     and G for gibibytes are supported), will be set to 128M if not
 295     specified.
 296
 297 auto\_balance
 298     Value of the auto\_balance flag for instances to use by default,
 299     will be set to true if not specified.
 300
 301
 302 The ``--nic-parameters`` option allows you to set the default nic
 303 parameters for the cluster. The parameter format is a
 304 comma-separated list of key=value pairs with the following
 305 supported keys:
 306
 307 mode
 308     The default nic mode, 'routed' or 'bridged'.
 309
 310 link
 311     In bridged mode the default NIC bridge. In routed mode it
 312     represents an hypervisor-vif-script dependent value to allow
 313     different instance groups. For example under the KVM default
 314     network script it is interpreted as a routing table number or
 315     name.
 316
 317
 318 The option ``--maintain-node-health`` allows to enable/disable
 319 automatic maintenance actions on nodes. Currently these include
 320 automatic shutdown of instances and deactivation of DRBD devices on
 321 offline nodes; in the future it might be extended to automatic
 322 removal of unknown LVM volumes, etc.
 323
 324 The ``--uid-pool`` option initializes the user-id pool. The
 325 *user-id pool definition* can contain a list of user-ids and/or a
 326 list of user-id ranges. The parameter format is a comma-separated
 327 list of numeric user-ids or user-id ranges. The ranges are defined
 328 by a lower and higher boundary, separated by a dash. The boundaries
 329 are inclusive. If the ``--uid-pool`` option is not supplied, the
 330 user-id pool is initialized to an empty list. An empty list means
 331 that the user-id pool feature is disabled.
 332
 333 The ``-I (--default-iallocator)`` option specifies the default
 334 instance allocator. The instance allocator will be used for
 335 operations like instance creation, instance and node migration,
 336 etc. when no manual override is specified. If this option is not
 337 specified, the default instance allocator will be blank, which
 338 means that relevant operations will require the administrator to
 339 manually specify either an instance allocator, or a set of nodes.
 340 The default iallocator can be changed later using the **modify**
 341 command.
 342
 343 The ``--primary-ip-version`` option specifies the IP version used
 344 for the primary address. Possible values are 4 and 6 for IPv4 and
 345 IPv6, respectively. This option is used when resolving node names
 346 and the cluster name.
 347
 348 The ``--node-parameters`` option allows you to set default node
 349 parameters for the cluster. Please see **ganeti**(7) for more
 350 information about supported key=value pairs.
 351
 352 LIST-TAGS
 353 ~~~~~~~~~
 354
 355 **list-tags**
 356
 357 List the tags of the cluster.
 358
 359 MASTER-FAILOVER
 360 ~~~~~~~~~~~~~~~
 361
 362 **master-failover** [--no-voting]
 363
 364 Failover the master role to the current node.
 365
 366 The ``--no-voting`` option skips the remote node agreement checks.
 367 This is dangerous, but necessary in some cases (for example failing
 368 over the master role in a 2 node cluster with the original master
 369 down). If the original master then comes up, it won't be able to
 370 start its master daemon because it won't have enough votes, but so
 371 won't the new master, if the master daemon ever needs a restart.
 372 You can pass ``--no-voting`` to **ganeti-masterd** on the new
 373 master to solve this problem, and run **gnt-cluster redist-conf**
 374 to make sure the cluster is consistent again.
 375
 376 MASTER-PING
 377 ~~~~~~~~~~~
 378
 379 **master-ping**
 380
 381 Checks if the master daemon is alive.
 382
 383 If the master daemon is alive and can respond to a basic query (the
 384 equivalent of **gnt-cluster info**), then the exit code of the
 385 command will be 0. If the master daemon is not alive (either due to
 386 a crash or because this is not the master node), the exit code will
 387 be 1.
 388
 389 MODIFY
 390 ~~~~~~
 391
 392 | **modify**
 393 | [--vg-name *vg-name*]
 394 | [--no-lvm-storage]
 395 | [--enabled-hypervisors *hypervisors*]
 396 | [--hypervisor-parameters *hypervisor*:*hv-param*=*value*[,*hv-param*=*value*...]]
 397 | [--backend-parameters *be-param*=*value* [,*be-param*=*value*...]]
 398 | [--nic-parameters *nic-param*=*value* [,*nic-param*=*value*...]]
 399 | [--uid-pool *user-id pool definition*]
 400 | [--add-uids *user-id pool definition*]
 401 | [--remove-uids *user-id pool definition*]
 402 | [-C *candidate\_pool\_size*]
 403 | [--maintain-node-health {yes \| no}]
 404 | [--prealloc-wipe-disks {yes \| no}]
 405 | [-I *default instance allocator*]
 406 | [--reserved-lvs=*NAMES*]
 407 | [--node-parameters *ndparams*]
 408 | [--master-netdev *interface-name*]
 409
 410 Modify the options for the cluster.
 411
 412 The ``--vg-name``, ``--no-lvm-storarge``, ``--enabled-hypervisors``,
 413 ``--hypervisor-parameters``, ``--backend-parameters``,
 414 ``--nic-parameters``, ``--maintain-node-health``,
 415 ``--prealloc-wipe-disks``, ``--uid-pool``, ``--node-parameters``,
 416 ``--master-netdev`` options are described in the **init** command.
 417
 418 The ``-C`` option specifies the ``candidate_pool_size`` cluster
 419 parameter. This is the number of nodes that the master will try to
 420 keep as master\_candidates. For more details about this role and
 421 other node roles, see the ganeti(7). If you increase the size, the
 422 master will automatically promote as many nodes as required and
 423 possible to reach the intended number.
 424
 425 The ``--add-uids`` and ``--remove-uids`` options can be used to
 426 modify the user-id pool by adding/removing a list of user-ids or
 427 user-id ranges.
 428
 429 The option ``--reserved-lvs`` specifies a list (comma-separated) of
 430 logical volume group names (regular expressions) that will be
 431 ignored by the cluster verify operation. This is useful if the
 432 volume group used for Ganeti is shared with the system for other
 433 uses. Note that it's not recommended to create and mark as ignored
 434 logical volume names which match Ganeti's own name format (starting
 435 with UUID and then .diskN), as this option only skips the
 436 verification, but not the actual use of the names given.
 437
 438 To remove all reserved logical volumes, pass in an empty argument
 439 to the option, as in ``--reserved-lvs=`` or ``--reserved-lvs ''``.
 440
 441 The ``-I`` is described in the **init** command. To clear the
 442 default iallocator, just pass an empty string ('').
 443
 444 QUEUE
 445 ~~~~~
 446
 447 **queue** {drain | undrain | info}
 448
 449 Change job queue properties.
 450
 451 The ``drain`` option sets the drain flag on the job queue. No new
 452 jobs will be accepted, but jobs already in the queue will be
 453 processed.
 454
 455 The ``undrain`` will unset the drain flag on the job queue. New
 456 jobs will be accepted.
 457
 458 The ``info`` option shows the properties of the job queue.
 459
 460 WATCHER
 461 ~~~~~~~
 462
 463 **watcher** {pause *duration* | continue | info}
 464
 465 Make the watcher pause or let it continue.
 466
 467 The ``pause`` option causes the watcher to pause for *duration*
 468 seconds.
 469
 470 The ``continue`` option will let the watcher continue.
 471
 472 The ``info`` option shows whether the watcher is currently paused.
 473
 474 redist-conf
 475 ~~~~~~~~~~~
 476
 477 **redist-conf** [--submit]
 478
 479 This command forces a full push of configuration files from the
 480 master node to the other nodes in the cluster. This is normally not
 481 needed, but can be run if the **verify** complains about
 482 configuration mismatches.
 483
 484 The ``--submit`` option is used to send the job to the master
 485 daemon but not wait for its completion. The job ID will be shown so
 486 that it can be examined via **gnt-job info**.
 487
 488 REMOVE-TAGS
 489 ~~~~~~~~~~~
 490
 491 **remove-tags** [--from *file*] {*tag*...}
 492
 493 Remove tags from the cluster. If any of the tags are not existing
 494 on the cluster, the entire operation will abort.
 495
 496 If the ``--from`` option is given, the list of tags to be removed will
 497 be extended with the contents of that file (each line becomes a tag).
 498 In this case, there is not need to pass tags on the command line (if
 499 you do, tags from both sources will be removed). A file name of - will
 500 be interpreted as stdin.
 501
 502 RENAME
 503 ~~~~~~
 504
 505 **rename** [-f] {*name*}
 506
 507 Renames the cluster and in the process updates the master IP
 508 address to the one the new name resolves to. At least one of either
 509 the name or the IP address must be different, otherwise the
 510 operation will be aborted.
 511
 512 Note that since this command can be dangerous (especially when run
 513 over SSH), the command will require confirmation unless run with
 514 the ``-f`` option.
 515
 516 RENEW-CRYPTO
 517 ~~~~~~~~~~~~
 518
 519 | **renew-crypto** [-f]
 520 | [--new-cluster-certificate] [--new-confd-hmac-key]
 521 | [--new-rapi-certificate] [--rapi-certificate *rapi-cert*]
 522 | [--new-cluster-domain-secret] [--cluster-domain-secret *filename*]
 523
 524 This command will stop all Ganeti daemons in the cluster and start
 525 them again once the new certificates and keys are replicated. The
 526 options ``--new-cluster-certificate`` and ``--new-confd-hmac-key``
 527 can be used to regenerate the cluster-internal SSL certificate
 528 respective the HMAC key used by ganeti-confd(8).
 529
 530 To generate a new self-signed RAPI certificate (used by
 531 ganeti-rapi(8)) specify ``--new-rapi-certificate``. If you want to
 532 use your own certificate, e.g. one signed by a certificate
 533 authority (CA), pass its filename to ``--rapi-certificate``.
 534
 535 ``--new-cluster-domain-secret`` generates a new, random cluster
 536 domain secret. ``--cluster-domain-secret`` reads the secret from a
 537 file. The cluster domain secret is used to sign information
 538 exchanged between separate clusters via a third party.
 539
 540 REPAIR-DISK-SIZES
 541 ~~~~~~~~~~~~~~~~~
 542
 543 **repair-disk-sizes** [instance...]
 544
 545 This command checks that the recorded size of the given instance's
 546 disks matches the actual size and updates any mismatches found.
 547 This is needed if the Ganeti configuration is no longer consistent
 548 with reality, as it will impact some disk operations. If no
 549 arguments are given, all instances will be checked.
 550
 551 Note that only active disks can be checked by this command; in case
 552 a disk cannot be activated it's advised to use
 553 **gnt-instance activate-disks --ignore-size ...** to force
 554 activation without regard to the current size.
 555
 556 When the all disk sizes are consistent, the command will return no
 557 output. Otherwise it will log details about the inconsistencies in
 558 the configuration.
 559
 560 SEARCH-TAGS
 561 ~~~~~~~~~~~
 562
 563 **search-tags** {*pattern*}
 564
 565 Searches the tags on all objects in the cluster (the cluster
 566 itself, the nodes and the instances) for a given pattern. The
 567 pattern is interpreted as a regular expression and a search will be
 568 done on it (i.e. the given pattern is not anchored to the beggining
 569 of the string; if you want that, prefix the pattern with ^).
 570
 571 If no tags are matching the pattern, the exit code of the command
 572 will be one. If there is at least one match, the exit code will be
 573 zero. Each match is listed on one line, the object and the tag
 574 separated by a space. The cluster will be listed as /cluster, a
 575 node will be listed as /nodes/*name*, and an instance as
 576 /instances/*name*. Example:
 577
 578 ::
 579
 580     # gnt-cluster search-tags time
 581     /cluster ctime:2007-09-01
 582     /nodes/node1.example.com mtime:2007-10-04
 583
 584 VERIFY
 585 ~~~~~~
 586
 587 **verify** [--no-nplus1-mem]
 588
 589 Verify correctness of cluster configuration. This is safe with
 590 respect to running instances, and incurs no downtime of the
 591 instances.
 592
 593 If the ``--no-nplus1-mem`` option is given, Ganeti won't check
 594 whether if it loses a node it can restart all the instances on
 595 their secondaries (and report an error otherwise).
 596
 597 VERIFY-DISKS
 598 ~~~~~~~~~~~~
 599
 600 **verify-disks**
 601
 602 The command checks which instances have degraded DRBD disks and
 603 activates the disks of those instances.
 604
 605 This command is run from the **ganeti-watcher** tool, which also
 606 has a different, complementary algorithm for doing this check.
 607 Together, these two should ensure that DRBD disks are kept
 608 consistent.
 609
 610 VERSION
 611 ~~~~~~~
 612
 613 **version**
 614
 615 Show the cluster version.