code.grnet.gr Git - ganeti-local/blob - doc/walkthrough.rst

   1 Ganeti walk-through
   2 ===================
   3
   4 Documents Ganeti version |version|
   5
   6 .. contents::
   7
   8 .. highlight:: shell-example
   9
  10 Introduction
  11 ------------
  12
  13 This document serves as a more example-oriented guide to Ganeti; while
  14 the administration guide shows a conceptual approach, here you will find
  15 a step-by-step example to managing instances and the cluster.
  16
  17 Our simulated, example cluster will have three machines, named
  18 ``node1``, ``node2``, ``node3``. Note that in real life machines will
  19 usually have FQDNs but here we use short names for brevity. We will use
  20 a secondary network for replication data, ``192.0.2.0/24``, with nodes
  21 having the last octet the same as their index. The cluster name will be
  22 ``example-cluster``. All nodes have the same simulated hardware
  23 configuration, two disks of 750GB, 32GB of memory and 4 CPUs.
  24
  25 On this cluster, we will create up to seven instances, named
  26 ``instance1`` to ``instance7``.
  27
  28
  29 Cluster creation
  30 ----------------
  31
  32 Follow the :doc:`install` document and prepare the nodes. Then it's time
  33 to initialise the cluster::
  34
  35   $ gnt-cluster init -s %192.0.2.1% --enabled-hypervisors=xen-pvm %example-cluster%
  36   $
  37
  38 The creation was fine. Let's check that one node we have is functioning
  39 correctly::
  40
  41   $ gnt-node list
  42   Node  DTotal DFree MTotal MNode MFree Pinst Sinst
  43   node1   1.3T  1.3T  32.0G  1.0G 30.5G     0     0
  44   $ gnt-cluster verify
  45   Mon Oct 26 02:08:51 2009 * Verifying global settings
  46   Mon Oct 26 02:08:51 2009 * Gathering data (1 nodes)
  47   Mon Oct 26 02:08:52 2009 * Verifying node status
  48   Mon Oct 26 02:08:52 2009 * Verifying instance status
  49   Mon Oct 26 02:08:52 2009 * Verifying orphan volumes
  50   Mon Oct 26 02:08:52 2009 * Verifying remaining instances
  51   Mon Oct 26 02:08:52 2009 * Verifying N+1 Memory redundancy
  52   Mon Oct 26 02:08:52 2009 * Other Notes
  53   Mon Oct 26 02:08:52 2009 * Hooks Results
  54   $
  55
  56 Since this proceeded correctly, let's add the other two nodes::
  57
  58   $ gnt-node add -s %192.0.2.2% %node2%
  59   -- WARNING --
  60   Performing this operation is going to replace the ssh daemon keypair
  61   on the target machine (node2) with the ones of the current one
  62   and grant full intra-cluster ssh root access to/from it
  63
  64   Unable to verify hostkey of host xen-devi-5.fra.corp.google.com:
  65   f7:…. Do you want to accept it?
  66   y/[n]/?: %y%
  67   Mon Oct 26 02:11:53 2009  Authentication to node2 via public key failed, trying password
  68   root password:
  69   Mon Oct 26 02:11:54 2009  - INFO: Node will be a master candidate
  70   $ gnt-node add -s %192.0.2.3% %node3%
  71   -- WARNING --
  72   Performing this operation is going to replace the ssh daemon keypair
  73   on the target machine (node3) with the ones of the current one
  74   and grant full intra-cluster ssh root access to/from it
  75
  76   …
  77   Mon Oct 26 02:12:43 2009  - INFO: Node will be a master candidate
  78
  79 Checking the cluster status again::
  80
  81   $ gnt-node list
  82   Node  DTotal DFree MTotal MNode MFree Pinst Sinst
  83   node1   1.3T  1.3T  32.0G  1.0G 30.5G     0     0
  84   node2   1.3T  1.3T  32.0G  1.0G 30.5G     0     0
  85   node3   1.3T  1.3T  32.0G  1.0G 30.5G     0     0
  86   $ gnt-cluster verify
  87   Mon Oct 26 02:15:14 2009 * Verifying global settings
  88   Mon Oct 26 02:15:14 2009 * Gathering data (3 nodes)
  89   Mon Oct 26 02:15:16 2009 * Verifying node status
  90   Mon Oct 26 02:15:16 2009 * Verifying instance status
  91   Mon Oct 26 02:15:16 2009 * Verifying orphan volumes
  92   Mon Oct 26 02:15:16 2009 * Verifying remaining instances
  93   Mon Oct 26 02:15:16 2009 * Verifying N+1 Memory redundancy
  94   Mon Oct 26 02:15:16 2009 * Other Notes
  95   Mon Oct 26 02:15:16 2009 * Hooks Results
  96   $
  97
  98 And let's check that we have a valid OS::
  99
 100   $ gnt-os list
 101   Name
 102   debootstrap
 103   node1#
 104
 105 Running a burn-in
 106 -----------------
 107
 108 Now that the cluster is created, it is time to check that the hardware
 109 works correctly, that the hypervisor can actually create instances,
 110 etc. This is done via the debootstrap tool as described in the admin
 111 guide. Similar output lines are replaced with ``…`` in the below log::
 112
 113   $ /usr/lib/ganeti/tools/burnin -o debootstrap -p instance{1..5}
 114   - Testing global parameters
 115   - Creating instances
 116     * instance instance1
 117       on node1, node2
 118     * instance instance2
 119       on node2, node3
 120     …
 121     * instance instance5
 122       on node2, node3
 123     * Submitted job ID(s) 157, 158, 159, 160, 161
 124       waiting for job 157 for instance1
 125       …
 126       waiting for job 161 for instance5
 127   - Replacing disks on the same nodes
 128     * instance instance1
 129       run replace_on_secondary
 130       run replace_on_primary
 131     …
 132     * instance instance5
 133       run replace_on_secondary
 134       run replace_on_primary
 135     * Submitted job ID(s) 162, 163, 164, 165, 166
 136       waiting for job 162 for instance1
 137       …
 138   - Changing the secondary node
 139     * instance instance1
 140       run replace_new_secondary node3
 141     * instance instance2
 142       run replace_new_secondary node1
 143     …
 144     * instance instance5
 145       run replace_new_secondary node1
 146     * Submitted job ID(s) 167, 168, 169, 170, 171
 147       waiting for job 167 for instance1
 148       …
 149   - Growing disks
 150     * instance instance1
 151       increase disk/0 by 128 MB
 152     …
 153     * instance instance5
 154       increase disk/0 by 128 MB
 155     * Submitted job ID(s) 173, 174, 175, 176, 177
 156       waiting for job 173 for instance1
 157       …
 158   - Failing over instances
 159     * instance instance1
 160     …
 161     * instance instance5
 162     * Submitted job ID(s) 179, 180, 181, 182, 183
 163       waiting for job 179 for instance1
 164       …
 165   - Migrating instances
 166     * instance instance1
 167       migration and migration cleanup
 168     …
 169     * instance instance5
 170       migration and migration cleanup
 171     * Submitted job ID(s) 184, 185, 186, 187, 188
 172       waiting for job 184 for instance1
 173       …
 174   - Exporting and re-importing instances
 175     * instance instance1
 176       export to node node3
 177       remove instance
 178       import from node3 to node1, node2
 179       remove export
 180     …
 181     * instance instance5
 182       export to node node1
 183       remove instance
 184       import from node1 to node2, node3
 185       remove export
 186     * Submitted job ID(s) 196, 197, 198, 199, 200
 187       waiting for job 196 for instance1
 188       …
 189   - Reinstalling instances
 190     * instance instance1
 191       reinstall without passing the OS
 192       reinstall specifying the OS
 193     …
 194     * instance instance5
 195       reinstall without passing the OS
 196       reinstall specifying the OS
 197     * Submitted job ID(s) 203, 204, 205, 206, 207
 198       waiting for job 203 for instance1
 199       …
 200   - Rebooting instances
 201     * instance instance1
 202       reboot with type 'hard'
 203       reboot with type 'soft'
 204       reboot with type 'full'
 205     …
 206     * instance instance5
 207       reboot with type 'hard'
 208       reboot with type 'soft'
 209       reboot with type 'full'
 210     * Submitted job ID(s) 208, 209, 210, 211, 212
 211       waiting for job 208 for instance1
 212     …
 213   - Adding and removing disks
 214     * instance instance1
 215       adding a disk
 216       removing last disk
 217     …
 218     * instance instance5
 219       adding a disk
 220       removing last disk
 221     * Submitted job ID(s) 213, 214, 215, 216, 217
 222       waiting for job 213 for instance1
 223       …
 224   - Adding and removing NICs
 225     * instance instance1
 226       adding a NIC
 227       removing last NIC
 228     …
 229     * instance instance5
 230       adding a NIC
 231       removing last NIC
 232     * Submitted job ID(s) 218, 219, 220, 221, 222
 233       waiting for job 218 for instance1
 234       …
 235   - Activating/deactivating disks
 236     * instance instance1
 237       activate disks when online
 238       activate disks when offline
 239       deactivate disks (when offline)
 240     …
 241     * instance instance5
 242       activate disks when online
 243       activate disks when offline
 244       deactivate disks (when offline)
 245     * Submitted job ID(s) 223, 224, 225, 226, 227
 246       waiting for job 223 for instance1
 247       …
 248   - Stopping and starting instances
 249     * instance instance1
 250     …
 251     * instance instance5
 252     * Submitted job ID(s) 230, 231, 232, 233, 234
 253       waiting for job 230 for instance1
 254       …
 255   - Removing instances
 256     * instance instance1
 257     …
 258     * instance instance5
 259     * Submitted job ID(s) 235, 236, 237, 238, 239
 260       waiting for job 235 for instance1
 261       …
 262   $
 263
 264 You can see in the above what operations the burn-in does. Ideally, the
 265 burn-in log would proceed successfully through all the steps and end
 266 cleanly, without throwing errors.
 267
 268 Instance operations
 269 -------------------
 270
 271 Creation
 272 ++++++++
 273
 274 At this point, Ganeti and the hardware seems to be functioning
 275 correctly, so we'll follow up with creating the instances manually::
 276
 277   $ gnt-instance add -t drbd -o debootstrap -s %256m% %instance3%
 278   Mon Oct 26 04:06:52 2009  - INFO: Selected nodes for instance instance1 via iallocator hail: node2, node3
 279   Mon Oct 26 04:06:53 2009 * creating instance disks...
 280   Mon Oct 26 04:06:57 2009 adding instance instance1 to cluster config
 281   Mon Oct 26 04:06:57 2009  - INFO: Waiting for instance instance1 to sync disks.
 282   Mon Oct 26 04:06:57 2009  - INFO: - device disk/0: 20.00\% done, 4 estimated seconds remaining
 283   Mon Oct 26 04:07:01 2009  - INFO: Instance instance1's disks are in sync.
 284   Mon Oct 26 04:07:01 2009 creating os for instance instance1 on node node2
 285   Mon Oct 26 04:07:01 2009 * running the instance OS create scripts...
 286   Mon Oct 26 04:07:14 2009 * starting instance...
 287   $ gnt-instance add -t drbd -o debootstrap -s %256m% -n %node1%:%node2% %instance2%
 288   Mon Oct 26 04:11:37 2009 * creating instance disks...
 289   Mon Oct 26 04:11:40 2009 adding instance instance2 to cluster config
 290   Mon Oct 26 04:11:41 2009  - INFO: Waiting for instance instance2 to sync disks.
 291   Mon Oct 26 04:11:41 2009  - INFO: - device disk/0: 35.40\% done, 1 estimated seconds remaining
 292   Mon Oct 26 04:11:42 2009  - INFO: - device disk/0: 58.50\% done, 1 estimated seconds remaining
 293   Mon Oct 26 04:11:43 2009  - INFO: - device disk/0: 86.20\% done, 0 estimated seconds remaining
 294   Mon Oct 26 04:11:44 2009  - INFO: - device disk/0: 92.40\% done, 0 estimated seconds remaining
 295   Mon Oct 26 04:11:44 2009  - INFO: - device disk/0: 97.00\% done, 0 estimated seconds remaining
 296   Mon Oct 26 04:11:44 2009  - INFO: Instance instance2's disks are in sync.
 297   Mon Oct 26 04:11:44 2009 creating os for instance instance2 on node node1
 298   Mon Oct 26 04:11:44 2009 * running the instance OS create scripts...
 299   Mon Oct 26 04:11:57 2009 * starting instance...
 300   $
 301
 302 The above shows one instance created via an iallocator script, and one
 303 being created with manual node assignment. The other three instances
 304 were also created and now it's time to check them::
 305
 306   $ gnt-instance list
 307   Instance  Hypervisor OS          Primary_node Status  Memory
 308   instance1 xen-pvm    debootstrap node2        running   128M
 309   instance2 xen-pvm    debootstrap node1        running   128M
 310   instance3 xen-pvm    debootstrap node1        running   128M
 311   instance4 xen-pvm    debootstrap node3        running   128M
 312   instance5 xen-pvm    debootstrap node2        running   128M
 313
 314 Accessing instances
 315 +++++++++++++++++++
 316
 317 Accessing an instance's console is easy::
 318
 319   $ gnt-instance console %instance2%
 320   [    0.000000] Bootdata ok (command line is root=/dev/sda1 ro)
 321   [    0.000000] Linux version 2.6…
 322   [    0.000000] BIOS-provided physical RAM map:
 323   [    0.000000]  Xen: 0000000000000000 - 0000000008800000 (usable)
 324   [13138176.018071] Built 1 zonelists.  Total pages: 34816
 325   [13138176.018074] Kernel command line: root=/dev/sda1 ro
 326   [13138176.018694] Initializing CPU#0
 327   …
 328   Checking file systems...fsck 1.41.3 (12-Oct-2008)
 329   done.
 330   Setting kernel variables (/etc/sysctl.conf)...done.
 331   Mounting local filesystems...done.
 332   Activating swapfile swap...done.
 333   Setting up networking....
 334   Configuring network interfaces...done.
 335   Setting console screen modes and fonts.
 336   INIT: Entering runlevel: 2
 337   Starting enhanced syslogd: rsyslogd.
 338   Starting periodic command scheduler: crond.
 339
 340   Debian GNU/Linux 5.0 instance2 tty1
 341
 342   instance2 login:
 343
 344 At this moment you can login to the instance and, after configuring the
 345 network (and doing this on all instances), we can check their
 346 connectivity::
 347
 348   $ fping %instance{1..5}%
 349   instance1 is alive
 350   instance2 is alive
 351   instance3 is alive
 352   instance4 is alive
 353   instance5 is alive
 354   $
 355
 356 Removal
 357 +++++++
 358
 359 Removing unwanted instances is also easy::
 360
 361   $ gnt-instance remove %instance5%
 362   This will remove the volumes of the instance instance5 (including
 363   mirrors), thus removing all the data of the instance. Continue?
 364   y/[n]/?: %y%
 365   $
 366
 367
 368 Recovering from hardware failures
 369 ---------------------------------
 370
 371 Recovering from node failure
 372 ++++++++++++++++++++++++++++
 373
 374 We are now left with four instances. Assume that at this point, node3,
 375 which has one primary and one secondary instance, crashes::
 376
 377   $ gnt-node info %node3%
 378   Node name: node3
 379     primary ip: 198.51.100.1
 380     secondary ip: 192.0.2.3
 381     master candidate: True
 382     drained: False
 383     offline: False
 384     primary for instances:
 385       - instance4
 386     secondary for instances:
 387       - instance1
 388   $ fping %node3%
 389   node3 is unreachable
 390
 391 At this point, the primary instance of that node (instance4) is down,
 392 but the secondary instance (instance1) is not affected except it has
 393 lost disk redundancy::
 394
 395   $ fping %instance{1,4}%
 396   instance1 is alive
 397   instance4 is unreachable
 398   $
 399
 400 If we try to check the status of instance4 via the instance info
 401 command, it fails because it tries to contact node3 which is down::
 402
 403   $ gnt-instance info %instance4%
 404   Failure: command execution error:
 405   Error checking node node3: Connection failed (113: No route to host)
 406   $
 407
 408 So we need to mark node3 as being *offline*, and thus Ganeti won't talk
 409 to it anymore::
 410
 411   $ gnt-node modify -O yes -f %node3%
 412   Mon Oct 26 04:34:12 2009  - WARNING: Not enough master candidates (desired 10, new value will be 2)
 413   Mon Oct 26 04:34:15 2009  - WARNING: Communication failure to node node3: Connection failed (113: No route to host)
 414   Modified node node3
 415    - offline -> True
 416    - master_candidate -> auto-demotion due to offline
 417   $
 418
 419 And now we can failover the instance::
 420
 421   $ gnt-instance failover %instance4%
 422   Failover will happen to image instance4. This requires a shutdown of
 423   the instance. Continue?
 424   y/[n]/?: %y%
 425   Mon Oct 26 04:35:34 2009 * checking disk consistency between source and target
 426   Failure: command execution error:
 427   Disk disk/0 is degraded on target node, aborting failover.
 428   $ gnt-instance failover --ignore-consistency %instance4%
 429   Failover will happen to image instance4. This requires a shutdown of
 430   the instance. Continue?
 431   y/[n]/?: y
 432   Mon Oct 26 04:35:47 2009 * checking disk consistency between source and target
 433   Mon Oct 26 04:35:47 2009 * shutting down instance on source node
 434   Mon Oct 26 04:35:47 2009  - WARNING: Could not shutdown instance instance4 on node node3. Proceeding anyway. Please make sure node node3 is down. Error details: Node is marked offline
 435   Mon Oct 26 04:35:47 2009 * deactivating the instance's disks on source node
 436   Mon Oct 26 04:35:47 2009  - WARNING: Could not shutdown block device disk/0 on node node3: Node is marked offline
 437   Mon Oct 26 04:35:47 2009 * activating the instance's disks on target node
 438   Mon Oct 26 04:35:47 2009  - WARNING: Could not prepare block device disk/0 on node node3 (is_primary=False, pass=1): Node is marked offline
 439   Mon Oct 26 04:35:48 2009 * starting the instance on the target node
 440   $
 441
 442 Note in our first attempt, Ganeti refused to do the failover since it
 443 wasn't sure what is the status of the instance's disks. We pass the
 444 ``--ignore-consistency`` flag and then we can failover::
 445
 446   $ gnt-instance list
 447   Instance  Hypervisor OS          Primary_node Status  Memory
 448   instance1 xen-pvm    debootstrap node2        running   128M
 449   instance2 xen-pvm    debootstrap node1        running   128M
 450   instance3 xen-pvm    debootstrap node1        running   128M
 451   instance4 xen-pvm    debootstrap node1        running   128M
 452   $
 453
 454 But at this point, both instance1 and instance4 are without disk
 455 redundancy::
 456
 457   $ gnt-instance info %instance1%
 458   Instance name: instance1
 459   UUID: 45173e82-d1fa-417c-8758-7d582ab7eef4
 460   Serial number: 2
 461   Creation time: 2009-10-26 04:06:57
 462   Modification time: 2009-10-26 04:07:14
 463   State: configured to be up, actual state is up
 464     Nodes:
 465       - primary: node2
 466       - secondaries: node3
 467     Operating system: debootstrap
 468     Allocated network port: None
 469     Hypervisor: xen-pvm
 470       - root_path: default (/dev/sda1)
 471       - kernel_args: default (ro)
 472       - use_bootloader: default (False)
 473       - bootloader_args: default ()
 474       - bootloader_path: default ()
 475       - kernel_path: default (/boot/vmlinuz-2.6-xenU)
 476       - initrd_path: default ()
 477     Hardware:
 478       - VCPUs: 1
 479       - maxmem: 256MiB
 480       - minmem: 512MiB
 481       - NICs:
 482         - nic/0: MAC: aa:00:00:78:da:63, IP: None, mode: bridged, link: xen-br0
 483     Disks:
 484       - disk/0: drbd8, size 256M
 485         access mode: rw
 486         nodeA:       node2, minor=0
 487         nodeB:       node3, minor=0
 488         port:        11035
 489         auth key:    8e950e3cec6854b0181fbc3a6058657701f2d458
 490         on primary:  /dev/drbd0 (147:0) in sync, status *DEGRADED*
 491         child devices:
 492           - child 0: lvm, size 256M
 493             logical_id: xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_data
 494             on primary: /dev/xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_data (254:0)
 495           - child 1: lvm, size 128M
 496             logical_id: xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta
 497             on primary: /dev/xenvg/22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta (254:1)
 498
 499 The output is similar for instance4. In order to recover this, we need
 500 to run the node evacuate command which will change from the current
 501 secondary node to a new one (in this case, we only have two working
 502 nodes, so all instances will be end on nodes one and two)::
 503
 504   $ gnt-node evacuate -I hail %node3%
 505   Relocate instance(s) 'instance1','instance4' from node
 506    node3 using iallocator hail?
 507   y/[n]/?: %y%
 508   Mon Oct 26 05:05:39 2009  - INFO: Selected new secondary for instance 'instance1': node1
 509   Mon Oct 26 05:05:40 2009  - INFO: Selected new secondary for instance 'instance4': node2
 510   Mon Oct 26 05:05:40 2009 Replacing disk(s) 0 for instance1
 511   Mon Oct 26 05:05:40 2009 STEP 1/6 Check device existence
 512   Mon Oct 26 05:05:40 2009  - INFO: Checking disk/0 on node2
 513   Mon Oct 26 05:05:40 2009  - INFO: Checking volume groups
 514   Mon Oct 26 05:05:40 2009 STEP 2/6 Check peer consistency
 515   Mon Oct 26 05:05:40 2009  - INFO: Checking disk/0 consistency on node node2
 516   Mon Oct 26 05:05:40 2009 STEP 3/6 Allocate new storage
 517   Mon Oct 26 05:05:40 2009  - INFO: Adding new local storage on node1 for disk/0
 518   Mon Oct 26 05:05:41 2009 STEP 4/6 Changing drbd configuration
 519   Mon Oct 26 05:05:41 2009  - INFO: activating a new drbd on node1 for disk/0
 520   Mon Oct 26 05:05:42 2009  - INFO: Shutting down drbd for disk/0 on old node
 521   Mon Oct 26 05:05:42 2009  - WARNING: Failed to shutdown drbd for disk/0 on oldnode: Node is marked offline
 522   Mon Oct 26 05:05:42 2009       Hint: Please cleanup this device manually as soon as possible
 523   Mon Oct 26 05:05:42 2009  - INFO: Detaching primary drbds from the network (=> standalone)
 524   Mon Oct 26 05:05:42 2009  - INFO: Updating instance configuration
 525   Mon Oct 26 05:05:45 2009  - INFO: Attaching primary drbds to new secondary (standalone => connected)
 526   Mon Oct 26 05:05:46 2009 STEP 5/6 Sync devices
 527   Mon Oct 26 05:05:46 2009  - INFO: Waiting for instance instance1 to sync disks.
 528   Mon Oct 26 05:05:46 2009  - INFO: - device disk/0: 13.90\% done, 7 estimated seconds remaining
 529   Mon Oct 26 05:05:53 2009  - INFO: Instance instance1's disks are in sync.
 530   Mon Oct 26 05:05:53 2009 STEP 6/6 Removing old storage
 531   Mon Oct 26 05:05:53 2009  - INFO: Remove logical volumes for 0
 532   Mon Oct 26 05:05:53 2009  - WARNING: Can't remove old LV: Node is marked offline
 533   Mon Oct 26 05:05:53 2009       Hint: remove unused LVs manually
 534   Mon Oct 26 05:05:53 2009  - WARNING: Can't remove old LV: Node is marked offline
 535   Mon Oct 26 05:05:53 2009       Hint: remove unused LVs manually
 536   Mon Oct 26 05:05:53 2009 Replacing disk(s) 0 for instance4
 537   Mon Oct 26 05:05:53 2009 STEP 1/6 Check device existence
 538   Mon Oct 26 05:05:53 2009  - INFO: Checking disk/0 on node1
 539   Mon Oct 26 05:05:53 2009  - INFO: Checking volume groups
 540   Mon Oct 26 05:05:53 2009 STEP 2/6 Check peer consistency
 541   Mon Oct 26 05:05:53 2009  - INFO: Checking disk/0 consistency on node node1
 542   Mon Oct 26 05:05:54 2009 STEP 3/6 Allocate new storage
 543   Mon Oct 26 05:05:54 2009  - INFO: Adding new local storage on node2 for disk/0
 544   Mon Oct 26 05:05:54 2009 STEP 4/6 Changing drbd configuration
 545   Mon Oct 26 05:05:54 2009  - INFO: activating a new drbd on node2 for disk/0
 546   Mon Oct 26 05:05:55 2009  - INFO: Shutting down drbd for disk/0 on old node
 547   Mon Oct 26 05:05:55 2009  - WARNING: Failed to shutdown drbd for disk/0 on oldnode: Node is marked offline
 548   Mon Oct 26 05:05:55 2009       Hint: Please cleanup this device manually as soon as possible
 549   Mon Oct 26 05:05:55 2009  - INFO: Detaching primary drbds from the network (=> standalone)
 550   Mon Oct 26 05:05:55 2009  - INFO: Updating instance configuration
 551   Mon Oct 26 05:05:55 2009  - INFO: Attaching primary drbds to new secondary (standalone => connected)
 552   Mon Oct 26 05:05:56 2009 STEP 5/6 Sync devices
 553   Mon Oct 26 05:05:56 2009  - INFO: Waiting for instance instance4 to sync disks.
 554   Mon Oct 26 05:05:56 2009  - INFO: - device disk/0: 12.40\% done, 8 estimated seconds remaining
 555   Mon Oct 26 05:06:04 2009  - INFO: Instance instance4's disks are in sync.
 556   Mon Oct 26 05:06:04 2009 STEP 6/6 Removing old storage
 557   Mon Oct 26 05:06:04 2009  - INFO: Remove logical volumes for 0
 558   Mon Oct 26 05:06:04 2009  - WARNING: Can't remove old LV: Node is marked offline
 559   Mon Oct 26 05:06:04 2009       Hint: remove unused LVs manually
 560   Mon Oct 26 05:06:04 2009  - WARNING: Can't remove old LV: Node is marked offline
 561   Mon Oct 26 05:06:04 2009       Hint: remove unused LVs manually
 562   $
 563
 564 And now node3 is completely free of instances and can be repaired::
 565
 566   $ gnt-node list
 567   Node  DTotal DFree MTotal MNode MFree Pinst Sinst
 568   node1   1.3T  1.3T  32.0G  1.0G 30.2G     3     1
 569   node2   1.3T  1.3T  32.0G  1.0G 30.4G     1     3
 570   node3      ?     ?      ?     ?     ?     0     0
 571
 572 Re-adding a node to the cluster
 573 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 574
 575 Let's say node3 has been repaired and is now ready to be
 576 reused. Re-adding it is simple::
 577
 578   $ gnt-node add --readd %node3%
 579   The authenticity of host 'node3 (198.51.100.1)' can't be established.
 580   RSA key fingerprint is 9f:2e:5a:2e:e0:bd:00:09:e4:5c:32:f2:27:57:7a:f4.
 581   Are you sure you want to continue connecting (yes/no)? yes
 582   Mon Oct 26 05:27:39 2009  - INFO: Readding a node, the offline/drained flags were reset
 583   Mon Oct 26 05:27:39 2009  - INFO: Node will be a master candidate
 584
 585 And it is now working again::
 586
 587   $ gnt-node list
 588   Node  DTotal DFree MTotal MNode MFree Pinst Sinst
 589   node1   1.3T  1.3T  32.0G  1.0G 30.2G     3     1
 590   node2   1.3T  1.3T  32.0G  1.0G 30.4G     1     3
 591   node3   1.3T  1.3T  32.0G  1.0G 30.4G     0     0
 592
 593 .. note:: If Ganeti has been built with the htools
 594    component enabled, you can shuffle the instances around to have a
 595    better use of the nodes.
 596
 597 Disk failures
 598 +++++++++++++
 599
 600 A disk failure is simpler than a full node failure. First, a single disk
 601 failure should not cause data-loss for any redundant instance; only the
 602 performance of some instances might be reduced due to more network
 603 traffic.
 604
 605 Let take the cluster status in the above listing, and check what volumes
 606 are in use::
 607
 608   $ gnt-node volumes -o phys,instance %node2%
 609   PhysDev   Instance
 610   /dev/sdb1 instance4
 611   /dev/sdb1 instance4
 612   /dev/sdb1 instance1
 613   /dev/sdb1 instance1
 614   /dev/sdb1 instance3
 615   /dev/sdb1 instance3
 616   /dev/sdb1 instance2
 617   /dev/sdb1 instance2
 618   $
 619
 620 You can see that all instances on node2 have logical volumes on
 621 ``/dev/sdb1``. Let's simulate a disk failure on that disk::
 622
 623   $ ssh node2
 624   # on node2
 625   $ echo offline > /sys/block/sdb/device/state
 626   $ vgs
 627     /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
 628     /dev/sdb1: read failed after 0 of 4096 at 750153695232: Input/output error
 629     /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
 630     Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'.
 631     Couldn't find all physical volumes for volume group xenvg.
 632     /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
 633     /dev/sdb1: read failed after 0 of 4096 at 0: Input/output error
 634     Couldn't find device with uuid '954bJA-mNL0-7ydj-sdpW-nc2C-ZrCi-zFp91c'.
 635     Couldn't find all physical volumes for volume group xenvg.
 636     Volume group xenvg not found
 637   $
 638
 639 At this point, the node is broken and if we are to examine
 640 instance2 we get (simplified output shown)::
 641
 642   $ gnt-instance info %instance2%
 643   Instance name: instance2
 644   State: configured to be up, actual state is up
 645     Nodes:
 646       - primary: node1
 647       - secondaries: node2
 648     Disks:
 649       - disk/0: drbd8, size 256M
 650         on primary:   /dev/drbd0 (147:0) in sync, status ok
 651         on secondary: /dev/drbd1 (147:1) in sync, status *DEGRADED* *MISSING DISK*
 652
 653 This instance has a secondary only on node2. Let's verify a primary
 654 instance of node2::
 655
 656   $ gnt-instance info %instance1%
 657   Instance name: instance1
 658   State: configured to be up, actual state is up
 659     Nodes:
 660       - primary: node2
 661       - secondaries: node1
 662     Disks:
 663       - disk/0: drbd8, size 256M
 664         on primary:   /dev/drbd0 (147:0) in sync, status *DEGRADED* *MISSING DISK*
 665         on secondary: /dev/drbd3 (147:3) in sync, status ok
 666   $ gnt-instance console %instance1%
 667
 668   Debian GNU/Linux 5.0 instance1 tty1
 669
 670   instance1 login: root
 671   Last login: Tue Oct 27 01:24:09 UTC 2009 on tty1
 672   instance1:~# date > test
 673   instance1:~# sync
 674   instance1:~# cat test
 675   Tue Oct 27 01:25:20 UTC 2009
 676   instance1:~# dmesg|tail
 677   [5439785.235448] NET: Registered protocol family 15
 678   [5439785.235489] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
 679   [5439785.235495] All bugs added by David S. Miller <davem@redhat.com>
 680   [5439785.235517] XENBUS: Device with no driver: device/console/0
 681   [5439785.236576] kjournald starting.  Commit interval 5 seconds
 682   [5439785.236588] EXT3-fs: mounted filesystem with ordered data mode.
 683   [5439785.236625] VFS: Mounted root (ext3 filesystem) readonly.
 684   [5439785.236663] Freeing unused kernel memory: 172k freed
 685   [5439787.533779] EXT3 FS on sda1, internal journal
 686   [5440655.065431] eth0: no IPv6 routers present
 687   instance1:~#
 688
 689 As you can see, the instance is running fine and doesn't see any disk
 690 issues. It is now time to fix node2 and re-establish redundancy for the
 691 involved instances.
 692
 693 .. note:: For Ganeti 2.0 we need to fix manually the volume group on
 694    node2 by running ``vgreduce --removemissing xenvg``
 695
 696 ::
 697
 698   $ gnt-node repair-storage %node2% lvm-vg %xenvg%
 699   Mon Oct 26 18:14:03 2009 Repairing storage unit 'xenvg' on node2 ...
 700   $ ssh %node2% vgs
 701   VG    #PV #LV #SN Attr   VSize   VFree
 702   xenvg   1   8   0 wz--n- 673.84G 673.84G
 703   $
 704
 705 This has removed the 'bad' disk from the volume group, which is now left
 706 with only one PV. We can now replace the disks for the involved
 707 instances::
 708
 709   $ for i in %instance{1..4}%; do gnt-instance replace-disks -a $i; done
 710   Mon Oct 26 18:15:38 2009 Replacing disk(s) 0 for instance1
 711   Mon Oct 26 18:15:38 2009 STEP 1/6 Check device existence
 712   Mon Oct 26 18:15:38 2009  - INFO: Checking disk/0 on node1
 713   Mon Oct 26 18:15:38 2009  - INFO: Checking disk/0 on node2
 714   Mon Oct 26 18:15:38 2009  - INFO: Checking volume groups
 715   Mon Oct 26 18:15:38 2009 STEP 2/6 Check peer consistency
 716   Mon Oct 26 18:15:38 2009  - INFO: Checking disk/0 consistency on node node1
 717   Mon Oct 26 18:15:39 2009 STEP 3/6 Allocate new storage
 718   Mon Oct 26 18:15:39 2009  - INFO: Adding storage on node2 for disk/0
 719   Mon Oct 26 18:15:39 2009 STEP 4/6 Changing drbd configuration
 720   Mon Oct 26 18:15:39 2009  - INFO: Detaching disk/0 drbd from local storage
 721   Mon Oct 26 18:15:40 2009  - INFO: Renaming the old LVs on the target node
 722   Mon Oct 26 18:15:40 2009  - INFO: Renaming the new LVs on the target node
 723   Mon Oct 26 18:15:40 2009  - INFO: Adding new mirror component on node2
 724   Mon Oct 26 18:15:41 2009 STEP 5/6 Sync devices
 725   Mon Oct 26 18:15:41 2009  - INFO: Waiting for instance instance1 to sync disks.
 726   Mon Oct 26 18:15:41 2009  - INFO: - device disk/0: 12.40\% done, 9 estimated seconds remaining
 727   Mon Oct 26 18:15:50 2009  - INFO: Instance instance1's disks are in sync.
 728   Mon Oct 26 18:15:50 2009 STEP 6/6 Removing old storage
 729   Mon Oct 26 18:15:50 2009  - INFO: Remove logical volumes for disk/0
 730   Mon Oct 26 18:15:52 2009 Replacing disk(s) 0 for instance2
 731   Mon Oct 26 18:15:52 2009 STEP 1/6 Check device existence
 732   …
 733   Mon Oct 26 18:16:01 2009 STEP 6/6 Removing old storage
 734   Mon Oct 26 18:16:01 2009  - INFO: Remove logical volumes for disk/0
 735   Mon Oct 26 18:16:02 2009 Replacing disk(s) 0 for instance3
 736   Mon Oct 26 18:16:02 2009 STEP 1/6 Check device existence
 737   …
 738   Mon Oct 26 18:16:09 2009 STEP 6/6 Removing old storage
 739   Mon Oct 26 18:16:09 2009  - INFO: Remove logical volumes for disk/0
 740   Mon Oct 26 18:16:10 2009 Replacing disk(s) 0 for instance4
 741   Mon Oct 26 18:16:10 2009 STEP 1/6 Check device existence
 742   …
 743   Mon Oct 26 18:16:18 2009 STEP 6/6 Removing old storage
 744   Mon Oct 26 18:16:18 2009  - INFO: Remove logical volumes for disk/0
 745   $
 746
 747 As this point, all instances should be healthy again.
 748
 749 .. note:: Ganeti 2.0 doesn't have the ``-a`` option to replace-disks, so
 750    for it you have to run the loop twice, once over primary instances
 751    with argument ``-p`` and once secondary instances with argument
 752    ``-s``, but otherwise the operations are similar::
 753
 754      $ gnt-instance replace-disks -p instance1
 755      …
 756      $ for i in %instance{2..4}%; do gnt-instance replace-disks -s $i; done
 757
 758 Common cluster problems
 759 -----------------------
 760
 761 There are a number of small issues that might appear on a cluster that
 762 can be solved easily as long as the issue is properly identified. For
 763 this exercise we will consider the case of node3, which was broken
 764 previously and re-added to the cluster without reinstallation. Running
 765 cluster verify on the cluster reports::
 766
 767   $ gnt-cluster verify
 768   Mon Oct 26 18:30:08 2009 * Verifying global settings
 769   Mon Oct 26 18:30:08 2009 * Gathering data (3 nodes)
 770   Mon Oct 26 18:30:10 2009 * Verifying node status
 771   Mon Oct 26 18:30:10 2009   - ERROR: node node3: unallocated drbd minor 0 is in use
 772   Mon Oct 26 18:30:10 2009   - ERROR: node node3: unallocated drbd minor 1 is in use
 773   Mon Oct 26 18:30:10 2009 * Verifying instance status
 774   Mon Oct 26 18:30:10 2009   - ERROR: instance instance4: instance should not run on node node3
 775   Mon Oct 26 18:30:10 2009 * Verifying orphan volumes
 776   Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-791667d07800.disk0_data is unknown
 777   Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data is unknown
 778   Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta is unknown
 779   Mon Oct 26 18:30:10 2009   - ERROR: node node3: volume 22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta is unknown
 780   Mon Oct 26 18:30:10 2009 * Verifying remaining instances
 781   Mon Oct 26 18:30:10 2009 * Verifying N+1 Memory redundancy
 782   Mon Oct 26 18:30:10 2009 * Other Notes
 783   Mon Oct 26 18:30:10 2009 * Hooks Results
 784   $
 785
 786 Instance status
 787 +++++++++++++++
 788
 789 As you can see, *instance4* has a copy running on node3, because we
 790 forced the failover when node3 failed. This case is dangerous as the
 791 instance will have the same IP and MAC address, wreaking havoc on the
 792 network environment and anyone who tries to use it.
 793
 794 Ganeti doesn't directly handle this case. It is recommended to logon to
 795 node3 and run::
 796
 797   $ xm destroy %instance4%
 798
 799 Unallocated DRBD minors
 800 +++++++++++++++++++++++
 801
 802 There are still unallocated DRBD minors on node3. Again, these are not
 803 handled by Ganeti directly and need to be cleaned up via DRBD commands::
 804
 805   $ ssh %node3%
 806   # on node 3
 807   $ drbdsetup /dev/drbd%0% down
 808   $ drbdsetup /dev/drbd%1% down
 809   $
 810
 811 Orphan volumes
 812 ++++++++++++++
 813
 814 At this point, the only remaining problem should be the so-called
 815 *orphan* volumes. This can happen also in the case of an aborted
 816 disk-replace, or similar situation where Ganeti was not able to recover
 817 automatically. Here you need to remove them manually via LVM commands::
 818
 819   $ ssh %node3%
 820   # on node3
 821   $ lvremove %xenvg%
 822   Do you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_data"? [y/n]: %y%
 823     Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_data" successfully removed
 824   Do you really want to remove active logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta"? [y/n]: %y%
 825     Logical volume "22459cf8-117d-4bea-a1aa-791667d07800.disk0_meta" successfully removed
 826   Do you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data"? [y/n]: %y%
 827     Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_data" successfully removed
 828   Do you really want to remove active logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta"? [y/n]: %y%
 829     Logical volume "1aaf4716-e57f-4101-a8d6-03af5da9dc50.disk0_meta" successfully removed
 830   node3#
 831
 832 At this point cluster verify shouldn't complain anymore::
 833
 834   $ gnt-cluster verify
 835   Mon Oct 26 18:37:51 2009 * Verifying global settings
 836   Mon Oct 26 18:37:51 2009 * Gathering data (3 nodes)
 837   Mon Oct 26 18:37:53 2009 * Verifying node status
 838   Mon Oct 26 18:37:53 2009 * Verifying instance status
 839   Mon Oct 26 18:37:53 2009 * Verifying orphan volumes
 840   Mon Oct 26 18:37:53 2009 * Verifying remaining instances
 841   Mon Oct 26 18:37:53 2009 * Verifying N+1 Memory redundancy
 842   Mon Oct 26 18:37:53 2009 * Other Notes
 843   Mon Oct 26 18:37:53 2009 * Hooks Results
 844   $
 845
 846 N+1 errors
 847 ++++++++++
 848
 849 Since redundant instances in Ganeti have a primary/secondary model, it
 850 is needed to leave aside on each node enough memory so that if one of
 851 its peer node fails, all the secondary instances that have that node as
 852 primary can be relocated. More specifically, if instance2 has node1 as
 853 primary and node2 as secondary (and node1 and node2 do not have any
 854 other instances in this layout), then it means that node2 must have
 855 enough free memory so that if node1 fails, we can failover instance2
 856 without any other operations (for reducing the downtime window). Let's
 857 increase the memory of the current instances to 4G, and add three new
 858 instances, two on node2:node3 with 8GB of RAM and one on node1:node2,
 859 with 12GB of RAM (numbers chosen so that we run out of memory)::
 860
 861   $ gnt-instance modify -B memory=%4G% %instance1%
 862   Modified instance instance1
 863    - be/maxmem -> 4096
 864    - be/minmem -> 4096
 865   Please don't forget that these parameters take effect only at the next start of the instance.
 866   $ gnt-instance modify …
 867
 868   $ gnt-instance add -t drbd -n %node2%:%node3% -s %512m% -B memory=%8G% -o %debootstrap% %instance5%
 869   …
 870   $ gnt-instance add -t drbd -n %node2%:%node3% -s %512m% -B memory=%8G% -o %debootstrap% %instance6%
 871   …
 872   $ gnt-instance add -t drbd -n %node1%:%node2% -s %512m% -B memory=%8G% -o %debootstrap% %instance7%
 873   $ gnt-instance reboot --all
 874   The reboot will operate on 7 instances.
 875   Do you want to continue?
 876   Affected instances:
 877     instance1
 878     instance2
 879     instance3
 880     instance4
 881     instance5
 882     instance6
 883     instance7
 884   y/[n]/?: %y%
 885   Submitted jobs 677, 678, 679, 680, 681, 682, 683
 886   Waiting for job 677 for instance1...
 887   Waiting for job 678 for instance2...
 888   Waiting for job 679 for instance3...
 889   Waiting for job 680 for instance4...
 890   Waiting for job 681 for instance5...
 891   Waiting for job 682 for instance6...
 892   Waiting for job 683 for instance7...
 893   $
 894
 895 We rebooted the instances for the memory changes to have effect. Now the
 896 cluster looks like::
 897
 898   $ gnt-node list
 899   Node  DTotal DFree MTotal MNode MFree Pinst Sinst
 900   node1   1.3T  1.3T  32.0G  1.0G  6.5G     4     1
 901   node2   1.3T  1.3T  32.0G  1.0G 10.5G     3     4
 902   node3   1.3T  1.3T  32.0G  1.0G 30.5G     0     2
 903   $ gnt-cluster verify
 904   Mon Oct 26 18:59:36 2009 * Verifying global settings
 905   Mon Oct 26 18:59:36 2009 * Gathering data (3 nodes)
 906   Mon Oct 26 18:59:37 2009 * Verifying node status
 907   Mon Oct 26 18:59:37 2009 * Verifying instance status
 908   Mon Oct 26 18:59:37 2009 * Verifying orphan volumes
 909   Mon Oct 26 18:59:37 2009 * Verifying remaining instances
 910   Mon Oct 26 18:59:37 2009 * Verifying N+1 Memory redundancy
 911   Mon Oct 26 18:59:37 2009   - ERROR: node node2: not enough memory to accommodate instance failovers should node node1 fail
 912   Mon Oct 26 18:59:37 2009 * Other Notes
 913   Mon Oct 26 18:59:37 2009 * Hooks Results
 914   $
 915
 916 The cluster verify error above shows that if node1 fails, node2 will not
 917 have enough memory to failover all primary instances on node1 to it. To
 918 solve this, you have a number of options:
 919
 920 - try to manually move instances around (but this can become complicated
 921   for any non-trivial cluster)
 922 - try to reduce the minimum memory of some instances on the source node
 923   of the N+1 failure (in the example above ``node1``): this will allow
 924   it to start and be failed over/migrated with less than its maximum
 925   memory
 926 - try to reduce the runtime/maximum memory of some instances on the
 927   destination node of the N+1 failure (in the example above ``node2``)
 928   to create additional available node memory (check the :doc:`admin`
 929   guide for what Ganeti will and won't automatically do in regards to
 930   instance runtime memory modification)
 931 - if Ganeti has been built with the htools package enabled, you can run
 932   the ``hbal`` tool which will try to compute an automated cluster
 933   solution that complies with the N+1 rule
 934
 935 Network issues
 936 ++++++++++++++
 937
 938 In case a node has problems with the network (usually the secondary
 939 network, as problems with the primary network will render the node
 940 unusable for ganeti commands), it will show up in cluster verify as::
 941
 942   $ gnt-cluster verify
 943   Mon Oct 26 19:07:19 2009 * Verifying global settings
 944   Mon Oct 26 19:07:19 2009 * Gathering data (3 nodes)
 945   Mon Oct 26 19:07:23 2009 * Verifying node status
 946   Mon Oct 26 19:07:23 2009   - ERROR: node node1: tcp communication with node 'node3': failure using the secondary interface(s)
 947   Mon Oct 26 19:07:23 2009   - ERROR: node node2: tcp communication with node 'node3': failure using the secondary interface(s)
 948   Mon Oct 26 19:07:23 2009   - ERROR: node node3: tcp communication with node 'node1': failure using the secondary interface(s)
 949   Mon Oct 26 19:07:23 2009   - ERROR: node node3: tcp communication with node 'node2': failure using the secondary interface(s)
 950   Mon Oct 26 19:07:23 2009   - ERROR: node node3: tcp communication with node 'node3': failure using the secondary interface(s)
 951   Mon Oct 26 19:07:23 2009 * Verifying instance status
 952   Mon Oct 26 19:07:23 2009 * Verifying orphan volumes
 953   Mon Oct 26 19:07:23 2009 * Verifying remaining instances
 954   Mon Oct 26 19:07:23 2009 * Verifying N+1 Memory redundancy
 955   Mon Oct 26 19:07:23 2009 * Other Notes
 956   Mon Oct 26 19:07:23 2009 * Hooks Results
 957   $
 958
 959 This shows that both node1 and node2 have problems contacting node3 over
 960 the secondary network, and node3 has problems contacting them. From this
 961 output is can be deduced that since node1 and node2 can communicate
 962 between themselves, node3 is the one having problems, and you need to
 963 investigate its network settings/connection.
 964
 965 Migration problems
 966 ++++++++++++++++++
 967
 968 Since live migration can sometimes fail and leave the instance in an
 969 inconsistent state, Ganeti provides a ``--cleanup`` argument to the
 970 migrate command that does:
 971
 972 - check on which node the instance is actually running (has the
 973   command failed before or after the actual migration?)
 974 - reconfigure the DRBD disks accordingly
 975
 976 It is always safe to run this command as long as the instance has good
 977 data on its primary node (i.e. not showing as degraded). If so, you can
 978 simply run::
 979
 980   $ gnt-instance migrate --cleanup %instance1%
 981   Instance instance1 will be recovered from a failed migration. Note
 982   that the migration procedure (including cleanup) is **experimental**
 983   in this version. This might impact the instance if anything goes
 984   wrong. Continue?
 985   y/[n]/?: %y%
 986   Mon Oct 26 19:13:49 2009 Migrating instance instance1
 987   Mon Oct 26 19:13:49 2009 * checking where the instance actually runs (if this hangs, the hypervisor might be in a bad state)
 988   Mon Oct 26 19:13:49 2009 * instance confirmed to be running on its primary node (node2)
 989   Mon Oct 26 19:13:49 2009 * switching node node1 to secondary mode
 990   Mon Oct 26 19:13:50 2009 * wait until resync is done
 991   Mon Oct 26 19:13:50 2009 * changing into standalone mode
 992   Mon Oct 26 19:13:50 2009 * changing disks into single-master mode
 993   Mon Oct 26 19:13:50 2009 * wait until resync is done
 994   Mon Oct 26 19:13:51 2009 * done
 995   $
 996
 997 In use disks at instance shutdown
 998 +++++++++++++++++++++++++++++++++
 999
1000 If you see something like the following when trying to shutdown or
1001 deactivate disks for an instance::
1002
1003   $ gnt-instance shutdown %instance1%
1004   Mon Oct 26 19:16:23 2009  - WARNING: Could not shutdown block device disk/0 on node node2: drbd0: can't shutdown drbd device: /dev/drbd0: State change failed: (-12) Device is held open by someone\n
1005
1006 It most likely means something is holding open the underlying DRBD
1007 device. This can be bad if the instance is not running, as it might mean
1008 that there was concurrent access from both the node and the instance to
1009 the disks, but not always (e.g. you could only have had the partitions
1010 activated via ``kpartx``).
1011
1012 To troubleshoot this issue you need to follow standard Linux practices,
1013 and pay attention to the hypervisor being used:
1014
1015 - check if (in the above example) ``/dev/drbd0`` on node2 is being
1016   mounted somewhere (``cat /proc/mounts``)
1017 - check if the device is not being used by device mapper itself:
1018   ``dmsetup ls`` and look for entries of the form ``drbd0pX``, and if so
1019   remove them with either ``kpartx -d`` or ``dmsetup remove``
1020
1021 For Xen, check if it's not using the disks itself::
1022
1023   $ xenstore-ls /local/domain/%0%/backend/vbd|grep -e "domain =" -e physical-device
1024   domain = "instance2"
1025   physical-device = "93:0"
1026   domain = "instance3"
1027   physical-device = "93:1"
1028   domain = "instance4"
1029   physical-device = "93:2"
1030   $
1031
1032 You can see in the above output that the node exports three disks, to
1033 three instances. The ``physical-device`` key is in major:minor format in
1034 hexadecimal, and ``0x93`` represents DRBD's major number. Thus we can
1035 see from the above that instance2 has /dev/drbd0, instance3 /dev/drbd1,
1036 and instance4 /dev/drbd2.
1037
1038 LUXI version mismatch
1039 +++++++++++++++++++++
1040
1041 LUXI is the protocol used for communication between clients and the
1042 master daemon. Starting in Ganeti 2.3, the peers exchange their version
1043 in each message. When they don't match, an error is raised::
1044
1045   $ gnt-node modify -O yes %node3%
1046   Unhandled Ganeti error: LUXI version mismatch, server 2020000, request 2030000
1047
1048 Usually this means that server and client are from different Ganeti
1049 versions or import their libraries from different, consistent paths
1050 (e.g. an older version installed in another place). You can print the
1051 import path for Ganeti's modules using the following command (note that
1052 depending on your setup you might have to use an explicit version in the
1053 Python command, e.g. ``python2.6``)::
1054
1055   python -c 'import ganeti; print ganeti.__file__'
1056
1057 .. vim: set textwidth=72 :
1058 .. Local Variables:
1059 .. mode: rst
1060 .. fill-column: 72
1061 .. End: