code.grnet.gr Git - ganeti-local/blob - UPGRADE

   1 Upgrade notes
   2 =============
   3
   4 .. highlight:: shell-example
   5
   6 This document details the steps needed to upgrade a cluster to newer versions
   7 of Ganeti.
   8
   9 As a general rule the node daemons need to be restarted after each software
  10 upgrade; if using the provided example init.d script, this means running the
  11 following command on all nodes::
  12
  13     $ /etc/init.d/ganeti restart
  14
  15
  16 2.1 and above
  17 -------------
  18
  19 Starting with Ganeti 2.0, upgrades between revisions (e.g. 2.1.0 to 2.1.1)
  20 should not need manual intervention. As a safety measure, minor releases (e.g.
  21 2.1.3 to 2.2.0) require the ``cfgupgrade`` command for changing the
  22 configuration version. Below you find the steps necessary to upgrade between
  23 minor releases.
  24
  25 To run commands on all nodes, the `distributed shell (dsh)
  26 <http://www.netfort.gr.jp/~dancer/software/dsh.html.en>`_ can be used, e.g.
  27 ``dsh -M -F 8 -f /var/lib/ganeti/ssconf_online_nodes gnt-cluster --version``.
  28
  29 #. Ensure no jobs are running (master node only)::
  30
  31     $ gnt-job list
  32
  33 #. Pause the watcher for an hour (master node only)::
  34
  35     $ gnt-cluster watcher pause 1h
  36
  37 #. Stop all daemons on all nodes::
  38
  39     $ /etc/init.d/ganeti stop
  40
  41 #. Backup old configuration (master node only)::
  42
  43     $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti
  44
  45 #. Install new Ganeti version on all nodes
  46 #. Run cfgupgrade on the master node::
  47
  48     $ /usr/lib/ganeti/tools/cfgupgrade --verbose --dry-run
  49     $ /usr/lib/ganeti/tools/cfgupgrade --verbose
  50
  51    (``cfgupgrade`` supports a number of parameters, run it with
  52    ``--help`` for more information)
  53
  54 #. Upgrade the directory permissions on all nodes::
  55
  56     $ /usr/lib/ganeti/ensure-dirs --full-run
  57
  58 #. Restart daemons on all nodes::
  59
  60     $ /etc/init.d/ganeti restart
  61
  62 #. Re-distribute configuration (master node only)::
  63
  64     $ gnt-cluster redist-conf
  65
  66 #. If you use file storage, check that the ``/etc/ganeti/file-storage-paths``
  67 #. is correct on all nodes. For security reasons it's not copied
  68 #. automatically, but it can be copied manually via::
  69
  70    $ gnt-cluster copyfile /etc/ganeti/file-storage-paths
  71
  72 #. Restart daemons again on all nodes::
  73
  74     $ /etc/init.d/ganeti restart
  75
  76 #. Enable the watcher again (master node only)::
  77
  78     $ gnt-cluster watcher continue
  79
  80 #. Verify cluster (master node only)::
  81
  82     $ gnt-cluster verify
  83
  84 Reverting an upgrade
  85 ~~~~~~~~~~~~~~~~~~~~
  86
  87 For going back between revisions (e.g. 2.1.1 to 2.1.0) no manual
  88 intervention is required, as for upgrades.
  89
  90 Starting from version 2.8, ``cfgupgrade`` supports ``--downgrade``
  91 option to bring the configuration back to the previous stable version.
  92 This is useful if you upgrade Ganeti and after some time you run into
  93 problems with the new version. You can downgrade the configuration
  94 without losing the changes made since the upgrade. Any feature not
  95 supported by the old version will be removed from the configuration, of
  96 course, but you get a warning about it. If there is any new feature and
  97 you haven't changed from its default value, you don't have to worry
  98 about it, as it will get the same value whenever you'll upgrade again.
  99
 100 The procedure is similar to upgrading, but please notice that you have to
 101 revert the configuration **before** installing the old version.
 102
 103 #. Ensure no jobs are running (master node only)::
 104
 105     $ gnt-job list
 106
 107 #. Pause the watcher for an hour (master node only)::
 108
 109     $ gnt-cluster watcher pause 1h
 110
 111 #. Stop all daemons on all nodes::
 112
 113     $ /etc/init.d/ganeti stop
 114
 115 #. Backup old configuration (master node only)::
 116
 117     $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti
 118
 119 #. Run cfgupgrade on the master node::
 120
 121     $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade --dry-run
 122     $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade
 123
 124    You may want to copy all the messages about features that have been
 125    removed during the downgrade, in case you want to restore them when
 126    upgrading again.
 127
 128 #. Install the old Ganeti version on all nodes
 129 #. Restart daemons on all nodes::
 130
 131     $ /etc/init.d/ganeti restart
 132
 133 #. Re-distribute configuration (master node only)::
 134
 135     $ gnt-cluster redist-conf
 136
 137 #. Restart daemons again on all nodes::
 138
 139     $ /etc/init.d/ganeti restart
 140
 141 #. Enable the watcher again (master node only)::
 142
 143     $ gnt-cluster watcher continue
 144
 145 #. Verify cluster (master node only)::
 146
 147     $ gnt-cluster verify
 148
 149
 150 2.0 releases
 151 ------------
 152
 153 2.0.3 to 2.0.4
 154 ~~~~~~~~~~~~~~
 155
 156 No changes needed except restarting the daemon; but rollback to 2.0.3 might
 157 require configuration editing.
 158
 159 If you're using Xen-HVM instances, please double-check the network
 160 configuration (``nic_type`` parameter) as the defaults might have changed:
 161 2.0.4 adds any missing configuration items and depending on the version of the
 162 software the cluster has been installed with, some new keys might have been
 163 added.
 164
 165 2.0.1 to 2.0.2/2.0.3
 166 ~~~~~~~~~~~~~~~~~~~~
 167
 168 Between 2.0.1 and 2.0.2 there have been some changes in the handling of block
 169 devices, which can cause some issues. 2.0.3 was then released which adds two
 170 new options/commands to fix this issue.
 171
 172 If you use DRBD-type instances and see problems in instance start or
 173 activate-disks with messages from DRBD about "lower device too small" or
 174 similar, it is recoomended to:
 175
 176 #. Run ``gnt-instance activate-disks --ignore-size $instance`` for each
 177    of the affected instances
 178 #. Then run ``gnt-cluster repair-disk-sizes`` which will check that
 179    instances have the correct disk sizes
 180
 181 1.2 to 2.0
 182 ----------
 183
 184 Prerequisites:
 185
 186 - Ganeti 1.2.7 is currently installed
 187 - All instances have been migrated from DRBD 0.7 to DRBD 8.x (i.e. no
 188   ``remote_raid1`` disk template)
 189 - Upgrade to Ganeti 2.0.0~rc2 or later (~rc1 and earlier don't have the needed
 190   upgrade tool)
 191
 192 In the below steps, replace :file:`/var/lib` with ``$libdir`` if Ganeti was not
 193 installed with this prefix (e.g. :file:`/usr/local/var`). Same for
 194 :file:`/usr/lib`.
 195
 196 Execution (all steps are required in the order given):
 197
 198 #. Make a backup of the current configuration, for safety::
 199
 200     $ cp -a /var/lib/ganeti /var/lib/ganeti-1.2.backup
 201
 202 #. Stop all instances::
 203
 204     $ gnt-instance stop --all
 205
 206 #. Make sure no DRBD device are in use, the following command should show no
 207    active minors::
 208
 209     $ gnt-cluster command grep cs: /proc/drbd | grep -v cs:Unconf
 210
 211 #. Stop the node daemons and rapi daemon on all nodes (note: should be logged
 212    in not via the cluster name, but the master node name, as the command below
 213    will remove the cluster ip from the master node)::
 214
 215     $ gnt-cluster command /etc/init.d/ganeti stop
 216
 217 #. Install the new software on all nodes, either from packaging (if available)
 218    or from sources; the master daemon will not start but give error messages
 219    about wrong configuration file, which is normal
 220 #. Upgrade the configuration file::
 221
 222     $ /usr/lib/ganeti/tools/cfgupgrade12 -v --dry-run
 223     $ /usr/lib/ganeti/tools/cfgupgrade12 -v
 224
 225 #. Make sure ``ganeti-noded`` is running on all nodes (and start it if
 226    not)
 227 #. Start the master daemon::
 228
 229     $ ganeti-masterd
 230
 231 #. Check that a simple node-list works::
 232
 233     $ gnt-node list
 234
 235 #. Redistribute updated configuration to all nodes::
 236
 237     $ gnt-cluster redist-conf
 238     $ gnt-cluster copyfile /var/lib/ganeti/known_hosts
 239
 240 #. Optional: if needed, install RAPI-specific certificates under
 241    :file:`/var/lib/ganeti/rapi.pem` and run::
 242
 243     $ gnt-cluster copyfile /var/lib/ganeti/rapi.pem
 244
 245 #. Run a cluster verify, this should show no problems::
 246
 247     $ gnt-cluster verify
 248
 249 #. Remove some obsolete files::
 250
 251     $ gnt-cluster command rm /var/lib/ganeti/ssconf_node_pass
 252     $ gnt-cluster command rm /var/lib/ganeti/ssconf_hypervisor
 253
 254 #. Update the xen pvm (if this was a pvm cluster) setting for 1.2
 255    compatibility::
 256
 257     $ gnt-cluster modify -H xen-pvm:root_path=/dev/sda
 258
 259 #. Depending on your setup, you might also want to reset the initrd parameter::
 260
 261     $ gnt-cluster modify -H xen-pvm:initrd_path=/boot/initrd-2.6-xenU
 262
 263 #. Reset the instance autobalance setting to default::
 264
 265     $ for i in $(gnt-instance list -o name --no-headers); do \
 266        gnt-instance modify -B auto_balance=default $i; \
 267       done
 268
 269 #. Optional: start the RAPI demon::
 270
 271     $ ganeti-rapi
 272
 273 #. Restart instances::
 274
 275     $ gnt-instance start --force-multiple --all
 276
 277 At this point, ``gnt-cluster verify`` should show no errors and the migration
 278 is complete.
 279
 280 1.2 releases
 281 ------------
 282
 283 1.2.4 to any other higher 1.2 version
 284 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 285
 286 No changes needed. Rollback will usually require manual edit of the
 287 configuration file.
 288
 289 1.2.3 to 1.2.4
 290 ~~~~~~~~~~~~~~
 291
 292 No changes needed. Note that going back from 1.2.4 to 1.2.3 will require manual
 293 edit of the configuration file (since we added some HVM-related new
 294 attributes).
 295
 296 1.2.2 to 1.2.3
 297 ~~~~~~~~~~~~~~
 298
 299 No changes needed. Note that the drbd7-to-8 upgrade tool does a disk format
 300 change for the DRBD metadata, so in theory this might be **risky**. It is
 301 advised to have (good) backups before doing the upgrade.
 302
 303 1.2.1 to 1.2.2
 304 ~~~~~~~~~~~~~~
 305
 306 No changes needed.
 307
 308 1.2.0 to 1.2.1
 309 ~~~~~~~~~~~~~~
 310
 311 No changes needed. Only some bugfixes and new additions that don't affect
 312 existing clusters.
 313
 314 1.2.0 beta 3 to 1.2.0
 315 ~~~~~~~~~~~~~~~~~~~~~
 316
 317 No changes needed.
 318
 319 1.2.0 beta 2 to beta 3
 320 ~~~~~~~~~~~~~~~~~~~~~~
 321
 322 No changes needed. A new version of the debian-etch-instance OS (0.3) has been
 323 released, but upgrading it is not required.
 324
 325 1.2.0 beta 1 to beta 2
 326 ~~~~~~~~~~~~~~~~~~~~~~
 327
 328 Beta 2 switched the config file format to JSON. Steps to upgrade:
 329
 330 #. Stop the daemons (``/etc/init.d/ganeti stop``) on all nodes
 331 #. Disable the cron job (default is :file:`/etc/cron.d/ganeti`)
 332 #. Install the new version
 333 #. Make a backup copy of the config file
 334 #. Upgrade the config file using the following command::
 335
 336     $ /usr/share/ganeti/cfgupgrade --verbose /var/lib/ganeti/config.data
 337
 338 #. Start the daemons and run ``gnt-cluster info``, ``gnt-node list`` and
 339    ``gnt-instance list`` to check if the upgrade process finished successfully
 340
 341 The OS definition also need to be upgraded. There is a new version of the
 342 debian-etch-instance OS (0.2) that goes along with beta 2.
 343
 344 .. vim: set textwidth=72 :
 345 .. Local Variables:
 346 .. mode: rst
 347 .. fill-column: 72
 348 .. End: