Statistics
| Branch: | Tag: | Revision:

root / UPGRADE @ b18409cf

History | View | Annotate | Download (12.2 kB)

1
Upgrade notes
2
=============
3

    
4
.. highlight:: shell-example
5

    
6
This document details the steps needed to upgrade a cluster to newer versions
7
of Ganeti.
8

    
9
As a general rule the node daemons need to be restarted after each software
10
upgrade; if using the provided example init.d script, this means running the
11
following command on all nodes::
12

    
13
    $ /etc/init.d/ganeti restart
14

    
15
2.11 and above
16
--------------
17

    
18
Starting from 2.10 onwards, Ganeti has support for parallely installed versions
19
and automated upgrades. The default configuration for 2.11 and higher already is
20
to install as a parallel version without changing the running version. If both
21
versions, the installed one and the one to upgrade to, are 2.10 or higher, the
22
actual switch of the live version can be carried out by the following command
23
on the master node.::
24

    
25
   $ gnt-cluster upgrade --to 2.11
26

    
27
This will carry out the steps described below in the section on upgrades from
28
2.1 and above. Downgrades to the previous minor version can be done in the same
29
way, specifiying the smaller version on the ``--to`` argument.
30

    
31

    
32
2.11
33
----
34

    
35
When upgrading to 2.11, first apply the instructions of ``2.11 and
36
above``. 2.11 comes with the new feature of enhanced RPC security
37
through client certificates. This features needs to be enabled after the
38
upgrade by::
39

    
40
   $ gnt-cluster renew-crypto --new-node-certificates
41

    
42

    
43
2.1 and above
44
-------------
45

    
46
Starting with Ganeti 2.0, upgrades between revisions (e.g. 2.1.0 to 2.1.1)
47
should not need manual intervention. As a safety measure, minor releases (e.g.
48
2.1.3 to 2.2.0) require the ``cfgupgrade`` command for changing the
49
configuration version. Below you find the steps necessary to upgrade between
50
minor releases.
51

    
52
To run commands on all nodes, the `distributed shell (dsh)
53
<http://www.netfort.gr.jp/~dancer/software/dsh.html.en>`_ can be used, e.g.
54
``dsh -M -F 8 -f /var/lib/ganeti/ssconf_online_nodes gnt-cluster --version``.
55

    
56
#. Ensure no jobs are running (master node only)::
57

    
58
    $ gnt-job list
59

    
60
#. Pause the watcher for an hour (master node only)::
61

    
62
    $ gnt-cluster watcher pause 1h
63

    
64
#. Stop all daemons on all nodes::
65

    
66
    $ /etc/init.d/ganeti stop
67

    
68
#. Backup old configuration (master node only)::
69

    
70
    $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti
71

    
72
#. Install new Ganeti version on all nodes
73
#. Run cfgupgrade on the master node::
74

    
75
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --dry-run
76
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose
77

    
78
   (``cfgupgrade`` supports a number of parameters, run it with
79
   ``--help`` for more information)
80

    
81
#. Upgrade the directory permissions on all nodes::
82

    
83
    $ /usr/lib/ganeti/ensure-dirs --full-run
84

    
85
#. Create the (missing) required users and make users part of the required
86
   groups on all nodes::
87

    
88
    $ /usr/lib/ganeti/tools/users-setup
89

    
90
   This will ask for confirmation. To execute directly, add the ``--yes-do-it``
91
   option.
92

    
93
#. Restart daemons on all nodes::
94

    
95
    $ /etc/init.d/ganeti restart
96

    
97
#. Re-distribute configuration (master node only)::
98

    
99
    $ gnt-cluster redist-conf
100

    
101
#. If you use file storage, check that the ``/etc/ganeti/file-storage-paths``
102
   is correct on all nodes. For security reasons it's not copied
103
   automatically, but it can be copied manually via::
104

    
105
   $ gnt-cluster copyfile /etc/ganeti/file-storage-paths
106

    
107
#. Restart daemons again on all nodes::
108

    
109
    $ /etc/init.d/ganeti restart
110

    
111
#. Enable the watcher again (master node only)::
112

    
113
    $ gnt-cluster watcher continue
114

    
115
#. Verify cluster (master node only)::
116

    
117
    $ gnt-cluster verify
118

    
119
Reverting an upgrade
120
~~~~~~~~~~~~~~~~~~~~
121

    
122
For going back between revisions (e.g. 2.1.1 to 2.1.0) no manual
123
intervention is required, as for upgrades.
124

    
125
Starting from version 2.8, ``cfgupgrade`` supports ``--downgrade``
126
option to bring the configuration back to the previous stable version.
127
This is useful if you upgrade Ganeti and after some time you run into
128
problems with the new version. You can downgrade the configuration
129
without losing the changes made since the upgrade. Any feature not
130
supported by the old version will be removed from the configuration, of
131
course, but you get a warning about it. If there is any new feature and
132
you haven't changed from its default value, you don't have to worry
133
about it, as it will get the same value whenever you'll upgrade again.
134

    
135
Automatic downgrades
136
....................
137

    
138
From version 2.11 onwards, downgrades can be done by using the
139
``gnt-cluster upgrade`` command.::
140

    
141
  gnt-cluster upgrade --to 2.10
142

    
143
Manual downgrades
144
.................
145

    
146
The procedure is similar to upgrading, but please notice that you have to
147
revert the configuration **before** installing the old version.
148

    
149
#. Ensure no jobs are running (master node only)::
150

    
151
    $ gnt-job list
152

    
153
#. Pause the watcher for an hour (master node only)::
154

    
155
    $ gnt-cluster watcher pause 1h
156

    
157
#. Stop all daemons on all nodes::
158

    
159
    $ /etc/init.d/ganeti stop
160

    
161
#. Backup old configuration (master node only)::
162

    
163
    $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti
164

    
165
#. Run cfgupgrade on the master node::
166

    
167
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade --dry-run
168
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade
169

    
170
   You may want to copy all the messages about features that have been
171
   removed during the downgrade, in case you want to restore them when
172
   upgrading again.
173

    
174
#. Install the old Ganeti version on all nodes
175

    
176
   NB: in Ganeti 2.8, the ``cmdlib.py`` file was split into a series of files
177
   contained in the ``cmdlib`` directory. If Ganeti is installed from sources
178
   and not from a package, while downgrading Ganeti to a pre-2.8
179
   version it is important to remember to remove the ``cmdlib`` directory
180
   from the directory containing the Ganeti python files (which usually is
181
   ``${PREFIX}/lib/python${VERSION}/dist-packages/ganeti``).
182
   A simpler upgrade/downgrade procedure will be made available in future
183
   versions of Ganeti.
184

    
185
#. Restart daemons on all nodes::
186

    
187
    $ /etc/init.d/ganeti restart
188

    
189
#. Re-distribute configuration (master node only)::
190

    
191
    $ gnt-cluster redist-conf
192

    
193
#. Restart daemons again on all nodes::
194

    
195
    $ /etc/init.d/ganeti restart
196

    
197
#. Enable the watcher again (master node only)::
198

    
199
    $ gnt-cluster watcher continue
200

    
201
#. Verify cluster (master node only)::
202

    
203
    $ gnt-cluster verify
204

    
205
Specific tasks for 2.11 to 2.10 downgrade
206
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
207

    
208
After running ``cfgupgrade``, the ``client.pem`` and
209
``ssconf_master_candidates_certs`` files need to be removed
210
from Ganeti's data directory on all nodes. While this step is
211
not necessary for 2.10 to run cleanly, leaving them will cause
212
problems when upgrading again after the downgrade.
213

    
214
2.0 releases
215
------------
216

    
217
2.0.3 to 2.0.4
218
~~~~~~~~~~~~~~
219

    
220
No changes needed except restarting the daemon; but rollback to 2.0.3 might
221
require configuration editing.
222

    
223
If you're using Xen-HVM instances, please double-check the network
224
configuration (``nic_type`` parameter) as the defaults might have changed:
225
2.0.4 adds any missing configuration items and depending on the version of the
226
software the cluster has been installed with, some new keys might have been
227
added.
228

    
229
2.0.1 to 2.0.2/2.0.3
230
~~~~~~~~~~~~~~~~~~~~
231

    
232
Between 2.0.1 and 2.0.2 there have been some changes in the handling of block
233
devices, which can cause some issues. 2.0.3 was then released which adds two
234
new options/commands to fix this issue.
235

    
236
If you use DRBD-type instances and see problems in instance start or
237
activate-disks with messages from DRBD about "lower device too small" or
238
similar, it is recoomended to:
239

    
240
#. Run ``gnt-instance activate-disks --ignore-size $instance`` for each
241
   of the affected instances
242
#. Then run ``gnt-cluster repair-disk-sizes`` which will check that
243
   instances have the correct disk sizes
244

    
245
1.2 to 2.0
246
----------
247

    
248
Prerequisites:
249

    
250
- Ganeti 1.2.7 is currently installed
251
- All instances have been migrated from DRBD 0.7 to DRBD 8.x (i.e. no
252
  ``remote_raid1`` disk template)
253
- Upgrade to Ganeti 2.0.0~rc2 or later (~rc1 and earlier don't have the needed
254
  upgrade tool)
255

    
256
In the below steps, replace :file:`/var/lib` with ``$libdir`` if Ganeti was not
257
installed with this prefix (e.g. :file:`/usr/local/var`). Same for
258
:file:`/usr/lib`.
259

    
260
Execution (all steps are required in the order given):
261

    
262
#. Make a backup of the current configuration, for safety::
263

    
264
    $ cp -a /var/lib/ganeti /var/lib/ganeti-1.2.backup
265

    
266
#. Stop all instances::
267

    
268
    $ gnt-instance stop --all
269

    
270
#. Make sure no DRBD device are in use, the following command should show no
271
   active minors::
272

    
273
    $ gnt-cluster command grep cs: /proc/drbd | grep -v cs:Unconf
274

    
275
#. Stop the node daemons and rapi daemon on all nodes (note: should be logged
276
   in not via the cluster name, but the master node name, as the command below
277
   will remove the cluster ip from the master node)::
278

    
279
    $ gnt-cluster command /etc/init.d/ganeti stop
280

    
281
#. Install the new software on all nodes, either from packaging (if available)
282
   or from sources; the master daemon will not start but give error messages
283
   about wrong configuration file, which is normal
284
#. Upgrade the configuration file::
285

    
286
    $ /usr/lib/ganeti/tools/cfgupgrade12 -v --dry-run
287
    $ /usr/lib/ganeti/tools/cfgupgrade12 -v
288

    
289
#. Make sure ``ganeti-noded`` is running on all nodes (and start it if
290
   not)
291
#. Start the master daemon::
292

    
293
    $ ganeti-masterd
294

    
295
#. Check that a simple node-list works::
296

    
297
    $ gnt-node list
298

    
299
#. Redistribute updated configuration to all nodes::
300

    
301
    $ gnt-cluster redist-conf
302
    $ gnt-cluster copyfile /var/lib/ganeti/known_hosts
303

    
304
#. Optional: if needed, install RAPI-specific certificates under
305
   :file:`/var/lib/ganeti/rapi.pem` and run::
306

    
307
    $ gnt-cluster copyfile /var/lib/ganeti/rapi.pem
308

    
309
#. Run a cluster verify, this should show no problems::
310

    
311
    $ gnt-cluster verify
312

    
313
#. Remove some obsolete files::
314

    
315
    $ gnt-cluster command rm /var/lib/ganeti/ssconf_node_pass
316
    $ gnt-cluster command rm /var/lib/ganeti/ssconf_hypervisor
317

    
318
#. Update the xen pvm (if this was a pvm cluster) setting for 1.2
319
   compatibility::
320

    
321
    $ gnt-cluster modify -H xen-pvm:root_path=/dev/sda
322

    
323
#. Depending on your setup, you might also want to reset the initrd parameter::
324

    
325
    $ gnt-cluster modify -H xen-pvm:initrd_path=/boot/initrd-2.6-xenU
326

    
327
#. Reset the instance autobalance setting to default::
328

    
329
    $ for i in $(gnt-instance list -o name --no-headers); do \
330
       gnt-instance modify -B auto_balance=default $i; \
331
      done
332

    
333
#. Optional: start the RAPI demon::
334

    
335
    $ ganeti-rapi
336

    
337
#. Restart instances::
338

    
339
    $ gnt-instance start --force-multiple --all
340

    
341
At this point, ``gnt-cluster verify`` should show no errors and the migration
342
is complete.
343

    
344
1.2 releases
345
------------
346

    
347
1.2.4 to any other higher 1.2 version
348
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
349

    
350
No changes needed. Rollback will usually require manual edit of the
351
configuration file.
352

    
353
1.2.3 to 1.2.4
354
~~~~~~~~~~~~~~
355

    
356
No changes needed. Note that going back from 1.2.4 to 1.2.3 will require manual
357
edit of the configuration file (since we added some HVM-related new
358
attributes).
359

    
360
1.2.2 to 1.2.3
361
~~~~~~~~~~~~~~
362

    
363
No changes needed. Note that the drbd7-to-8 upgrade tool does a disk format
364
change for the DRBD metadata, so in theory this might be **risky**. It is
365
advised to have (good) backups before doing the upgrade.
366

    
367
1.2.1 to 1.2.2
368
~~~~~~~~~~~~~~
369

    
370
No changes needed.
371

    
372
1.2.0 to 1.2.1
373
~~~~~~~~~~~~~~
374

    
375
No changes needed. Only some bugfixes and new additions that don't affect
376
existing clusters.
377

    
378
1.2.0 beta 3 to 1.2.0
379
~~~~~~~~~~~~~~~~~~~~~
380

    
381
No changes needed.
382

    
383
1.2.0 beta 2 to beta 3
384
~~~~~~~~~~~~~~~~~~~~~~
385

    
386
No changes needed. A new version of the debian-etch-instance OS (0.3) has been
387
released, but upgrading it is not required.
388

    
389
1.2.0 beta 1 to beta 2
390
~~~~~~~~~~~~~~~~~~~~~~
391

    
392
Beta 2 switched the config file format to JSON. Steps to upgrade:
393

    
394
#. Stop the daemons (``/etc/init.d/ganeti stop``) on all nodes
395
#. Disable the cron job (default is :file:`/etc/cron.d/ganeti`)
396
#. Install the new version
397
#. Make a backup copy of the config file
398
#. Upgrade the config file using the following command::
399

    
400
    $ /usr/share/ganeti/cfgupgrade --verbose /var/lib/ganeti/config.data
401

    
402
#. Start the daemons and run ``gnt-cluster info``, ``gnt-node list`` and
403
   ``gnt-instance list`` to check if the upgrade process finished successfully
404

    
405
The OS definition also need to be upgraded. There is a new version of the
406
debian-etch-instance OS (0.2) that goes along with beta 2.
407

    
408
.. vim: set textwidth=72 :
409
.. Local Variables:
410
.. mode: rst
411
.. fill-column: 72
412
.. End: