Statistics
| Branch: | Tag: | Revision:

root / UPGRADE @ e8b46463

History | View | Annotate | Download (9.9 kB)

1
Upgrade notes
2
=============
3

    
4
.. highlight:: shell-example
5

    
6
This document details the steps needed to upgrade a cluster to newer versions
7
of Ganeti.
8

    
9
As a general rule the node daemons need to be restarted after each software
10
upgrade; if using the provided example init.d script, this means running the
11
following command on all nodes::
12

    
13
    $ /etc/init.d/ganeti restart
14

    
15

    
16
2.1 and above
17
-------------
18

    
19
Starting with Ganeti 2.0, upgrades between revisions (e.g. 2.1.0 to 2.1.1)
20
should not need manual intervention. As a safety measure, minor releases (e.g.
21
2.1.3 to 2.2.0) require the ``cfgupgrade`` command for changing the
22
configuration version. Below you find the steps necessary to upgrade between
23
minor releases.
24

    
25
To run commands on all nodes, the `distributed shell (dsh)
26
<http://www.netfort.gr.jp/~dancer/software/dsh.html.en>`_ can be used, e.g.
27
``dsh -M -F 8 -f /var/lib/ganeti/ssconf_online_nodes gnt-cluster --version``.
28

    
29
#. Ensure no jobs are running (master node only)::
30

    
31
    $ gnt-job list
32

    
33
#. Pause the watcher for an hour (master node only)::
34

    
35
    $ gnt-cluster watcher pause 1h
36

    
37
#. Stop all daemons on all nodes::
38

    
39
    $ /etc/init.d/ganeti stop
40

    
41
#. Backup old configuration (master node only)::
42

    
43
    $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti
44

    
45
#. Install new Ganeti version on all nodes
46
#. Run cfgupgrade on the master node::
47

    
48
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --dry-run
49
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose
50

    
51
   (``cfgupgrade`` supports a number of parameters, run it with
52
   ``--help`` for more information)
53

    
54
#. Upgrade the directory permissions on all nodes::
55

    
56
    $ /usr/lib/ganeti/ensure-dirs --full-run
57

    
58
#. Restart daemons on all nodes::
59

    
60
    $ /etc/init.d/ganeti restart
61

    
62
#. Re-distribute configuration (master node only)::
63

    
64
    $ gnt-cluster redist-conf
65

    
66
#. If you use file storage, check that the ``/etc/ganeti/file-storage-paths``
67
#. is correct on all nodes. For security reasons it's not copied
68
#. automatically, but it can be copied manually via::
69

    
70
   $ gnt-cluster copyfile /etc/ganeti/file-storage-paths
71

    
72
#. Restart daemons again on all nodes::
73

    
74
    $ /etc/init.d/ganeti restart
75

    
76
#. Enable the watcher again (master node only)::
77

    
78
    $ gnt-cluster watcher continue
79

    
80
#. Verify cluster (master node only)::
81

    
82
    $ gnt-cluster verify
83

    
84
Reverting an upgrade
85
~~~~~~~~~~~~~~~~~~~~
86

    
87
For going back between revisions (e.g. 2.1.1 to 2.1.0) no manual
88
intervention is required, as for upgrades.
89

    
90
Starting from version 2.8, ``cfgupgrade`` supports ``--downgrade``
91
option to bring the configuration back to the previous stable version.
92
This is useful if you upgrade Ganeti and after some time you run into
93
problems with the new version. You can downgrade the configuration
94
without losing the changes made since the upgrade. Any feature not
95
supported by the old version will be removed from the configuration, of
96
course, but you get a warning about it. If there is any new feature and
97
you haven't changed from its default value, you don't have to worry
98
about it, as it will get the same value whenever you'll upgrade again.
99

    
100
The procedure is similar to upgrading, but please notice that you have to
101
revert the configuration **before** installing the old version.
102

    
103
#. Ensure no jobs are running (master node only)::
104

    
105
    $ gnt-job list
106

    
107
#. Pause the watcher for an hour (master node only)::
108

    
109
    $ gnt-cluster watcher pause 1h
110

    
111
#. Stop all daemons on all nodes::
112

    
113
    $ /etc/init.d/ganeti stop
114

    
115
#. Backup old configuration (master node only)::
116

    
117
    $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti
118

    
119
#. Run cfgupgrade on the master node::
120

    
121
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade --dry-run
122
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade
123

    
124
   You may want to copy all the messages about features that have been
125
   removed during the downgrade, in case you want to restore them when
126
   upgrading again.
127

    
128
#. Install the old Ganeti version on all nodes
129
#. Restart daemons on all nodes::
130

    
131
    $ /etc/init.d/ganeti restart
132

    
133
#. Re-distribute configuration (master node only)::
134

    
135
    $ gnt-cluster redist-conf
136

    
137
#. Restart daemons again on all nodes::
138

    
139
    $ /etc/init.d/ganeti restart
140

    
141
#. Enable the watcher again (master node only)::
142

    
143
    $ gnt-cluster watcher continue
144

    
145
#. Verify cluster (master node only)::
146

    
147
    $ gnt-cluster verify
148

    
149

    
150
2.0 releases
151
------------
152

    
153
2.0.3 to 2.0.4
154
~~~~~~~~~~~~~~
155

    
156
No changes needed except restarting the daemon; but rollback to 2.0.3 might
157
require configuration editing.
158

    
159
If you're using Xen-HVM instances, please double-check the network
160
configuration (``nic_type`` parameter) as the defaults might have changed:
161
2.0.4 adds any missing configuration items and depending on the version of the
162
software the cluster has been installed with, some new keys might have been
163
added.
164

    
165
2.0.1 to 2.0.2/2.0.3
166
~~~~~~~~~~~~~~~~~~~~
167

    
168
Between 2.0.1 and 2.0.2 there have been some changes in the handling of block
169
devices, which can cause some issues. 2.0.3 was then released which adds two
170
new options/commands to fix this issue.
171

    
172
If you use DRBD-type instances and see problems in instance start or
173
activate-disks with messages from DRBD about "lower device too small" or
174
similar, it is recoomended to:
175

    
176
#. Run ``gnt-instance activate-disks --ignore-size $instance`` for each
177
   of the affected instances
178
#. Then run ``gnt-cluster repair-disk-sizes`` which will check that
179
   instances have the correct disk sizes
180

    
181
1.2 to 2.0
182
----------
183

    
184
Prerequisites:
185

    
186
- Ganeti 1.2.7 is currently installed
187
- All instances have been migrated from DRBD 0.7 to DRBD 8.x (i.e. no
188
  ``remote_raid1`` disk template)
189
- Upgrade to Ganeti 2.0.0~rc2 or later (~rc1 and earlier don't have the needed
190
  upgrade tool)
191

    
192
In the below steps, replace :file:`/var/lib` with ``$libdir`` if Ganeti was not
193
installed with this prefix (e.g. :file:`/usr/local/var`). Same for
194
:file:`/usr/lib`.
195

    
196
Execution (all steps are required in the order given):
197

    
198
#. Make a backup of the current configuration, for safety::
199

    
200
    $ cp -a /var/lib/ganeti /var/lib/ganeti-1.2.backup
201

    
202
#. Stop all instances::
203

    
204
    $ gnt-instance stop --all
205

    
206
#. Make sure no DRBD device are in use, the following command should show no
207
   active minors::
208

    
209
    $ gnt-cluster command grep cs: /proc/drbd | grep -v cs:Unconf
210

    
211
#. Stop the node daemons and rapi daemon on all nodes (note: should be logged
212
   in not via the cluster name, but the master node name, as the command below
213
   will remove the cluster ip from the master node)::
214

    
215
    $ gnt-cluster command /etc/init.d/ganeti stop
216

    
217
#. Install the new software on all nodes, either from packaging (if available)
218
   or from sources; the master daemon will not start but give error messages
219
   about wrong configuration file, which is normal
220
#. Upgrade the configuration file::
221

    
222
    $ /usr/lib/ganeti/tools/cfgupgrade12 -v --dry-run
223
    $ /usr/lib/ganeti/tools/cfgupgrade12 -v
224

    
225
#. Make sure ``ganeti-noded`` is running on all nodes (and start it if
226
   not)
227
#. Start the master daemon::
228

    
229
    $ ganeti-masterd
230

    
231
#. Check that a simple node-list works::
232

    
233
    $ gnt-node list
234

    
235
#. Redistribute updated configuration to all nodes::
236

    
237
    $ gnt-cluster redist-conf
238
    $ gnt-cluster copyfile /var/lib/ganeti/known_hosts
239

    
240
#. Optional: if needed, install RAPI-specific certificates under
241
   :file:`/var/lib/ganeti/rapi.pem` and run::
242

    
243
    $ gnt-cluster copyfile /var/lib/ganeti/rapi.pem
244

    
245
#. Run a cluster verify, this should show no problems::
246

    
247
    $ gnt-cluster verify
248

    
249
#. Remove some obsolete files::
250

    
251
    $ gnt-cluster command rm /var/lib/ganeti/ssconf_node_pass
252
    $ gnt-cluster command rm /var/lib/ganeti/ssconf_hypervisor
253

    
254
#. Update the xen pvm (if this was a pvm cluster) setting for 1.2
255
   compatibility::
256

    
257
    $ gnt-cluster modify -H xen-pvm:root_path=/dev/sda
258

    
259
#. Depending on your setup, you might also want to reset the initrd parameter::
260

    
261
    $ gnt-cluster modify -H xen-pvm:initrd_path=/boot/initrd-2.6-xenU
262

    
263
#. Reset the instance autobalance setting to default::
264

    
265
    $ for i in $(gnt-instance list -o name --no-headers); do \
266
       gnt-instance modify -B auto_balance=default $i; \
267
      done
268

    
269
#. Optional: start the RAPI demon::
270

    
271
    $ ganeti-rapi
272

    
273
#. Restart instances::
274

    
275
    $ gnt-instance start --force-multiple --all
276

    
277
At this point, ``gnt-cluster verify`` should show no errors and the migration
278
is complete.
279

    
280
1.2 releases
281
------------
282

    
283
1.2.4 to any other higher 1.2 version
284
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
285

    
286
No changes needed. Rollback will usually require manual edit of the
287
configuration file.
288

    
289
1.2.3 to 1.2.4
290
~~~~~~~~~~~~~~
291

    
292
No changes needed. Note that going back from 1.2.4 to 1.2.3 will require manual
293
edit of the configuration file (since we added some HVM-related new
294
attributes).
295

    
296
1.2.2 to 1.2.3
297
~~~~~~~~~~~~~~
298

    
299
No changes needed. Note that the drbd7-to-8 upgrade tool does a disk format
300
change for the DRBD metadata, so in theory this might be **risky**. It is
301
advised to have (good) backups before doing the upgrade.
302

    
303
1.2.1 to 1.2.2
304
~~~~~~~~~~~~~~
305

    
306
No changes needed.
307

    
308
1.2.0 to 1.2.1
309
~~~~~~~~~~~~~~
310

    
311
No changes needed. Only some bugfixes and new additions that don't affect
312
existing clusters.
313

    
314
1.2.0 beta 3 to 1.2.0
315
~~~~~~~~~~~~~~~~~~~~~
316

    
317
No changes needed.
318

    
319
1.2.0 beta 2 to beta 3
320
~~~~~~~~~~~~~~~~~~~~~~
321

    
322
No changes needed. A new version of the debian-etch-instance OS (0.3) has been
323
released, but upgrading it is not required.
324

    
325
1.2.0 beta 1 to beta 2
326
~~~~~~~~~~~~~~~~~~~~~~
327

    
328
Beta 2 switched the config file format to JSON. Steps to upgrade:
329

    
330
#. Stop the daemons (``/etc/init.d/ganeti stop``) on all nodes
331
#. Disable the cron job (default is :file:`/etc/cron.d/ganeti`)
332
#. Install the new version
333
#. Make a backup copy of the config file
334
#. Upgrade the config file using the following command::
335

    
336
    $ /usr/share/ganeti/cfgupgrade --verbose /var/lib/ganeti/config.data
337

    
338
#. Start the daemons and run ``gnt-cluster info``, ``gnt-node list`` and
339
   ``gnt-instance list`` to check if the upgrade process finished successfully
340

    
341
The OS definition also need to be upgraded. There is a new version of the
342
debian-etch-instance OS (0.2) that goes along with beta 2.
343

    
344
.. vim: set textwidth=72 :
345
.. Local Variables:
346
.. mode: rst
347
.. fill-column: 72
348
.. End: