Statistics
| Branch: | Tag: | Revision:

root / UPGRADE @ 91cdc18f

History | View | Annotate | Download (10.1 kB)

1
Upgrade notes
2
=============
3

    
4
.. highlight:: shell-example
5

    
6
This document details the steps needed to upgrade a cluster to newer versions
7
of Ganeti.
8

    
9
As a general rule the node daemons need to be restarted after each software
10
upgrade; if using the provided example init.d script, this means running the
11
following command on all nodes::
12

    
13
    $ /etc/init.d/ganeti restart
14

    
15

    
16
2.1 and above
17
-------------
18

    
19
Starting with Ganeti 2.0, upgrades between revisions (e.g. 2.1.0 to 2.1.1)
20
should not need manual intervention. As a safety measure, minor releases (e.g.
21
2.1.3 to 2.2.0) require the ``cfgupgrade`` command for changing the
22
configuration version. Below you find the steps necessary to upgrade between
23
minor releases.
24

    
25
To run commands on all nodes, the `distributed shell (dsh)
26
<http://www.netfort.gr.jp/~dancer/software/dsh.html.en>`_ can be used, e.g.
27
``dsh -M -F 8 -f /var/lib/ganeti/ssconf_online_nodes gnt-cluster --version``.
28

    
29
#. Ensure no jobs are running (master node only)::
30

    
31
    $ gnt-job list
32

    
33
#. Pause the watcher for an hour (master node only)::
34

    
35
    $ gnt-cluster watcher pause 1h
36

    
37
#. Stop all daemons on all nodes::
38

    
39
    $ /etc/init.d/ganeti stop
40

    
41
#. Backup old configuration (master node only)::
42

    
43
    $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti
44

    
45
#. Install new Ganeti version on all nodes
46
#. Run cfgupgrade on the master node::
47

    
48
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --dry-run
49
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose
50

    
51
   (``cfgupgrade`` supports a number of parameters, run it with
52
   ``--help`` for more information)
53

    
54
#. Upgrade the directory permissions on all nodes::
55

    
56
    $ /usr/lib/ganeti/ensure-dirs --full-run
57

    
58
#. Create the (missing) required users and make users part of the required
59
groups on all nodes::
60

    
61
    $ /usr/lib/ganeti/tools/users-setup
62

    
63
#. Restart daemons on all nodes::
64

    
65
    $ /etc/init.d/ganeti restart
66

    
67
#. Re-distribute configuration (master node only)::
68

    
69
    $ gnt-cluster redist-conf
70

    
71
#. If you use file storage, check that the ``/etc/ganeti/file-storage-paths``
72
#. is correct on all nodes. For security reasons it's not copied
73
#. automatically, but it can be copied manually via::
74

    
75
   $ gnt-cluster copyfile /etc/ganeti/file-storage-paths
76

    
77
#. Restart daemons again on all nodes::
78

    
79
    $ /etc/init.d/ganeti restart
80

    
81
#. Enable the watcher again (master node only)::
82

    
83
    $ gnt-cluster watcher continue
84

    
85
#. Verify cluster (master node only)::
86

    
87
    $ gnt-cluster verify
88

    
89
Reverting an upgrade
90
~~~~~~~~~~~~~~~~~~~~
91

    
92
For going back between revisions (e.g. 2.1.1 to 2.1.0) no manual
93
intervention is required, as for upgrades.
94

    
95
Starting from version 2.8, ``cfgupgrade`` supports ``--downgrade``
96
option to bring the configuration back to the previous stable version.
97
This is useful if you upgrade Ganeti and after some time you run into
98
problems with the new version. You can downgrade the configuration
99
without losing the changes made since the upgrade. Any feature not
100
supported by the old version will be removed from the configuration, of
101
course, but you get a warning about it. If there is any new feature and
102
you haven't changed from its default value, you don't have to worry
103
about it, as it will get the same value whenever you'll upgrade again.
104

    
105
The procedure is similar to upgrading, but please notice that you have to
106
revert the configuration **before** installing the old version.
107

    
108
#. Ensure no jobs are running (master node only)::
109

    
110
    $ gnt-job list
111

    
112
#. Pause the watcher for an hour (master node only)::
113

    
114
    $ gnt-cluster watcher pause 1h
115

    
116
#. Stop all daemons on all nodes::
117

    
118
    $ /etc/init.d/ganeti stop
119

    
120
#. Backup old configuration (master node only)::
121

    
122
    $ tar czf /var/lib/ganeti-$(date +\%FT\%T).tar.gz -C /var/lib ganeti
123

    
124
#. Run cfgupgrade on the master node::
125

    
126
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade --dry-run
127
    $ /usr/lib/ganeti/tools/cfgupgrade --verbose --downgrade
128

    
129
   You may want to copy all the messages about features that have been
130
   removed during the downgrade, in case you want to restore them when
131
   upgrading again.
132

    
133
#. Install the old Ganeti version on all nodes
134
#. Restart daemons on all nodes::
135

    
136
    $ /etc/init.d/ganeti restart
137

    
138
#. Re-distribute configuration (master node only)::
139

    
140
    $ gnt-cluster redist-conf
141

    
142
#. Restart daemons again on all nodes::
143

    
144
    $ /etc/init.d/ganeti restart
145

    
146
#. Enable the watcher again (master node only)::
147

    
148
    $ gnt-cluster watcher continue
149

    
150
#. Verify cluster (master node only)::
151

    
152
    $ gnt-cluster verify
153

    
154

    
155
2.0 releases
156
------------
157

    
158
2.0.3 to 2.0.4
159
~~~~~~~~~~~~~~
160

    
161
No changes needed except restarting the daemon; but rollback to 2.0.3 might
162
require configuration editing.
163

    
164
If you're using Xen-HVM instances, please double-check the network
165
configuration (``nic_type`` parameter) as the defaults might have changed:
166
2.0.4 adds any missing configuration items and depending on the version of the
167
software the cluster has been installed with, some new keys might have been
168
added.
169

    
170
2.0.1 to 2.0.2/2.0.3
171
~~~~~~~~~~~~~~~~~~~~
172

    
173
Between 2.0.1 and 2.0.2 there have been some changes in the handling of block
174
devices, which can cause some issues. 2.0.3 was then released which adds two
175
new options/commands to fix this issue.
176

    
177
If you use DRBD-type instances and see problems in instance start or
178
activate-disks with messages from DRBD about "lower device too small" or
179
similar, it is recoomended to:
180

    
181
#. Run ``gnt-instance activate-disks --ignore-size $instance`` for each
182
   of the affected instances
183
#. Then run ``gnt-cluster repair-disk-sizes`` which will check that
184
   instances have the correct disk sizes
185

    
186
1.2 to 2.0
187
----------
188

    
189
Prerequisites:
190

    
191
- Ganeti 1.2.7 is currently installed
192
- All instances have been migrated from DRBD 0.7 to DRBD 8.x (i.e. no
193
  ``remote_raid1`` disk template)
194
- Upgrade to Ganeti 2.0.0~rc2 or later (~rc1 and earlier don't have the needed
195
  upgrade tool)
196

    
197
In the below steps, replace :file:`/var/lib` with ``$libdir`` if Ganeti was not
198
installed with this prefix (e.g. :file:`/usr/local/var`). Same for
199
:file:`/usr/lib`.
200

    
201
Execution (all steps are required in the order given):
202

    
203
#. Make a backup of the current configuration, for safety::
204

    
205
    $ cp -a /var/lib/ganeti /var/lib/ganeti-1.2.backup
206

    
207
#. Stop all instances::
208

    
209
    $ gnt-instance stop --all
210

    
211
#. Make sure no DRBD device are in use, the following command should show no
212
   active minors::
213

    
214
    $ gnt-cluster command grep cs: /proc/drbd | grep -v cs:Unconf
215

    
216
#. Stop the node daemons and rapi daemon on all nodes (note: should be logged
217
   in not via the cluster name, but the master node name, as the command below
218
   will remove the cluster ip from the master node)::
219

    
220
    $ gnt-cluster command /etc/init.d/ganeti stop
221

    
222
#. Install the new software on all nodes, either from packaging (if available)
223
   or from sources; the master daemon will not start but give error messages
224
   about wrong configuration file, which is normal
225
#. Upgrade the configuration file::
226

    
227
    $ /usr/lib/ganeti/tools/cfgupgrade12 -v --dry-run
228
    $ /usr/lib/ganeti/tools/cfgupgrade12 -v
229

    
230
#. Make sure ``ganeti-noded`` is running on all nodes (and start it if
231
   not)
232
#. Start the master daemon::
233

    
234
    $ ganeti-masterd
235

    
236
#. Check that a simple node-list works::
237

    
238
    $ gnt-node list
239

    
240
#. Redistribute updated configuration to all nodes::
241

    
242
    $ gnt-cluster redist-conf
243
    $ gnt-cluster copyfile /var/lib/ganeti/known_hosts
244

    
245
#. Optional: if needed, install RAPI-specific certificates under
246
   :file:`/var/lib/ganeti/rapi.pem` and run::
247

    
248
    $ gnt-cluster copyfile /var/lib/ganeti/rapi.pem
249

    
250
#. Run a cluster verify, this should show no problems::
251

    
252
    $ gnt-cluster verify
253

    
254
#. Remove some obsolete files::
255

    
256
    $ gnt-cluster command rm /var/lib/ganeti/ssconf_node_pass
257
    $ gnt-cluster command rm /var/lib/ganeti/ssconf_hypervisor
258

    
259
#. Update the xen pvm (if this was a pvm cluster) setting for 1.2
260
   compatibility::
261

    
262
    $ gnt-cluster modify -H xen-pvm:root_path=/dev/sda
263

    
264
#. Depending on your setup, you might also want to reset the initrd parameter::
265

    
266
    $ gnt-cluster modify -H xen-pvm:initrd_path=/boot/initrd-2.6-xenU
267

    
268
#. Reset the instance autobalance setting to default::
269

    
270
    $ for i in $(gnt-instance list -o name --no-headers); do \
271
       gnt-instance modify -B auto_balance=default $i; \
272
      done
273

    
274
#. Optional: start the RAPI demon::
275

    
276
    $ ganeti-rapi
277

    
278
#. Restart instances::
279

    
280
    $ gnt-instance start --force-multiple --all
281

    
282
At this point, ``gnt-cluster verify`` should show no errors and the migration
283
is complete.
284

    
285
1.2 releases
286
------------
287

    
288
1.2.4 to any other higher 1.2 version
289
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
290

    
291
No changes needed. Rollback will usually require manual edit of the
292
configuration file.
293

    
294
1.2.3 to 1.2.4
295
~~~~~~~~~~~~~~
296

    
297
No changes needed. Note that going back from 1.2.4 to 1.2.3 will require manual
298
edit of the configuration file (since we added some HVM-related new
299
attributes).
300

    
301
1.2.2 to 1.2.3
302
~~~~~~~~~~~~~~
303

    
304
No changes needed. Note that the drbd7-to-8 upgrade tool does a disk format
305
change for the DRBD metadata, so in theory this might be **risky**. It is
306
advised to have (good) backups before doing the upgrade.
307

    
308
1.2.1 to 1.2.2
309
~~~~~~~~~~~~~~
310

    
311
No changes needed.
312

    
313
1.2.0 to 1.2.1
314
~~~~~~~~~~~~~~
315

    
316
No changes needed. Only some bugfixes and new additions that don't affect
317
existing clusters.
318

    
319
1.2.0 beta 3 to 1.2.0
320
~~~~~~~~~~~~~~~~~~~~~
321

    
322
No changes needed.
323

    
324
1.2.0 beta 2 to beta 3
325
~~~~~~~~~~~~~~~~~~~~~~
326

    
327
No changes needed. A new version of the debian-etch-instance OS (0.3) has been
328
released, but upgrading it is not required.
329

    
330
1.2.0 beta 1 to beta 2
331
~~~~~~~~~~~~~~~~~~~~~~
332

    
333
Beta 2 switched the config file format to JSON. Steps to upgrade:
334

    
335
#. Stop the daemons (``/etc/init.d/ganeti stop``) on all nodes
336
#. Disable the cron job (default is :file:`/etc/cron.d/ganeti`)
337
#. Install the new version
338
#. Make a backup copy of the config file
339
#. Upgrade the config file using the following command::
340

    
341
    $ /usr/share/ganeti/cfgupgrade --verbose /var/lib/ganeti/config.data
342

    
343
#. Start the daemons and run ``gnt-cluster info``, ``gnt-node list`` and
344
   ``gnt-instance list`` to check if the upgrade process finished successfully
345

    
346
The OS definition also need to be upgraded. There is a new version of the
347
debian-etch-instance OS (0.2) that goes along with beta 2.
348

    
349
.. vim: set textwidth=72 :
350
.. Local Variables:
351
.. mode: rst
352
.. fill-column: 72
353
.. End: