root / doc / design-upgrade.rst @ cf9f3b92
History | View | Annotate | Download (8.3 kB)
1 |
======================================== |
---|---|
2 |
Automatized Upgrade Procedure for Ganeti |
3 |
======================================== |
4 |
|
5 |
.. contents:: :depth: 4 |
6 |
|
7 |
This is a design document detailing the proposed changes to the |
8 |
upgrade process, in order to allow it to be more automatic. |
9 |
|
10 |
|
11 |
Current state and shortcomings |
12 |
============================== |
13 |
|
14 |
Ganeti requires to run the same version of Ganeti to be run on all |
15 |
nodes of a cluster and this requirement is unlikely to go away in the |
16 |
foreseeable future. Also, the configuration may change between minor |
17 |
versions (and in the past has proven to do so). This requires a quite |
18 |
involved manual upgrade process of draining the queue, stopping |
19 |
ganeti, changing the binaries, upgrading the configuration, starting |
20 |
ganeti, distributing the configuration, and undraining the queue. |
21 |
|
22 |
|
23 |
Proposed changes |
24 |
================ |
25 |
|
26 |
While we will not remove the requirement of the same Ganeti |
27 |
version running on all nodes, the transition from one version |
28 |
to the other will be made more automatic. It will be possible |
29 |
to install new binaries ahead of time, and the actual switch |
30 |
between versions will be a single command. |
31 |
|
32 |
Path changes to allow multiple versions installed |
33 |
------------------------------------------------- |
34 |
|
35 |
Currently, Ganeti installs to ``${PREFIX}/bin``, ``${PREFIX}/sbin``, |
36 |
and so on, as well as to ``${pythondir}/ganeti``. |
37 |
|
38 |
These paths will be changed in the following way. |
39 |
|
40 |
- The python package will be installed to ``${pythondir}/ganeti-${VERSION}``. |
41 |
Here ${VERSION} is the full qualified version number, consisting of |
42 |
major, minor, revision, and suffix. All python executables will be changed |
43 |
to import the correct version of the ganeti package. |
44 |
|
45 |
- All other files will be installed to the corresponding path under |
46 |
``${PREFIX}/opt/ganeti-${VERSION}`` instead of under ``${PREFIX}`` |
47 |
directly. Symbolic links to these files will be added under ``${PREFIX}/bin``, |
48 |
``${PREFIX}/sbin``, and so on. |
49 |
|
50 |
As only each version itself has the authoritative knowledge of which |
51 |
files belong to it, each version provides two executables ``install`` |
52 |
and ``uninstall`` that add and remove the symbolic links, |
53 |
respectively. Both executables will be idempotent and only touch |
54 |
symbolic links that are outside the directory for their version of |
55 |
Ganeti and point into this directory. In particular, an ``uninstall`` |
56 |
of one version will not interfere with an ``install`` of a different |
57 |
version. |
58 |
|
59 |
gnt-upgrade |
60 |
----------- |
61 |
|
62 |
The actual upgrade process will be done by a new binary, |
63 |
``gnt-upgrade``. It will take precisely one argument, the version to |
64 |
upgrade (or downgrade) to, given as full string with major, minor, suffix, |
65 |
and suffix. To be compatible with current configuration upgrade and downgrade |
66 |
procedures, the new version must be of the same major version and |
67 |
either an equal or higher minor version, or precisely the previous |
68 |
minor version. |
69 |
|
70 |
When executed, ``gnt-upgrade`` will perform the following actions. |
71 |
|
72 |
- It verifies that the version to change to is installed on all nodes |
73 |
of the cluster that are not marked as offline. If this is not the |
74 |
case it aborts with an error. This initial testing is an |
75 |
optimization to allow for early feedback. |
76 |
|
77 |
- An intent-to-upgrade file is created that contains the current |
78 |
version of ganeti, the version to change to, and the process ID of |
79 |
the ``gnt-upgrade`` process. The latter is not used automatically, |
80 |
but allows manual detection if the upgrade process died |
81 |
unintentionally. The intend-to-upgrade file is persisted to disk |
82 |
before continuing. |
83 |
|
84 |
- The Ganeti job queue is drained, and the executable waits till there |
85 |
are no more jobs in the queue. Once :doc:`design-optables` is |
86 |
implemented, for upgrades, and only for upgrades, all jobs are paused |
87 |
instead (in the sense that the currently running opcode continues, |
88 |
but the next opcode is not started) and it is continued once all |
89 |
jobs are fully paused. |
90 |
|
91 |
- All ganeti daemons on the master node are stopped. |
92 |
|
93 |
- It is verified again that all nodes at this moment not marked as |
94 |
offline have the new version installed. If this is not the case, |
95 |
then all changes so far (stopping ganeti daemons and draining the |
96 |
queue) are undone and failure is reported. This second verification |
97 |
is necessary, as the set of online nodes might have changed during |
98 |
the draining period. |
99 |
|
100 |
- All ganeti daemons on all remaining (non-offline) nodes are stopped. |
101 |
|
102 |
- A backup of all Ganeti-related status information is created for |
103 |
manual rollbacks. While the normal way of rolling back after an |
104 |
upgrade should be calling ``gnt-upgrade`` from the newer version |
105 |
with the older version as argument, a full backup provides an |
106 |
additional safety net, especially for jump-upgrades (skipping |
107 |
intermediate minor versions). |
108 |
|
109 |
- If the action is a downgrade to the previous minor version, the |
110 |
configuration is downgraded now, using ``cfgupgrade --downgrade``. |
111 |
|
112 |
- The current version of ganeti is deactivated on all nodes, using the |
113 |
``uninstall`` executable described earlier. |
114 |
|
115 |
- The new version of ganeti is activated on all nodes, using the |
116 |
``install`` executable described earlier. |
117 |
|
118 |
- If the action is an upgrade to a higher minor version, the configuration |
119 |
is upgraded now, using ``cfgupgrade``. |
120 |
|
121 |
- All daemons are started on all nodes. |
122 |
|
123 |
- ``ensure-dirs --full-run`` is run on all nodes. |
124 |
|
125 |
- ``gnt-cluster redist-conf`` is run on the master node. |
126 |
|
127 |
- All daemons are restarted on all nodes. |
128 |
|
129 |
- The Ganeti job queue is undrained. |
130 |
|
131 |
- The intent-to-upgrade file is removed. |
132 |
|
133 |
- ``gnt-cluster verify`` is run and the result reported. |
134 |
|
135 |
|
136 |
Considerations on unintended reboots of the master node |
137 |
======================================================= |
138 |
|
139 |
During the upgrade procedure, the only ganeti process still running is |
140 |
the one instance of ``gnt-upgrade``. This process is also responsible |
141 |
for eventually removing the queue drain. Therefore, we have to provide |
142 |
means to resume this process, if it dies unintentionally. The process |
143 |
itself will handle SIGTERM gracefully by either undoing all changes |
144 |
done so far, or by ignoring the signal all together and continuing to |
145 |
the end; the choice between these behaviors depends on whether change |
146 |
of the configuration has already started (in which case it goes |
147 |
through to the end), or not (in which case the actions done so far are |
148 |
rolled back). |
149 |
|
150 |
To achieve this, ``gnt-upgrade`` will support a ``--resume`` |
151 |
option. It is recommended to have ``gnt-upgrade --resume`` as an |
152 |
at-reboot task in the crontab. If started with this option, |
153 |
``gnt-upgrade`` does not accept any arguments. It first verifies that |
154 |
it is running on the master node, using the same requirement as for |
155 |
starting the master daemon, i.e., confirmed by a majority of all |
156 |
nodes. If it is not the master node, it will remove any possibly |
157 |
existing intend-to-upgrade file and exit. If it is running on the |
158 |
master node, it will check for the existence of an intend-to-upgrade |
159 |
file. If no such file is found, it will simply exit. If found, it will |
160 |
resume at the appropriate stage. |
161 |
|
162 |
- If the configuration file still is at the initial version, |
163 |
``gnt-upgrade`` is resumed at the step immediately following the |
164 |
writing of the intend-to-upgrade file. It should be noted that |
165 |
all steps before changing the configuration are idempotent, so |
166 |
redoing them does not do any harm. |
167 |
|
168 |
- If the configuration is already at the new version, all daemons on |
169 |
all nodes are stopped (as they might have been started again due |
170 |
to a reboot) and then it is resumed at the step immediately |
171 |
following the configuration change. All actions following the |
172 |
configuration change can be repeated without bringing the cluster |
173 |
into a worse state. |
174 |
|
175 |
|
176 |
Caveats |
177 |
======= |
178 |
|
179 |
Since ``gnt-upgrade`` drains the queue and undrains it later, so any |
180 |
information about a previous drain gets lost. This problem will |
181 |
disappear, once :doc:`design-optables` is implemented, as then the |
182 |
undrain will then be restricted to filters by gnt-upgrade. |
183 |
|
184 |
|
185 |
Requirement of opcode backwards compatibility |
186 |
============================================== |
187 |
|
188 |
Since for upgrades we only pause jobs and do not fully drain the |
189 |
queue, we need to be able to transform the job queue into a queue for |
190 |
the new version. The way this is achieved is by keeping the |
191 |
serialization format backwards compatible. This is in line with |
192 |
current practice that opcodes do not change between versions, and at |
193 |
most new fields are added. Whenever we add a new field to an opcode, |
194 |
we will make sure that the deserialization function will provide a |
195 |
default value if the field is not present. |
196 |
|
197 |
|