root / doc / design-upgrade.rst @ 4c4153b5
History | View | Annotate | Download (8.4 kB)
1 |
======================================== |
---|---|
2 |
Automatized Upgrade Procedure for Ganeti |
3 |
======================================== |
4 |
|
5 |
.. contents:: :depth: 4 |
6 |
|
7 |
This is a design document detailing the proposed changes to the |
8 |
upgrade process, in order to allow it to be more automatic. |
9 |
|
10 |
|
11 |
Current state and shortcomings |
12 |
============================== |
13 |
|
14 |
Ganeti requires to run the same version of Ganeti to be run on all |
15 |
nodes of a cluster and this requirement is unlikely to go away in the |
16 |
foreseeable future. Also, the configuration may change between minor |
17 |
versions (and in the past has proven to do so). This requires a quite |
18 |
involved manual upgrade process of draining the queue, stopping |
19 |
ganeti, changing the binaries, upgrading the configuration, starting |
20 |
ganeti, distributing the configuration, and undraining the queue. |
21 |
|
22 |
|
23 |
Proposed changes |
24 |
================ |
25 |
|
26 |
While we will not remove the requirement of the same Ganeti |
27 |
version running on all nodes, the transition from one version |
28 |
to the other will be made more automatic. It will be possible |
29 |
to install new binaries ahead of time, and the actual switch |
30 |
between versions will be a single command. |
31 |
|
32 |
Path changes to allow multiple versions installed |
33 |
------------------------------------------------- |
34 |
|
35 |
Currently, Ganeti installs to ``${PREFIX}/bin``, ``${PREFIX}/sbin``, |
36 |
and so on, as well as to ``${pythondir}/ganeti``. |
37 |
|
38 |
These paths will be changed in the following way. |
39 |
|
40 |
- The python package will be installed to ``${pythondir}/ganeti-${VERSION}``. |
41 |
Here ${VERSION} is the full qualified version number, consisting of |
42 |
major, minor, revision, and suffix. All python executables will be changed |
43 |
to import the correct version of the ganeti package. |
44 |
|
45 |
- All other files will be installed to the corresponding path under |
46 |
``${libdir}/ganeti-${VERSION}`` instead of under ``${PREFIX}`` |
47 |
directly, where ${libdir} defaults to ${PREFIX}/lib. Symbolic links |
48 |
to these files will be added under ``${PREFIX}/bin``, |
49 |
``${PREFIX}/sbin``, and so on. |
50 |
|
51 |
As only each version itself has the authoritative knowledge of which |
52 |
files belong to it, each version provides two executables ``install`` |
53 |
and ``uninstall`` that add and remove the symbolic links, |
54 |
respectively. Both executables will be idempotent and only touch |
55 |
symbolic links that are outside the directory for their version of |
56 |
Ganeti and point into this directory. In particular, an ``uninstall`` |
57 |
of one version will not interfere with an ``install`` of a different |
58 |
version. |
59 |
|
60 |
gnt-upgrade |
61 |
----------- |
62 |
|
63 |
The actual upgrade process will be done by a new binary, |
64 |
``gnt-upgrade``. It will take precisely one argument, the version to |
65 |
upgrade (or downgrade) to, given as full string with major, minor, suffix, |
66 |
and suffix. To be compatible with current configuration upgrade and downgrade |
67 |
procedures, the new version must be of the same major version and |
68 |
either an equal or higher minor version, or precisely the previous |
69 |
minor version. |
70 |
|
71 |
When executed, ``gnt-upgrade`` will perform the following actions. |
72 |
|
73 |
- It verifies that the version to change to is installed on all nodes |
74 |
of the cluster that are not marked as offline. If this is not the |
75 |
case it aborts with an error. This initial testing is an |
76 |
optimization to allow for early feedback. |
77 |
|
78 |
- An intent-to-upgrade file is created that contains the current |
79 |
version of ganeti, the version to change to, and the process ID of |
80 |
the ``gnt-upgrade`` process. The latter is not used automatically, |
81 |
but allows manual detection if the upgrade process died |
82 |
unintentionally. The intend-to-upgrade file is persisted to disk |
83 |
before continuing. |
84 |
|
85 |
- The Ganeti job queue is drained, and the executable waits till there |
86 |
are no more jobs in the queue. Once :doc:`design-optables` is |
87 |
implemented, for upgrades, and only for upgrades, all jobs are paused |
88 |
instead (in the sense that the currently running opcode continues, |
89 |
but the next opcode is not started) and it is continued once all |
90 |
jobs are fully paused. |
91 |
|
92 |
- All ganeti daemons on the master node are stopped. |
93 |
|
94 |
- It is verified again that all nodes at this moment not marked as |
95 |
offline have the new version installed. If this is not the case, |
96 |
then all changes so far (stopping ganeti daemons and draining the |
97 |
queue) are undone and failure is reported. This second verification |
98 |
is necessary, as the set of online nodes might have changed during |
99 |
the draining period. |
100 |
|
101 |
- All ganeti daemons on all remaining (non-offline) nodes are stopped. |
102 |
|
103 |
- A backup of all Ganeti-related status information is created for |
104 |
manual rollbacks. While the normal way of rolling back after an |
105 |
upgrade should be calling ``gnt-upgrade`` from the newer version |
106 |
with the older version as argument, a full backup provides an |
107 |
additional safety net, especially for jump-upgrades (skipping |
108 |
intermediate minor versions). |
109 |
|
110 |
- If the action is a downgrade to the previous minor version, the |
111 |
configuration is downgraded now, using ``cfgupgrade --downgrade``. |
112 |
|
113 |
- The current version of ganeti is deactivated on all nodes, using the |
114 |
``uninstall`` executable described earlier. |
115 |
|
116 |
- The new version of ganeti is activated on all nodes, using the |
117 |
``install`` executable described earlier. |
118 |
|
119 |
- If the action is an upgrade to a higher minor version, the configuration |
120 |
is upgraded now, using ``cfgupgrade``. |
121 |
|
122 |
- All daemons are started on all nodes. |
123 |
|
124 |
- ``ensure-dirs --full-run`` is run on all nodes. |
125 |
|
126 |
- ``gnt-cluster redist-conf`` is run on the master node. |
127 |
|
128 |
- All daemons are restarted on all nodes. |
129 |
|
130 |
- The Ganeti job queue is undrained. |
131 |
|
132 |
- The intent-to-upgrade file is removed. |
133 |
|
134 |
- ``gnt-cluster verify`` is run and the result reported. |
135 |
|
136 |
|
137 |
Considerations on unintended reboots of the master node |
138 |
======================================================= |
139 |
|
140 |
During the upgrade procedure, the only ganeti process still running is |
141 |
the one instance of ``gnt-upgrade``. This process is also responsible |
142 |
for eventually removing the queue drain. Therefore, we have to provide |
143 |
means to resume this process, if it dies unintentionally. The process |
144 |
itself will handle SIGTERM gracefully by either undoing all changes |
145 |
done so far, or by ignoring the signal all together and continuing to |
146 |
the end; the choice between these behaviors depends on whether change |
147 |
of the configuration has already started (in which case it goes |
148 |
through to the end), or not (in which case the actions done so far are |
149 |
rolled back). |
150 |
|
151 |
To achieve this, ``gnt-upgrade`` will support a ``--resume`` |
152 |
option. It is recommended to have ``gnt-upgrade --resume`` as an |
153 |
at-reboot task in the crontab. If started with this option, |
154 |
``gnt-upgrade`` does not accept any arguments. It first verifies that |
155 |
it is running on the master node, using the same requirement as for |
156 |
starting the master daemon, i.e., confirmed by a majority of all |
157 |
nodes. If it is not the master node, it will remove any possibly |
158 |
existing intend-to-upgrade file and exit. If it is running on the |
159 |
master node, it will check for the existence of an intend-to-upgrade |
160 |
file. If no such file is found, it will simply exit. If found, it will |
161 |
resume at the appropriate stage. |
162 |
|
163 |
- If the configuration file still is at the initial version, |
164 |
``gnt-upgrade`` is resumed at the step immediately following the |
165 |
writing of the intend-to-upgrade file. It should be noted that |
166 |
all steps before changing the configuration are idempotent, so |
167 |
redoing them does not do any harm. |
168 |
|
169 |
- If the configuration is already at the new version, all daemons on |
170 |
all nodes are stopped (as they might have been started again due |
171 |
to a reboot) and then it is resumed at the step immediately |
172 |
following the configuration change. All actions following the |
173 |
configuration change can be repeated without bringing the cluster |
174 |
into a worse state. |
175 |
|
176 |
|
177 |
Caveats |
178 |
======= |
179 |
|
180 |
Since ``gnt-upgrade`` drains the queue and undrains it later, so any |
181 |
information about a previous drain gets lost. This problem will |
182 |
disappear, once :doc:`design-optables` is implemented, as then the |
183 |
undrain will then be restricted to filters by gnt-upgrade. |
184 |
|
185 |
|
186 |
Requirement of opcode backwards compatibility |
187 |
============================================== |
188 |
|
189 |
Since for upgrades we only pause jobs and do not fully drain the |
190 |
queue, we need to be able to transform the job queue into a queue for |
191 |
the new version. The way this is achieved is by keeping the |
192 |
serialization format backwards compatible. This is in line with |
193 |
current practice that opcodes do not change between versions, and at |
194 |
most new fields are added. Whenever we add a new field to an opcode, |
195 |
we will make sure that the deserialization function will provide a |
196 |
default value if the field is not present. |
197 |
|
198 |
|