Statistics
| Branch: | Tag: | Revision:

root / doc / design-upgrade.rst @ b8e39189

History | View | Annotate | Download (11.5 kB)

1 cf9f3b92 Klaus Aehlig
========================================
2 cf9f3b92 Klaus Aehlig
Automatized Upgrade Procedure for Ganeti
3 cf9f3b92 Klaus Aehlig
========================================
4 cf9f3b92 Klaus Aehlig
5 cf9f3b92 Klaus Aehlig
.. contents:: :depth: 4
6 cf9f3b92 Klaus Aehlig
7 cf9f3b92 Klaus Aehlig
This is a design document detailing the proposed changes to the
8 cf9f3b92 Klaus Aehlig
upgrade process, in order to allow it to be more automatic.
9 cf9f3b92 Klaus Aehlig
10 cf9f3b92 Klaus Aehlig
11 cf9f3b92 Klaus Aehlig
Current state and shortcomings
12 cf9f3b92 Klaus Aehlig
==============================
13 cf9f3b92 Klaus Aehlig
14 cf9f3b92 Klaus Aehlig
Ganeti requires to run the same version of Ganeti to be run on all
15 cf9f3b92 Klaus Aehlig
nodes of a cluster and this requirement is unlikely to go away in the
16 cf9f3b92 Klaus Aehlig
foreseeable future. Also, the configuration may change between minor
17 cf9f3b92 Klaus Aehlig
versions (and in the past has proven to do so). This requires a quite
18 cf9f3b92 Klaus Aehlig
involved manual upgrade process of draining the queue, stopping
19 cf9f3b92 Klaus Aehlig
ganeti, changing the binaries, upgrading the configuration, starting
20 cf9f3b92 Klaus Aehlig
ganeti, distributing the configuration, and undraining the queue.
21 cf9f3b92 Klaus Aehlig
22 cf9f3b92 Klaus Aehlig
23 cf9f3b92 Klaus Aehlig
Proposed changes
24 cf9f3b92 Klaus Aehlig
================
25 cf9f3b92 Klaus Aehlig
26 cf9f3b92 Klaus Aehlig
While we will not remove the requirement of the same Ganeti
27 cf9f3b92 Klaus Aehlig
version running on all nodes, the transition from one version
28 cf9f3b92 Klaus Aehlig
to the other will be made more automatic. It will be possible
29 cf9f3b92 Klaus Aehlig
to install new binaries ahead of time, and the actual switch
30 cf9f3b92 Klaus Aehlig
between versions will be a single command.
31 cf9f3b92 Klaus Aehlig
32 b8e39189 Klaus Aehlig
While changing the file layout anyway, we install the python
33 b8e39189 Klaus Aehlig
code, which is architecture independent, under ``${prefix}/share``,
34 b8e39189 Klaus Aehlig
in a way that properly separates the Ganeti libraries of the
35 b8e39189 Klaus Aehlig
various versions. 
36 b8e39189 Klaus Aehlig
37 cf9f3b92 Klaus Aehlig
Path changes to allow multiple versions installed
38 cf9f3b92 Klaus Aehlig
-------------------------------------------------
39 cf9f3b92 Klaus Aehlig
40 cf9f3b92 Klaus Aehlig
Currently, Ganeti installs to ``${PREFIX}/bin``, ``${PREFIX}/sbin``,
41 cf9f3b92 Klaus Aehlig
and so on, as well as to ``${pythondir}/ganeti``.
42 cf9f3b92 Klaus Aehlig
43 cf9f3b92 Klaus Aehlig
These paths will be changed in the following way.
44 cf9f3b92 Klaus Aehlig
45 b8e39189 Klaus Aehlig
- The python package will be installed to
46 b8e39189 Klaus Aehlig
  ``${PREFIX}/share/ganeti/${VERSION}/ganeti``.
47 b8e39189 Klaus Aehlig
  Here ${VERSION} is, depending on configure options, either the full qualified
48 b8e39189 Klaus Aehlig
  version number, consisting of major, minor, revision, and suffix, or it is
49 b8e39189 Klaus Aehlig
  just a major.minor pair. All python executables will be installed under
50 b8e39189 Klaus Aehlig
  ``${PREFIX}/share/ganeti/${VERSION}`` so that they see their respective
51 b8e39189 Klaus Aehlig
  Ganeti library. ``${PREFIX}/share/ganeti/default`` is a symbolic link to
52 b8e39189 Klaus Aehlig
  ``${sysconfdir}/ganeti/share`` which, in turn, is a symbolic link to
53 b8e39189 Klaus Aehlig
  ``${PREFIX}/share/ganeti/${VERSION}``. For all python executatables (like
54 b8e39189 Klaus Aehlig
  ``gnt-cluster``, ``gnt-node``, etc) symbolic links going through
55 b8e39189 Klaus Aehlig
  ``${PREFIX}/share/ganeti/default`` are added under ``${PREFIX}/sbin``.
56 cf9f3b92 Klaus Aehlig
57 cf9f3b92 Klaus Aehlig
- All other files will be installed to the corresponding path under
58 b8e39189 Klaus Aehlig
  ``${libdir}/ganeti/${VERSION}`` instead of under ``${PREFIX}``
59 b8e39189 Klaus Aehlig
  directly, where ``${libdir}`` defaults to ``${PREFIX}/lib``.
60 b8e39189 Klaus Aehlig
  ``${libdir}/ganeti/default`` will be a symlink to ``${sysconfdir}/ganeti/lib``
61 b8e39189 Klaus Aehlig
  which, in turn, is a symlink to ``${libdir}/ganeti/${VERSION}``.
62 b8e39189 Klaus Aehlig
  Symbolic links to the files installed under ``${libdir}/ganeti/${VERSION}``
63 b8e39189 Klaus Aehlig
  will be added under ``${PREFIX}/bin``, ``${PREFIX}/sbin``, and so on. These
64 b8e39189 Klaus Aehlig
  symbolic links will go through ``${libdir}/ganeti/default`` so that the
65 b8e39189 Klaus Aehlig
  version can easily be changed by updating the symbolic link in
66 b8e39189 Klaus Aehlig
  ``${sysconfdir}``.
67 b8e39189 Klaus Aehlig
68 b8e39189 Klaus Aehlig
The set of links for ganeti binaries might change between the versions.
69 b8e39189 Klaus Aehlig
However, as the file structure under ``${libdir}/ganeti/${VERSION}`` reflects
70 b8e39189 Klaus Aehlig
that of ``/``, two links of differnt versions will never conflict. Similarly,
71 b8e39189 Klaus Aehlig
the symbolic links for the python executables will never conflict, as they
72 b8e39189 Klaus Aehlig
always point to a file with the same basename directly under
73 b8e39189 Klaus Aehlig
``${PREFIX}/share/ganeti/default``. Therefore, each version will make sure that
74 b8e39189 Klaus Aehlig
enough symbolic links are present in ``${PREFIX}/bin``, ``${PREFIX}/sbin`` and
75 b8e39189 Klaus Aehlig
so on, even though some might be dangling, if a differnt version of ganeti is
76 b8e39189 Klaus Aehlig
currently active.
77 b8e39189 Klaus Aehlig
78 b8e39189 Klaus Aehlig
The extra indirection through ``${sysconfdir}`` allows installations that choose
79 b8e39189 Klaus Aehlig
to have ``${sysconfdir}`` and ``${localstatedir}`` outside ``${PREFIX}`` to
80 b8e39189 Klaus Aehlig
mount ``${PREFIX}`` read-only. The latter is important for systems that choose
81 b8e39189 Klaus Aehlig
``/usr`` as ``${PREFIX}`` and are following the Filesystem Hierarchy Standard.
82 b8e39189 Klaus Aehlig
For example, choosing ``/usr`` as ``${PREFIX}`` and ``/etc`` as ``${sysconfdir}``,
83 b8e39189 Klaus Aehlig
the layout for version 2.10 will look as follows.
84 b8e39189 Klaus Aehlig
::
85 b8e39189 Klaus Aehlig
86 b8e39189 Klaus Aehlig
   /
87 b8e39189 Klaus Aehlig
   |
88 b8e39189 Klaus Aehlig
   +-- etc
89 b8e39189 Klaus Aehlig
   |   |
90 b8e39189 Klaus Aehlig
   |   +-- ganeti 
91 b8e39189 Klaus Aehlig
   |         |
92 b8e39189 Klaus Aehlig
   |         +-- lib -> /usr/lib/ganeti/2.10
93 b8e39189 Klaus Aehlig
   |         |
94 b8e39189 Klaus Aehlig
   |         +-- share  -> /usr/share/ganeti/2.10
95 b8e39189 Klaus Aehlig
   +-- usr
96 b8e39189 Klaus Aehlig
        |
97 b8e39189 Klaus Aehlig
        +-- bin
98 b8e39189 Klaus Aehlig
        |   |
99 b8e39189 Klaus Aehlig
        |   +-- harep -> /usr/lib/ganeti/default/usr/bin/harep
100 b8e39189 Klaus Aehlig
        |   |
101 b8e39189 Klaus Aehlig
        |   ...  
102 b8e39189 Klaus Aehlig
        |
103 b8e39189 Klaus Aehlig
        +-- sbin
104 b8e39189 Klaus Aehlig
        |   |
105 b8e39189 Klaus Aehlig
        |   +-- gnt-cluster -> /usr/share/ganeti/default/gnt-cluster
106 b8e39189 Klaus Aehlig
        |   |
107 b8e39189 Klaus Aehlig
        |   ...  
108 b8e39189 Klaus Aehlig
        |
109 b8e39189 Klaus Aehlig
        +-- ...
110 b8e39189 Klaus Aehlig
        |
111 b8e39189 Klaus Aehlig
        +-- lib
112 b8e39189 Klaus Aehlig
        |   |
113 b8e39189 Klaus Aehlig
        |   +-- ganeti
114 b8e39189 Klaus Aehlig
        |       |
115 b8e39189 Klaus Aehlig
        |       +-- default -> /etc/ganeti/lib
116 b8e39189 Klaus Aehlig
        |       |
117 b8e39189 Klaus Aehlig
        |       +-- 2.10
118 b8e39189 Klaus Aehlig
        |           |
119 b8e39189 Klaus Aehlig
        |           +-- usr
120 b8e39189 Klaus Aehlig
        |               |
121 b8e39189 Klaus Aehlig
        |               +-- bin
122 b8e39189 Klaus Aehlig
        |               |    |
123 b8e39189 Klaus Aehlig
        |               |    +-- htools
124 b8e39189 Klaus Aehlig
        |               |    |
125 b8e39189 Klaus Aehlig
        |               |    +-- harep -> htools
126 b8e39189 Klaus Aehlig
        |               |    |
127 b8e39189 Klaus Aehlig
        |               |    ...
128 b8e39189 Klaus Aehlig
        |               ...
129 b8e39189 Klaus Aehlig
        |
130 b8e39189 Klaus Aehlig
        +-- share
131 b8e39189 Klaus Aehlig
             |
132 b8e39189 Klaus Aehlig
             +-- ganeti
133 b8e39189 Klaus Aehlig
                 |
134 b8e39189 Klaus Aehlig
                 +-- default -> /etc/ganeti/share
135 b8e39189 Klaus Aehlig
                 |
136 b8e39189 Klaus Aehlig
                 +-- 2.10
137 b8e39189 Klaus Aehlig
                     |
138 b8e39189 Klaus Aehlig
                     + -- gnt-cluster
139 b8e39189 Klaus Aehlig
                     |
140 b8e39189 Klaus Aehlig
                     + -- gnt-node
141 b8e39189 Klaus Aehlig
                     |
142 b8e39189 Klaus Aehlig
                     + -- ...
143 b8e39189 Klaus Aehlig
                     |
144 b8e39189 Klaus Aehlig
                     + -- ganeti
145 b8e39189 Klaus Aehlig
                          |
146 b8e39189 Klaus Aehlig
                          +-- backend.py
147 b8e39189 Klaus Aehlig
                          |
148 b8e39189 Klaus Aehlig
                          +-- ...
149 b8e39189 Klaus Aehlig
                          |
150 b8e39189 Klaus Aehlig
                          +-- cmdlib
151 b8e39189 Klaus Aehlig
                          |   |
152 b8e39189 Klaus Aehlig
                          |   ...
153 b8e39189 Klaus Aehlig
                          ...
154 b8e39189 Klaus Aehlig
155 b8e39189 Klaus Aehlig
156 cf9f3b92 Klaus Aehlig
157 cf9f3b92 Klaus Aehlig
gnt-upgrade
158 cf9f3b92 Klaus Aehlig
-----------
159 cf9f3b92 Klaus Aehlig
160 cf9f3b92 Klaus Aehlig
The actual upgrade process will be done by a new binary,
161 cf9f3b92 Klaus Aehlig
``gnt-upgrade``. It will take precisely one argument, the version to
162 cf9f3b92 Klaus Aehlig
upgrade (or downgrade) to, given as full string with major, minor, suffix,
163 cf9f3b92 Klaus Aehlig
and suffix. To be compatible with current configuration upgrade and downgrade
164 cf9f3b92 Klaus Aehlig
procedures, the new version must be of the same major version and
165 cf9f3b92 Klaus Aehlig
either an equal or higher minor version, or precisely the previous
166 cf9f3b92 Klaus Aehlig
minor version.
167 cf9f3b92 Klaus Aehlig
168 cf9f3b92 Klaus Aehlig
When executed, ``gnt-upgrade`` will perform the following actions.
169 cf9f3b92 Klaus Aehlig
170 cf9f3b92 Klaus Aehlig
- It verifies that the version to change to is installed on all nodes
171 cf9f3b92 Klaus Aehlig
  of the cluster that are not marked as offline. If this is not the
172 cf9f3b92 Klaus Aehlig
  case it aborts with an error. This initial testing is an
173 cf9f3b92 Klaus Aehlig
  optimization to allow for early feedback.
174 cf9f3b92 Klaus Aehlig
175 cf9f3b92 Klaus Aehlig
- An intent-to-upgrade file is created that contains the current
176 cf9f3b92 Klaus Aehlig
  version of ganeti, the version to change to, and the process ID of
177 cf9f3b92 Klaus Aehlig
  the ``gnt-upgrade`` process. The latter is not used automatically,
178 cf9f3b92 Klaus Aehlig
  but allows manual detection if the upgrade process died
179 cf9f3b92 Klaus Aehlig
  unintentionally. The intend-to-upgrade file is persisted to disk
180 cf9f3b92 Klaus Aehlig
  before continuing.
181 cf9f3b92 Klaus Aehlig
182 cf9f3b92 Klaus Aehlig
- The Ganeti job queue is drained, and the executable waits till there
183 cf9f3b92 Klaus Aehlig
  are no more jobs in the queue. Once :doc:`design-optables` is
184 cf9f3b92 Klaus Aehlig
  implemented, for upgrades, and only for upgrades, all jobs are paused
185 cf9f3b92 Klaus Aehlig
  instead (in the sense that the currently running opcode continues,
186 cf9f3b92 Klaus Aehlig
  but the next opcode is not started) and it is continued once all
187 cf9f3b92 Klaus Aehlig
  jobs are fully paused.
188 cf9f3b92 Klaus Aehlig
189 cf9f3b92 Klaus Aehlig
- All ganeti daemons on the master node are stopped.
190 cf9f3b92 Klaus Aehlig
191 cf9f3b92 Klaus Aehlig
- It is verified again that all nodes at this moment not marked as
192 cf9f3b92 Klaus Aehlig
  offline have the new version installed. If this is not the case,
193 cf9f3b92 Klaus Aehlig
  then all changes so far (stopping ganeti daemons and draining the
194 cf9f3b92 Klaus Aehlig
  queue) are undone and failure is reported. This second verification
195 cf9f3b92 Klaus Aehlig
  is necessary, as the set of online nodes might have changed during
196 cf9f3b92 Klaus Aehlig
  the draining period.
197 cf9f3b92 Klaus Aehlig
198 cf9f3b92 Klaus Aehlig
- All ganeti daemons on all remaining (non-offline) nodes are stopped.
199 cf9f3b92 Klaus Aehlig
200 cf9f3b92 Klaus Aehlig
- A backup of all Ganeti-related status information is created for
201 cf9f3b92 Klaus Aehlig
  manual rollbacks. While the normal way of rolling back after an
202 cf9f3b92 Klaus Aehlig
  upgrade should be calling ``gnt-upgrade`` from the newer version
203 cf9f3b92 Klaus Aehlig
  with the older version as argument, a full backup provides an
204 cf9f3b92 Klaus Aehlig
  additional safety net, especially for jump-upgrades (skipping
205 cf9f3b92 Klaus Aehlig
  intermediate minor versions).
206 cf9f3b92 Klaus Aehlig
207 cf9f3b92 Klaus Aehlig
- If the action is a downgrade to the previous minor version, the
208 cf9f3b92 Klaus Aehlig
  configuration is downgraded now, using ``cfgupgrade --downgrade``.
209 cf9f3b92 Klaus Aehlig
210 b8e39189 Klaus Aehlig
- The ``${sysconfdir}/ganeti/lib`` and ``${sysconfdir}/ganeti/share``
211 b8e39189 Klaus Aehlig
  symbolic links are updated.
212 cf9f3b92 Klaus Aehlig
213 cf9f3b92 Klaus Aehlig
- If the action is an upgrade to a higher minor version, the configuration
214 cf9f3b92 Klaus Aehlig
  is upgraded now, using ``cfgupgrade``.
215 cf9f3b92 Klaus Aehlig
216 cf9f3b92 Klaus Aehlig
- All daemons are started on all nodes.
217 cf9f3b92 Klaus Aehlig
218 cf9f3b92 Klaus Aehlig
- ``ensure-dirs --full-run`` is run on all nodes.
219 cf9f3b92 Klaus Aehlig
220 cf9f3b92 Klaus Aehlig
- ``gnt-cluster redist-conf`` is run on the master node. 
221 cf9f3b92 Klaus Aehlig
222 cf9f3b92 Klaus Aehlig
- All daemons are restarted on all nodes.
223 cf9f3b92 Klaus Aehlig
224 cf9f3b92 Klaus Aehlig
- The Ganeti job queue is undrained.
225 cf9f3b92 Klaus Aehlig
226 cf9f3b92 Klaus Aehlig
- The intent-to-upgrade file is removed.
227 cf9f3b92 Klaus Aehlig
228 cf9f3b92 Klaus Aehlig
- ``gnt-cluster verify`` is run and the result reported.
229 cf9f3b92 Klaus Aehlig
230 cf9f3b92 Klaus Aehlig
231 cf9f3b92 Klaus Aehlig
Considerations on unintended reboots of the master node
232 cf9f3b92 Klaus Aehlig
=======================================================
233 cf9f3b92 Klaus Aehlig
 
234 cf9f3b92 Klaus Aehlig
During the upgrade procedure, the only ganeti process still running is
235 cf9f3b92 Klaus Aehlig
the one instance of ``gnt-upgrade``. This process is also responsible
236 cf9f3b92 Klaus Aehlig
for eventually removing the queue drain. Therefore, we have to provide
237 cf9f3b92 Klaus Aehlig
means to resume this process, if it dies unintentionally. The process
238 cf9f3b92 Klaus Aehlig
itself will handle SIGTERM gracefully by either undoing all changes
239 cf9f3b92 Klaus Aehlig
done so far, or by ignoring the signal all together and continuing to
240 cf9f3b92 Klaus Aehlig
the end; the choice between these behaviors depends on whether change
241 cf9f3b92 Klaus Aehlig
of the configuration has already started (in which case it goes
242 cf9f3b92 Klaus Aehlig
through to the end), or not (in which case the actions done so far are
243 cf9f3b92 Klaus Aehlig
rolled back).
244 cf9f3b92 Klaus Aehlig
245 cf9f3b92 Klaus Aehlig
To achieve this, ``gnt-upgrade`` will support a ``--resume``
246 cf9f3b92 Klaus Aehlig
option. It is recommended to have ``gnt-upgrade --resume`` as an
247 cf9f3b92 Klaus Aehlig
at-reboot task in the crontab. If started with this option,
248 cf9f3b92 Klaus Aehlig
``gnt-upgrade`` does not accept any arguments. It first verifies that
249 cf9f3b92 Klaus Aehlig
it is running on the master node, using the same requirement as for
250 cf9f3b92 Klaus Aehlig
starting the master daemon, i.e., confirmed by a majority of all
251 cf9f3b92 Klaus Aehlig
nodes. If it is not the master node, it will remove any possibly
252 cf9f3b92 Klaus Aehlig
existing intend-to-upgrade file and exit. If it is running on the
253 cf9f3b92 Klaus Aehlig
master node, it will check for the existence of an intend-to-upgrade
254 cf9f3b92 Klaus Aehlig
file. If no such file is found, it will simply exit. If found, it will
255 cf9f3b92 Klaus Aehlig
resume at the appropriate stage.
256 cf9f3b92 Klaus Aehlig
257 cf9f3b92 Klaus Aehlig
- If the configuration file still is at the initial version,
258 cf9f3b92 Klaus Aehlig
  ``gnt-upgrade`` is resumed at the step immediately following the
259 cf9f3b92 Klaus Aehlig
  writing of the intend-to-upgrade file. It should be noted that
260 cf9f3b92 Klaus Aehlig
  all steps before changing the configuration are idempotent, so
261 cf9f3b92 Klaus Aehlig
  redoing them does not do any harm.
262 cf9f3b92 Klaus Aehlig
263 cf9f3b92 Klaus Aehlig
- If the configuration is already at the new version, all daemons on
264 cf9f3b92 Klaus Aehlig
  all nodes are stopped (as they might have been started again due
265 cf9f3b92 Klaus Aehlig
  to a reboot) and then it is resumed at the step immediately
266 cf9f3b92 Klaus Aehlig
  following the configuration change. All actions following the
267 cf9f3b92 Klaus Aehlig
  configuration change can be repeated without bringing the cluster
268 cf9f3b92 Klaus Aehlig
  into a worse state.
269 cf9f3b92 Klaus Aehlig
270 cf9f3b92 Klaus Aehlig
271 cf9f3b92 Klaus Aehlig
Caveats
272 cf9f3b92 Klaus Aehlig
=======
273 cf9f3b92 Klaus Aehlig
274 cf9f3b92 Klaus Aehlig
Since ``gnt-upgrade`` drains the queue and undrains it later, so any
275 cf9f3b92 Klaus Aehlig
information about a previous drain gets lost. This problem will
276 cf9f3b92 Klaus Aehlig
disappear, once :doc:`design-optables` is implemented, as then the
277 cf9f3b92 Klaus Aehlig
undrain will then be restricted to filters by gnt-upgrade.
278 cf9f3b92 Klaus Aehlig
279 cf9f3b92 Klaus Aehlig
280 cf9f3b92 Klaus Aehlig
Requirement of opcode backwards compatibility
281 cf9f3b92 Klaus Aehlig
==============================================
282 cf9f3b92 Klaus Aehlig
283 cf9f3b92 Klaus Aehlig
Since for upgrades we only pause jobs and do not fully drain the
284 cf9f3b92 Klaus Aehlig
queue, we need to be able to transform the job queue into a queue for
285 cf9f3b92 Klaus Aehlig
the new version. The way this is achieved is by keeping the
286 cf9f3b92 Klaus Aehlig
serialization format backwards compatible. This is in line with
287 cf9f3b92 Klaus Aehlig
current practice that opcodes do not change between versions, and at
288 cf9f3b92 Klaus Aehlig
most new fields are added. Whenever we add a new field to an opcode,
289 cf9f3b92 Klaus Aehlig
we will make sure that the deserialization function will provide a
290 cf9f3b92 Klaus Aehlig
default value if the field is not present.
291 cf9f3b92 Klaus Aehlig