Statistics
| Branch: | Tag: | Revision:

root / doc / design-upgrade.rst @ 9110fb4a

History | View | Annotate | Download (11.9 kB)

1 cf9f3b92 Klaus Aehlig
========================================
2 cf9f3b92 Klaus Aehlig
Automatized Upgrade Procedure for Ganeti
3 cf9f3b92 Klaus Aehlig
========================================
4 cf9f3b92 Klaus Aehlig
5 cf9f3b92 Klaus Aehlig
.. contents:: :depth: 4
6 cf9f3b92 Klaus Aehlig
7 cf9f3b92 Klaus Aehlig
This is a design document detailing the proposed changes to the
8 cf9f3b92 Klaus Aehlig
upgrade process, in order to allow it to be more automatic.
9 cf9f3b92 Klaus Aehlig
10 cf9f3b92 Klaus Aehlig
11 cf9f3b92 Klaus Aehlig
Current state and shortcomings
12 cf9f3b92 Klaus Aehlig
==============================
13 cf9f3b92 Klaus Aehlig
14 cf9f3b92 Klaus Aehlig
Ganeti requires to run the same version of Ganeti to be run on all
15 cf9f3b92 Klaus Aehlig
nodes of a cluster and this requirement is unlikely to go away in the
16 cf9f3b92 Klaus Aehlig
foreseeable future. Also, the configuration may change between minor
17 cf9f3b92 Klaus Aehlig
versions (and in the past has proven to do so). This requires a quite
18 cf9f3b92 Klaus Aehlig
involved manual upgrade process of draining the queue, stopping
19 cf9f3b92 Klaus Aehlig
ganeti, changing the binaries, upgrading the configuration, starting
20 cf9f3b92 Klaus Aehlig
ganeti, distributing the configuration, and undraining the queue.
21 cf9f3b92 Klaus Aehlig
22 cf9f3b92 Klaus Aehlig
23 cf9f3b92 Klaus Aehlig
Proposed changes
24 cf9f3b92 Klaus Aehlig
================
25 cf9f3b92 Klaus Aehlig
26 cf9f3b92 Klaus Aehlig
While we will not remove the requirement of the same Ganeti
27 cf9f3b92 Klaus Aehlig
version running on all nodes, the transition from one version
28 cf9f3b92 Klaus Aehlig
to the other will be made more automatic. It will be possible
29 cf9f3b92 Klaus Aehlig
to install new binaries ahead of time, and the actual switch
30 cf9f3b92 Klaus Aehlig
between versions will be a single command.
31 cf9f3b92 Klaus Aehlig
32 b8e39189 Klaus Aehlig
While changing the file layout anyway, we install the python
33 b8e39189 Klaus Aehlig
code, which is architecture independent, under ``${prefix}/share``,
34 b8e39189 Klaus Aehlig
in a way that properly separates the Ganeti libraries of the
35 b8e39189 Klaus Aehlig
various versions. 
36 b8e39189 Klaus Aehlig
37 cf9f3b92 Klaus Aehlig
Path changes to allow multiple versions installed
38 cf9f3b92 Klaus Aehlig
-------------------------------------------------
39 cf9f3b92 Klaus Aehlig
40 cf9f3b92 Klaus Aehlig
Currently, Ganeti installs to ``${PREFIX}/bin``, ``${PREFIX}/sbin``,
41 cf9f3b92 Klaus Aehlig
and so on, as well as to ``${pythondir}/ganeti``.
42 cf9f3b92 Klaus Aehlig
43 cf9f3b92 Klaus Aehlig
These paths will be changed in the following way.
44 cf9f3b92 Klaus Aehlig
45 b8e39189 Klaus Aehlig
- The python package will be installed to
46 b8e39189 Klaus Aehlig
  ``${PREFIX}/share/ganeti/${VERSION}/ganeti``.
47 b8e39189 Klaus Aehlig
  Here ${VERSION} is, depending on configure options, either the full qualified
48 b8e39189 Klaus Aehlig
  version number, consisting of major, minor, revision, and suffix, or it is
49 b8e39189 Klaus Aehlig
  just a major.minor pair. All python executables will be installed under
50 b8e39189 Klaus Aehlig
  ``${PREFIX}/share/ganeti/${VERSION}`` so that they see their respective
51 b8e39189 Klaus Aehlig
  Ganeti library. ``${PREFIX}/share/ganeti/default`` is a symbolic link to
52 b8e39189 Klaus Aehlig
  ``${sysconfdir}/ganeti/share`` which, in turn, is a symbolic link to
53 b8e39189 Klaus Aehlig
  ``${PREFIX}/share/ganeti/${VERSION}``. For all python executatables (like
54 b8e39189 Klaus Aehlig
  ``gnt-cluster``, ``gnt-node``, etc) symbolic links going through
55 b8e39189 Klaus Aehlig
  ``${PREFIX}/share/ganeti/default`` are added under ``${PREFIX}/sbin``.
56 cf9f3b92 Klaus Aehlig
57 cf9f3b92 Klaus Aehlig
- All other files will be installed to the corresponding path under
58 b8e39189 Klaus Aehlig
  ``${libdir}/ganeti/${VERSION}`` instead of under ``${PREFIX}``
59 b8e39189 Klaus Aehlig
  directly, where ``${libdir}`` defaults to ``${PREFIX}/lib``.
60 b8e39189 Klaus Aehlig
  ``${libdir}/ganeti/default`` will be a symlink to ``${sysconfdir}/ganeti/lib``
61 b8e39189 Klaus Aehlig
  which, in turn, is a symlink to ``${libdir}/ganeti/${VERSION}``.
62 b8e39189 Klaus Aehlig
  Symbolic links to the files installed under ``${libdir}/ganeti/${VERSION}``
63 b8e39189 Klaus Aehlig
  will be added under ``${PREFIX}/bin``, ``${PREFIX}/sbin``, and so on. These
64 b8e39189 Klaus Aehlig
  symbolic links will go through ``${libdir}/ganeti/default`` so that the
65 b8e39189 Klaus Aehlig
  version can easily be changed by updating the symbolic link in
66 b8e39189 Klaus Aehlig
  ``${sysconfdir}``.
67 b8e39189 Klaus Aehlig
68 b8e39189 Klaus Aehlig
The set of links for ganeti binaries might change between the versions.
69 b8e39189 Klaus Aehlig
However, as the file structure under ``${libdir}/ganeti/${VERSION}`` reflects
70 b8e39189 Klaus Aehlig
that of ``/``, two links of differnt versions will never conflict. Similarly,
71 b8e39189 Klaus Aehlig
the symbolic links for the python executables will never conflict, as they
72 b8e39189 Klaus Aehlig
always point to a file with the same basename directly under
73 b8e39189 Klaus Aehlig
``${PREFIX}/share/ganeti/default``. Therefore, each version will make sure that
74 b8e39189 Klaus Aehlig
enough symbolic links are present in ``${PREFIX}/bin``, ``${PREFIX}/sbin`` and
75 b8e39189 Klaus Aehlig
so on, even though some might be dangling, if a differnt version of ganeti is
76 b8e39189 Klaus Aehlig
currently active.
77 b8e39189 Klaus Aehlig
78 b8e39189 Klaus Aehlig
The extra indirection through ``${sysconfdir}`` allows installations that choose
79 b8e39189 Klaus Aehlig
to have ``${sysconfdir}`` and ``${localstatedir}`` outside ``${PREFIX}`` to
80 b8e39189 Klaus Aehlig
mount ``${PREFIX}`` read-only. The latter is important for systems that choose
81 b8e39189 Klaus Aehlig
``/usr`` as ``${PREFIX}`` and are following the Filesystem Hierarchy Standard.
82 b8e39189 Klaus Aehlig
For example, choosing ``/usr`` as ``${PREFIX}`` and ``/etc`` as ``${sysconfdir}``,
83 b8e39189 Klaus Aehlig
the layout for version 2.10 will look as follows.
84 b8e39189 Klaus Aehlig
::
85 b8e39189 Klaus Aehlig
86 b8e39189 Klaus Aehlig
   /
87 b8e39189 Klaus Aehlig
   |
88 b8e39189 Klaus Aehlig
   +-- etc
89 b8e39189 Klaus Aehlig
   |   |
90 b8e39189 Klaus Aehlig
   |   +-- ganeti 
91 b8e39189 Klaus Aehlig
   |         |
92 b8e39189 Klaus Aehlig
   |         +-- lib -> /usr/lib/ganeti/2.10
93 b8e39189 Klaus Aehlig
   |         |
94 b8e39189 Klaus Aehlig
   |         +-- share  -> /usr/share/ganeti/2.10
95 b8e39189 Klaus Aehlig
   +-- usr
96 b8e39189 Klaus Aehlig
        |
97 b8e39189 Klaus Aehlig
        +-- bin
98 b8e39189 Klaus Aehlig
        |   |
99 b8e39189 Klaus Aehlig
        |   +-- harep -> /usr/lib/ganeti/default/usr/bin/harep
100 b8e39189 Klaus Aehlig
        |   |
101 b8e39189 Klaus Aehlig
        |   ...  
102 b8e39189 Klaus Aehlig
        |
103 b8e39189 Klaus Aehlig
        +-- sbin
104 b8e39189 Klaus Aehlig
        |   |
105 b8e39189 Klaus Aehlig
        |   +-- gnt-cluster -> /usr/share/ganeti/default/gnt-cluster
106 b8e39189 Klaus Aehlig
        |   |
107 b8e39189 Klaus Aehlig
        |   ...  
108 b8e39189 Klaus Aehlig
        |
109 b8e39189 Klaus Aehlig
        +-- ...
110 b8e39189 Klaus Aehlig
        |
111 b8e39189 Klaus Aehlig
        +-- lib
112 b8e39189 Klaus Aehlig
        |   |
113 b8e39189 Klaus Aehlig
        |   +-- ganeti
114 b8e39189 Klaus Aehlig
        |       |
115 b8e39189 Klaus Aehlig
        |       +-- default -> /etc/ganeti/lib
116 b8e39189 Klaus Aehlig
        |       |
117 b8e39189 Klaus Aehlig
        |       +-- 2.10
118 b8e39189 Klaus Aehlig
        |           |
119 b8e39189 Klaus Aehlig
        |           +-- usr
120 b8e39189 Klaus Aehlig
        |               |
121 b8e39189 Klaus Aehlig
        |               +-- bin
122 b8e39189 Klaus Aehlig
        |               |    |
123 b8e39189 Klaus Aehlig
        |               |    +-- htools
124 b8e39189 Klaus Aehlig
        |               |    |
125 b8e39189 Klaus Aehlig
        |               |    +-- harep -> htools
126 b8e39189 Klaus Aehlig
        |               |    |
127 b8e39189 Klaus Aehlig
        |               |    ...
128 b8e39189 Klaus Aehlig
        |               ...
129 b8e39189 Klaus Aehlig
        |
130 b8e39189 Klaus Aehlig
        +-- share
131 b8e39189 Klaus Aehlig
             |
132 b8e39189 Klaus Aehlig
             +-- ganeti
133 b8e39189 Klaus Aehlig
                 |
134 b8e39189 Klaus Aehlig
                 +-- default -> /etc/ganeti/share
135 b8e39189 Klaus Aehlig
                 |
136 b8e39189 Klaus Aehlig
                 +-- 2.10
137 b8e39189 Klaus Aehlig
                     |
138 b8e39189 Klaus Aehlig
                     + -- gnt-cluster
139 b8e39189 Klaus Aehlig
                     |
140 b8e39189 Klaus Aehlig
                     + -- gnt-node
141 b8e39189 Klaus Aehlig
                     |
142 b8e39189 Klaus Aehlig
                     + -- ...
143 b8e39189 Klaus Aehlig
                     |
144 b8e39189 Klaus Aehlig
                     + -- ganeti
145 b8e39189 Klaus Aehlig
                          |
146 b8e39189 Klaus Aehlig
                          +-- backend.py
147 b8e39189 Klaus Aehlig
                          |
148 b8e39189 Klaus Aehlig
                          +-- ...
149 b8e39189 Klaus Aehlig
                          |
150 b8e39189 Klaus Aehlig
                          +-- cmdlib
151 b8e39189 Klaus Aehlig
                          |   |
152 b8e39189 Klaus Aehlig
                          |   ...
153 b8e39189 Klaus Aehlig
                          ...
154 b8e39189 Klaus Aehlig
155 b8e39189 Klaus Aehlig
156 cf9f3b92 Klaus Aehlig
157 921319f5 Klaus Aehlig
gnt-cluster upgrade
158 921319f5 Klaus Aehlig
-------------------
159 cf9f3b92 Klaus Aehlig
160 921319f5 Klaus Aehlig
The actual upgrade process will be done by a new command ``upgrade`` to
161 921319f5 Klaus Aehlig
``gnt-cluster``. If called with the option ``--to`` which take precisely
162 921319f5 Klaus Aehlig
one argument, the version to
163 cf9f3b92 Klaus Aehlig
upgrade (or downgrade) to, given as full string with major, minor, suffix,
164 cf9f3b92 Klaus Aehlig
and suffix. To be compatible with current configuration upgrade and downgrade
165 cf9f3b92 Klaus Aehlig
procedures, the new version must be of the same major version and
166 cf9f3b92 Klaus Aehlig
either an equal or higher minor version, or precisely the previous
167 cf9f3b92 Klaus Aehlig
minor version.
168 cf9f3b92 Klaus Aehlig
169 921319f5 Klaus Aehlig
When executed, ``gnt-cluster upgrade --to=<version>`` will perform the
170 921319f5 Klaus Aehlig
following actions.
171 cf9f3b92 Klaus Aehlig
172 cf9f3b92 Klaus Aehlig
- It verifies that the version to change to is installed on all nodes
173 cf9f3b92 Klaus Aehlig
  of the cluster that are not marked as offline. If this is not the
174 cf9f3b92 Klaus Aehlig
  case it aborts with an error. This initial testing is an
175 cf9f3b92 Klaus Aehlig
  optimization to allow for early feedback.
176 cf9f3b92 Klaus Aehlig
177 cf9f3b92 Klaus Aehlig
- An intent-to-upgrade file is created that contains the current
178 cf9f3b92 Klaus Aehlig
  version of ganeti, the version to change to, and the process ID of
179 921319f5 Klaus Aehlig
  the ``gnt-cluster upgrade`` process. The latter is not used automatically,
180 cf9f3b92 Klaus Aehlig
  but allows manual detection if the upgrade process died
181 cf9f3b92 Klaus Aehlig
  unintentionally. The intend-to-upgrade file is persisted to disk
182 cf9f3b92 Klaus Aehlig
  before continuing.
183 cf9f3b92 Klaus Aehlig
184 cf9f3b92 Klaus Aehlig
- The Ganeti job queue is drained, and the executable waits till there
185 cf9f3b92 Klaus Aehlig
  are no more jobs in the queue. Once :doc:`design-optables` is
186 cf9f3b92 Klaus Aehlig
  implemented, for upgrades, and only for upgrades, all jobs are paused
187 cf9f3b92 Klaus Aehlig
  instead (in the sense that the currently running opcode continues,
188 cf9f3b92 Klaus Aehlig
  but the next opcode is not started) and it is continued once all
189 cf9f3b92 Klaus Aehlig
  jobs are fully paused.
190 cf9f3b92 Klaus Aehlig
191 cf9f3b92 Klaus Aehlig
- All ganeti daemons on the master node are stopped.
192 cf9f3b92 Klaus Aehlig
193 cf9f3b92 Klaus Aehlig
- It is verified again that all nodes at this moment not marked as
194 cf9f3b92 Klaus Aehlig
  offline have the new version installed. If this is not the case,
195 cf9f3b92 Klaus Aehlig
  then all changes so far (stopping ganeti daemons and draining the
196 cf9f3b92 Klaus Aehlig
  queue) are undone and failure is reported. This second verification
197 cf9f3b92 Klaus Aehlig
  is necessary, as the set of online nodes might have changed during
198 cf9f3b92 Klaus Aehlig
  the draining period.
199 cf9f3b92 Klaus Aehlig
200 cf9f3b92 Klaus Aehlig
- All ganeti daemons on all remaining (non-offline) nodes are stopped.
201 cf9f3b92 Klaus Aehlig
202 cf9f3b92 Klaus Aehlig
- A backup of all Ganeti-related status information is created for
203 cf9f3b92 Klaus Aehlig
  manual rollbacks. While the normal way of rolling back after an
204 921319f5 Klaus Aehlig
  upgrade should be calling ``gnt-clsuter upgrade`` from the newer version
205 cf9f3b92 Klaus Aehlig
  with the older version as argument, a full backup provides an
206 cf9f3b92 Klaus Aehlig
  additional safety net, especially for jump-upgrades (skipping
207 cf9f3b92 Klaus Aehlig
  intermediate minor versions).
208 cf9f3b92 Klaus Aehlig
209 cf9f3b92 Klaus Aehlig
- If the action is a downgrade to the previous minor version, the
210 cf9f3b92 Klaus Aehlig
  configuration is downgraded now, using ``cfgupgrade --downgrade``.
211 cf9f3b92 Klaus Aehlig
212 b8e39189 Klaus Aehlig
- The ``${sysconfdir}/ganeti/lib`` and ``${sysconfdir}/ganeti/share``
213 b8e39189 Klaus Aehlig
  symbolic links are updated.
214 cf9f3b92 Klaus Aehlig
215 cf9f3b92 Klaus Aehlig
- If the action is an upgrade to a higher minor version, the configuration
216 cf9f3b92 Klaus Aehlig
  is upgraded now, using ``cfgupgrade``.
217 cf9f3b92 Klaus Aehlig
218 cf9f3b92 Klaus Aehlig
- All daemons are started on all nodes.
219 cf9f3b92 Klaus Aehlig
220 cf9f3b92 Klaus Aehlig
- ``ensure-dirs --full-run`` is run on all nodes.
221 cf9f3b92 Klaus Aehlig
222 cf9f3b92 Klaus Aehlig
- ``gnt-cluster redist-conf`` is run on the master node. 
223 cf9f3b92 Klaus Aehlig
224 cf9f3b92 Klaus Aehlig
- All daemons are restarted on all nodes.
225 cf9f3b92 Klaus Aehlig
226 cf9f3b92 Klaus Aehlig
- The Ganeti job queue is undrained.
227 cf9f3b92 Klaus Aehlig
228 cf9f3b92 Klaus Aehlig
- The intent-to-upgrade file is removed.
229 cf9f3b92 Klaus Aehlig
230 cf9f3b92 Klaus Aehlig
- ``gnt-cluster verify`` is run and the result reported.
231 cf9f3b92 Klaus Aehlig
232 cf9f3b92 Klaus Aehlig
233 cf9f3b92 Klaus Aehlig
Considerations on unintended reboots of the master node
234 cf9f3b92 Klaus Aehlig
=======================================================
235 cf9f3b92 Klaus Aehlig
 
236 cf9f3b92 Klaus Aehlig
During the upgrade procedure, the only ganeti process still running is
237 921319f5 Klaus Aehlig
the one instance of ``gnt-cluster upgrade``. This process is also responsible
238 cf9f3b92 Klaus Aehlig
for eventually removing the queue drain. Therefore, we have to provide
239 cf9f3b92 Klaus Aehlig
means to resume this process, if it dies unintentionally. The process
240 cf9f3b92 Klaus Aehlig
itself will handle SIGTERM gracefully by either undoing all changes
241 cf9f3b92 Klaus Aehlig
done so far, or by ignoring the signal all together and continuing to
242 cf9f3b92 Klaus Aehlig
the end; the choice between these behaviors depends on whether change
243 cf9f3b92 Klaus Aehlig
of the configuration has already started (in which case it goes
244 cf9f3b92 Klaus Aehlig
through to the end), or not (in which case the actions done so far are
245 cf9f3b92 Klaus Aehlig
rolled back).
246 cf9f3b92 Klaus Aehlig
247 921319f5 Klaus Aehlig
To achieve this, ``gnt-cluster upgrade`` will support a ``--resume``
248 921319f5 Klaus Aehlig
option. It is recommended
249 921319f5 Klaus Aehlig
to have ``gnt-cluster upgrade --resume`` as an at-reboot task in the crontab.
250 921319f5 Klaus Aehlig
The ``gnt-cluster upgrade --resume`` comand first verifies that
251 cf9f3b92 Klaus Aehlig
it is running on the master node, using the same requirement as for
252 cf9f3b92 Klaus Aehlig
starting the master daemon, i.e., confirmed by a majority of all
253 cf9f3b92 Klaus Aehlig
nodes. If it is not the master node, it will remove any possibly
254 cf9f3b92 Klaus Aehlig
existing intend-to-upgrade file and exit. If it is running on the
255 cf9f3b92 Klaus Aehlig
master node, it will check for the existence of an intend-to-upgrade
256 cf9f3b92 Klaus Aehlig
file. If no such file is found, it will simply exit. If found, it will
257 cf9f3b92 Klaus Aehlig
resume at the appropriate stage.
258 cf9f3b92 Klaus Aehlig
259 cf9f3b92 Klaus Aehlig
- If the configuration file still is at the initial version,
260 921319f5 Klaus Aehlig
  ``gnt-cluster upgrade`` is resumed at the step immediately following the
261 cf9f3b92 Klaus Aehlig
  writing of the intend-to-upgrade file. It should be noted that
262 cf9f3b92 Klaus Aehlig
  all steps before changing the configuration are idempotent, so
263 cf9f3b92 Klaus Aehlig
  redoing them does not do any harm.
264 cf9f3b92 Klaus Aehlig
265 cf9f3b92 Klaus Aehlig
- If the configuration is already at the new version, all daemons on
266 cf9f3b92 Klaus Aehlig
  all nodes are stopped (as they might have been started again due
267 cf9f3b92 Klaus Aehlig
  to a reboot) and then it is resumed at the step immediately
268 cf9f3b92 Klaus Aehlig
  following the configuration change. All actions following the
269 cf9f3b92 Klaus Aehlig
  configuration change can be repeated without bringing the cluster
270 cf9f3b92 Klaus Aehlig
  into a worse state.
271 cf9f3b92 Klaus Aehlig
272 cf9f3b92 Klaus Aehlig
273 cf9f3b92 Klaus Aehlig
Caveats
274 cf9f3b92 Klaus Aehlig
=======
275 cf9f3b92 Klaus Aehlig
276 921319f5 Klaus Aehlig
Since ``gnt-cluster upgrade`` drains the queue and undrains it later, so any
277 cf9f3b92 Klaus Aehlig
information about a previous drain gets lost. This problem will
278 cf9f3b92 Klaus Aehlig
disappear, once :doc:`design-optables` is implemented, as then the
279 cf9f3b92 Klaus Aehlig
undrain will then be restricted to filters by gnt-upgrade.
280 cf9f3b92 Klaus Aehlig
281 cf9f3b92 Klaus Aehlig
282 ec3b36c8 Klaus Aehlig
Requirement of job queue update
283 ec3b36c8 Klaus Aehlig
===============================
284 cf9f3b92 Klaus Aehlig
285 cf9f3b92 Klaus Aehlig
Since for upgrades we only pause jobs and do not fully drain the
286 cf9f3b92 Klaus Aehlig
queue, we need to be able to transform the job queue into a queue for
287 ec3b36c8 Klaus Aehlig
the new version. The preferred way to obtain this is to keep the
288 ec3b36c8 Klaus Aehlig
serialization format backwards compatible, i.e., only adding new
289 ec3b36c8 Klaus Aehlig
opcodes and new optional fields.
290 ec3b36c8 Klaus Aehlig
291 ec3b36c8 Klaus Aehlig
However, even with soft drain, no job is running at the moment `cfgupgrade`
292 ec3b36c8 Klaus Aehlig
is running. So, if we change the queue representation, including the
293 ec3b36c8 Klaus Aehlig
representation of individual opcodes in any way, `cfgupgrade` will also
294 ec3b36c8 Klaus Aehlig
modify the queue accordingly. In a jobs-as-processes world, pausing a job
295 ec3b36c8 Klaus Aehlig
will be implemented in such a way that the corresponding process stops after
296 ec3b36c8 Klaus Aehlig
finishing the current opcode, and a new process is created if and when the
297 ec3b36c8 Klaus Aehlig
job is unpaused again.