root / doc / design-multi-version-tests.rst @ a8b1e9f8
History | View | Annotate | Download (7.1 kB)
1 |
=================== |
---|---|
2 |
Multi-version tests |
3 |
=================== |
4 |
|
5 |
.. contents:: :depth: 4 |
6 |
|
7 |
This is a design document describing how tests which use multiple |
8 |
versions of Ganeti can be introduced into the current build |
9 |
infrastructure. |
10 |
|
11 |
Desired improvements |
12 |
==================== |
13 |
|
14 |
The testing of Ganeti is currently done by using two different |
15 |
approaches - unit tests and QA. While the former are useful for ensuring |
16 |
that the individual parts of the system work as expected, most errors |
17 |
are discovered only when all the components of Ganeti interact during |
18 |
QA. |
19 |
|
20 |
However useful otherwise, until now the QA has failed to provide support |
21 |
for testing upgrades and version compatibility as it was limited to |
22 |
using only one version of Ganeti. While these can be tested for every |
23 |
release manually, a systematic approach is preferred and none can exist |
24 |
with this restriction in place. To lift it, the buildbot scripts and QA |
25 |
utilities must be extended to allow a way of specifying and using |
26 |
diverse multi-version checks. |
27 |
|
28 |
Required use cases |
29 |
================== |
30 |
|
31 |
There are two classes of multi-version tests that are interesting in |
32 |
Ganeti, and this chapter provides an example from each to highlight what |
33 |
should be accounted for in the design. |
34 |
|
35 |
Compatibility tests |
36 |
------------------- |
37 |
|
38 |
One interface Ganeti exposes to clients interested in interacting with |
39 |
it is the RAPI. Its stability has always been a design principle |
40 |
followed during implementation, but whether it held true in practice was |
41 |
not asserted through tests. |
42 |
|
43 |
An automatic test of RAPI compatibility would have to take a diverse set |
44 |
of RAPI requests and perform them on two clusters of different versions, |
45 |
one of which would be the reference version. If the clusters had been |
46 |
identically configured, all of the commands successfully executed on the |
47 |
reference version should succeed on the newer version as well. |
48 |
|
49 |
To achieve this, two versions of Ganeti can be run separately on a |
50 |
cleanly setup cluster. With no guarantee that the versions can coexist, |
51 |
the deployment of these has to be separate. A proxy placed between the |
52 |
client and Ganeti records all the requests and responses. Using this |
53 |
data, a testing utility can decide if the newer version is compatible or |
54 |
not, and provide additional information to assist with debugging. |
55 |
|
56 |
Upgrade / downgrade tests |
57 |
------------------------- |
58 |
|
59 |
An upgrade / downgrade test serves to examine whether the state of the |
60 |
cluster is unchanged after its configuration has been upgraded or |
61 |
downgraded to another version of Ganeti. |
62 |
|
63 |
The test works with two consecutive versions of Ganeti, both installed |
64 |
on the same machine. It examines whether the configuration data and |
65 |
instances survive the downgrade and upgrade procedures. This is done by |
66 |
creating a cluster with the newer version, downgrading it to the older |
67 |
one, and upgrading it to the newer one again. After every step, the |
68 |
integrity of the cluster is checked by running various operations and |
69 |
ensuring everything still works. |
70 |
|
71 |
Design and implementation |
72 |
========================= |
73 |
|
74 |
Although the previous examples have not been selected to show use cases |
75 |
as diverse as possible, they still show a number of dissimilarities: |
76 |
|
77 |
- Parallel installation vs sequential deployments |
78 |
- Comparing with reference version vs comparing consecutive versions |
79 |
- Examining result dumps vs trying a sequence of operations |
80 |
|
81 |
With the first two real use cases demonstrating such diversity, it does |
82 |
not make sense to design multi-version test classes. Instead, the |
83 |
programmability of buildbot's configuration files can be leveraged to |
84 |
implement each test as a separate builder with a custom sequence of |
85 |
steps. The individual steps such as checking out a given or previous |
86 |
version, or installing and removing Ganeti, will be provided as utility |
87 |
functions for any test writer to use. |
88 |
|
89 |
Current state |
90 |
------------- |
91 |
|
92 |
An upgrade / downgrade test is a part of the QA suite as of commit |
93 |
aa104b5e. The test and the corresponding buildbot changes are a very |
94 |
good first step, both by showing that multi-version tests can be done, |
95 |
and by providing utilities needed for builds of multiple branches. |
96 |
Previously, the same folder was used as the base directory of any build, |
97 |
and now a directory structure more accommodating to multiple builds is |
98 |
in place. |
99 |
|
100 |
The builder running the test has one flaw - regardless of the branch |
101 |
submitted, it compares versions 2.10 and 2.11 (current master). This |
102 |
behaviour is different from any of the other builders, which may |
103 |
restrict the branches a test can be performed on, but do not |
104 |
differentiate between them otherwise. While additional builders for |
105 |
different versions pairs may be added, this is not a good long-term |
106 |
solution. |
107 |
|
108 |
The test can be improved by making it compare the current and the |
109 |
previous version. As the buildbot has no notion of what a previous |
110 |
version is, additional utilities to handle this logic will have to be |
111 |
introduced. |
112 |
|
113 |
Planned changes |
114 |
--------------- |
115 |
|
116 |
The upgrade / downgrade test should be generalized to work for any |
117 |
version which can be downgraded from and upgraded to automatically, |
118 |
meaning versions from 2.11 onwards. This will be made challenging by |
119 |
the fact that the previous version has to be checked out by reading the |
120 |
version of the currently checked out code, identifying the previous |
121 |
version, and then making yet another checkout. |
122 |
|
123 |
The major and minor version can be read from a Ganeti repository in |
124 |
multiple ways. The two are present as constants defined in source files, |
125 |
but due to refactorings shifting constants from the Python to the |
126 |
Haskell side, their position varies across versions. A more reliable way |
127 |
of fetching them is by examining the news file, as it obeys strict |
128 |
formatting restrictions. |
129 |
|
130 |
With the version found, a script that acts as a previous version |
131 |
lookup table can be invoked. This script can be constructed dynamically |
132 |
upon buildbot startup, and specified as a build step. The checkout |
133 |
following it proceeds as expected. |
134 |
|
135 |
The RAPI compatibility test should be added as a separate builder |
136 |
afterwards. As the test requires additional comparison and proxy logic |
137 |
to be used, it will be enabled only on 2.11 onwards, comparing the |
138 |
versions to 2.6 - the reference version for the RAPI. Details on the |
139 |
design of this test will be added in a separate document. |
140 |
|
141 |
Potential issues |
142 |
================ |
143 |
|
144 |
While there are many advantages to having a single builder representing |
145 |
a multi-version test, working on every branch, there is at least one |
146 |
disadvantage: the need to define a base or reference version, which is |
147 |
the only version that can be used to trigger the test, and the only one |
148 |
on which code changes can be tried. |
149 |
|
150 |
If an error is detected while running a test, and the issue lies with |
151 |
a version other than the one used to invoke the test, the fix would |
152 |
have to make it into the repository before the test could be tried |
153 |
again. |
154 |
|
155 |
For simple tests, the issue might be mitigated by running them locally. |
156 |
However, the multi-version tests are more likely to be complicated than |
157 |
not, and it could be difficult to reproduce a test by hand. |
158 |
|
159 |
The situation can be made simpler by requiring that any multi-version |
160 |
test can use only versions lower than the reference version. As errors |
161 |
are more likely to be found in new rather than old code, this would at |
162 |
least reduce the number of troublesome cases. |