Revision f67cca80

b/NEWS
7 7

  
8 8
*(unreleased)*
9 9

  
10
- Deprecated ``admin_up`` instance field. Instead, ``admin_state`` is
11
  introduced, with 3 possible values -- ``up``, ``down`` and
12
  ``offline``.
13
- Replaced ``--disks`` option of ``gnt-instance replace-disks`` with a
14
  more flexible ``--disk`` option. Now disk size and mode can be changed
15
  upon recreation.
16
- Removed deprecated ``QueryLocks`` LUXI request. Use
17
  ``Query(what=QR_LOCK, ...)`` instead.
18
- The LUXI requests :pyeval:`luxi.REQ_QUERY_JOBS`,
19
  :pyeval:`luxi.REQ_QUERY_INSTANCES`, :pyeval:`luxi.REQ_QUERY_NODES`,
20
  :pyeval:`luxi.REQ_QUERY_GROUPS`, :pyeval:`luxi.REQ_QUERY_EXPORTS` and
21
  :pyeval:`luxi.REQ_QUERY_TAGS` are deprecated and will be removed in a
22
  future version. :pyeval:`luxi.REQ_QUERY` should be used instead.
23
- RAPI client: ``CertificateError`` now derives from ``GanetiApiError``
24
- Deprecation warnings due to PyCrypto/paramiko import in
25
  ``tools/setup-ssh`` have been silenced, as usually they are safe;
26
  please make sure to run an up-to-date paramiko version
27
- The QA scripts now depend on Python 2.5 or above
10
New features
11
~~~~~~~~~~~~
12

  
13
Instance run status
14
+++++++++++++++++++
15

  
16
The current ``admin_up`` field, which used to denote whether an instance
17
should be running or not, has been removed. Instead, ``admin_state`` is
18
introduced, with 3 possible values -- ``up``, ``down`` and ``offline``.
19

  
20
The rational behind this is that an instance being “down” can have
21
different meanings:
22

  
23
- it could be down during a reboot
24
- it could be temporarily be down for a reinstall
25
- or it could be down because it is deprecated and kept just for its
26
  disk
27

  
28
The previous Boolean state was making it difficult to do capacity
29
calculations: should Ganeti reserve memory for a down instance? Now, the
30
tri-state field makes it clear:
31

  
32
- in ``up`` and ``down`` state, all resources are reserved for the
33
  instance, and it can be at any time brought up if it is down
34
- in ``offline`` state, only disk space is reserved for it, but not
35
  memory or CPUs
36

  
37
The field can have an extra use: since the transition between ``up`` and
38
``down`` and vice-versus is done via ``gnt-instance start/stop``, but
39
transition between ``offline`` and ``down`` is done via ``gnt-instance
40
modify``, it is possible to given different rights to users. For
41
example, owners of an instance could be allowed to start/stop it, but
42
not transition it out of the offline state.
43

  
44
Instance policies and specs
45
+++++++++++++++++++++++++++
46

  
47
In previous Ganeti versions, an instance creation request was not
48
limited on the minimum size and on the maximum size just by the cluster
49
resources. As such, any policy could be implemented only in third-party
50
clients (RAPI clients, or shell wrappers over ``gnt-*``
51
tools). Furthermore, calculating cluster capacity via ``hspace`` again
52
required external input with regards to instance sizes.
53

  
54
In order to improve these workflows and to allow for example better
55
per-node group differentiation, we introduced instance specs, which
56
allow declaring:
57

  
58
- minimum instance disk size, disk count, memory size, cpu count
59
- maximum values for the above metrics
60
- and “standard” values (used in ``hspace`` to calculate the standard
61
  sized instances)
62

  
63
The minimum/maximum values can be also customised at node-group level,
64
for example allowing more powerful hardware to support bigger instance
65
memory sizes.
66

  
67
Beside the instance specs, there are a few other settings belonging to
68
the instance policy framework. It is possible now to customise, per
69
cluster and node-group:
70

  
71
- the list of allowed disk templates
72
- the maximum ratio of VCPUs per PCPUs (to control CPU oversubscription)
73
- the maximum ratio of instance to spindles (see below for more
74
  information) for local storage
75

  
76
All these together should allow all tools that talk to Ganeti to know
77
what are the ranges of allowed values for instances and the
78
over-subscription that is allowed.
79

  
80
For the VCPU/PCPU ratio, we already have the VCPU configuration from the
81
instance configuration, and the physical CPU configuration from the
82
node. For the spindle ratios however, we didn't track before these
83
values, so new parameters have been added:
84

  
85
- a new node parameter ``spindle_count``, defaults to 1, customisable at
86
  node group or node level
87
- at new backend parameter (for instances), ``spindle_use`` defaults to 1
88

  
89
Note that spindles in this context doesn't need to mean actual
90
mechanical hard-drives; it's just a relative number for both the node
91
I/O capacity and instance I/O consumption.
92

  
93
Instance migration behaviour
94
++++++++++++++++++++++++++++
95

  
96
While live-migration is in general desirable over failover, it is
97
possible that for some workloads it is actually worse, due to the
98
variable time of the “suspend” phase during live migration.
99

  
100
To allow the tools to work consistently over such instances (without
101
having to hard-code instance names), a new backend parameter
102
``always_failover`` has been added to control the migration/failover
103
behaviour. When set to True, all migration requests for an instance will
104
instead fall-back to failover.
105

  
106
Instance memory ballooning
107
++++++++++++++++++++++++++
108

  
109
Initial support for memory ballooning has been added. The memory for an
110
instance is no longer fixed (backend parameter ``memory``), but instead
111
can vary between minimum and maximum values (backend parameters
112
``minmem`` and ``maxmem``). Currently we only change an instance's
113
memory when:
114

  
115
- live migrating or failing over and instance and the target node
116
  doesn't have enough memory
117
- user requests changing the memory via ``gnt-instance modify
118
  --runtime-memory``
119

  
120
Instance CPU pinning
121
++++++++++++++++++++
122

  
123
In order to control the use of specific CPUs by instance, support for
124
controlling CPU pinning has been added for the Xen, HVM and LXC
125
hypervisors. This is controlled by a new hypervisor parameter
126
``cpu_mask``; details about possible values for this are in the
127
:manpage:`gnt-instance(8)`. Note that use of the most specific (precise
128
VCPU-to-CPU mapping) form will work well only when all nodes in your
129
cluster have the same amount of CPUs.
130

  
131
Disk parameters
132
+++++++++++++++
133

  
134
Another area in which Ganeti was not customisable were the parameters
135
used for storage configuration, e.g. how many stripes to use for LVM,
136
DRBD resync configuration, etc.
137

  
138
To improve this area, we've added disks parameters, which are
139
customisable at cluster and node group level, and which allow to
140
specify various parameters for disks (DRBD has the most parameters
141
currently), for example:
142

  
143
- DRBD resync algorithm and parameters (e.g. speed)
144
- the default VG for meta-data volumes for DRBD
145
- number of stripes for LVM (plain disk template)
146
- the RBD pool
147

  
148
These parameters can be modified via ``gnt-cluster modify -D …`` and
149
``gnt-group modify -D …``, and are used at either instance creation (in
150
case of LVM stripes, for example) or at disk “activation” time
151
(e.g. resync speed).
152

  
153
Rados block device support
154
++++++++++++++++++++++++++
155

  
156
A Rados (http://ceph.com/wiki/Rbd) storage backend has been added,
157
denoted by the ``rbd`` disk template type. This is considered
158
experimental, feedback is welcome. For details on configuring it, see
159
the :doc:`install` document and the :manpage:`gnt-cluster(8)` man page.
160

  
161
Master IP setup
162
+++++++++++++++
163

  
164
The existing master IP functionality works well only in simple setups (a
165
single network shared by all nodes); however, if nodes belong to
166
different networks, then the ``/32`` setup and lack of routing
167
information is not enough.
168

  
169
To allow the master IP to function well in more complex cases, the
170
system was reworked as follows:
171

  
172
- a master IP netmask setting has been added
173
- the master IP activation/turn-down code was moved from the node daemon
174
  to a separate script
175
- whether to run the Ganeti-supplied master IP script or a user-supplied
176
  on is a ``gnt-cluster init`` setting
177

  
178
Details about the location of the standard and custom setup scripts are
179
in the man page :manpage:`gnt-cluster(8)`; for information about the
180
setup script protocol, look at the Ganeti-supplied script.
181

  
182
SPICE support
183
+++++++++++++
184

  
185
The `SPICE <http://www.linux-kvm.org/page/SPICE>`_ support has been
186
improved.
187

  
188
It is now possible to use TLS-protected connections, and when renewing
189
or changing the cluster certificates (via ``gnt-cluster renew-crypto``,
190
it is now possible to specify spice or spice CA certificates. Also, it
191
is possible to configure a password for SPICE sessions via the
192
hypervisor parameter ``spice_password_file``.
193

  
194
There are also new parameters to control the compression and streaming
195
options (e.g. ``spice_image_compression``, ``spice_streaming_video``,
196
etc.). For details, see the man page :manpage:`gnt-instance(8)` and look
197
for the spice parameters.
198

  
199
Lastly, it is now possible to see the SPICE connection information via
200
``gnt-instance console``.
201

  
202
OVF converter
203
+++++++++++++
204

  
205
A new tool (``tools/ovfconverter``) has been added that supports
206
conversion between Ganeti and the `Open Virtualization Format
207
<http://en.wikipedia.org/wiki/Open_Virtualization_Format>`_ (both to and
208
from).
209

  
210
This relies on the ``qemu-img`` tool to convert the disk formats, so the
211
actual compatibility with other virtualization solutions depends on it.
212

  
213
Confd daemon changes
214
++++++++++++++++++++
215

  
216
The configuration query daemon (``ganeti-confd``) is now optional, and
217
has been rewritten in Haskell; whether to use the daemon at all, use the
218
Python (default) or the Haskell version is selectable at configure time
219
via the ``--enable-confd`` parameter, which can take one of the
220
``haskell``, ``python`` or ``no`` values. If not used, disabling the
221
daemon will result in a smaller footprint; for larger systems, we
222
welcome feedback on the Haskell version which might become the default
223
in future versions.
224

  
225

  
226
User interface changes
227
~~~~~~~~~~~~~~~~~~~~~~
228

  
229
We have replaced the ``--disks`` option of ``gnt-instance
230
replace-disks`` with a more flexible ``--disk`` option, which allows
231
adding and removing disks at arbitrary indices (Issue 188). Furthermore,
232
disk size and mode can be changed upon recreation (via ``gnt-instance
233
recreate-disks``, which accepts the same ``--disk`` option).
234

  
235
As many people are used to a ``show`` command, we have added that as an
236
alias to ``info`` on all ``gnt-*`` commands.
237

  
238
The ``gnt-instance grow-disk`` command has a new mode in which it can
239
accept the target size of the disk, instead of the delta; this can be
240
more safe since two runs in absolute mode will be idempotent, and
241
sometimes it's also easier to specify the desired size directly.
242

  
243
API changes
244
~~~~~~~~~~~
245

  
246
RAPI coverage has improved, with (for example) new resources for
247
recreate-disks, node power-cycle, etc.
248

  
249
Compatibility
250
~~~~~~~~~~~~~
251

  
252
There is partial support for ``xl`` in the Xen hypervisor; feedback is
253
welcome.
254

  
255
Python 2.7 is better supported, and after Ganeti 2.6 we will investigate
256
whether to still support Python 2.4 or move to Python 2.6 as minimum
257
required version.
258

  
259
Internal changes
260
~~~~~~~~~~~~~~~~
261

  
262
The deprecated ``QueryLocks`` LUXI request has been removed. Use
263
``Query(what=QR_LOCK, ...)`` instead.
264

  
265
The LUXI requests :pyeval:`luxi.REQ_QUERY_JOBS`,
266
:pyeval:`luxi.REQ_QUERY_INSTANCES`, :pyeval:`luxi.REQ_QUERY_NODES`,
267
:pyeval:`luxi.REQ_QUERY_GROUPS`, :pyeval:`luxi.REQ_QUERY_EXPORTS` and
268
:pyeval:`luxi.REQ_QUERY_TAGS` are deprecated and will be removed in a
269
future version. :pyeval:`luxi.REQ_QUERY` should be used instead.
270

  
271
RAPI client: ``CertificateError`` now derives from
272
``GanetiApiError``. This should make it more easy to handle Ganeti
273
errors.
274

  
275
Deprecation warnings due to PyCrypto/paramiko import in
276
``tools/setup-ssh`` have been silenced, as usually they are safe; please
277
make sure to run an up-to-date paramiko version, if you use this tool.
278

  
279
The QA scripts now depend on Python 2.5 or above (the main code base
280
still works with Python 2.4).
281

  
282
The configuration file (``config.data``) is now written without
283
indentation for performance reasons; if you want to edit it, it can be
284
re-formatted via ``tools/fmtjson``.
285

  
286
A number of bugs has been fixed in the cluster merge tool.
287

  
288
``x509`` certification verification (used in import-export) has been
289
changed to allow the same clock skew as permitted by the cluster
290
verification. This will remove some rare but hard to diagnose errors in
291
import-export.
292

  
28 293

  
29 294

  
30 295
Version 2.5.1

Also available in: Unified diff