Statistics
| Branch: | Tag: | Revision:

root / doc / design-os.rst @ 9110fb4a

History | View | Annotate | Download (21.1 kB)

1
===============================
2
Ganeti OS installation redesign
3
===============================
4

    
5
.. contents:: :depth: 3
6

    
7
This is a design document detailing a new OS installation procedure, which is
8
more secure, able to provide more features and easier to use for many common
9
tasks w.r.t. the current one.
10

    
11
Current state and shortcomings
12
==============================
13

    
14
As of Ganeti 2.10, each instance is associated with an OS definition. An OS
15
definition is a set of scripts (``create``, ``export``, ``import``, ``rename``)
16
that are executed with root privileges on the primary host of the instance to
17
perform all the OS-related functionality (setting up an operating system inside
18
the disks of the instance being created, exporting/importing the instance,
19
renaming it).
20

    
21
These scripts receive through environment variables a fixed set of parameters
22
related to the instance (such as the hypervisor, the name of the instance, the
23
number of disks, and their location) and a set of user defined parameters.
24
These parameters are also written in the configuration file of Ganeti, to allow
25
future reinstalls of the instance, and in various log files, namely:
26

    
27
* node daemon log file: contains DEBUG strings of the ``/os_validate``,
28
  ``/instance_os_add`` and ``/instance_start`` RPC calls.
29

    
30
* master daemon log file: DEBUG strings related to the same RPC calls are stored
31
  here as well.
32

    
33
* commands log: the CLI commands that create a new instance, including their
34
  parameters, are logged here.
35

    
36
* RAPI log: the RAPI commands that create a new instance, including their
37
  parameters, are logged here.
38

    
39
* job logs: the job files stored in the job queue, or in its archive, contain
40
  the parameters.
41

    
42
The current situation presents a number of shortcomings:
43

    
44
* Having the installation scripts run as root on the nodes doesn't allow
45
  user-defined OS scripts, as they would pose a huge security issue.
46
  Furthermore, even a script without malicious intentions might end up
47
  distrupting a node because of a bug in it.
48

    
49
* Ganeti cannot be used to create instances starting from user provided disk
50
  images: even in the (hypothetical) case where the scripts are completely
51
  secure and run not by root but by an unprivileged user with only the power to
52
  mount arbitrary files as disk images, this is a security issue. It has been
53
  proven that a carefully crafted file system might exploit kernel
54
  vulnerabilities to gain control of the system. Therefore, directly mounting
55
  images on the Ganeti nodes is not an option.
56

    
57
* There is no way to inject files into an existing disk image. A common use case
58
  is for the system administrator to provide a standard image of the system, to
59
  be later personalized with the network configuration, private keys identifying
60
  the machine, ssh keys of the users and so on. A possible workaround would be
61
  for the scripts to mount the image (only if this is trusted!) and to receive
62
  the configurations and ssh keys as user defined OS parameters. Unfortunately,
63
  this is also not an option for security sensitive material (such as the ssh
64
  keys) because the OS parameters are stored in many places on the system, as
65
  already described above.
66

    
67
* Most other virtualization software simply work with instance images, not with
68
  installation scripts. This difference makes the interaction of Ganeti with
69
  other software difficult.
70

    
71
Proposed changes
72
================
73

    
74
In order to fix the shortcomings of the current state, we plan to introduce the
75
following changes.
76

    
77

    
78
OS parameters categories
79
++++++++++++++++++++++++
80

    
81
Change the OS parameters to have three categories:
82

    
83
* ``public``: the current behavior. The parameter is logged and stored freely.
84

    
85
* ``private``: the parameter is saved inside the Ganeti configuration (to allow
86
  for instance reinstall) but it is not shown in logs, job logs, or passed back
87
  via RAPI.
88

    
89
* ``secret``: the parameter is not saved inside the Ganeti configuration.
90
  Reinstalls are impossible unless the data is passed again. The parameter will
91
  not appear in any log file. When a functionality is performed jointly by
92
  multiple daemons (such as MasterD and LuxiD), currently Ganeti sometimes
93
  serializes jobs on disk and later reloads them. Secret parameters will not be
94
  serialized to disk. They will be passed around as part of the LUXI calls
95
  exchanged by the daemons, and only kept in memory, in order to reduce their
96
  accessibility as much as possible. In case of failure of the master node,
97
  these parameters will be lost and cannot be recovered because they are not
98
  serialized. As a result, the job cannot be taken over by the new master.  This
99
  is an expected and accepted side effect of jobs with secret parameters: if
100
  they fail, they'll have to be restarted manually.
101

    
102

    
103
Metadata
104
++++++++
105

    
106
In order to allow metadata to be sent inside the instance, a communication
107
mechanism between the instance and the host will be created.  This mechanism
108
will be bidirectional (e.g.: to allow the setup process going on inside the
109
instance to communicate its progress to the host). Each instance will have
110
access exclusively to its own metadata, and it will be only able to communicate
111
with its host over this channel.  This is the approach followed the
112
``cloud-init`` tool and more details will be provided in the `Communication
113
mechanism and metadata service`_ section.
114

    
115

    
116
Installation procedure
117
++++++++++++++++++++++
118

    
119
A new installation procedure will be introduced, with which it will be possible
120
to use an installation medium and run the OS scripts in an optional virtualized
121
environment and with an optional personalization package.  There will be two
122
sets of parameters, namely, installation parameters, which are used mainly for
123
installs and reinstalls, and execution parameters, which are used in all the
124
other runs that are not part of an installation procedure.
125

    
126
This set of installation parameters will allow, e.g., to attach an installation
127
floppy/cdrom/network, change the boot device order, or specify a disk image to
128
be used.  Through this set of parameters, the administrator will have to provide
129
the hypervisor a location for an installation medium for the instance (e.g., a
130
boot disk, a network image, etc).  This medium will carry out the installation
131
of the instance onto the instance's disks and will then be responsible for
132
getting the parameters for configuring the instance, such as, network
133
interfaces, IP address, and hostname.  These parameters are taken from the
134
metadata.  The installation parameters will be stored in the configuration of
135
Ganeti and used in future reinstalls, but not during normal execution.
136

    
137
The instance is reinstalled using the same installation parameters from the
138
first installation.  However, it will be the administrator's responsibility to
139
ensure that the any installation media is still available at the proper location
140
when a reinstall occurs.
141

    
142
The parameter ``--os-parameters`` can still be used to specify the OS
143
parameters.  However, without OS scripts, Ganeti cannot do more than a syntactic
144
check to validate the supplied OS parameters string.  As a result, this string
145
will be directly passed to the instance as part of the metadata.  If the
146
installation procedure is running inside a virtualized environment, then Ganeti
147
will take these parameters from the metadata and pass them to the OS scripts as
148
environment variables.
149

    
150
* Use a disk image:
151

    
152
  Currently, it is already possible to specify an installation medium, such as,
153
  a cdrom, but not a disk image.  Therefore, a new parameter ``--os-image`` will
154
  be used to specify the location of a disk image which will be dumped to the
155
  instance's first disk before the instance is started.  The location of the
156
  image can be a URL and, if this is the case, Ganeti will download this image.
157

    
158
* Run OS scripts:
159

    
160
  The parameter ``--os-type`` (short version: ``-o``), is currently used to
161
  specify the OS scripts.  This parameter will still be used to specify the OS
162
  scripts with the difference that these OS scripts may optionally run inside a
163
  virtualized environment for safety reasons, depending on whether they are
164
  trusted or not.  For more details on trusted and untrusted OS scripts, refer
165
  to the `Installation process in a virtualized environment`_ section.
166

    
167
* Personalization package
168

    
169
  As part of the instance creation command, it will be possible to indicate a
170
  URL for a "personalization package", which is an archive containing a set of
171
  files meant to be overlayed on top of the OS file system at the end of the
172
  setup process and before the VM is started for the first time in normal mode.
173
  Ganeti will provide a mechanism for receiving and unpacking this archive
174
  whether the installation is being performed inside the virtualized environment
175
  or not.
176

    
177
  The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or
178
  ``.tgz``) and contain the files according to the directory structure that will
179
  be recreated on the installation disk.  Files contained in this archive will
180
  overwrite files with the same path created during the installation procedure
181
  (if any).  The URL of the "personalization package" will have to specify an
182
  extension to identify the file format (in order to allow for more formats to
183
  be supported in the future).  The URL will be stored as part of the
184
  configuration of the instance (therefore, the URL should not contain
185
  confidential information, but the files there available can).
186

    
187
  It is up to the system administrator to ensure that a package is actually
188
  available at that URL at install and reinstall time.  The contents of the
189
  package are allowed to change.  E.g.: a system administrator might create a
190
  package containing the private keys of the instance being created.  When the
191
  instance is reinstalled, a new package with new keys can be made available
192
  there, thus allowing instance reinstall without the need to store keys.  A
193
  username and a password can be specified together with the URL.  If the URL is
194
  a HTTP(S) URL, they will be used as basic access authentication credentials to
195
  access that URL.  The username and password will not be saved in the config,
196
  and will have to be provided again in case a reinstall is requested.
197

    
198
  The downloaded personalization package will not be stored locally on the node
199
  for longer than it is needed while unpacking it and adding its files to the
200
  instance being created.  The personalization package will be overlayed on top
201
  of the instance filesystem after the scripts that created it have been
202
  executed.  In order for the files in the package to be automatically overlayed
203
  on top of the instance filesystem, it is required that the appliance is
204
  actually able to mount the instance's disks.  As a result, this will not work
205
  for every filesystem.
206

    
207
* Combine a disk image, OS scripts, and a personalization package
208

    
209
  It will possible to combine a disk image, OS scripts, and a personalization
210
  package, both with or without a virtualized environment.  There is one
211
  exception which is if there are untrusted OS scripts.  At least, an
212
  installation medium or OS scripts should be specified.
213

    
214
  The disk image of the actual virtual appliance, which bootstraps the virtual
215
  environment used in the installation procedure, will be read only, so that a
216
  pristine copy of the appliance can be started every time a new instance needs
217
  to be created and to further increase security.  The data the instance needs
218
  to write at runtime will only be stored in RAM and disappear as soon as the
219
  instance is stopped.
220

    
221
  The parameter ``--enable-safe-install=yes|no`` will be used to give the
222
  administrator control over whether to use a virtualized environment for the
223
  installation procedure.  By default, a virtualized environment will be used.
224
  Note that some feature combinations, such as, using untrusted scripts, will
225
  require the virtualized environment.  In this case, Ganeti will not allow
226
  disabling the virtualized environment.
227

    
228
Implementation
229
==============
230

    
231
The implementation of this design will happen as an ordered sequence of steps,
232
of increasing impact on the system and, in some cases, dependent on each other:
233

    
234
#. Private and secret instance parameters
235
#. Communication mechanism between host and instance
236
#. Metadata service
237
#. Personalization package (inside a virtualization environment)
238
#. Instance creation via a disk image
239
#. Instance creation inside a virtualized environment
240

    
241
Some of these steps need to be more deeply specified w.r.t. what is already
242
written in the `Proposed changes`_ Section. Extra details will be provided in
243
the following subsections.
244

    
245
Communication mechanism and metadata service
246
++++++++++++++++++++++++++++++++++++++++++++
247

    
248
The communication mechanism and the metadata service are described together
249
because they are deeply tied. The communication mechanism will be made more
250
generic because it can be used for other purposes in the future (like allowing
251
instances to explicitly send commands to Ganeti, or to let Ganeti control a
252
helper instance, like the one hereby introduced for performing OS installs
253
inside a safe environment).
254

    
255
The communication mechanism will be enabled automatically during an installation
256
procedure that requires a virtualized environment, but for backwards
257
compatibility it will be disabled when the instance is running normally, unless
258
it is explicitly requested. Specifically, a new parameter
259
``--communication=yes|no`` (short version: ``-C``) will be added to
260
``gnt-instance add`` and ``gnt-instance modify``. It will determine whether the
261
instance has a communication channel set to interact with the host and receive
262
metadata. The value of this parameter will be saved as part of the configuration
263
of the instance.
264

    
265
When the communication mechanism is enabled, Ganeti will create a new network
266
interface inside the instance. This additional network interface will be the
267
last one in the instance, after all the user defined ones. On the host side,
268
this interface will only be accessible to the host itself, and not routed
269
outside the machine.
270
On this network interface, the instance will connect using the IP:
271
169.254.169.253 and netmask 255.255.255.0.
272
The host will be on the same network, with the IP address: 169.254.169.254.
273

    
274
The way to create this interface depends on the specific hypervisor being used.
275
In KVM, it is possible to create a network interface inside the instance without
276
having a corresponding interface created on the host. Using a command like::
277

    
278
  kvm -net nic -net \
279
    user,restrict=on,net=169.254.169.0/24,host=169.254.169.253,
280
    guestfwd=tcp:169.254.169.254:80-tcp:127.0.0.1:8080
281

    
282
a network interface will be created inside the VM, part of the 169.254.169.0/24
283
network, where the VM will have IP address .253 and the host port 8080 will be
284
reachable on port 80.
285

    
286
In Xen, unfortunately, such a capability is not present, and an actual network
287
interface has to be created on the host (using the ``vif`` parameter of the Xen
288
configuration file). Each instance will have its corresponding ``vif`` network
289
interface on the host. These interfaces will not be connected to each other in
290
any way, and Ganeti will not configure them to allow traffic to be forwarded
291
beyond the host machine. The ``vif-route`` script of Xen might be helpful in
292
implementing this.
293
It will be the system administrator's responsibility to ensure that the extra
294
firewalling and routing rules specified on the host don't allow this
295
accidentally.
296

    
297
The instance will be able to connect to 169.254.169.254:80, and issue GET
298
requests to an HTTP server that will provide the instance metadata.
299

    
300
The choice of this IP address and port for accessing the metadata is done for
301
compatibility reasons with OpenStack's and Amazon EC2's ways of providing
302
metadata to the instance. The metadata will be provided by a single daemon,
303
which will determine what instance the request comes from and reply with the
304
metadata specific for that instance.
305

    
306
Where possible, the metadata will be provided in a way compatible with Amazon
307
EC2, at::
308

    
309
  http://169.254.169.254/<version>/meta-data/*
310

    
311
If some metadata are Ganeti-specific and don't fit this structure, they will be
312
provided at::
313

    
314
  http://169.254.169.254/ganeti/<version>/meta_data.json
315

    
316
``<version>`` is either a date in YYYY-MM-DD format, or ``latest`` to indicate
317
the most recent available protocol version.
318

    
319
If needed in the future, this structure also allows us to support OpenStack's
320
metadata at::
321

    
322
  http://169.254.169.254/openstack/<version>/meta_data.json
323

    
324
A bi-directional, pipe-like communication channel will be provided. The instance
325
will be able to receive data from the host by a GET request at::
326

    
327
  http://169.254.169.254/ganeti/<version>/read
328

    
329
and to send data to the host by a POST request at::
330

    
331
  http://169.254.169.254/ganeti/<version>/write
332

    
333
As in a pipe, once the data are read, they will not be in the buffer anymore, so
334
subsequent GET requests to ``read`` will not return the same data twice.
335
Unlike a pipe, though, it will not be possible to perform blocking I/O
336
operations.
337

    
338
The OS parameters will be accessible through a GET
339
request at::
340

    
341
  http://169.254.169.254/ganeti/<version>/os/parameters.json
342

    
343
as a JSON serialized dictionary having the parameter name as the key, and the
344
pair ``(<value>, <visibility>)`` as the value, where ``<value>`` is the
345
user-provided value of the parameter, and ``<visibility>`` is either ``public``,
346
``private`` or ``secret``.
347

    
348
The installation scripts to be run inside the virtualized environment will be
349
available at::
350

    
351
  http://169.254.169.254/ganeti/<version>/os/scripts/<script_name>
352

    
353
where ``<script_name>`` is the name of the script.
354

    
355

    
356
Rationale
357
---------
358

    
359
The choice of using a network interface for instance-host communication, as
360
opposed to VirtIO, XenBus or other methods, is due to the will of having a
361
generic, hypervisor-independent way of creating a communication channel, that
362
doesn't require unusual (para)virtualization drivers.
363
At the same time, a network interface was preferred over solutions involving
364
virtual floppy or USB devices because the latter tend to be detected and
365
configured by the guest operating systems, sometimes even in prominent positions
366
in the user interface, whereas it is fairly common to have an unconfigured
367
network interface in a system, usually without any negative side effects.
368

    
369

    
370
Installation process in a virtualized environment
371
+++++++++++++++++++++++++++++++++++++++++++++++++
372

    
373
In the new OS installation scenario, we distinguish between trusted and
374
untrusted code.
375

    
376
The trusted installation code maintains the behavior of the current one and
377
requires no modifications, with the scripts running on the node the instance is
378
being created on. The untrusted code is stored in a subdirectory of the OS
379
definition called ``untrusted``.  This directory contains scripts that are
380
equivalent to the already existing ones (``create``, ``export``, ``import``,
381
``rename``) but that will be run inside an virtualized environment, to protect
382
the host from malicious tampering.
383

    
384
The ``untrusted`` code is meant to either be untrusted itself, or to be trusted
385
code running operations that might be dangerous (such as mounting a
386
user-provided image).
387

    
388
By default, all new OS definitions will have to be explicitly marked as trusted
389
by the cluster administrator (with a new ``gnt-os modify`` command) before they
390
can run code on the host. Otherwise, only the untrusted part of the code will be
391
allowed to run, inside the virtual appliance. For backwards compatibility
392
reasons, when upgrading an existing cluster, all the installed OSes will be
393
marked as trusted, so that they can keep running with no changes.
394

    
395
In order to allow for the highest flexibility, if both a trusted and an
396
untrusted script are provided for the same operation (i.e. ``create``), both of
397
them will be executed at the same time, one on the host, and one inside the
398
installation appliance. They will be allowed to communicate with each other
399
through the already described communication mechanism, in order to orchestrate
400
their execution (e.g.: the untrusted code might execute the installation, while
401
the trusted one receives status updates from it and delivers them to a user
402
interface).
403

    
404
The cluster administrator will have an option to completely disable scripts
405
running on the host, leaving only the ones running in the VM.
406

    
407
Ganeti will provide a script to be run at install time that can be used to
408
create the virtualized environment that will perform the OS installation of new
409
instances.
410
This script will build a debootstrapped basic debian system including a software
411
that will read the metadata, setup the environment variables and launch the
412
installation scripts inside the virtualized environment. The script will also
413
provide hooks for personalization.
414

    
415
It will also be possible to use other self-made virtualized environments, as
416
long as they connect to Ganeti over the described communication mechanism and
417
they know how to read and use the provided metadata to create a new instance.
418

    
419
While performing an installation in the virtualized environment, a
420
personalizable timeout will be used to detect possible problems with the
421
installation process, and to kill the virtualized environment. The timeout will
422
be optional and set on a cluster basis by the administrator. If set, it will be
423
the total time allowed to setup an instance inside the appliance. It is mainly
424
meant as a safety measure to prevent an instance taken over by malicious scripts
425
to be available for a long time.
426

    
427
.. vim: set textwidth=72 :
428
.. Local Variables:
429
.. mode: rst
430
.. fill-column: 72
431
.. End: