Statistics
| Branch: | Tag: | Revision:

root / doc / design-os.rst @ 0565f862

History | View | Annotate | Download (30.8 kB)

1
===============================
2
Ganeti OS installation redesign
3
===============================
4

    
5
.. contents:: :depth: 3
6

    
7
This is a design document detailing a new OS installation procedure, which is
8
more secure, able to provide more features and easier to use for many common
9
tasks w.r.t. the current one.
10

    
11
Current state and shortcomings
12
==============================
13

    
14
As of Ganeti 2.10, each instance is associated with an OS definition. An OS
15
definition is a set of scripts (i.e., ``create``, ``export``, ``import``,
16
``rename``) that are executed with root privileges on the primary host of the
17
instance.  These scripts are responsible for performing all the OS-related
18
tasks, namely, create an instance, setup an operating system on the instance's
19
disks, export/import the instance, and rename the instance.
20

    
21
These scripts receive, through environment variables, a fixed set of instance
22
parameters (such as, the hypervisor, the name of the instance, the number of
23
disks and their location) and a set of user defined parameters.  Both the
24
instance and user defined parameters are written in the configuration file of
25
Ganeti, to allow future reinstalls of the instance, and in various log files,
26
namely:
27

    
28
* node daemon log file: contains DEBUG strings of the ``/os_validate``,
29
  ``/instance_os_add`` and ``/instance_start`` RPC calls.
30

    
31
* master daemon log file: DEBUG strings related to the same RPC calls are stored
32
  here as well.
33

    
34
* commands log: the CLI commands that create a new instance, including their
35
  parameters, are logged here.
36

    
37
* RAPI log: the RAPI commands that create a new instance, including their
38
  parameters, are logged here.
39

    
40
* job logs: the job files stored in the job queue, or in its archive, contain
41
  the parameters.
42

    
43
The current situation presents a number of shortcomings:
44

    
45
* Having the installation scripts run as root on the nodes does not allow
46
  user-defined OS scripts, as they would pose a huge security risk.
47
  Furthermore, even a script without malicious intentions might end up
48
  disrupting a node because of due to a bug.
49

    
50
* Ganeti cannot be used to create instances starting from user provided disk
51
  images: even in the (hypothetical) case in which the scripts are completely
52
  secure and run not by root but by an unprivileged user with only the power to
53
  mount arbitrary files as disk images, this is still a security issue. It has
54
  been proven that a carefully crafted file system might exploit kernel
55
  vulnerabilities to gain control of the system. Therefore, directly mounting
56
  images on the Ganeti nodes is not an option.
57

    
58
* There is no way to inject files into an existing disk image. A common use case
59
  is for the system administrator to provide a standard image of the system, to
60
  be later personalized with the network configuration, private keys identifying
61
  the machine, ssh keys of the users, and so on. A possible workaround would be
62
  for the scripts to mount the image (only if this is trusted!) and to receive
63
  the configurations and ssh keys as user defined OS parameters. Unfortunately,
64
  this is also not an option for security sensitive material (such as the ssh
65
  keys) because the OS parameters are stored in many places on the system, as
66
  already described above.
67

    
68
* Most other virtualization software allow only instance images, but no
69
  installation scripts. This difference makes the interaction between Ganeti and
70
  other software difficult.
71

    
72
Proposed changes
73
================
74

    
75
In order to fix the shortcomings of the current state, we plan to introduce the
76
following changes.
77

    
78
OS parameter categories
79
+++++++++++++++++++++++
80

    
81
Change the OS parameters to have three categories:
82

    
83
* ``public``: the current behavior. The parameter is logged and stored freely.
84

    
85
* ``private``: the parameter is saved inside the Ganeti configuration (to allow
86
  for instance reinstall) but it is not shown in logs, job logs, or passed back
87
  via RAPI.
88

    
89
* ``secret``: the parameter is not saved inside the Ganeti configuration.
90
  Reinstalls are impossible unless the data is passed again. The parameter will
91
  not appear in any log file. When a functionality is performed jointly by
92
  multiple daemons (such as MasterD and LuxiD), currently Ganeti sometimes
93
  serializes jobs on disk and later reloads them. Secret parameters will not be
94
  serialized to disk. They will be passed around as part of the LUXI calls
95
  exchanged by the daemons, and only kept in memory, in order to reduce their
96
  accessibility as much as possible. In case of failure of the master node,
97
  these parameters will be lost and cannot be recovered because they are not
98
  serialized. As a result, the job cannot be taken over by the new master.  This
99
  is an expected and accepted side effect of jobs with secret parameters: if
100
  they fail, they'll have to be restarted manually.
101

    
102
Metadata
103
++++++++
104

    
105
In order to allow metadata to be sent inside the instance, a communication
106
mechanism between the instance and the host will be created.  This mechanism
107
will be bidirectional (e.g.: to allow the setup process going on inside the
108
instance to communicate its progress to the host). Each instance will have
109
access exclusively to its own metadata, and it will be only able to communicate
110
with its host over this channel.  This is the approach followed the
111
``cloud-init`` tool and more details will be provided in the `Communication
112
mechanism`_ and `Metadata service`_ sections.
113

    
114
Installation procedure
115
++++++++++++++++++++++
116

    
117
A new installation procedure will be introduced.  There will be two sets of
118
parameters, namely, installation parameters, which are used mainly for installs
119
and reinstalls, and execution parameters, which are used in all the other runs
120
that are not part of an installation procedure.  Also, it will be possible to
121
use an installation medium and/or run the OS scripts in an optional virtualized
122
environment, and optionally use a personalization package.  This section details
123
all of these options.
124

    
125
The set of installation parameters will allow, for example, to attach an
126
installation floppy/cdrom/network, change the boot device order, or specify a
127
disk image to be used.  Through this set of parameters, the administrator will
128
have to provide the hypervisor a location for an installation medium for the
129
instance (e.g., a boot disk, a network image, etc).  This medium will carry out
130
the installation of the instance onto the instance's disks and will then be
131
responsible for getting the parameters for configuring the instance, such as,
132
network interfaces, IP address, and hostname.  These parameters are taken from
133
the metadata.  The installation parameters will be stored in the configuration
134
of Ganeti and used in future reinstalls, but not during normal execution.
135

    
136
The instance is reinstalled using the same installation parameters from the
137
first installation.  However, it will be the administrator's responsibility to
138
ensure that the installation media is still available at the proper location
139
when a reinstall occurs.
140

    
141
The parameter ``--os-parameters`` can still be used to specify the OS
142
parameters.  However, without OS scripts, Ganeti cannot do more than a syntactic
143
check to validate the supplied OS parameter string.  As a result, this string
144
will be passed directly to the instance as part of the metadata.  If OS scripts
145
are used and the installation procedure is running inside a virtualized
146
environment, Ganeti will take these parameters from the metadata and pass them
147
to the OS scripts as environment variables.
148

    
149
Ganeti allows the following installation options:
150

    
151
* Use a disk image:
152

    
153
  Currently, it is already possible to specify an installation medium, such as,
154
  a cdrom, but not a disk image.  Therefore, a new parameter ``--os-image`` will
155
  be used to specify the location of a disk image which will be dumped to the
156
  instance's first disk before the instance is started.  The location of the
157
  image can be a URL and, if this is the case, Ganeti will download this image.
158

    
159
* Run OS scripts:
160

    
161
  The parameter ``--os-type`` (short version: ``-o``), is currently used to
162
  specify the OS scripts.  This parameter will still be used to specify the OS
163
  scripts with the difference that these scripts may optionally run inside a
164
  virtualized environment for safety reasons, depending on whether they are
165
  trusted or not.  For more details on trusted and untrusted OS scripts, refer
166
  to the `Installation process in a virtualized environment`_ section.  Note
167
  that this parameter will become optional thus allowing a user to create an
168
  instance specifying only, for example, a disk image or a cdrom image to boot
169
  from.
170

    
171
* Personalization package
172

    
173
  As part of the instance creation command, it will be possible to indicate a
174
  URL for a "personalization package", which is an archive containing a set of
175
  files meant to be overlayed on top of the OS file system at the end of the
176
  setup process and before the VM is started for the first time in normal mode.
177
  Ganeti will provide a mechanism for receiving and unpacking this archive,
178
  independently of whether the installation is being performed inside the
179
  virtualized environment or not.
180

    
181
  The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or
182
  ``.tgz``) and contain the files according to the directory structure that will
183
  be recreated on the installation disk.  Files contained in this archive will
184
  overwrite files with the same path created during the installation procedure
185
  (if any).  The URL of the "personalization package" will have to specify an
186
  extension to identify the file format (in order to allow for more formats to
187
  be supported in the future).  The URL will be stored as part of the
188
  configuration of the instance (therefore, the URL should not contain
189
  confidential information, but the files there available can).
190

    
191
  It is up to the system administrator to ensure that a package is actually
192
  available at that URL at install and reinstall time.  The contents of the
193
  package are allowed to change.  E.g.: a system administrator might create a
194
  package containing the private keys of the instance being created.  When the
195
  instance is reinstalled, a new package with new keys can be made available
196
  there, thus allowing instance reinstall without the need to store keys.  A
197
  username and a password can be specified together with the URL.  If the URL is
198
  a HTTP(S) URL, they will be used as basic access authentication credentials to
199
  access that URL.  The username and password will not be saved in the config,
200
  and will have to be provided again in case a reinstall is requested.
201

    
202
  The downloaded personalization package will not be stored locally on the node
203
  for longer than it is needed while unpacking it and adding its files to the
204
  instance being created.  The personalization package will be overlayed on top
205
  of the instance filesystem after the scripts that created it have been
206
  executed.  In order for the files in the package to be automatically overlayed
207
  on top of the instance filesystem, it is required that the appliance is
208
  actually able to mount the instance's disks.  As a result, this will not work
209
  for every filesystem.
210

    
211
* Combine a disk image, OS scripts, and a personalization package
212

    
213
  It will possible to combine a disk image, OS scripts, and a personalization
214
  package, both with or without a virtualized environment (see the exception
215
  below). At least, an installation medium or OS scripts should be specified.
216

    
217
  The disk image of the actual virtual appliance, which bootstraps the virtual
218
  environment used in the installation procedure, will be read only, so that a
219
  pristine copy of the appliance can be started every time a new instance needs
220
  to be created and to further increase security.  The data the instance needs
221
  to write at runtime will only be stored in RAM and disappear as soon as the
222
  instance is stopped.
223

    
224
  The parameter ``--enable-safe-install=yes|no`` will be used to give the
225
  administrator control over whether to use a virtualized environment for the
226
  installation procedure.  By default, a virtualized environment will be used.
227
  Note that some feature combinations, such as, using untrusted scripts, will
228
  require the virtualized environment.  In this case, Ganeti will not allow
229
  disabling the virtualized environment.
230

    
231
Implementation
232
==============
233

    
234
The implementation of this design will happen as an ordered sequence of steps,
235
of increasing impact on the system and, in some cases, dependent on each other:
236

    
237
#. Private and secret instance parameters
238
#. Communication mechanism between host and instance
239
#. Metadata service
240
#. Personalization package (inside a virtualization environment)
241
#. Instance creation via a disk image
242
#. Instance creation inside a virtualized environment
243

    
244
Some of these steps need to be more deeply specified w.r.t. what is already
245
written in the `Proposed changes`_ Section. Extra details will be provided in
246
the following subsections.
247

    
248
Communication mechanism
249
+++++++++++++++++++++++
250

    
251
The communication mechanism will be an exclusive, generic, bidirectional
252
communication channel between Ganeti hosts and guests.
253

    
254
exclusive
255
  The communication mechanism allows communication between a guest and its host,
256
  but it does not allow a guest to communicate with other guests or reach the
257
  outside world.
258

    
259
generic
260
  The communication mechanism allows a guest to reach any service on the host,
261
  not just the metadata service.  Examples of valid communication include, but
262
  are not limited to, access to the metadata service, send commands to Ganeti,
263
  request changes to parameters, such as, those related to the distribution
264
  upgrades, and let Ganeti control a helper instance, such as, the one for
265
  performing OS installs inside a safe environment.
266

    
267
bidirectional
268
  The communication mechanism allows communication to be initiated from either
269
  party, namely, from a host to a guest or guest to host.
270

    
271
Note that Ganeti will allow communication with any service (e.g., daemon) running
272
on the host and, as a result, Ganeti will not be responsible for ensuring that
273
only the metadata service is reachable.  It is the responsibility of each system
274
administrator to ensure that the extra firewalling and routing rules specified
275
on the host provide the necessary protection on a given Ganeti installation and,
276
at the same time, do not accidentally override the behaviour hereby described
277
which makes the communication between the host and the guest exclusive, generic,
278
and bidirectional, unless intended.
279

    
280
The communication mechanism will be enabled automatically during an installation
281
procedure that requires a virtualized environment, but, for backwards
282
compatibility, it will be disabled when the instance is running normally, unless
283
explicitly requested.  Specifically, a new parameter ``--communication=yes|no``
284
(short version: ``-C``) will be added to ``gnt-instance add`` and ``gnt-instance
285
modify``.  This parameter will determine whether the communication mechanism is
286
enabled for a particular instance.  The value of this parameter will be saved as
287
part of the instance's configuration.
288

    
289
The communication mechanism will be implemented through network interfaces on
290
the host and the guest, and Ganeti will be responsible for the host side,
291
namely, creating a TAP interface for each guest and configuring these interfaces
292
to have name ``gnt.com.%d``, where ``%d`` is a unique number within the host
293
(e.g., ``gnt.com.0`` and ``gnt.com.1``), IP address ``169.254.169.254``, and
294
netmask ``255.255.255.255``.  The interface's name allows DHCP servers to
295
recognize which interfaces are part of the communication mechanism.
296

    
297
This network interface will be connected to the guest's last network interface,
298
which is meant to be used exclusively for the communication mechanism and is
299
defined after all the used-defined interfaces.  The last interface was chosen
300
(as opposed to the first one, for example) because the first interface is
301
generally understood and the main gateway out, and also because it minimizes the
302
impact on existing systems, for example, in a scenario where the system
303
administrator has a running cluster and wants to enable the communication
304
mechanism for already existing instances, which might have been created with
305
older versions of Ganeti.  Further, DBus should assist in keeping the guest
306
network interfaces more stable.
307

    
308
On the guest side, each instance will have its own MAC address and IP address.
309
Both the guest's MAC address and IP address must be unique within a single
310
cluster.  An IP is unique within a single cluster, and not within a single host,
311
in order to minimize disruption of connectivity, for example, during live
312
migration, in particular since an instance is not aware when it changes host.
313
Unfortunately, a side-effect of this decision is that a cluster can have a
314
maximum of a ``/16`` network allowed instances (with communication enabled).  If
315
necessary to overcome this limit, it should be possible to allow different
316
networks to be configured link-local only.
317

    
318
The guest will use the DHCP protocol on its last network interface to contact a
319
DHCP server running on the host and thus determine its IP address.  The DHCP
320
server is configured, started, and stopped, by Ganeti and it will be listening
321
exclusively on the TAP network interfaces of the guests in order not to
322
interfere with a potential DHCP server running on the same host.  Furthermore,
323
the DHCP server will only recognize MAC and IP address pairs that have been
324
approved by Ganeti.
325

    
326
The TAP network interfaces created for each guest share the same IP address.
327
Therefore, it will be necessary to extend the routing table with rules specific
328
to each guest.  This can be achieved with the following command, which takes the
329
guest's unique IP address and its TAP interface::
330

    
331
  route add -host <ip> dev <ifname>
332

    
333
This rule has the additional advantage of preventing guests from trying to lease
334
IP addresses from the DHCP server other than the own that has been assigned to
335
them by Ganeti.  The guest could lie about its MAC address to the DHCP server
336
and try to steal another guest's IP address, however, this routing rule will
337
block traffic (i.e., IP packets carrying the wrong IP) from the DHCP server to
338
the malicious guest.  Similarly, the guest could lie about its IP address (i.e.,
339
simply assign a predefined IP address, perhaps from another guest), however,
340
replies from the host will not be routed to the malicious guest.
341

    
342
This routing rule ensures that the communication channel is exclusive but, as
343
mentioned before, it will not prevent guests from accessing any service on the
344
host.  It is the system administrator's responsibility to employ the necessary
345
``iptables`` rules.  In order to achieve this, Ganeti will provide ``ifup``
346
hooks associated with the guest network interfaces which will give system
347
administrator's the opportunity to customize their own ``iptables``, if
348
necessary.  Ganeti will also provide examples of such hooks.  However, these are
349
meant to personalized to each Ganeti installation and not to be taken as
350
production ready scripts.
351

    
352
For KVM, an instance will be started with a unique MAC address and the file
353
descriptor for the TAP network interface meant to be used by the communication
354
mechanism.  Ganeti will be responsible for generating a unique MAC address for
355
the guest, opening the TAP interface, and passing its file descriptor to KVM::
356

    
357
  kvm -net nic,macaddr=<mac> -net tap,fd=<tap-fd> ...
358

    
359
For Xen, a network interface will be created on the host (using the ``vif``
360
parameter of the Xen configuration file).  Each instance will have its
361
corresponding ``vif`` network interface on the host.  The ``vif-route`` script
362
of Xen might be helpful in implementing this.
363

    
364
dnsmasq
365
+++++++
366

    
367
The previous section describes the communication mechanism and explains the role
368
of the DHCP server.  Note that any DHCP server can be used in the implementation
369
of the communication mechanism.  However, the DHCP server employed should not
370
violate the properties described in the previous section, which state that the
371
communication mechanism should be exclusive, generic, and bidirectional, unless
372
this is intentional.
373

    
374
In our experiments, we have used dnsmasq.  In this section, we describe how to
375
properly configure dnsmasq to work on a given Ganeti installation.  This is
376
particularly important if, in this Ganeti installation, dnsmasq will share the
377
node with one or more DHCP servers running in parallel.
378

    
379
First, it is important to become familiar with the operational modes of dnsmasq,
380
which are well explained in the `FAQ
381
<http://www.thekelleys.org.uk/dnsmasq/docs/FAQ>`_ under the question ``What are
382
these strange "bind-interface" and "bind-dynamic" options?``.  The rest of this
383
section assumes the reader is familiar with these operational modes.
384

    
385
bind-dynamic
386
  dnsmasq SHOULD be configured in the ``bind-dynamic`` mode (if supported) in
387
  order to allow other DHCP servers to run on the same node.  In this mode,
388
  dnsmasq can listen on the TAP interfaces for the communication mechanism by
389
  listening on the TAP interfaces that match the pattern ``gnt.com.*`` (e.g.,
390
  ``interface=gnt.com.*``).  For extra safety, interfaces matching the pattern
391
  ``eth*`` and the name ``lo`` should be configured such that dnsmasq will
392
  always ignore them (e.g., ``except-interface=eth*`` and
393
  ``except-interface=lo``).
394

    
395
bind-interfaces
396
  dnsmasq MAY be configured in the ``bind-interfaces`` mode (if supported) in
397
  order to allow other DHCP servers to run on the same node.  Unfortunately,
398
  because dnsmasq cannot dynamically adjust to TAP interfaces that are created
399
  and destroyed by the system, dnsmasq must be restarted with a new
400
  configuration file each time an instance is created or destroyed.
401

    
402
  Also, the interfaces cannot be patterns, such as, ``gnt.com.*``.  Instead, the
403
  interfaces must be explictly specified, for example,
404
  ``interface=gnt.com.0,gnt.com.1``.  Moreover, dnsmasq cannot bind to the TAP
405
  interfaces if they have all the same IPv4 address.  As a result, it is
406
  necessary to configure these TAP interfaces to enable IPv6 and an IPv6 address
407
  must be assigned to them.
408

    
409
wildcard
410
  dnsmasq CANNOT be configured in the ``wildcard`` mode if there is
411
  (at least) another DHCP server running on the same node.
412

    
413
Metadata service
414
++++++++++++++++
415

    
416
An instance will be able to reach metadata service on ``169.254.169.254:80`` in
417
order to, for example, retrieve its metadata.  This IP address and port were
418
chosen for compatibility with the OpenStack and Amazon EC2 metadata service.
419
The metadata service will be provided by a single daemon, which will determine
420
the source instance for a given request and reply with the metadata pertaining
421
to that instance.
422

    
423
Where possible, the metadata will be provided in a way compatible with Amazon
424
EC2, at::
425

    
426
  http://169.254.169.254/<version>/meta-data/*
427

    
428
Ganeti-specific metadata, that does not fit this structure, will be provided
429
at::
430

    
431
  http://169.254.169.254/ganeti/<version>/meta_data.json
432

    
433
where ``<version>`` is either a date in YYYY-MM-DD format, or ``latest`` to
434
indicate the most recent available protocol version.
435

    
436
If needed in the future, this structure also allows us to support OpenStack's
437
metadata at::
438

    
439
  http://169.254.169.254/openstack/<version>/meta_data.json
440

    
441
A bi-directional, pipe-like communication channel will also be provided.  The
442
instance will be able to receive data from the host by a GET request at::
443

    
444
  http://169.254.169.254/ganeti/<version>/read
445

    
446
and to send data to the host by a POST request at::
447

    
448
  http://169.254.169.254/ganeti/<version>/write
449

    
450
As in a pipe, once the data are read, they will not be in the buffer anymore, so
451
subsequent GET requests to ``read`` will not return the same data.  However,
452
unlike a pipe, it will not be possible to perform blocking I/O operations.
453

    
454
The OS parameters will be accessible through a GET request at::
455

    
456
  http://169.254.169.254/ganeti/<version>/os/parameters.json
457

    
458
as a JSON serialized dictionary having the parameter name as the key, and the
459
pair ``(<value>, <visibility>)`` as the value, where ``<value>`` is the
460
user-provided value of the parameter, and ``<visibility>`` is either ``public``,
461
``private`` or ``secret``.
462

    
463
The installation scripts to be run inside the virtualized environment will be
464
available at::
465

    
466
  http://169.254.169.254/ganeti/<version>/os/scripts/<script_name>
467

    
468
where ``<script_name>`` is the name of the script.
469

    
470
Rationale
471
---------
472

    
473
The choice of using a network interface for instance-host communication, as
474
opposed to VirtIO, XenBus or other methods, is due to the will of having a
475
generic, hypervisor-independent way of creating a communication channel, that
476
doesn't require unusual (para)virtualization drivers.
477
At the same time, a network interface was preferred over solutions involving
478
virtual floppy or USB devices because the latter tend to be detected and
479
configured by the guest operating systems, sometimes even in prominent positions
480
in the user interface, whereas it is fairly common to have an unconfigured
481
network interface in a system, usually without any negative side effects.
482

    
483
Installation process in a virtualized environment
484
+++++++++++++++++++++++++++++++++++++++++++++++++
485

    
486
In the new OS installation scenario, we distinguish between trusted and
487
untrusted code.
488

    
489
The trusted installation code maintains the behavior of the current one and
490
requires no modifications, with the scripts running on the node the instance is
491
being created on. The untrusted code is stored in a subdirectory of the OS
492
definition called ``untrusted``.  This directory contains scripts that are
493
equivalent to the already existing ones (``create``, ``export``, ``import``,
494
``rename``) but that will be run inside an virtualized environment, to protect
495
the host from malicious tampering.
496

    
497
The ``untrusted`` code is meant to either be untrusted itself, or to be trusted
498
code running operations that might be dangerous (such as mounting a
499
user-provided image).
500

    
501
By default, all new OS definitions will have to be explicitly marked as trusted
502
by the cluster administrator (with a new ``gnt-os modify`` command) before they
503
can run code on the host. Otherwise, only the untrusted part of the code will be
504
allowed to run, inside the virtual appliance. For backwards compatibility
505
reasons, when upgrading an existing cluster, all the installed OSes will be
506
marked as trusted, so that they can keep running with no changes.
507

    
508
In order to allow for the highest flexibility, if both a trusted and an
509
untrusted script are provided for the same operation (i.e. ``create``), both of
510
them will be executed at the same time, one on the host, and one inside the
511
installation appliance. They will be allowed to communicate with each other
512
through the already described communication mechanism, in order to orchestrate
513
their execution (e.g.: the untrusted code might execute the installation, while
514
the trusted one receives status updates from it and delivers them to a user
515
interface).
516

    
517
The cluster administrator will have an option to completely disable scripts
518
running on the host, leaving only the ones running in the VM.
519

    
520
Ganeti will provide a script to be run at install time that can be used to
521
create the virtualized environment that will perform the OS installation of new
522
instances.
523
This script will build a debootstrapped basic Debian system including a software
524
that will read the metadata, setup the environment variables and launch the
525
installation scripts inside the virtualized environment. The script will also
526
provide hooks for personalization.
527

    
528
It will also be possible to use other self-made virtualized environments, as
529
long as they connect to Ganeti over the described communication mechanism and
530
they know how to read and use the provided metadata to create a new instance.
531

    
532
While performing an installation in the virtualized environment, a customizable
533
timeout will be used to detect possible problems with the installation process,
534
and to kill the virtualized environment. The timeout will be optional and set on
535
a cluster basis by the administrator. If set, it will be the total time allowed
536
to setup an instance inside the appliance. It is mainly meant as a safety
537
measure to prevent an instance taken over by malicious scripts to be available
538
for a long time.
539

    
540
Alternatives to design and implementation
541
=========================================
542

    
543
This section lists alternatives to design and implementation, which came up
544
during the development of this design document, that will not be implemented.
545
Please read carefully through the limitations and security concerns of each of
546
these alternatives.
547

    
548
Port forwarding in KVM
549
++++++++++++++++++++++
550

    
551
The communication mechanism could have been implemented in KVM using guest port
552
forwarding, as opposed to network interfaces.  There are two alternatives in
553
KVM's guest port forwarding, namely, creating a forwarding device, such as, a
554
TCP/IP connection, or executing a command.  However, we have determined that
555
both of these options are not viable.
556

    
557
A TCP/IP forwarding device can be created through the following KVM invocation::
558

    
559
  kvm -net nic -net \
560
    user,restrict=on,net=169.254.0.0/16,host=169.254.169.253,
561
    guestfwd=tcp:169.254.169.254:80-tcp:127.0.0.1:8080 ...
562

    
563
This invocation even has the advantage that it can block undesired traffic
564
(i.e., traffic that is not explicitly specified in the arguments) and it can
565
remap ports, which would have allowed the metadata service daemon to run in port
566
8080 instead of 80.  However, in this scheme, KVM opens the TCP connection only
567
once, when it is started, and, if the connection breaks, KVM will not
568
reestablish the connection.  Furthermore, opening the TCP connection only once
569
interferes with the HTTP protocol, which needs to dynamically establish and
570
close connections.
571

    
572
The alternative to the TCP/IP forwarding device is to execute a command.  The
573
KVM invocation for this is, for example, the following::
574

    
575
  kvm -net nic -net \
576
    "user,restrict=on,net=169.254.0.0/16,host=169.254.169.253,
577
    guestfwd=tcp:169.254.169.254:80-netcat 127.0.0.1 8080" ...
578

    
579
The advantage of this approach is that the command is executed each time the
580
guest initiates a connection.  This is the ideal situation, however, it is only
581
supported in KVM 1.2 and above, and, therefore, not viable because we want to
582
provide support for at least KVM version 1.0, which is the version provided by
583
Ubuntu LTS.
584

    
585
Alternatives to the DHCP server
586
+++++++++++++++++++++++++++++++
587

    
588
There are alternatives to using the DHCP server, for example, by assigning a
589
fixed IP address to guests, such as, the IP address ``169.254.169.253``.
590
However, this introduces a routing problem, namely, how to route incoming
591
packets from the same source IP to the host.  This problem can be overcome in a
592
number of ways.
593

    
594
The first solution is to use NAT to translate the incoming guest IP address, for
595
example, ``169.254.169.253``, to a unique IP address, for example,
596
``169.254.0.1``.  Given that NAT through ``ip rule`` is deprecated, users can
597
resort to ``iptables``.  Note that this has not yet been tested.
598

    
599
Another option, which has been tested, but only in a prototype, is to connect
600
the TAP network interfaces of the guests to a bridge.  The bridge takes the
601
configuration from the TAP network interfaces, namely, IP address
602
``169.254.169.254`` and netmask ``255.255.255.255``, thus leaving those
603
interfaces without an IP address.  Note that in this setting, guests will be
604
able to reach each other, therefore, if necessary, additional ``iptables`` rules
605
can be put in place to prevent it.