Statistics
| Branch: | Tag: | Revision:

root / doc / design-os.rst @ 0565f862

History | View | Annotate | Download (30.8 kB)

1 e5eaa80a Michele Tartara
===============================
2 e5eaa80a Michele Tartara
Ganeti OS installation redesign
3 e5eaa80a Michele Tartara
===============================
4 e5eaa80a Michele Tartara
5 e5eaa80a Michele Tartara
.. contents:: :depth: 3
6 e5eaa80a Michele Tartara
7 e5eaa80a Michele Tartara
This is a design document detailing a new OS installation procedure, which is
8 e5eaa80a Michele Tartara
more secure, able to provide more features and easier to use for many common
9 e5eaa80a Michele Tartara
tasks w.r.t. the current one.
10 e5eaa80a Michele Tartara
11 e5eaa80a Michele Tartara
Current state and shortcomings
12 e5eaa80a Michele Tartara
==============================
13 e5eaa80a Michele Tartara
14 e5eaa80a Michele Tartara
As of Ganeti 2.10, each instance is associated with an OS definition. An OS
15 1a7c1456 Jose A. Lopes
definition is a set of scripts (i.e., ``create``, ``export``, ``import``,
16 1a7c1456 Jose A. Lopes
``rename``) that are executed with root privileges on the primary host of the
17 1a7c1456 Jose A. Lopes
instance.  These scripts are responsible for performing all the OS-related
18 1a7c1456 Jose A. Lopes
tasks, namely, create an instance, setup an operating system on the instance's
19 1a7c1456 Jose A. Lopes
disks, export/import the instance, and rename the instance.
20 1a7c1456 Jose A. Lopes
21 1a7c1456 Jose A. Lopes
These scripts receive, through environment variables, a fixed set of instance
22 1a7c1456 Jose A. Lopes
parameters (such as, the hypervisor, the name of the instance, the number of
23 1a7c1456 Jose A. Lopes
disks and their location) and a set of user defined parameters.  Both the
24 1a7c1456 Jose A. Lopes
instance and user defined parameters are written in the configuration file of
25 1a7c1456 Jose A. Lopes
Ganeti, to allow future reinstalls of the instance, and in various log files,
26 1a7c1456 Jose A. Lopes
namely:
27 e5eaa80a Michele Tartara
28 e5eaa80a Michele Tartara
* node daemon log file: contains DEBUG strings of the ``/os_validate``,
29 e5eaa80a Michele Tartara
  ``/instance_os_add`` and ``/instance_start`` RPC calls.
30 e5eaa80a Michele Tartara
31 e5eaa80a Michele Tartara
* master daemon log file: DEBUG strings related to the same RPC calls are stored
32 e5eaa80a Michele Tartara
  here as well.
33 e5eaa80a Michele Tartara
34 e5eaa80a Michele Tartara
* commands log: the CLI commands that create a new instance, including their
35 e5eaa80a Michele Tartara
  parameters, are logged here.
36 e5eaa80a Michele Tartara
37 e5eaa80a Michele Tartara
* RAPI log: the RAPI commands that create a new instance, including their
38 e5eaa80a Michele Tartara
  parameters, are logged here.
39 e5eaa80a Michele Tartara
40 e5eaa80a Michele Tartara
* job logs: the job files stored in the job queue, or in its archive, contain
41 e5eaa80a Michele Tartara
  the parameters.
42 e5eaa80a Michele Tartara
43 e5eaa80a Michele Tartara
The current situation presents a number of shortcomings:
44 e5eaa80a Michele Tartara
45 1a7c1456 Jose A. Lopes
* Having the installation scripts run as root on the nodes does not allow
46 1a7c1456 Jose A. Lopes
  user-defined OS scripts, as they would pose a huge security risk.
47 e5eaa80a Michele Tartara
  Furthermore, even a script without malicious intentions might end up
48 1a7c1456 Jose A. Lopes
  disrupting a node because of due to a bug.
49 e5eaa80a Michele Tartara
50 e5eaa80a Michele Tartara
* Ganeti cannot be used to create instances starting from user provided disk
51 1a7c1456 Jose A. Lopes
  images: even in the (hypothetical) case in which the scripts are completely
52 e5eaa80a Michele Tartara
  secure and run not by root but by an unprivileged user with only the power to
53 1a7c1456 Jose A. Lopes
  mount arbitrary files as disk images, this is still a security issue. It has
54 1a7c1456 Jose A. Lopes
  been proven that a carefully crafted file system might exploit kernel
55 e5eaa80a Michele Tartara
  vulnerabilities to gain control of the system. Therefore, directly mounting
56 e5eaa80a Michele Tartara
  images on the Ganeti nodes is not an option.
57 e5eaa80a Michele Tartara
58 e5eaa80a Michele Tartara
* There is no way to inject files into an existing disk image. A common use case
59 e5eaa80a Michele Tartara
  is for the system administrator to provide a standard image of the system, to
60 e5eaa80a Michele Tartara
  be later personalized with the network configuration, private keys identifying
61 1a7c1456 Jose A. Lopes
  the machine, ssh keys of the users, and so on. A possible workaround would be
62 e5eaa80a Michele Tartara
  for the scripts to mount the image (only if this is trusted!) and to receive
63 e5eaa80a Michele Tartara
  the configurations and ssh keys as user defined OS parameters. Unfortunately,
64 e5eaa80a Michele Tartara
  this is also not an option for security sensitive material (such as the ssh
65 e5eaa80a Michele Tartara
  keys) because the OS parameters are stored in many places on the system, as
66 e5eaa80a Michele Tartara
  already described above.
67 e5eaa80a Michele Tartara
68 1a7c1456 Jose A. Lopes
* Most other virtualization software allow only instance images, but no
69 1a7c1456 Jose A. Lopes
  installation scripts. This difference makes the interaction between Ganeti and
70 e5eaa80a Michele Tartara
  other software difficult.
71 e5eaa80a Michele Tartara
72 e5eaa80a Michele Tartara
Proposed changes
73 e5eaa80a Michele Tartara
================
74 e5eaa80a Michele Tartara
75 e5eaa80a Michele Tartara
In order to fix the shortcomings of the current state, we plan to introduce the
76 56c934da Jose A. Lopes
following changes.
77 56c934da Jose A. Lopes
78 1a7c1456 Jose A. Lopes
OS parameter categories
79 1a7c1456 Jose A. Lopes
+++++++++++++++++++++++
80 56c934da Jose A. Lopes
81 56c934da Jose A. Lopes
Change the OS parameters to have three categories:
82 56c934da Jose A. Lopes
83 56c934da Jose A. Lopes
* ``public``: the current behavior. The parameter is logged and stored freely.
84 56c934da Jose A. Lopes
85 56c934da Jose A. Lopes
* ``private``: the parameter is saved inside the Ganeti configuration (to allow
86 56c934da Jose A. Lopes
  for instance reinstall) but it is not shown in logs, job logs, or passed back
87 56c934da Jose A. Lopes
  via RAPI.
88 56c934da Jose A. Lopes
89 56c934da Jose A. Lopes
* ``secret``: the parameter is not saved inside the Ganeti configuration.
90 56c934da Jose A. Lopes
  Reinstalls are impossible unless the data is passed again. The parameter will
91 56c934da Jose A. Lopes
  not appear in any log file. When a functionality is performed jointly by
92 56c934da Jose A. Lopes
  multiple daemons (such as MasterD and LuxiD), currently Ganeti sometimes
93 56c934da Jose A. Lopes
  serializes jobs on disk and later reloads them. Secret parameters will not be
94 56c934da Jose A. Lopes
  serialized to disk. They will be passed around as part of the LUXI calls
95 56c934da Jose A. Lopes
  exchanged by the daemons, and only kept in memory, in order to reduce their
96 56c934da Jose A. Lopes
  accessibility as much as possible. In case of failure of the master node,
97 56c934da Jose A. Lopes
  these parameters will be lost and cannot be recovered because they are not
98 56c934da Jose A. Lopes
  serialized. As a result, the job cannot be taken over by the new master.  This
99 56c934da Jose A. Lopes
  is an expected and accepted side effect of jobs with secret parameters: if
100 56c934da Jose A. Lopes
  they fail, they'll have to be restarted manually.
101 56c934da Jose A. Lopes
102 56c934da Jose A. Lopes
Metadata
103 56c934da Jose A. Lopes
++++++++
104 56c934da Jose A. Lopes
105 56c934da Jose A. Lopes
In order to allow metadata to be sent inside the instance, a communication
106 56c934da Jose A. Lopes
mechanism between the instance and the host will be created.  This mechanism
107 56c934da Jose A. Lopes
will be bidirectional (e.g.: to allow the setup process going on inside the
108 56c934da Jose A. Lopes
instance to communicate its progress to the host). Each instance will have
109 56c934da Jose A. Lopes
access exclusively to its own metadata, and it will be only able to communicate
110 56c934da Jose A. Lopes
with its host over this channel.  This is the approach followed the
111 56c934da Jose A. Lopes
``cloud-init`` tool and more details will be provided in the `Communication
112 1a7c1456 Jose A. Lopes
mechanism`_ and `Metadata service`_ sections.
113 56c934da Jose A. Lopes
114 56c934da Jose A. Lopes
Installation procedure
115 56c934da Jose A. Lopes
++++++++++++++++++++++
116 56c934da Jose A. Lopes
117 1a7c1456 Jose A. Lopes
A new installation procedure will be introduced.  There will be two sets of
118 1a7c1456 Jose A. Lopes
parameters, namely, installation parameters, which are used mainly for installs
119 1a7c1456 Jose A. Lopes
and reinstalls, and execution parameters, which are used in all the other runs
120 1a7c1456 Jose A. Lopes
that are not part of an installation procedure.  Also, it will be possible to
121 1a7c1456 Jose A. Lopes
use an installation medium and/or run the OS scripts in an optional virtualized
122 1a7c1456 Jose A. Lopes
environment, and optionally use a personalization package.  This section details
123 1a7c1456 Jose A. Lopes
all of these options.
124 1a7c1456 Jose A. Lopes
125 1a7c1456 Jose A. Lopes
The set of installation parameters will allow, for example, to attach an
126 1a7c1456 Jose A. Lopes
installation floppy/cdrom/network, change the boot device order, or specify a
127 1a7c1456 Jose A. Lopes
disk image to be used.  Through this set of parameters, the administrator will
128 1a7c1456 Jose A. Lopes
have to provide the hypervisor a location for an installation medium for the
129 1a7c1456 Jose A. Lopes
instance (e.g., a boot disk, a network image, etc).  This medium will carry out
130 1a7c1456 Jose A. Lopes
the installation of the instance onto the instance's disks and will then be
131 1a7c1456 Jose A. Lopes
responsible for getting the parameters for configuring the instance, such as,
132 1a7c1456 Jose A. Lopes
network interfaces, IP address, and hostname.  These parameters are taken from
133 1a7c1456 Jose A. Lopes
the metadata.  The installation parameters will be stored in the configuration
134 1a7c1456 Jose A. Lopes
of Ganeti and used in future reinstalls, but not during normal execution.
135 56c934da Jose A. Lopes
136 56c934da Jose A. Lopes
The instance is reinstalled using the same installation parameters from the
137 56c934da Jose A. Lopes
first installation.  However, it will be the administrator's responsibility to
138 1a7c1456 Jose A. Lopes
ensure that the installation media is still available at the proper location
139 56c934da Jose A. Lopes
when a reinstall occurs.
140 56c934da Jose A. Lopes
141 56c934da Jose A. Lopes
The parameter ``--os-parameters`` can still be used to specify the OS
142 56c934da Jose A. Lopes
parameters.  However, without OS scripts, Ganeti cannot do more than a syntactic
143 1a7c1456 Jose A. Lopes
check to validate the supplied OS parameter string.  As a result, this string
144 1a7c1456 Jose A. Lopes
will be passed directly to the instance as part of the metadata.  If OS scripts
145 1a7c1456 Jose A. Lopes
are used and the installation procedure is running inside a virtualized
146 1a7c1456 Jose A. Lopes
environment, Ganeti will take these parameters from the metadata and pass them
147 1a7c1456 Jose A. Lopes
to the OS scripts as environment variables.
148 1a7c1456 Jose A. Lopes
149 1a7c1456 Jose A. Lopes
Ganeti allows the following installation options:
150 56c934da Jose A. Lopes
151 56c934da Jose A. Lopes
* Use a disk image:
152 56c934da Jose A. Lopes
153 56c934da Jose A. Lopes
  Currently, it is already possible to specify an installation medium, such as,
154 56c934da Jose A. Lopes
  a cdrom, but not a disk image.  Therefore, a new parameter ``--os-image`` will
155 56c934da Jose A. Lopes
  be used to specify the location of a disk image which will be dumped to the
156 56c934da Jose A. Lopes
  instance's first disk before the instance is started.  The location of the
157 56c934da Jose A. Lopes
  image can be a URL and, if this is the case, Ganeti will download this image.
158 56c934da Jose A. Lopes
159 56c934da Jose A. Lopes
* Run OS scripts:
160 56c934da Jose A. Lopes
161 56c934da Jose A. Lopes
  The parameter ``--os-type`` (short version: ``-o``), is currently used to
162 56c934da Jose A. Lopes
  specify the OS scripts.  This parameter will still be used to specify the OS
163 1a7c1456 Jose A. Lopes
  scripts with the difference that these scripts may optionally run inside a
164 56c934da Jose A. Lopes
  virtualized environment for safety reasons, depending on whether they are
165 56c934da Jose A. Lopes
  trusted or not.  For more details on trusted and untrusted OS scripts, refer
166 1a7c1456 Jose A. Lopes
  to the `Installation process in a virtualized environment`_ section.  Note
167 1a7c1456 Jose A. Lopes
  that this parameter will become optional thus allowing a user to create an
168 1a7c1456 Jose A. Lopes
  instance specifying only, for example, a disk image or a cdrom image to boot
169 1a7c1456 Jose A. Lopes
  from.
170 56c934da Jose A. Lopes
171 56c934da Jose A. Lopes
* Personalization package
172 56c934da Jose A. Lopes
173 56c934da Jose A. Lopes
  As part of the instance creation command, it will be possible to indicate a
174 56c934da Jose A. Lopes
  URL for a "personalization package", which is an archive containing a set of
175 56c934da Jose A. Lopes
  files meant to be overlayed on top of the OS file system at the end of the
176 56c934da Jose A. Lopes
  setup process and before the VM is started for the first time in normal mode.
177 1a7c1456 Jose A. Lopes
  Ganeti will provide a mechanism for receiving and unpacking this archive,
178 1a7c1456 Jose A. Lopes
  independently of whether the installation is being performed inside the
179 1a7c1456 Jose A. Lopes
  virtualized environment or not.
180 56c934da Jose A. Lopes
181 56c934da Jose A. Lopes
  The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or
182 56c934da Jose A. Lopes
  ``.tgz``) and contain the files according to the directory structure that will
183 56c934da Jose A. Lopes
  be recreated on the installation disk.  Files contained in this archive will
184 56c934da Jose A. Lopes
  overwrite files with the same path created during the installation procedure
185 56c934da Jose A. Lopes
  (if any).  The URL of the "personalization package" will have to specify an
186 56c934da Jose A. Lopes
  extension to identify the file format (in order to allow for more formats to
187 56c934da Jose A. Lopes
  be supported in the future).  The URL will be stored as part of the
188 56c934da Jose A. Lopes
  configuration of the instance (therefore, the URL should not contain
189 56c934da Jose A. Lopes
  confidential information, but the files there available can).
190 56c934da Jose A. Lopes
191 56c934da Jose A. Lopes
  It is up to the system administrator to ensure that a package is actually
192 56c934da Jose A. Lopes
  available at that URL at install and reinstall time.  The contents of the
193 56c934da Jose A. Lopes
  package are allowed to change.  E.g.: a system administrator might create a
194 56c934da Jose A. Lopes
  package containing the private keys of the instance being created.  When the
195 56c934da Jose A. Lopes
  instance is reinstalled, a new package with new keys can be made available
196 56c934da Jose A. Lopes
  there, thus allowing instance reinstall without the need to store keys.  A
197 56c934da Jose A. Lopes
  username and a password can be specified together with the URL.  If the URL is
198 56c934da Jose A. Lopes
  a HTTP(S) URL, they will be used as basic access authentication credentials to
199 56c934da Jose A. Lopes
  access that URL.  The username and password will not be saved in the config,
200 56c934da Jose A. Lopes
  and will have to be provided again in case a reinstall is requested.
201 56c934da Jose A. Lopes
202 56c934da Jose A. Lopes
  The downloaded personalization package will not be stored locally on the node
203 56c934da Jose A. Lopes
  for longer than it is needed while unpacking it and adding its files to the
204 56c934da Jose A. Lopes
  instance being created.  The personalization package will be overlayed on top
205 56c934da Jose A. Lopes
  of the instance filesystem after the scripts that created it have been
206 56c934da Jose A. Lopes
  executed.  In order for the files in the package to be automatically overlayed
207 56c934da Jose A. Lopes
  on top of the instance filesystem, it is required that the appliance is
208 56c934da Jose A. Lopes
  actually able to mount the instance's disks.  As a result, this will not work
209 56c934da Jose A. Lopes
  for every filesystem.
210 56c934da Jose A. Lopes
211 56c934da Jose A. Lopes
* Combine a disk image, OS scripts, and a personalization package
212 56c934da Jose A. Lopes
213 56c934da Jose A. Lopes
  It will possible to combine a disk image, OS scripts, and a personalization
214 1a7c1456 Jose A. Lopes
  package, both with or without a virtualized environment (see the exception
215 1a7c1456 Jose A. Lopes
  below). At least, an installation medium or OS scripts should be specified.
216 56c934da Jose A. Lopes
217 56c934da Jose A. Lopes
  The disk image of the actual virtual appliance, which bootstraps the virtual
218 56c934da Jose A. Lopes
  environment used in the installation procedure, will be read only, so that a
219 56c934da Jose A. Lopes
  pristine copy of the appliance can be started every time a new instance needs
220 56c934da Jose A. Lopes
  to be created and to further increase security.  The data the instance needs
221 56c934da Jose A. Lopes
  to write at runtime will only be stored in RAM and disappear as soon as the
222 56c934da Jose A. Lopes
  instance is stopped.
223 56c934da Jose A. Lopes
224 56c934da Jose A. Lopes
  The parameter ``--enable-safe-install=yes|no`` will be used to give the
225 56c934da Jose A. Lopes
  administrator control over whether to use a virtualized environment for the
226 56c934da Jose A. Lopes
  installation procedure.  By default, a virtualized environment will be used.
227 56c934da Jose A. Lopes
  Note that some feature combinations, such as, using untrusted scripts, will
228 56c934da Jose A. Lopes
  require the virtualized environment.  In this case, Ganeti will not allow
229 56c934da Jose A. Lopes
  disabling the virtualized environment.
230 e5eaa80a Michele Tartara
231 e5eaa80a Michele Tartara
Implementation
232 e5eaa80a Michele Tartara
==============
233 e5eaa80a Michele Tartara
234 e5eaa80a Michele Tartara
The implementation of this design will happen as an ordered sequence of steps,
235 e5eaa80a Michele Tartara
of increasing impact on the system and, in some cases, dependent on each other:
236 e5eaa80a Michele Tartara
237 e5eaa80a Michele Tartara
#. Private and secret instance parameters
238 e5eaa80a Michele Tartara
#. Communication mechanism between host and instance
239 e5eaa80a Michele Tartara
#. Metadata service
240 e5eaa80a Michele Tartara
#. Personalization package (inside a virtualization environment)
241 56c934da Jose A. Lopes
#. Instance creation via a disk image
242 56c934da Jose A. Lopes
#. Instance creation inside a virtualized environment
243 e5eaa80a Michele Tartara
244 e5eaa80a Michele Tartara
Some of these steps need to be more deeply specified w.r.t. what is already
245 e5eaa80a Michele Tartara
written in the `Proposed changes`_ Section. Extra details will be provided in
246 e5eaa80a Michele Tartara
the following subsections.
247 e5eaa80a Michele Tartara
248 1a7c1456 Jose A. Lopes
Communication mechanism
249 1a7c1456 Jose A. Lopes
+++++++++++++++++++++++
250 1a7c1456 Jose A. Lopes
251 1a7c1456 Jose A. Lopes
The communication mechanism will be an exclusive, generic, bidirectional
252 1a7c1456 Jose A. Lopes
communication channel between Ganeti hosts and guests.
253 1a7c1456 Jose A. Lopes
254 1a7c1456 Jose A. Lopes
exclusive
255 1a7c1456 Jose A. Lopes
  The communication mechanism allows communication between a guest and its host,
256 1a7c1456 Jose A. Lopes
  but it does not allow a guest to communicate with other guests or reach the
257 1a7c1456 Jose A. Lopes
  outside world.
258 1a7c1456 Jose A. Lopes
259 1a7c1456 Jose A. Lopes
generic
260 1a7c1456 Jose A. Lopes
  The communication mechanism allows a guest to reach any service on the host,
261 1a7c1456 Jose A. Lopes
  not just the metadata service.  Examples of valid communication include, but
262 1a7c1456 Jose A. Lopes
  are not limited to, access to the metadata service, send commands to Ganeti,
263 1a7c1456 Jose A. Lopes
  request changes to parameters, such as, those related to the distribution
264 1a7c1456 Jose A. Lopes
  upgrades, and let Ganeti control a helper instance, such as, the one for
265 1a7c1456 Jose A. Lopes
  performing OS installs inside a safe environment.
266 1a7c1456 Jose A. Lopes
267 1a7c1456 Jose A. Lopes
bidirectional
268 1a7c1456 Jose A. Lopes
  The communication mechanism allows communication to be initiated from either
269 1a7c1456 Jose A. Lopes
  party, namely, from a host to a guest or guest to host.
270 1a7c1456 Jose A. Lopes
271 1a7c1456 Jose A. Lopes
Note that Ganeti will allow communication with any service (e.g., daemon) running
272 1a7c1456 Jose A. Lopes
on the host and, as a result, Ganeti will not be responsible for ensuring that
273 1a7c1456 Jose A. Lopes
only the metadata service is reachable.  It is the responsibility of each system
274 1a7c1456 Jose A. Lopes
administrator to ensure that the extra firewalling and routing rules specified
275 1a7c1456 Jose A. Lopes
on the host provide the necessary protection on a given Ganeti installation and,
276 1a7c1456 Jose A. Lopes
at the same time, do not accidentally override the behaviour hereby described
277 1a7c1456 Jose A. Lopes
which makes the communication between the host and the guest exclusive, generic,
278 1a7c1456 Jose A. Lopes
and bidirectional, unless intended.
279 56c934da Jose A. Lopes
280 56c934da Jose A. Lopes
The communication mechanism will be enabled automatically during an installation
281 1a7c1456 Jose A. Lopes
procedure that requires a virtualized environment, but, for backwards
282 1a7c1456 Jose A. Lopes
compatibility, it will be disabled when the instance is running normally, unless
283 1a7c1456 Jose A. Lopes
explicitly requested.  Specifically, a new parameter ``--communication=yes|no``
284 1a7c1456 Jose A. Lopes
(short version: ``-C``) will be added to ``gnt-instance add`` and ``gnt-instance
285 1a7c1456 Jose A. Lopes
modify``.  This parameter will determine whether the communication mechanism is
286 1a7c1456 Jose A. Lopes
enabled for a particular instance.  The value of this parameter will be saved as
287 1a7c1456 Jose A. Lopes
part of the instance's configuration.
288 1a7c1456 Jose A. Lopes
289 1a7c1456 Jose A. Lopes
The communication mechanism will be implemented through network interfaces on
290 1a7c1456 Jose A. Lopes
the host and the guest, and Ganeti will be responsible for the host side,
291 1a7c1456 Jose A. Lopes
namely, creating a TAP interface for each guest and configuring these interfaces
292 1ab752c8 Jose A. Lopes
to have name ``gnt.com.%d``, where ``%d`` is a unique number within the host
293 1ab752c8 Jose A. Lopes
(e.g., ``gnt.com.0`` and ``gnt.com.1``), IP address ``169.254.169.254``, and
294 1ab752c8 Jose A. Lopes
netmask ``255.255.255.255``.  The interface's name allows DHCP servers to
295 1ab752c8 Jose A. Lopes
recognize which interfaces are part of the communication mechanism.
296 1ab752c8 Jose A. Lopes
297 1ab752c8 Jose A. Lopes
This network interface will be connected to the guest's last network interface,
298 1ab752c8 Jose A. Lopes
which is meant to be used exclusively for the communication mechanism and is
299 1ab752c8 Jose A. Lopes
defined after all the used-defined interfaces.  The last interface was chosen
300 1ab752c8 Jose A. Lopes
(as opposed to the first one, for example) because the first interface is
301 1ab752c8 Jose A. Lopes
generally understood and the main gateway out, and also because it minimizes the
302 1ab752c8 Jose A. Lopes
impact on existing systems, for example, in a scenario where the system
303 1ab752c8 Jose A. Lopes
administrator has a running cluster and wants to enable the communication
304 1ab752c8 Jose A. Lopes
mechanism for already existing instances, which might have been created with
305 1ab752c8 Jose A. Lopes
older versions of Ganeti.  Further, DBus should assist in keeping the guest
306 1ab752c8 Jose A. Lopes
network interfaces more stable.
307 1a7c1456 Jose A. Lopes
308 1a7c1456 Jose A. Lopes
On the guest side, each instance will have its own MAC address and IP address.
309 1a7c1456 Jose A. Lopes
Both the guest's MAC address and IP address must be unique within a single
310 1a7c1456 Jose A. Lopes
cluster.  An IP is unique within a single cluster, and not within a single host,
311 1a7c1456 Jose A. Lopes
in order to minimize disruption of connectivity, for example, during live
312 1a7c1456 Jose A. Lopes
migration, in particular since an instance is not aware when it changes host.
313 1a7c1456 Jose A. Lopes
Unfortunately, a side-effect of this decision is that a cluster can have a
314 1a7c1456 Jose A. Lopes
maximum of a ``/16`` network allowed instances (with communication enabled).  If
315 1a7c1456 Jose A. Lopes
necessary to overcome this limit, it should be possible to allow different
316 1a7c1456 Jose A. Lopes
networks to be configured link-local only.
317 1a7c1456 Jose A. Lopes
318 1a7c1456 Jose A. Lopes
The guest will use the DHCP protocol on its last network interface to contact a
319 1a7c1456 Jose A. Lopes
DHCP server running on the host and thus determine its IP address.  The DHCP
320 1a7c1456 Jose A. Lopes
server is configured, started, and stopped, by Ganeti and it will be listening
321 1a7c1456 Jose A. Lopes
exclusively on the TAP network interfaces of the guests in order not to
322 1a7c1456 Jose A. Lopes
interfere with a potential DHCP server running on the same host.  Furthermore,
323 1a7c1456 Jose A. Lopes
the DHCP server will only recognize MAC and IP address pairs that have been
324 1a7c1456 Jose A. Lopes
approved by Ganeti.
325 1a7c1456 Jose A. Lopes
326 1a7c1456 Jose A. Lopes
The TAP network interfaces created for each guest share the same IP address.
327 1a7c1456 Jose A. Lopes
Therefore, it will be necessary to extend the routing table with rules specific
328 1a7c1456 Jose A. Lopes
to each guest.  This can be achieved with the following command, which takes the
329 1a7c1456 Jose A. Lopes
guest's unique IP address and its TAP interface::
330 1a7c1456 Jose A. Lopes
331 1a7c1456 Jose A. Lopes
  route add -host <ip> dev <ifname>
332 1a7c1456 Jose A. Lopes
333 1a7c1456 Jose A. Lopes
This rule has the additional advantage of preventing guests from trying to lease
334 1a7c1456 Jose A. Lopes
IP addresses from the DHCP server other than the own that has been assigned to
335 1a7c1456 Jose A. Lopes
them by Ganeti.  The guest could lie about its MAC address to the DHCP server
336 1a7c1456 Jose A. Lopes
and try to steal another guest's IP address, however, this routing rule will
337 1a7c1456 Jose A. Lopes
block traffic (i.e., IP packets carrying the wrong IP) from the DHCP server to
338 1a7c1456 Jose A. Lopes
the malicious guest.  Similarly, the guest could lie about its IP address (i.e.,
339 1a7c1456 Jose A. Lopes
simply assign a predefined IP address, perhaps from another guest), however,
340 1a7c1456 Jose A. Lopes
replies from the host will not be routed to the malicious guest.
341 1a7c1456 Jose A. Lopes
342 1a7c1456 Jose A. Lopes
This routing rule ensures that the communication channel is exclusive but, as
343 1a7c1456 Jose A. Lopes
mentioned before, it will not prevent guests from accessing any service on the
344 1a7c1456 Jose A. Lopes
host.  It is the system administrator's responsibility to employ the necessary
345 1a7c1456 Jose A. Lopes
``iptables`` rules.  In order to achieve this, Ganeti will provide ``ifup``
346 1a7c1456 Jose A. Lopes
hooks associated with the guest network interfaces which will give system
347 1a7c1456 Jose A. Lopes
administrator's the opportunity to customize their own ``iptables``, if
348 1a7c1456 Jose A. Lopes
necessary.  Ganeti will also provide examples of such hooks.  However, these are
349 1a7c1456 Jose A. Lopes
meant to personalized to each Ganeti installation and not to be taken as
350 1a7c1456 Jose A. Lopes
production ready scripts.
351 1a7c1456 Jose A. Lopes
352 1a7c1456 Jose A. Lopes
For KVM, an instance will be started with a unique MAC address and the file
353 1a7c1456 Jose A. Lopes
descriptor for the TAP network interface meant to be used by the communication
354 1a7c1456 Jose A. Lopes
mechanism.  Ganeti will be responsible for generating a unique MAC address for
355 1a7c1456 Jose A. Lopes
the guest, opening the TAP interface, and passing its file descriptor to KVM::
356 1a7c1456 Jose A. Lopes
357 1a7c1456 Jose A. Lopes
  kvm -net nic,macaddr=<mac> -net tap,fd=<tap-fd> ...
358 1a7c1456 Jose A. Lopes
359 1a7c1456 Jose A. Lopes
For Xen, a network interface will be created on the host (using the ``vif``
360 1a7c1456 Jose A. Lopes
parameter of the Xen configuration file).  Each instance will have its
361 1a7c1456 Jose A. Lopes
corresponding ``vif`` network interface on the host.  The ``vif-route`` script
362 1a7c1456 Jose A. Lopes
of Xen might be helpful in implementing this.
363 1a7c1456 Jose A. Lopes
364 1ab752c8 Jose A. Lopes
dnsmasq
365 1ab752c8 Jose A. Lopes
+++++++
366 1ab752c8 Jose A. Lopes
367 1ab752c8 Jose A. Lopes
The previous section describes the communication mechanism and explains the role
368 1ab752c8 Jose A. Lopes
of the DHCP server.  Note that any DHCP server can be used in the implementation
369 1ab752c8 Jose A. Lopes
of the communication mechanism.  However, the DHCP server employed should not
370 1ab752c8 Jose A. Lopes
violate the properties described in the previous section, which state that the
371 1ab752c8 Jose A. Lopes
communication mechanism should be exclusive, generic, and bidirectional, unless
372 1ab752c8 Jose A. Lopes
this is intentional.
373 1ab752c8 Jose A. Lopes
374 1ab752c8 Jose A. Lopes
In our experiments, we have used dnsmasq.  In this section, we describe how to
375 1ab752c8 Jose A. Lopes
properly configure dnsmasq to work on a given Ganeti installation.  This is
376 1ab752c8 Jose A. Lopes
particularly important if, in this Ganeti installation, dnsmasq will share the
377 1ab752c8 Jose A. Lopes
node with one or more DHCP servers running in parallel.
378 1ab752c8 Jose A. Lopes
379 1ab752c8 Jose A. Lopes
First, it is important to become familiar with the operational modes of dnsmasq,
380 1ab752c8 Jose A. Lopes
which are well explained in the `FAQ
381 1ab752c8 Jose A. Lopes
<http://www.thekelleys.org.uk/dnsmasq/docs/FAQ>`_ under the question ``What are
382 1ab752c8 Jose A. Lopes
these strange "bind-interface" and "bind-dynamic" options?``.  The rest of this
383 1ab752c8 Jose A. Lopes
section assumes the reader is familiar with these operational modes.
384 1ab752c8 Jose A. Lopes
385 1ab752c8 Jose A. Lopes
bind-dynamic
386 1ab752c8 Jose A. Lopes
  dnsmasq SHOULD be configured in the ``bind-dynamic`` mode (if supported) in
387 1ab752c8 Jose A. Lopes
  order to allow other DHCP servers to run on the same node.  In this mode,
388 1ab752c8 Jose A. Lopes
  dnsmasq can listen on the TAP interfaces for the communication mechanism by
389 1ab752c8 Jose A. Lopes
  listening on the TAP interfaces that match the pattern ``gnt.com.*`` (e.g.,
390 1ab752c8 Jose A. Lopes
  ``interface=gnt.com.*``).  For extra safety, interfaces matching the pattern
391 1ab752c8 Jose A. Lopes
  ``eth*`` and the name ``lo`` should be configured such that dnsmasq will
392 1ab752c8 Jose A. Lopes
  always ignore them (e.g., ``except-interface=eth*`` and
393 1ab752c8 Jose A. Lopes
  ``except-interface=lo``).
394 1ab752c8 Jose A. Lopes
395 1ab752c8 Jose A. Lopes
bind-interfaces
396 1ab752c8 Jose A. Lopes
  dnsmasq MAY be configured in the ``bind-interfaces`` mode (if supported) in
397 1ab752c8 Jose A. Lopes
  order to allow other DHCP servers to run on the same node.  Unfortunately,
398 1ab752c8 Jose A. Lopes
  because dnsmasq cannot dynamically adjust to TAP interfaces that are created
399 1ab752c8 Jose A. Lopes
  and destroyed by the system, dnsmasq must be restarted with a new
400 1ab752c8 Jose A. Lopes
  configuration file each time an instance is created or destroyed.
401 1ab752c8 Jose A. Lopes
402 1ab752c8 Jose A. Lopes
  Also, the interfaces cannot be patterns, such as, ``gnt.com.*``.  Instead, the
403 1ab752c8 Jose A. Lopes
  interfaces must be explictly specified, for example,
404 1ab752c8 Jose A. Lopes
  ``interface=gnt.com.0,gnt.com.1``.  Moreover, dnsmasq cannot bind to the TAP
405 1ab752c8 Jose A. Lopes
  interfaces if they have all the same IPv4 address.  As a result, it is
406 1ab752c8 Jose A. Lopes
  necessary to configure these TAP interfaces to enable IPv6 and an IPv6 address
407 1ab752c8 Jose A. Lopes
  must be assigned to them.
408 1ab752c8 Jose A. Lopes
409 1ab752c8 Jose A. Lopes
wildcard
410 1ab752c8 Jose A. Lopes
  dnsmasq CANNOT be configured in the ``wildcard`` mode if there is
411 1ab752c8 Jose A. Lopes
  (at least) another DHCP server running on the same node.
412 1ab752c8 Jose A. Lopes
413 1a7c1456 Jose A. Lopes
Metadata service
414 1a7c1456 Jose A. Lopes
++++++++++++++++
415 1a7c1456 Jose A. Lopes
416 1a7c1456 Jose A. Lopes
An instance will be able to reach metadata service on ``169.254.169.254:80`` in
417 1a7c1456 Jose A. Lopes
order to, for example, retrieve its metadata.  This IP address and port were
418 1a7c1456 Jose A. Lopes
chosen for compatibility with the OpenStack and Amazon EC2 metadata service.
419 1a7c1456 Jose A. Lopes
The metadata service will be provided by a single daemon, which will determine
420 1a7c1456 Jose A. Lopes
the source instance for a given request and reply with the metadata pertaining
421 1a7c1456 Jose A. Lopes
to that instance.
422 e5eaa80a Michele Tartara
423 e5eaa80a Michele Tartara
Where possible, the metadata will be provided in a way compatible with Amazon
424 e5eaa80a Michele Tartara
EC2, at::
425 e5eaa80a Michele Tartara
426 e5eaa80a Michele Tartara
  http://169.254.169.254/<version>/meta-data/*
427 e5eaa80a Michele Tartara
428 1a7c1456 Jose A. Lopes
Ganeti-specific metadata, that does not fit this structure, will be provided
429 1a7c1456 Jose A. Lopes
at::
430 e5eaa80a Michele Tartara
431 e5eaa80a Michele Tartara
  http://169.254.169.254/ganeti/<version>/meta_data.json
432 e5eaa80a Michele Tartara
433 1a7c1456 Jose A. Lopes
where ``<version>`` is either a date in YYYY-MM-DD format, or ``latest`` to
434 1a7c1456 Jose A. Lopes
indicate the most recent available protocol version.
435 e5eaa80a Michele Tartara
436 e5eaa80a Michele Tartara
If needed in the future, this structure also allows us to support OpenStack's
437 e5eaa80a Michele Tartara
metadata at::
438 e5eaa80a Michele Tartara
439 e5eaa80a Michele Tartara
  http://169.254.169.254/openstack/<version>/meta_data.json
440 e5eaa80a Michele Tartara
441 1a7c1456 Jose A. Lopes
A bi-directional, pipe-like communication channel will also be provided.  The
442 1a7c1456 Jose A. Lopes
instance will be able to receive data from the host by a GET request at::
443 e5eaa80a Michele Tartara
444 e5eaa80a Michele Tartara
  http://169.254.169.254/ganeti/<version>/read
445 e5eaa80a Michele Tartara
446 e5eaa80a Michele Tartara
and to send data to the host by a POST request at::
447 e5eaa80a Michele Tartara
448 e5eaa80a Michele Tartara
  http://169.254.169.254/ganeti/<version>/write
449 e5eaa80a Michele Tartara
450 e5eaa80a Michele Tartara
As in a pipe, once the data are read, they will not be in the buffer anymore, so
451 1a7c1456 Jose A. Lopes
subsequent GET requests to ``read`` will not return the same data.  However,
452 1a7c1456 Jose A. Lopes
unlike a pipe, it will not be possible to perform blocking I/O operations.
453 e5eaa80a Michele Tartara
454 1a7c1456 Jose A. Lopes
The OS parameters will be accessible through a GET request at::
455 e5eaa80a Michele Tartara
456 e5eaa80a Michele Tartara
  http://169.254.169.254/ganeti/<version>/os/parameters.json
457 e5eaa80a Michele Tartara
458 e5eaa80a Michele Tartara
as a JSON serialized dictionary having the parameter name as the key, and the
459 e5eaa80a Michele Tartara
pair ``(<value>, <visibility>)`` as the value, where ``<value>`` is the
460 e5eaa80a Michele Tartara
user-provided value of the parameter, and ``<visibility>`` is either ``public``,
461 e5eaa80a Michele Tartara
``private`` or ``secret``.
462 e5eaa80a Michele Tartara
463 56c934da Jose A. Lopes
The installation scripts to be run inside the virtualized environment will be
464 56c934da Jose A. Lopes
available at::
465 e5eaa80a Michele Tartara
466 56c934da Jose A. Lopes
  http://169.254.169.254/ganeti/<version>/os/scripts/<script_name>
467 e5eaa80a Michele Tartara
468 e5eaa80a Michele Tartara
where ``<script_name>`` is the name of the script.
469 e5eaa80a Michele Tartara
470 e5eaa80a Michele Tartara
Rationale
471 e5eaa80a Michele Tartara
---------
472 e5eaa80a Michele Tartara
473 e5eaa80a Michele Tartara
The choice of using a network interface for instance-host communication, as
474 e5eaa80a Michele Tartara
opposed to VirtIO, XenBus or other methods, is due to the will of having a
475 e5eaa80a Michele Tartara
generic, hypervisor-independent way of creating a communication channel, that
476 e5eaa80a Michele Tartara
doesn't require unusual (para)virtualization drivers.
477 e5eaa80a Michele Tartara
At the same time, a network interface was preferred over solutions involving
478 e5eaa80a Michele Tartara
virtual floppy or USB devices because the latter tend to be detected and
479 e5eaa80a Michele Tartara
configured by the guest operating systems, sometimes even in prominent positions
480 e5eaa80a Michele Tartara
in the user interface, whereas it is fairly common to have an unconfigured
481 e5eaa80a Michele Tartara
network interface in a system, usually without any negative side effects.
482 e5eaa80a Michele Tartara
483 e5eaa80a Michele Tartara
Installation process in a virtualized environment
484 e5eaa80a Michele Tartara
+++++++++++++++++++++++++++++++++++++++++++++++++
485 e5eaa80a Michele Tartara
486 e5eaa80a Michele Tartara
In the new OS installation scenario, we distinguish between trusted and
487 e5eaa80a Michele Tartara
untrusted code.
488 e5eaa80a Michele Tartara
489 e5eaa80a Michele Tartara
The trusted installation code maintains the behavior of the current one and
490 e5eaa80a Michele Tartara
requires no modifications, with the scripts running on the node the instance is
491 e5eaa80a Michele Tartara
being created on. The untrusted code is stored in a subdirectory of the OS
492 e5eaa80a Michele Tartara
definition called ``untrusted``.  This directory contains scripts that are
493 e5eaa80a Michele Tartara
equivalent to the already existing ones (``create``, ``export``, ``import``,
494 e5eaa80a Michele Tartara
``rename``) but that will be run inside an virtualized environment, to protect
495 e5eaa80a Michele Tartara
the host from malicious tampering.
496 e5eaa80a Michele Tartara
497 e5eaa80a Michele Tartara
The ``untrusted`` code is meant to either be untrusted itself, or to be trusted
498 e5eaa80a Michele Tartara
code running operations that might be dangerous (such as mounting a
499 e5eaa80a Michele Tartara
user-provided image).
500 e5eaa80a Michele Tartara
501 e5eaa80a Michele Tartara
By default, all new OS definitions will have to be explicitly marked as trusted
502 e5eaa80a Michele Tartara
by the cluster administrator (with a new ``gnt-os modify`` command) before they
503 e5eaa80a Michele Tartara
can run code on the host. Otherwise, only the untrusted part of the code will be
504 e5eaa80a Michele Tartara
allowed to run, inside the virtual appliance. For backwards compatibility
505 e5eaa80a Michele Tartara
reasons, when upgrading an existing cluster, all the installed OSes will be
506 e5eaa80a Michele Tartara
marked as trusted, so that they can keep running with no changes.
507 e5eaa80a Michele Tartara
508 e5eaa80a Michele Tartara
In order to allow for the highest flexibility, if both a trusted and an
509 e5eaa80a Michele Tartara
untrusted script are provided for the same operation (i.e. ``create``), both of
510 e5eaa80a Michele Tartara
them will be executed at the same time, one on the host, and one inside the
511 e5eaa80a Michele Tartara
installation appliance. They will be allowed to communicate with each other
512 e5eaa80a Michele Tartara
through the already described communication mechanism, in order to orchestrate
513 e5eaa80a Michele Tartara
their execution (e.g.: the untrusted code might execute the installation, while
514 e5eaa80a Michele Tartara
the trusted one receives status updates from it and delivers them to a user
515 e5eaa80a Michele Tartara
interface).
516 e5eaa80a Michele Tartara
517 e5eaa80a Michele Tartara
The cluster administrator will have an option to completely disable scripts
518 e5eaa80a Michele Tartara
running on the host, leaving only the ones running in the VM.
519 e5eaa80a Michele Tartara
520 e5eaa80a Michele Tartara
Ganeti will provide a script to be run at install time that can be used to
521 e5eaa80a Michele Tartara
create the virtualized environment that will perform the OS installation of new
522 e5eaa80a Michele Tartara
instances.
523 1a7c1456 Jose A. Lopes
This script will build a debootstrapped basic Debian system including a software
524 e5eaa80a Michele Tartara
that will read the metadata, setup the environment variables and launch the
525 e5eaa80a Michele Tartara
installation scripts inside the virtualized environment. The script will also
526 e5eaa80a Michele Tartara
provide hooks for personalization.
527 e5eaa80a Michele Tartara
528 e5eaa80a Michele Tartara
It will also be possible to use other self-made virtualized environments, as
529 e5eaa80a Michele Tartara
long as they connect to Ganeti over the described communication mechanism and
530 e5eaa80a Michele Tartara
they know how to read and use the provided metadata to create a new instance.
531 e5eaa80a Michele Tartara
532 1a7c1456 Jose A. Lopes
While performing an installation in the virtualized environment, a customizable
533 1a7c1456 Jose A. Lopes
timeout will be used to detect possible problems with the installation process,
534 1a7c1456 Jose A. Lopes
and to kill the virtualized environment. The timeout will be optional and set on
535 1a7c1456 Jose A. Lopes
a cluster basis by the administrator. If set, it will be the total time allowed
536 1a7c1456 Jose A. Lopes
to setup an instance inside the appliance. It is mainly meant as a safety
537 1a7c1456 Jose A. Lopes
measure to prevent an instance taken over by malicious scripts to be available
538 1a7c1456 Jose A. Lopes
for a long time.
539 1a7c1456 Jose A. Lopes
540 1a7c1456 Jose A. Lopes
Alternatives to design and implementation
541 1a7c1456 Jose A. Lopes
=========================================
542 1a7c1456 Jose A. Lopes
543 1a7c1456 Jose A. Lopes
This section lists alternatives to design and implementation, which came up
544 1a7c1456 Jose A. Lopes
during the development of this design document, that will not be implemented.
545 1a7c1456 Jose A. Lopes
Please read carefully through the limitations and security concerns of each of
546 1a7c1456 Jose A. Lopes
these alternatives.
547 1a7c1456 Jose A. Lopes
548 1a7c1456 Jose A. Lopes
Port forwarding in KVM
549 1a7c1456 Jose A. Lopes
++++++++++++++++++++++
550 1a7c1456 Jose A. Lopes
551 1a7c1456 Jose A. Lopes
The communication mechanism could have been implemented in KVM using guest port
552 1a7c1456 Jose A. Lopes
forwarding, as opposed to network interfaces.  There are two alternatives in
553 1a7c1456 Jose A. Lopes
KVM's guest port forwarding, namely, creating a forwarding device, such as, a
554 1a7c1456 Jose A. Lopes
TCP/IP connection, or executing a command.  However, we have determined that
555 1a7c1456 Jose A. Lopes
both of these options are not viable.
556 1a7c1456 Jose A. Lopes
557 1a7c1456 Jose A. Lopes
A TCP/IP forwarding device can be created through the following KVM invocation::
558 1a7c1456 Jose A. Lopes
559 1a7c1456 Jose A. Lopes
  kvm -net nic -net \
560 1a7c1456 Jose A. Lopes
    user,restrict=on,net=169.254.0.0/16,host=169.254.169.253,
561 1a7c1456 Jose A. Lopes
    guestfwd=tcp:169.254.169.254:80-tcp:127.0.0.1:8080 ...
562 1a7c1456 Jose A. Lopes
563 1a7c1456 Jose A. Lopes
This invocation even has the advantage that it can block undesired traffic
564 1a7c1456 Jose A. Lopes
(i.e., traffic that is not explicitly specified in the arguments) and it can
565 1a7c1456 Jose A. Lopes
remap ports, which would have allowed the metadata service daemon to run in port
566 1a7c1456 Jose A. Lopes
8080 instead of 80.  However, in this scheme, KVM opens the TCP connection only
567 1a7c1456 Jose A. Lopes
once, when it is started, and, if the connection breaks, KVM will not
568 1a7c1456 Jose A. Lopes
reestablish the connection.  Furthermore, opening the TCP connection only once
569 1a7c1456 Jose A. Lopes
interferes with the HTTP protocol, which needs to dynamically establish and
570 1a7c1456 Jose A. Lopes
close connections.
571 1a7c1456 Jose A. Lopes
572 1a7c1456 Jose A. Lopes
The alternative to the TCP/IP forwarding device is to execute a command.  The
573 1a7c1456 Jose A. Lopes
KVM invocation for this is, for example, the following::
574 1a7c1456 Jose A. Lopes
575 1a7c1456 Jose A. Lopes
  kvm -net nic -net \
576 1a7c1456 Jose A. Lopes
    "user,restrict=on,net=169.254.0.0/16,host=169.254.169.253,
577 1a7c1456 Jose A. Lopes
    guestfwd=tcp:169.254.169.254:80-netcat 127.0.0.1 8080" ...
578 1a7c1456 Jose A. Lopes
579 1a7c1456 Jose A. Lopes
The advantage of this approach is that the command is executed each time the
580 1a7c1456 Jose A. Lopes
guest initiates a connection.  This is the ideal situation, however, it is only
581 1a7c1456 Jose A. Lopes
supported in KVM 1.2 and above, and, therefore, not viable because we want to
582 1a7c1456 Jose A. Lopes
provide support for at least KVM version 1.0, which is the version provided by
583 1a7c1456 Jose A. Lopes
Ubuntu LTS.
584 1a7c1456 Jose A. Lopes
585 1a7c1456 Jose A. Lopes
Alternatives to the DHCP server
586 1a7c1456 Jose A. Lopes
+++++++++++++++++++++++++++++++
587 1a7c1456 Jose A. Lopes
588 1a7c1456 Jose A. Lopes
There are alternatives to using the DHCP server, for example, by assigning a
589 1a7c1456 Jose A. Lopes
fixed IP address to guests, such as, the IP address ``169.254.169.253``.
590 1a7c1456 Jose A. Lopes
However, this introduces a routing problem, namely, how to route incoming
591 1a7c1456 Jose A. Lopes
packets from the same source IP to the host.  This problem can be overcome in a
592 1a7c1456 Jose A. Lopes
number of ways.
593 1a7c1456 Jose A. Lopes
594 1a7c1456 Jose A. Lopes
The first solution is to use NAT to translate the incoming guest IP address, for
595 1a7c1456 Jose A. Lopes
example, ``169.254.169.253``, to a unique IP address, for example,
596 1a7c1456 Jose A. Lopes
``169.254.0.1``.  Given that NAT through ``ip rule`` is deprecated, users can
597 1a7c1456 Jose A. Lopes
resort to ``iptables``.  Note that this has not yet been tested.
598 1a7c1456 Jose A. Lopes
599 1a7c1456 Jose A. Lopes
Another option, which has been tested, but only in a prototype, is to connect
600 1a7c1456 Jose A. Lopes
the TAP network interfaces of the guests to a bridge.  The bridge takes the
601 1a7c1456 Jose A. Lopes
configuration from the TAP network interfaces, namely, IP address
602 1a7c1456 Jose A. Lopes
``169.254.169.254`` and netmask ``255.255.255.255``, thus leaving those
603 1a7c1456 Jose A. Lopes
interfaces without an IP address.  Note that in this setting, guests will be
604 1a7c1456 Jose A. Lopes
able to reach each other, therefore, if necessary, additional ``iptables`` rules
605 1a7c1456 Jose A. Lopes
can be put in place to prevent it.