Revision 8388e9ff doc/design-2.2.rst

b/doc/design-2.2.rst
179 179
Feature changes
180 180
---------------
181 181

  
182
KVM Security
183
~~~~~~~~~~~~
184

  
185
Current state and shortcomings
186
++++++++++++++++++++++++++++++
187

  
188
Currently all kvm processes run as root. Taking ownership of the
189
hypervisor process, from inside a virtual machine, would mean a full
190
compromise of the whole Ganeti cluster, knowledge of all Ganeti
191
authentication secrets, full access to all running instances, and the
192
option of subverting other basic services on the cluster (eg: ssh).
193

  
194
Proposed changes
195
++++++++++++++++
196

  
197
We would like to decrease the surface of attack available if an
198
hypervisor is compromised. We can do so adding different features to
199
Ganeti, which will allow restricting the broken hypervisor
200
possibilities, in the absence of a local privilege escalation attack, to
201
subvert the node.
202

  
203
Dropping privileges in kvm to a single user (easy)
204
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
205

  
206
By passing the ``-runas`` option to kvm, we can make it drop privileges.
207
The user can be chosen by an hypervisor parameter, so that each instance
208
can have its own user, but by default they will all run under the same
209
one. It should be very easy to implement, and can easily be backported
210
to 2.1.X.
211

  
212
This mode protects the Ganeti cluster from a subverted hypervisor, but
213
doesn't protect the instances between each other, unless care is taken
214
to specify a different user for each. This would prevent the worst
215
attacks, including:
216

  
217
- logging in to other nodes
218
- administering the Ganeti cluster
219
- subverting other services
220

  
221
But the following would remain an option:
222

  
223
- terminate other VMs (but not start them again, as that requires root
224
  privileges to set up networking) (unless different users are used)
225
- trace other VMs, and probably subvert them and access their data
226
  (unless different users are used)
227
- send network traffic from the node
228
- read unprotected data on the node filesystem
229

  
230
Running kvm in a chroot (slightly harder)
231
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
232

  
233
By passing the ``-chroot`` option to kvm, we can restrict the kvm
234
process in its own (possibly empty) root directory. We need to set this
235
area up so that the instance disks and control sockets are accessible,
236
so it would require slightly more work at the Ganeti level.
237

  
238
Breaking out in a chroot would mean:
239

  
240
- a lot less options to find a local privilege escalation vector
241
- the impossibility to write local data, if the chroot is set up
242
  correctly
243
- the impossibility to read filesystem data on the host
244

  
245
It would still be possible though to:
246

  
247
- terminate other VMs
248
- trace other VMs, and possibly subvert them (if a tracer can be
249
  installed in the chroot)
250
- send network traffic from the node
251

  
252

  
253
Running kvm with a pool of users (slightly harder)
254
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
255

  
256
If rather than passing a single user as an hypervisor parameter, we have
257
a pool of useable ones, we can dynamically choose a free one to use and
258
thus guarantee that each machine will be separate from the others,
259
without putting the burden of this on the cluster administrator.
260

  
261
This would mean interfering between machines would be impossible, and
262
can still be combined with the chroot benefits.
263

  
264
Running iptables rules to limit network interaction (easy)
265
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
266

  
267
These don't need to be handled by Ganeti, but we can ship examples. If
268
the users used to run VMs would be blocked from sending some or all
269
network traffic, it would become impossible for a broken into hypervisor
270
to send arbitrary data on the node network, which is especially useful
271
when the instance and the node network are separated (using ganeti-nbma
272
or a separate set of network interfaces), or when a separate replication
273
network is maintained. We need to experiment to see how much restriction
274
we can properly apply, without limiting the instance legitimate traffic.
275

  
276

  
277
Running kvm inside a container (even harder)
278
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
279

  
280
Recent linux kernels support different process namespaces through
281
control groups. PIDs, users, filesystems and even network interfaces can
282
be separated. If we can set up ganeti to run kvm in a separate container
283
we could insulate all the host process from being even visible if the
284
hypervisor gets broken into. Most probably separating the network
285
namespace would require one extra hop in the host, through a veth
286
interface, thus reducing performance, so we may want to avoid that, and
287
just rely on iptables.
288

  
289
Implementation plan
290
+++++++++++++++++++
291

  
292
We will first implement dropping privileges for kvm processes as a
293
single user, and most probably backport it to 2.1. Then we'll ship
294
example iptables rules to show how the user can be limited in its
295
network activities.  After that we'll implement chroot restriction for
296
kvm processes, and extend the user limitation to use a user pool.
297

  
298
Finally we'll look into namespaces and containers, although that might
299
slip after the 2.2 release.
300

  
182 301
External interface changes
183 302
--------------------------
184 303

  

Also available in: Unified diff