Revision bced76fd

b/Makefile.am
487 487

  
488 488
docinput = \
489 489
	doc/admin.rst \
490
	doc/cluster-keys-replacement.rst \
490 491
	doc/cluster-merge.rst \
491 492
	doc/conf.py \
492 493
	doc/css/style.css \
......
510 511
	doc/design-daemons.rst \
511 512
	doc/design-device-uuid-name.rst \
512 513
	doc/design-draft.rst \
514
	doc/design-file-based-storage.rst \
513 515
	doc/design-glusterfs-ganeti-support.rst \
514 516
	doc/design-hotplug.rst \
515 517
	doc/design-hroller.rst \
b/doc/admin.rst
1340 1340
Otherwise, if you plan to re-create the cluster, you can just go ahead
1341 1341
and rerun ``gnt-cluster init``.
1342 1342

  
1343
Replacing the SSH and SSL keys
1344
++++++++++++++++++++++++++++++
1345

  
1346
Ganeti uses both SSL and SSH keys, and actively modifies the SSH keys on
1347
the nodes.  As result, in order to replace these keys, a few extra steps
1348
need to be followed: :doc:`cluster-keys-replacement`
1349

  
1343 1350
Monitoring the cluster
1344 1351
----------------------
1345 1352

  
b/doc/cluster-keys-replacement.rst
1
========================
2
Cluster Keys Replacement
3
========================
4

  
5
Ganeti uses both SSL and SSH keys, and actively modifies the SSH keys
6
on the nodes.  As result, in order to replace these keys, a few extra
7
steps need to be followed.
8

  
9
For an example when this could be needed, see the thread at
10
`Regenerating SSL and SSH keys after the security bug in Debian's
11
OpenSSL
12
<http://groups.google.com/group/ganeti/browse_thread/thread/30cc95102dc2123e>`_.
13

  
14
Ganeti uses OpenSSL for encryption on the RPC layer and SSH for
15
executing commands. The SSL certificate is automatically generated
16
when the cluster is initialized and it's copied to added nodes
17
automatically together with the master's SSH host key.
18

  
19
Note that paths below may vary depending on your distribution. In
20
general, modifications should be done on the master node and then
21
distributed to all nodes of a cluster (possibly using a pendrive - but
22
don't forget to use "shred" to remove files securely afterwards).
23

  
24
Replacing SSL keys
25
==================
26

  
27
The cluster SSL key is stored in ``/var/lib/ganeti/server.pem``.
28

  
29
Run the following command to generate a new key::
30

  
31
  gnt-cluster renew-crypto --new-cluster-certificate
32

  
33
  # Older version, which don't have this command, can instead use:
34
  chmod 0600 /var/lib/ganeti/server.pem &&
35
  openssl req -new -newkey rsa:1024 -days 1825 -nodes \
36
   -x509 -keyout /var/lib/ganeti/server.pem \
37
   -out /var/lib/ganeti/server.pem -batch &&
38
  chmod 0400 /var/lib/ganeti/server.pem &&
39
  /etc/init.d/ganeti restart
40

  
41
  gnt-cluster copyfile /var/lib/ganeti/server.pem
42

  
43
  gnt-cluster command /etc/init.d/ganeti restart
44

  
45
Replacing SSH keys
46
==================
47

  
48
There are two sets of SSH keys in the cluster: the host keys (both DSA
49
and RSA, though Ganeti only uses the RSA one) and the root's DSA key
50
(Ganeti uses DSA for historically reasons, in the future RSA will be
51
used).
52

  
53
host keys
54
+++++++++
55

  
56
These are the files named ``/etc/ssh/ssh_host_*``. You need to
57
manually recreate them; it's possibly that the startup script of
58
OpenSSH will generate them if they don't exist, or that the package
59
system regenerates them.
60

  
61
Also make sure to copy the master's SSH host keys to all other nodes.
62

  
63
cluster public key file
64
+++++++++++++++++++++++
65

  
66
The new public rsa host key created in the previous step must be added
67
in two places:
68

  
69
#. known hosts file, ``/var/lib/ganeti/known_hosts``
70
#. cluster configuration file, ``/var/lib/ganeti/config.data``
71

  
72
Edit these two files and update them with newly generated SSH host key
73
(in the previous step, take it from the
74
``/etc/ssh/ssh_host_rsa_key.pub``).
75

  
76
For the ``config.data`` file, please look for an entry named
77
``rsahostkeypub`` and replace the value for it with the contents of
78
the ``.pub`` file. For the ``known_hosts`` file, you need to replace
79
the old key with the new one on each line (for each host).
80

  
81
root's key
82
++++++++++
83

  
84
These are the files named ``~root/.ssh/id_dsa*``.
85

  
86
Run this command to rebuild them::
87

  
88
  ssh-keygen -t dsa -f ~root/.ssh/id_dsa -q -N ""
89

  
90
root's ``authorized_keys``
91
++++++++++++++++++++++++++
92

  
93
This is the file named ``~root/.ssh/authorized_keys``.
94

  
95
Edit file and update it with the newly generated root key, from the
96
``id_dsa.pub`` file generated in the previous step.
97

  
98
Finish
99
======
100

  
101
In the end, the files mentioned above should be identical for all
102
nodes in a cluster. Also do not forget to run ``gnt-cluster verify``.
b/doc/design-file-based-storage.rst
1
==================
2
File-based Storage
3
==================
4

  
5
This page describes the proposed file-based storage for the 2.0 version
6
of Ganeti. The project consists in extending Ganeti in order to support
7
a filesystem image as Virtual Block Device (VBD) in Dom0 as the primary
8
storage for a VM.
9

  
10
Objective
11
=========
12

  
13
Goals:
14

  
15
* file-based storage for virtual machines running in a Xen-based
16
  Ganeti cluster
17

  
18
* failover of file-based virtual machines between cluster-nodes
19

  
20
* export/import file-based virtual machines
21

  
22
* reuse existing image files
23

  
24
* allow Ganeti to initalize the cluster without checking for a volume
25
  group (e.g. xenvg)
26

  
27
Non Goals:
28

  
29
* any kind of data mirroring between clusters for file-based instances
30
  (this should be achieved by using shared storage)
31

  
32
* special support for live-migration
33

  
34
* encryption of VBDs
35

  
36
* compression of VBDs
37

  
38
Background
39
==========
40

  
41
Ganeti is a virtual server management software tool built on top of Xen
42
VM monitor and other Open Source software.
43

  
44
Since Ganeti currently supports only block devices as storage backend
45
for virtual machines, the wish came up to provide a file-based backend.
46
Using this file-based option provides the possibility to store the VBDs
47
on basically every filesystem and therefore allows to deploy external
48
data storages (e.g. SAN, NAS, etc.) in clusters.
49

  
50
Overview
51
========
52

  
53
Introduction
54
++++++++++++
55

  
56
Xen (and other hypervisors) provide(s) the possibility to use a file as
57
the primary storage for a VM. One file represents one VBD.
58

  
59
Advantages/Disadvantages
60
++++++++++++++++++++++++
61

  
62
Advantages of file-backed VBD:
63

  
64
* support of sparse allocation
65

  
66
* easy from a management/backup point of view (e.g. you can just copy
67
  the files around)
68

  
69
* external storage (e.g. SAN, NAS) can be used to store VMs
70

  
71
Disadvantages of file-backed VBD:
72
* possible performance loss for I/O-intensive workloads
73

  
74
* using sparse files requires care to ensure the sparseness is
75
  preserved when copying, and there is no header in which metadata
76
  relating back to the VM can be stored
77

  
78
Xen-related specifications
79
++++++++++++++++++++++++++
80

  
81
Driver
82
~~~~~~
83

  
84
There are several ways to realize the required functionality with an
85
underlying Xen hypervisor.
86

  
87
1) loopback driver
88
^^^^^^^^^^^^^^^^^^
89

  
90
Advantages:
91
* available in most precompiled kernels
92
* stable, since it is in kernel tree for a long time
93
* easy to set up
94

  
95
Disadvantages:
96

  
97
* buffer writes very aggressively, which can affect guest filesystem
98
  correctness in the event of a host crash
99

  
100
* can even cause out-of-memory kernel crashes in Dom0 under heavy
101
  write load
102

  
103
* substantial slowdowns under heavy I/O workloads
104

  
105
* the default number of supported loopdevices is only 8
106

  
107
* doesn't support QCOW files
108

  
109
``blktap`` driver
110
^^^^^^^^^^^^^^^^^
111

  
112
Advantages:
113

  
114
* higher performance than loopback driver
115

  
116
* more scalable
117

  
118
* better safety properties for VBD data
119

  
120
* Xen-team strongly encourages use
121

  
122
* already in Xen tree
123

  
124
* supports QCOW files
125

  
126
* asynchronous driver (i.e. high performance)
127

  
128
Disadvantages:
129

  
130
* not enabled in most precompiled kernels
131

  
132
* stable, but not as much tested as loopback driver
133

  
134
3) ubklback driver
135
^^^^^^^^^^^^^^^^^^
136

  
137
The Xen Roadmap states "Work is well under way to implement a
138
``ublkback`` driver that supports all of the various qemu file format
139
plugins".
140

  
141
Furthermore, the Roadmap includes the following:
142

  
143
  "... A special high-performance qcow plugin is also under
144
  development, that supports better metadata caching, asynchronous IO,
145
  and allows request reordering with appropriate safety barriers to
146
  enforce correctness. It remains both forward and backward compatible
147
  with existing qcow disk images, but makes adjustments to qemu's
148
  default allocation policy when creating new disks such as to
149
  optimize performance."
150

  
151
File types
152
~~~~~~~~~~
153

  
154
Raw disk image file
155
^^^^^^^^^^^^^^^^^^^
156

  
157
Advantages:
158
* Resizing supported
159
* Sparse file (filesystem dependend)
160
* simple and easily exportable
161

  
162
Disadvantages:
163

  
164
* Underlying filesystem needs to support sparse files (most
165
  filesystems do, though)
166

  
167
QCOW disk image file
168
^^^^^^^^^^^^^^^^^^^^
169

  
170
Advantages:
171

  
172
* Smaller file size, even on filesystems which don't support holes
173
  (i.e. sparse files)
174

  
175
* Snapshot support, where the image only represents changes made to an
176
  underlying disk image
177

  
178
* Optional zlib based compression
179

  
180
* Optional AES encryption
181

  
182
Disadvantages:
183
* Resizing not supported yet (it's on the way)
184

  
185
VMDK disk image file
186
^^^^^^^^^^^^^^^^^^^^
187

  
188
This file format is directly based on the qemu vmdk driver, which is
189
synchronous and thus slow.
190

  
191
Detailed Design
192
===============
193

  
194
Terminology
195
+++++++++++
196

  
197
* **VBD** (Virtual Block Device): Persistent storage available to a
198
  virtual machine, providing the abstraction of an actual block
199
  storage device. VBDs may be actual block devices, filesystem images,
200
  or remote/network storage.
201

  
202
* **Dom0** (Domain 0): The first domain to be started on a Xen
203
  machine.  Domain 0 is responsible for managing the system.
204

  
205
* **VM** (Virtual Machine): The environment in which a hosted
206
  operating system runs, providing the abstraction of a dedicated
207
  machine. A VM may be identical to the underlying hardware (as in
208
  full virtualization, or it may differ, as in paravirtualization). In
209
  the case of Xen the domU (unprivileged domain) instance is meant.
210

  
211
* **QCOW**: QEMU (a processor emulator) image format.
212

  
213

  
214
Implementation
215
++++++++++++++
216

  
217
Managing file-based instances
218
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219

  
220
The option for file-based storage will be added to the 'gnt-instance'
221
utility.
222

  
223
Add Instance
224
^^^^^^^^^^^^
225

  
226
Example:
227

  
228
  gnt-instance add -t file:[path\ =[,driver=loop[,reuse[,...]]]] \
229
  --disk 0:size=5G --disk 1:size=10G -n node -o debian-etch instance2
230

  
231
This will create a file-based instance with e.g. the following files:
232
* ``/sda`` -> 5GB
233
* ``/sdb`` -> 10GB
234

  
235
The default directory where files will be stored is
236
``/srv/ganeti/file-storage/``. This can be changed by setting the
237
``<path>`` option. This option denotes the full path to the directory
238
where the files are stored. The filetype will be "raw" for the first
239
release of Ganeti 2.0. However, the code will be extensible to more
240
file types, since Ganeti will store information about the file type of
241
each image file. Internally Ganeti will keep track of the used driver,
242
the file-type and the full path to the file for every VBD. Example:
243
"logical_id" : ``[FD_LOOP, FT_RAW, "/instance1/sda"]`` If the
244
``--reuse`` flag is set, Ganeti checks for existing files in the
245
corresponding directory (e.g. ``/xen/instance2/``). If one or more
246
files in this directory are present and correctly named (the naming
247
conventions will be defined in Ganeti version 2.0) Ganeti will set a
248
VM up with these. If no file can be found or the names or invalid the
249
operation will be aborted.
250

  
251
Remove instance
252
^^^^^^^^^^^^^^^
253

  
254
The instance removal will just differ from the actual one by deleting
255
the VBD-files instead of the corresponding block device (e.g. a logical
256
volume).
257

  
258
Starting/Stopping Instance
259
^^^^^^^^^^^^^^^^^^^^^^^^^^
260

  
261
Here nothing has to be changed, as the xen tools don't differentiate
262
between file-based or blockdevice-based instances in this case.
263

  
264
Export/Import instance
265
^^^^^^^^^^^^^^^^^^^^^^
266

  
267
Provided "dump/restore" is used in the "export" and "import" guest-os
268
scripts, there are no modifications needed when file-based instances are
269
exported/imported. If any other backup-tool (which requires access to
270
the mounted file-system) is used then the image file can be temporaily
271
mounted. This can be done in different ways:
272

  
273
Mount a raw image file via loopback driver::
274

  
275
  mount -o loop /srv/ganeti/file-storage/instance1/sda1 /mnt/disk\
276

  
277
Mount a raw image file via blkfront driver (Dom0 kernel needs this
278
module to do the following operation)::
279

  
280
  xm block-attach 0 tap:aio:/srv/ganeti/file-storage/instance1/sda1 /dev/xvda1 w 0\
281

  
282
  mount /dev/xvda1 /mnt/disk
283

  
284
Mount a qcow image file via blkfront driver (Dom0 kernel needs this
285
module to do the following operation)
286

  
287
  xm block-attach 0 tap:qcow:/srv/ganeti/file-storage/instance1/sda1 /dev/xvda1 w 0
288

  
289
  mount /dev/xvda1 /mnt/disk
290

  
291
High availability features with file-based instances
292
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
293

  
294
Failing over an instance
295
^^^^^^^^^^^^^^^^^^^^^^^^
296

  
297
Failover is done in the same way as with block device backends. The
298
instance gets stopped on the primary node and started on the secondary.
299
The roles of primary and secondary get swapped. Note: If a failover is
300
done, Ganeti will assume that the corresponding VBD(s) location (i.e.
301
directory) is the same on the source and destination node. In case one
302
or more corresponding file(s) are not present on the destination node,
303
Ganeti will abort the operation.
304

  
305
Replacing an instance disks
306
^^^^^^^^^^^^^^^^^^^^^^^^^^^
307

  
308
Since there is no data mirroring for file-backed VM there is no such
309
operation.
310

  
311
Evacuation of a node
312
^^^^^^^^^^^^^^^^^^^^
313

  
314
Since there is no data mirroring for file-backed VMs there is no such
315
operation.
316

  
317
Live migration
318
^^^^^^^^^^^^^^
319

  
320
Live migration is possible using file-backed VBDs. However, the
321
administrator has to make sure that the corresponding files are exactly
322
the same on the source and destination node.
323

  
324
Xen Setup
325
+++++++++
326

  
327
File creation
328
~~~~~~~~~~~~~
329

  
330
Creation of a raw file is simple. Example of creating a sparse file of 2
331
Gigabytes. The option "seek" instructs "dd" to create a sparse file::
332

  
333
  dd if=/dev/zero of=vm1disk bs=1k seek=2048k count=1
334

  
335
Creation of QCOW image files can be done with the "qemu-img" utility (in
336
debian it comes with the "qemu" package).
337

  
338
Config file
339
~~~~~~~~~~~
340

  
341
The Xen config file will have the following modification if one chooses
342
the file-based disk-template.
343

  
344
1) loopback driver and raw file
345
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
346

  
347
::
348

  
349
  disk = ['file:</path/to/file>,sda1,w']
350

  
351
2) blktap driver and raw file
352
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
353

  
354
::
355

  
356
  disk = ['tap:aio:,sda1,w']
357

  
358
3) blktap driver and qcow file
359
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
360

  
361
::
362

  
363
  disk = ['tap:qcow:,sda1,w']
364

  
365
Other hypervisors
366
+++++++++++++++++
367

  
368
Other hypervisors have mostly differnet ways to make storage available
369
to their virtual instances/machines. This is beyond the scope of this
370
document.
b/doc/index.rst
71 71
A description of the locking strategy and, in particular, lock order
72 72
dependencies is presented in :doc:`locking`.
73 73

  
74
Build dependencies and other useful development-related information are provided
75
in the :doc:`devnotes`.
74
Build dependencies and other useful development-related information
75
are provided in the :doc:`devnotes`.
76 76

  
77 77
All the features implemented in Ganeti are described in a design document before
78 78
being actually implemented. Designs can be implemented in a released version, or
......
108 108

  
109 109
   admin.rst
110 110
   cluster-merge.rst
111
   cluster-keys-replacement.rst
111 112
   design-autorepair.rst
112 113
   design-bulk-create.rst
113 114
   design-chained-jobs.rst
114 115
   design-cmdlib-unittests.rst
115 116
   design-cpu-pinning.rst
116 117
   design-device-uuid-name.rst
118
   design-file-based-storage.rst
117 119
   design-hroller.rst
118 120
   design-hotplug.rst
119 121
   design-linuxha.rst

Also available in: Unified diff