Revision e7e2552e
b/doc/design-shared-storage.rst | ||
---|---|---|
6 | 6 |
2.3 storage model. |
7 | 7 |
|
8 | 8 |
.. contents:: :depth: 4 |
9 |
.. highlight:: shell-example |
|
9 | 10 |
|
10 | 11 |
Objective |
11 | 12 |
========= |
... | ... | |
64 | 65 |
filesystems. |
65 | 66 |
- Introduction of shared block device disk template with device |
66 | 67 |
adoption. |
68 |
- Introduction of an External Storage Interface. |
|
67 | 69 |
|
68 | 70 |
Additionally, mid- to long-term goals include: |
69 | 71 |
|
70 | 72 |
- Support for external “storage pools”. |
71 |
- Introduction of an interface for communicating with external scripts, |
|
72 |
providing methods for the various stages of a block device's and |
|
73 |
instance's life-cycle. In order to provide storage provisioning |
|
74 |
capabilities for various SAN appliances, external helpers in the form |
|
75 |
of a “storage driver” will be possibly introduced as well. |
|
76 | 73 |
|
77 | 74 |
Refactoring of all code referring to constants.DTS_NET_MIRROR |
78 | 75 |
============================================================= |
... | ... | |
159 | 156 |
- The device will be available with the same path under all nodes in the |
160 | 157 |
node group. |
161 | 158 |
|
159 |
Introduction of an External Storage Interface |
|
160 |
============================================== |
|
161 |
Overview |
|
162 |
-------- |
|
163 |
|
|
164 |
To extend the shared block storage template and give Ganeti the ability |
|
165 |
to control and manipulate external storage (provisioning, removal, |
|
166 |
growing, etc.) we need a more generic approach. The generic method for |
|
167 |
supporting external shared storage in Ganeti will be to have an |
|
168 |
ExtStorage provider for each external shared storage hardware type. The |
|
169 |
ExtStorage provider will be a set of files (executable scripts and text |
|
170 |
files), contained inside a directory which will be named after the |
|
171 |
provider. This directory must be present across all nodes of a nodegroup |
|
172 |
(Ganeti doesn't replicate it), in order for the provider to be usable by |
|
173 |
Ganeti for this nodegroup (valid). The external shared storage hardware |
|
174 |
should also be accessible by all nodes of this nodegroup too. |
|
175 |
|
|
176 |
An “ExtStorage provider” will have to provide the following methods: |
|
177 |
|
|
178 |
- Create a disk |
|
179 |
- Remove a disk |
|
180 |
- Grow a disk |
|
181 |
- Attach a disk to a given node |
|
182 |
- Detach a disk from a given node |
|
183 |
- Verify its supported parameters |
|
184 |
|
|
185 |
The proposed ExtStorage interface borrows heavily from the OS |
|
186 |
interface and follows a one-script-per-function approach. An ExtStorage |
|
187 |
provider is expected to provide the following scripts: |
|
188 |
|
|
189 |
- ``create`` |
|
190 |
- ``remove`` |
|
191 |
- ``grow`` |
|
192 |
- ``attach`` |
|
193 |
- ``detach`` |
|
194 |
- ``verify`` |
|
195 |
|
|
196 |
All scripts will be called with no arguments and get their input via |
|
197 |
environment variables. A common set of variables will be exported for |
|
198 |
all commands, and some of them might have extra ones. |
|
199 |
|
|
200 |
``VOL_NAME`` |
|
201 |
The name of the volume. This is unique for Ganeti and it |
|
202 |
uses it to refer to a specific volume inside the external storage. |
|
203 |
``VOL_SIZE`` |
|
204 |
The volume's size in mebibytes. |
|
205 |
``VOL_NEW_SIZE`` |
|
206 |
Available only to the `grow` script. It declares the |
|
207 |
new size of the volume after grow (in mebibytes). |
|
208 |
``EXTP_name`` |
|
209 |
ExtStorage parameter, where `name` is the parameter in |
|
210 |
upper-case (same as OS interface's ``OSP_*`` parameters). |
|
211 |
|
|
212 |
All scripts except `attach` should return 0 on success and non-zero on |
|
213 |
error, accompanied by an appropriate error message on stderr. The |
|
214 |
`attach` script should return a string on stdout on success, which is |
|
215 |
the block device's full path, after it has been successfully attached to |
|
216 |
the host node. On error it should return non-zero. |
|
217 |
|
|
218 |
Implementation |
|
219 |
-------------- |
|
220 |
|
|
221 |
To support the ExtStorage interface, we will introduce a new disk |
|
222 |
template called `ext`. This template will implement the existing Ganeti |
|
223 |
disk interface in `lib/bdev.py` (create, remove, attach, assemble, |
|
224 |
shutdown, grow), and will simultaneously pass control to the external |
|
225 |
scripts to actually handle the above actions. The `ext` disk template |
|
226 |
will act as a translation layer between the current Ganeti disk |
|
227 |
interface and the ExtStorage providers. |
|
228 |
|
|
229 |
We will also introduce a new IDISK_PARAM called `IDISK_PROVIDER = |
|
230 |
provider`, which will be used at the command line to select the desired |
|
231 |
ExtStorage provider. This parameter will be valid only for template |
|
232 |
`ext` e.g.:: |
|
233 |
|
|
234 |
$ gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1 |
|
235 |
|
|
236 |
The Extstorage interface will support different disks to be created by |
|
237 |
different providers. e.g.:: |
|
238 |
|
|
239 |
$ gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1 \ |
|
240 |
--disk=1:size=1G,provider=sample_provider2 \ |
|
241 |
--disk=2:size=3G,provider=sample_provider1 |
|
242 |
|
|
243 |
Finally, the ExtStorage interface will support passing of parameters to |
|
244 |
the ExtStorage provider. This will also be done per disk, from the |
|
245 |
command line:: |
|
246 |
|
|
247 |
$ gnt-instance add -t ext --disk=0:size=1G,provider=sample_provider1,\ |
|
248 |
param1=value1,param2=value2 |
|
249 |
|
|
250 |
The above parameters will be exported to the ExtStorage provider's |
|
251 |
scripts as the enviromental variables: |
|
252 |
|
|
253 |
- `EXTP_PARAM1 = str(value1)` |
|
254 |
- `EXTP_PARAM2 = str(value2)` |
|
255 |
|
|
256 |
We will also introduce a new Ganeti client called `gnt-storage` which |
|
257 |
will be used to diagnose ExtStorage providers and show information about |
|
258 |
them, similarly to the way `gnt-os diagose` and `gnt-os info` handle OS |
|
259 |
definitions. |
|
260 |
|
|
162 | 261 |
Long-term shared storage goals |
163 | 262 |
============================== |
263 |
|
|
164 | 264 |
Storage pool handling |
165 | 265 |
--------------------- |
166 | 266 |
|
167 | 267 |
A new cluster configuration attribute will be introduced, named |
168 | 268 |
“storage_pools”, modeled as a dictionary mapping storage pools to |
169 |
external storage drivers (see below), e.g.::
|
|
269 |
external storage providers (see below), e.g.::
|
|
170 | 270 |
|
171 | 271 |
{ |
172 | 272 |
"nas1": "foostore", |
... | ... | |
180 | 280 |
storage pools will be performed by implementing new options to the |
181 | 281 |
`gnt-cluster` command:: |
182 | 282 |
|
183 |
gnt-cluster modify --add-pool nas1 foostore |
|
184 |
gnt-cluster modify --remove-pool nas1 # There may be no instances using
|
|
185 |
# the pool to remove it |
|
283 |
$ gnt-cluster modify --add-pool nas1 foostore
|
|
284 |
$ gnt-cluster modify --remove-pool nas1 # There must be no instances using
|
|
285 |
# the pool to remove it
|
|
186 | 286 |
|
187 | 287 |
Furthermore, the storage pools will be used to indicate the availability |
188 | 288 |
of storage pools to different node groups, thus specifying the |
189 | 289 |
instances' “mobility domain”. |
190 | 290 |
|
191 |
New disk templates will also be necessary to facilitate the use of external |
|
192 |
storage. The proposed addition is a whole template namespace created by |
|
193 |
prefixing the pool names with a fixed string, e.g. “ext:”, forming names |
|
194 |
like “ext:nas1”, “ext:foo”. |
|
195 |
|
|
196 |
Interface to the external storage drivers |
|
197 |
----------------------------------------- |
|
198 |
|
|
199 |
In addition to external storage pools, a new interface will be |
|
200 |
introduced to allow external scripts to provision and manipulate shared |
|
201 |
storage. |
|
202 |
|
|
203 |
In order to provide storage provisioning and manipulation (e.g. growing, |
|
204 |
renaming) capabilities, each instance's disk template can possibly be |
|
205 |
associated with an external “storage driver” which, based on the |
|
206 |
instance's configuration and tags, will perform all supported storage |
|
207 |
operations using auxiliary means (e.g. XML-RPC, ssh, etc.). |
|
208 |
|
|
209 |
A “storage driver” will have to provide the following methods: |
|
210 |
|
|
211 |
- Create a disk |
|
212 |
- Remove a disk |
|
213 |
- Rename a disk |
|
214 |
- Resize a disk |
|
215 |
- Attach a disk to a given node |
|
216 |
- Detach a disk from a given node |
|
217 |
|
|
218 |
The proposed storage driver architecture borrows heavily from the OS |
|
219 |
interface and follows a one-script-per-function approach. A storage |
|
220 |
driver is expected to provide the following scripts: |
|
221 |
|
|
222 |
- `create` |
|
223 |
- `resize` |
|
224 |
- `rename` |
|
225 |
- `remove` |
|
226 |
- `attach` |
|
227 |
- `detach` |
|
228 |
|
|
229 |
These executables will be called once for each disk with no arguments |
|
230 |
and all required information will be passed through environment |
|
231 |
variables. The following environment variables will always be present on |
|
232 |
each invocation: |
|
233 |
|
|
234 |
- `INSTANCE_NAME`: The instance's name |
|
235 |
- `INSTANCE_UUID`: The instance's UUID |
|
236 |
- `INSTANCE_TAGS`: The instance's tags |
|
237 |
- `DISK_INDEX`: The current disk index. |
|
238 |
- `LOGICAL_ID`: The disk's logical id (if existing) |
|
239 |
- `POOL`: The storage pool the instance belongs to. |
|
240 |
|
|
241 |
Additional variables may be available in a per-script context (see |
|
242 |
below). |
|
243 |
|
|
244 |
Of particular importance is the disk's logical ID, which will act as |
|
245 |
glue between Ganeti and the external storage drivers; there are two |
|
246 |
possible ways of using a disk's logical ID in a storage driver: |
|
247 |
|
|
248 |
1. Simply use it as a unique identifier (e.g. UUID) and keep a separate, |
|
249 |
external database linking it to the actual storage. |
|
250 |
2. Encode all useful storage information in the logical ID and have the |
|
251 |
driver decode it at runtime. |
|
252 |
|
|
253 |
All scripts should return 0 on success and non-zero on error accompanied by |
|
254 |
an appropriate error message on stderr. Furthermore, the following |
|
255 |
special cases are defined: |
|
256 |
|
|
257 |
1. `create` In case of success, a string representing the disk's logical |
|
258 |
id must be returned on stdout, which will be saved in the instance's |
|
259 |
configuration and can be later used by the other scripts of the same |
|
260 |
storage driver. The logical id may be based on instance name, |
|
261 |
instance uuid and/or disk index. |
|
262 |
|
|
263 |
Additional environment variables present: |
|
264 |
- `DISK_SIZE`: The requested disk size in MiB |
|
265 |
|
|
266 |
2. `resize` In case of success, output the new disk size. |
|
267 |
|
|
268 |
Additional environment variables present: |
|
269 |
- `DISK_SIZE`: The requested disk size in MiB |
|
270 |
|
|
271 |
3. `rename` On success, a new logical id should be returned, which will |
|
272 |
replace the old one. This script is meant to rename the instance's |
|
273 |
backing store and update the disk's logical ID in case one of them is |
|
274 |
bound to the instance name. |
|
291 |
The pool, in which to put the new instance's disk, will be defined at |
|
292 |
the command line during `instance add`. This will become possible by |
|
293 |
replacing the IDISK_PROVIDER parameter with a new one, called `IDISK_POOL |
|
294 |
= pool`. The cmdlib logic will then look at the cluster-level mapping |
|
295 |
dictionary to determine the ExtStorage provider for the given pool. |
|
275 | 296 |
|
276 |
Additional environment variables present:
|
|
277 |
- `NEW_INSTANCE_NAME`: The instance's new name.
|
|
297 |
gnt-storage
|
|
298 |
-----------
|
|
278 | 299 |
|
300 |
The ``gnt-storage`` client can be extended to support pool management |
|
301 |
(creation/modification/deletion of pools, connection/disconnection of |
|
302 |
pools to nodegroups, etc.). It can also be extended to diagnose and |
|
303 |
provide information for internal disk templates too, such as lvm and |
|
304 |
drbd. |
|
279 | 305 |
|
280 | 306 |
.. vim: set textwidth=72 : |
Also available in: Unified diff