Revision b284f504
/dev/null | ||
---|---|---|
1 |
============================ |
|
2 |
Storage free space reporting |
|
3 |
============================ |
|
4 |
|
|
5 |
.. contents:: :depth: 4 |
|
6 |
|
|
7 |
Background |
|
8 |
========== |
|
9 |
|
|
10 |
Currently, there is no consistent management of different variants of storage |
|
11 |
in Ganeti. One direct consequence is that storage space reporting is currently |
|
12 |
broken for all storage that is not based on lvm technolgy. This design looks at |
|
13 |
the root causes and proposes a way to fix it. |
|
14 |
|
|
15 |
FIXME: rename the design doc to make clear that space reporting is not the only |
|
16 |
thing covered here? |
|
17 |
|
|
18 |
Proposed changes |
|
19 |
================ |
|
20 |
|
|
21 |
We propose to streamline handling of different storage types and disk templates. |
|
22 |
Currently, there is no consistent implementation for dis/enabling of disk |
|
23 |
templates and/or storage types. |
|
24 |
|
|
25 |
Our idea is to introduce a list of enabled disk templates, which can be |
|
26 |
used by instances in the cluster. Based on this list, we want to provide |
|
27 |
storage reporting mechanisms for the available disk templates. Since some |
|
28 |
disk templates share the same underlying storage technology (for example |
|
29 |
``drbd`` and ``plain`` are based on ``lvm``), we map disk templates to storage |
|
30 |
types and implement storage space reporting for each storage type. |
|
31 |
|
|
32 |
Configuration changes |
|
33 |
--------------------- |
|
34 |
|
|
35 |
Add a new attribute "enabled_disk_templates" (type: list of strings) to the |
|
36 |
cluster config which holds disk templates, for example, "drbd", "file", |
|
37 |
or "ext". This attribute represents the list of disk templates that are enabled |
|
38 |
cluster-wide for usage by the instances. It will not be possible to create |
|
39 |
instances with a disk template that is not enabled, as well as it will not be |
|
40 |
possible to remove a disk template from the list if there are still instances |
|
41 |
using it. |
|
42 |
|
|
43 |
The list of enabled disk templates can contain any non-empty subset of |
|
44 |
the currently implemented disk templates: ``blockdev``, ``diskless``, ``drbd``, |
|
45 |
``ext``, ``file``, ``plain``, ``rbd``, and ``sharedfile``. See |
|
46 |
``DISK_TEMPLATES`` in ``constants.py``. |
|
47 |
|
|
48 |
Note that the abovementioned list of enabled disk types is just a "mechanism" |
|
49 |
parameter that defines which disk templates the cluster can use. Further |
|
50 |
filtering about what's allowed can go in the ipolicy, which is not covered in |
|
51 |
this design doc. Note that it is possible to force an instance to use a disk |
|
52 |
template that is not allowed by the ipolicy. This is not possible if the |
|
53 |
template is not enabled by the cluster. |
|
54 |
|
|
55 |
FIXME: In what way should verification between the enabled disk templates in |
|
56 |
the cluster and in the ipolicy take place? |
|
57 |
|
|
58 |
We consider the first disk template in the list to be the default template for |
|
59 |
instance creation and storage reporting. This will remove the need to specify |
|
60 |
the disk template with ``-t`` on instance creation. |
|
61 |
|
|
62 |
Currently, cluster-wide dis/enabling of disk templates is not implemented |
|
63 |
consistently. ``lvm`` based disk templates are enabled by specifying a volume |
|
64 |
group name on cluster initialization and can only be disabled by explicitly |
|
65 |
using the option ``--no-lvm-storage``. This will be replaced by adding/removing |
|
66 |
``drbd`` and ``plain`` from the set of enabled disk templates. |
|
67 |
|
|
68 |
Up till now, file storage and shared file storage could be dis/enabled at |
|
69 |
``./configure`` time. This will also be replaced by adding/removing the |
|
70 |
respective disk templates from the set of enabled disk templates. |
|
71 |
|
|
72 |
There is currently no possibility to dis/enable the disk templates |
|
73 |
``diskless``, ``blockdev``, ``ext``, and ``rdb``. By introducing the set of |
|
74 |
enabled disk templates, we will require these disk templates to be explicitely |
|
75 |
enabled in order to be used. The idea is that the administrator of the cluster |
|
76 |
can tailor the cluster configuration to what is actually needed in the cluster. |
|
77 |
There is hope that this will lead to cleaner code, better performance and fewer |
|
78 |
bugs. |
|
79 |
|
|
80 |
When upgrading the configuration from a version that did not have the list |
|
81 |
of enabled disk templates, we have to decide which disk templates are enabled |
|
82 |
based on the current configuration of the cluster. We propose the following |
|
83 |
update logic to be implemented in the online update of the config in |
|
84 |
the ``Cluster`` class in ``objects.py``: |
|
85 |
- If a ``volume_group_name`` is existing, then enable ``drbd`` and ``plain``. |
|
86 |
(TODO: can we narrow that down further?) |
|
87 |
- If ``file`` or ``sharedfile`` was enabled at configure time, add the |
|
88 |
respective disk template to the list of enabled disk templates. |
|
89 |
- For disk templates ``diskless``, ``blockdev``, ``ext``, and ``rbd``, we |
|
90 |
inspect the current cluster configuration regarding whether or not there |
|
91 |
are instances that use one of those disk templates. We will add only those |
|
92 |
that are currently in use. |
|
93 |
The order in which the list of enabled disk templates is built up will be |
|
94 |
determined by a preference order based on when in the history of Ganeti the |
|
95 |
disk templates were introduced (thus being a heuristic for which are used |
|
96 |
more than others). |
|
97 |
|
|
98 |
The list of enabled disk templates can be specified on cluster initialization |
|
99 |
with ``gnt-cluster init`` using the optional parameter |
|
100 |
``--enabled-disk-templates``. If it is not set, it will be set to a default |
|
101 |
set of enabled disk templates, which includes the following disk templates: |
|
102 |
``drbd`` and ``plain``. The list can be shrunk or extended by |
|
103 |
``gnt-cluster modify`` using the same parameter. |
|
104 |
|
|
105 |
Storage reporting |
|
106 |
----------------- |
|
107 |
|
|
108 |
The storage reporting in ``gnt-node list`` will be the first user of the |
|
109 |
newly introduced list of enabled disk templates. Currently, storage reporting |
|
110 |
works only for lvm-based storage. We want to extend that and report storage |
|
111 |
for the enabled disk templates. The default of ``gnt-node list`` will only |
|
112 |
report on storage of the default disk template (the first in the list of enabled |
|
113 |
disk templates). One can explicitly ask for storage reporting on the other |
|
114 |
enabled disk templates with the ``-o`` option. |
|
115 |
|
|
116 |
Some of the currently implemented disk templates share the same base storage |
|
117 |
technology. Since the storage reporting is based on the underlying technology |
|
118 |
rather than on the user-facing disk templates, we introduce storage types to |
|
119 |
represent the underlying technology. There will be a mapping from disk templates |
|
120 |
to storage types, which will be used by the storage reporting backend to pick |
|
121 |
the right method for estimating the storage for the different disk templates. |
|
122 |
|
|
123 |
The proposed storage types are ``blockdev``, ``diskless``, ``ext``, ``file``, |
|
124 |
``lvm-pv``, ``lvm-vg``, ``rados``. |
|
125 |
|
|
126 |
The mapping from disk templates to storage types will be: ``drbd`` and ``plain`` |
|
127 |
to ``lvm-vg``, ``file`` and ``sharedfile`` to ``file``, and all others to their |
|
128 |
obvious counterparts. |
|
129 |
|
|
130 |
Note that there is no disk template mapping to ``lvm-pv``, because this storage |
|
131 |
type is currently only used to enable the user to mark it as (un)allocatable. |
|
132 |
(See ``man gnt-node``.) It is not possible to create an instance on a storage |
|
133 |
unit that is of type ``lvm-pv`` directly, therefore it is not included in the |
|
134 |
mapping. |
|
135 |
|
|
136 |
The storage reporting for file storage will report space on the file storage |
|
137 |
dir, which is currently limited to one directory. In the future, if we'll have |
|
138 |
support for more directories, or for per-nodegroup directories this can be |
|
139 |
changed. |
|
140 |
|
|
141 |
For now, we will implement only the storage reporting for non-shared storage, |
|
142 |
that is disk templates ``file``, ``lvm``, and ``drbd``. For disk template |
|
143 |
``diskless``, there is obviously nothing to report about. When implementing |
|
144 |
storage reporting for file, we can also use it for ``sharedfile``, since it |
|
145 |
uses the same file system mechanisms to determine the free space. In the |
|
146 |
future, we can optimize storage reporting for shared storage by not querying |
|
147 |
all nodes that use a common shared file for the same space information. |
|
148 |
|
|
149 |
In the future, we extend storage reporting for shared storage types like |
|
150 |
``rados`` and ``ext``. Note that it will not make sense to query each node for |
|
151 |
storage reporting on a storage unit that is used by several nodes. |
|
152 |
|
|
153 |
We will not implement storage reporting for the ``blockdev`` disk template, |
|
154 |
because block devices are always adopted after being provided by the system |
|
155 |
administrator, thus coming from outside Ganeti. There is no point in storage |
|
156 |
reporting for block devices, because Ganeti will never try to allocate storage |
|
157 |
inside a block device. |
|
158 |
|
|
159 |
RPC changes |
|
160 |
----------- |
|
161 |
|
|
162 |
The noded RPC call that reports node storage space will be changed to |
|
163 |
accept a list of <disktemplate>,<key> string tuples. For each of them, it will |
|
164 |
report the free amount of storage space found on storage <key> as known |
|
165 |
by the requested disk template. Depending on the disk template, the key would |
|
166 |
be a volume group name, in case of lvm-based disk templates, a directory name |
|
167 |
for the file and shared file storage, and a rados pool name for rados storage. |
|
168 |
|
|
169 |
Masterd will know through the mapping of disk templates to storage types which |
|
170 |
storage type uses which mechanism for storage calculation and invoke only the |
|
171 |
needed ones. |
|
172 |
|
|
173 |
Note that for file and sharedfile the node knows which directories are allowed |
|
174 |
and won't allow any other directory to be queried for security reasons. The |
|
175 |
actual path still needs to be passed to distinguish the two, as the type will |
|
176 |
be the same for both. |
|
177 |
|
|
178 |
These calculations will be implemented in the node storage system |
|
179 |
(currently lib/storage.py) but querying will still happen through the |
|
180 |
``node info`` call, to avoid requiring an extra RPC each time. |
|
181 |
|
|
182 |
Ganeti reporting |
|
183 |
---------------- |
|
184 |
|
|
185 |
`gnt-node list`` can be queried for the different disk templates, if they |
|
186 |
are enabled. By default, it will just report information about the default |
|
187 |
disk template. Examples:: |
|
188 |
|
|
189 |
> gnt-node list |
|
190 |
Node DTotal DFree MTotal MNode MFree Pinst Sinst |
|
191 |
mynode1 3.6T 3.6T 64.0G 1023M 62.2G 1 0 |
|
192 |
mynode2 3.6T 3.6T 64.0G 1023M 62.0G 2 1 |
|
193 |
mynode3 3.6T 3.6T 64.0G 1023M 62.3G 0 2 |
|
194 |
|
|
195 |
> gnt-node list -o dtotal/drbd,dfree/file |
|
196 |
Node DTotal (drbd, myvg) DFree (file, mydir) |
|
197 |
mynode1 3.6T - |
|
198 |
mynode2 3.6T - |
|
199 |
|
|
200 |
Note that for drbd, we only report the space of the vg and only if it was not |
|
201 |
renamed to something different than the default volume group name. With this |
|
202 |
design, there is also no possibility to ask about the meta volume group. We |
|
203 |
restrict the design here to make the transition to storage pools easier (as it |
|
204 |
is an interim state only). It is the administrator's responsibility to ensure |
|
205 |
that there is enough space for the meta volume group. |
|
206 |
|
|
207 |
When storage pools are implemented, we switch from referencing the disk template |
|
208 |
to referencing the storage pool name. For that, of course, the pool names need |
|
209 |
to be unique over all storage types. For drbd, we will use the default 'drbd' |
|
210 |
storage pool and possibly a second lvm-based storage pool for the metavg. It |
|
211 |
will be possible to rename storage pools (thus also the default lvm storage |
|
212 |
pool). There will be new functionality to ask about what storage pools are |
|
213 |
available and of what type. Storage pools will have a storage pool type which is |
|
214 |
one of the disk templates. There can be more than one storage pool based on the |
|
215 |
same disk template, therefore we will then start referencing the storage pool |
|
216 |
name instead of the disk template. |
|
217 |
|
|
218 |
``gnt-cluster info`` will report which disk templates are enabled, i.e. |
|
219 |
which ones are supported according to the cluster configuration. Example |
|
220 |
output:: |
|
221 |
|
|
222 |
> gnt-cluster info |
|
223 |
[...] |
|
224 |
Cluster parameters: |
|
225 |
- [...] |
|
226 |
- enabled disk templates: plain, drbd, sharedfile, rados |
|
227 |
- [...] |
|
228 |
|
|
229 |
``gnt-node list-storage`` will not be affected by any changes, since this design |
|
230 |
is restricted only to free storage reporting for non-shared storage types. |
|
231 |
|
|
232 |
Allocator changes |
|
233 |
----------------- |
|
234 |
|
|
235 |
The iallocator protocol doesn't need to change: since we know which |
|
236 |
disk template an instance has, we'll pass only the "free" value for that |
|
237 |
disk template to the iallocator, when asking for an allocation to be |
|
238 |
made. Note that for DRBD nowadays we ignore the case when vg and metavg |
|
239 |
are different, and we only consider the main volume group. Fixing this is |
|
240 |
outside the scope of this design. |
|
241 |
|
|
242 |
With this design, we ensure forward-compatibility with respect to storage |
|
243 |
pools. For now, we'll report space for all available disk templates that |
|
244 |
are based on non-shared storage types, in the future, for all available |
|
245 |
storage pools. |
|
246 |
|
|
247 |
Rebalancing changes |
|
248 |
------------------- |
|
249 |
|
|
250 |
Hbal will not need changes, as it handles it already. We don't forecast |
|
251 |
any changes needed to it. |
|
252 |
|
|
253 |
Space reporting changes |
|
254 |
----------------------- |
|
255 |
|
|
256 |
Hspace will by default report by assuming the allocation will happen on |
|
257 |
the default disk template for the cluster/nodegroup. An option will be added |
|
258 |
to manually specify a different storage. |
|
259 |
|
|
260 |
Interactions with Partitioned Ganeti |
|
261 |
------------------------------------ |
|
262 |
|
|
263 |
Also the design for :doc:`Partitioned Ganeti <design-partitioned>` deals |
|
264 |
with reporting free space. Partitioned Ganeti has a different way to |
|
265 |
report free space for LVM on nodes where the ``exclusive_storage`` flag |
|
266 |
is set. That doesn't interact directly with this design, as the specifics |
|
267 |
of how the free space is computed is not in the scope of this design. |
|
268 |
But the ``node info`` call contains the value of the |
|
269 |
``exclusive_storage`` flag, which is currently only meaningful for the |
|
270 |
LVM storage type. Additional flags like the ``exclusive_storage`` flag |
|
271 |
for lvm might be useful for other disk templates / storage types as well. |
|
272 |
We therefore extend the RPC call with <disktemplate>,<key> to |
|
273 |
<disktemplate>,<key>,<params> to include any disk-template-specific |
|
274 |
(or storage-type specific) parameters in the RPC call. |
|
275 |
|
|
276 |
The reporting of free spindles, also part of Partitioned Ganeti, is not |
|
277 |
concerned with this design doc, as those are seen as a separate resource. |
|
278 |
|
|
279 |
.. vim: set textwidth=72 : |
|
280 |
.. Local Variables: |
|
281 |
.. mode: rst |
|
282 |
.. fill-column: 72 |
|
283 |
.. End: |
b/doc/design-storagetypes.rst | ||
---|---|---|
1 |
============================================================================= |
|
2 |
Management of storage types and disk templates, incl. storage space reporting |
|
3 |
============================================================================= |
|
4 |
|
|
5 |
.. contents:: :depth: 4 |
|
6 |
|
|
7 |
Background |
|
8 |
========== |
|
9 |
|
|
10 |
Currently, there is no consistent management of different variants of storage |
|
11 |
in Ganeti. One direct consequence is that storage space reporting is currently |
|
12 |
broken for all storage that is not based on lvm technolgy. This design looks at |
|
13 |
the root causes and proposes a way to fix it. |
|
14 |
|
|
15 |
Proposed changes |
|
16 |
================ |
|
17 |
|
|
18 |
We propose to streamline handling of different storage types and disk templates. |
|
19 |
Currently, there is no consistent implementation for dis/enabling of disk |
|
20 |
templates and/or storage types. |
|
21 |
|
|
22 |
Our idea is to introduce a list of enabled disk templates, which can be |
|
23 |
used by instances in the cluster. Based on this list, we want to provide |
|
24 |
storage reporting mechanisms for the available disk templates. Since some |
|
25 |
disk templates share the same underlying storage technology (for example |
|
26 |
``drbd`` and ``plain`` are based on ``lvm``), we map disk templates to storage |
|
27 |
types and implement storage space reporting for each storage type. |
|
28 |
|
|
29 |
Configuration changes |
|
30 |
--------------------- |
|
31 |
|
|
32 |
Add a new attribute "enabled_disk_templates" (type: list of strings) to the |
|
33 |
cluster config which holds disk templates, for example, "drbd", "file", |
|
34 |
or "ext". This attribute represents the list of disk templates that are enabled |
|
35 |
cluster-wide for usage by the instances. It will not be possible to create |
|
36 |
instances with a disk template that is not enabled, as well as it will not be |
|
37 |
possible to remove a disk template from the list if there are still instances |
|
38 |
using it. |
|
39 |
|
|
40 |
The list of enabled disk templates can contain any non-empty subset of |
|
41 |
the currently implemented disk templates: ``blockdev``, ``diskless``, ``drbd``, |
|
42 |
``ext``, ``file``, ``plain``, ``rbd``, and ``sharedfile``. See |
|
43 |
``DISK_TEMPLATES`` in ``constants.py``. |
|
44 |
|
|
45 |
Note that the abovementioned list of enabled disk types is just a "mechanism" |
|
46 |
parameter that defines which disk templates the cluster can use. Further |
|
47 |
filtering about what's allowed can go in the ipolicy, which is not covered in |
|
48 |
this design doc. Note that it is possible to force an instance to use a disk |
|
49 |
template that is not allowed by the ipolicy. This is not possible if the |
|
50 |
template is not enabled by the cluster. |
|
51 |
|
|
52 |
FIXME: In what way should verification between the enabled disk templates in |
|
53 |
the cluster and in the ipolicy take place? |
|
54 |
|
|
55 |
We consider the first disk template in the list to be the default template for |
|
56 |
instance creation and storage reporting. This will remove the need to specify |
|
57 |
the disk template with ``-t`` on instance creation. |
|
58 |
|
|
59 |
Currently, cluster-wide dis/enabling of disk templates is not implemented |
|
60 |
consistently. ``lvm`` based disk templates are enabled by specifying a volume |
|
61 |
group name on cluster initialization and can only be disabled by explicitly |
|
62 |
using the option ``--no-lvm-storage``. This will be replaced by adding/removing |
|
63 |
``drbd`` and ``plain`` from the set of enabled disk templates. |
|
64 |
|
|
65 |
Up till now, file storage and shared file storage could be dis/enabled at |
|
66 |
``./configure`` time. This will also be replaced by adding/removing the |
|
67 |
respective disk templates from the set of enabled disk templates. |
|
68 |
|
|
69 |
There is currently no possibility to dis/enable the disk templates |
|
70 |
``diskless``, ``blockdev``, ``ext``, and ``rdb``. By introducing the set of |
|
71 |
enabled disk templates, we will require these disk templates to be explicitely |
|
72 |
enabled in order to be used. The idea is that the administrator of the cluster |
|
73 |
can tailor the cluster configuration to what is actually needed in the cluster. |
|
74 |
There is hope that this will lead to cleaner code, better performance and fewer |
|
75 |
bugs. |
|
76 |
|
|
77 |
When upgrading the configuration from a version that did not have the list |
|
78 |
of enabled disk templates, we have to decide which disk templates are enabled |
|
79 |
based on the current configuration of the cluster. We propose the following |
|
80 |
update logic to be implemented in the online update of the config in |
|
81 |
the ``Cluster`` class in ``objects.py``: |
|
82 |
- If a ``volume_group_name`` is existing, then enable ``drbd`` and ``plain``. |
|
83 |
(TODO: can we narrow that down further?) |
|
84 |
- If ``file`` or ``sharedfile`` was enabled at configure time, add the |
|
85 |
respective disk template to the list of enabled disk templates. |
|
86 |
- For disk templates ``diskless``, ``blockdev``, ``ext``, and ``rbd``, we |
|
87 |
inspect the current cluster configuration regarding whether or not there |
|
88 |
are instances that use one of those disk templates. We will add only those |
|
89 |
that are currently in use. |
|
90 |
The order in which the list of enabled disk templates is built up will be |
|
91 |
determined by a preference order based on when in the history of Ganeti the |
|
92 |
disk templates were introduced (thus being a heuristic for which are used |
|
93 |
more than others). |
|
94 |
|
|
95 |
The list of enabled disk templates can be specified on cluster initialization |
|
96 |
with ``gnt-cluster init`` using the optional parameter |
|
97 |
``--enabled-disk-templates``. If it is not set, it will be set to a default |
|
98 |
set of enabled disk templates, which includes the following disk templates: |
|
99 |
``drbd`` and ``plain``. The list can be shrunk or extended by |
|
100 |
``gnt-cluster modify`` using the same parameter. |
|
101 |
|
|
102 |
Storage reporting |
|
103 |
----------------- |
|
104 |
|
|
105 |
The storage reporting in ``gnt-node list`` will be the first user of the |
|
106 |
newly introduced list of enabled disk templates. Currently, storage reporting |
|
107 |
works only for lvm-based storage. We want to extend that and report storage |
|
108 |
for the enabled disk templates. The default of ``gnt-node list`` will only |
|
109 |
report on storage of the default disk template (the first in the list of enabled |
|
110 |
disk templates). One can explicitly ask for storage reporting on the other |
|
111 |
enabled disk templates with the ``-o`` option. |
|
112 |
|
|
113 |
Some of the currently implemented disk templates share the same base storage |
|
114 |
technology. Since the storage reporting is based on the underlying technology |
|
115 |
rather than on the user-facing disk templates, we introduce storage types to |
|
116 |
represent the underlying technology. There will be a mapping from disk templates |
|
117 |
to storage types, which will be used by the storage reporting backend to pick |
|
118 |
the right method for estimating the storage for the different disk templates. |
|
119 |
|
|
120 |
The proposed storage types are ``blockdev``, ``diskless``, ``ext``, ``file``, |
|
121 |
``lvm-pv``, ``lvm-vg``, ``rados``. |
|
122 |
|
|
123 |
The mapping from disk templates to storage types will be: ``drbd`` and ``plain`` |
|
124 |
to ``lvm-vg``, ``file`` and ``sharedfile`` to ``file``, and all others to their |
|
125 |
obvious counterparts. |
|
126 |
|
|
127 |
Note that there is no disk template mapping to ``lvm-pv``, because this storage |
|
128 |
type is currently only used to enable the user to mark it as (un)allocatable. |
|
129 |
(See ``man gnt-node``.) It is not possible to create an instance on a storage |
|
130 |
unit that is of type ``lvm-pv`` directly, therefore it is not included in the |
|
131 |
mapping. |
|
132 |
|
|
133 |
The storage reporting for file storage will report space on the file storage |
|
134 |
dir, which is currently limited to one directory. In the future, if we'll have |
|
135 |
support for more directories, or for per-nodegroup directories this can be |
|
136 |
changed. |
|
137 |
|
|
138 |
For now, we will implement only the storage reporting for non-shared storage, |
|
139 |
that is disk templates ``file``, ``lvm``, and ``drbd``. For disk template |
|
140 |
``diskless``, there is obviously nothing to report about. When implementing |
|
141 |
storage reporting for file, we can also use it for ``sharedfile``, since it |
|
142 |
uses the same file system mechanisms to determine the free space. In the |
|
143 |
future, we can optimize storage reporting for shared storage by not querying |
|
144 |
all nodes that use a common shared file for the same space information. |
|
145 |
|
|
146 |
In the future, we extend storage reporting for shared storage types like |
|
147 |
``rados`` and ``ext``. Note that it will not make sense to query each node for |
|
148 |
storage reporting on a storage unit that is used by several nodes. |
|
149 |
|
|
150 |
We will not implement storage reporting for the ``blockdev`` disk template, |
|
151 |
because block devices are always adopted after being provided by the system |
|
152 |
administrator, thus coming from outside Ganeti. There is no point in storage |
|
153 |
reporting for block devices, because Ganeti will never try to allocate storage |
|
154 |
inside a block device. |
|
155 |
|
|
156 |
RPC changes |
|
157 |
----------- |
|
158 |
|
|
159 |
The noded RPC call that reports node storage space will be changed to |
|
160 |
accept a list of <disktemplate>,<key> string tuples. For each of them, it will |
|
161 |
report the free amount of storage space found on storage <key> as known |
|
162 |
by the requested disk template. Depending on the disk template, the key would |
|
163 |
be a volume group name, in case of lvm-based disk templates, a directory name |
|
164 |
for the file and shared file storage, and a rados pool name for rados storage. |
|
165 |
|
|
166 |
Masterd will know through the mapping of disk templates to storage types which |
|
167 |
storage type uses which mechanism for storage calculation and invoke only the |
|
168 |
needed ones. |
|
169 |
|
|
170 |
Note that for file and sharedfile the node knows which directories are allowed |
|
171 |
and won't allow any other directory to be queried for security reasons. The |
|
172 |
actual path still needs to be passed to distinguish the two, as the type will |
|
173 |
be the same for both. |
|
174 |
|
|
175 |
These calculations will be implemented in the node storage system |
|
176 |
(currently lib/storage.py) but querying will still happen through the |
|
177 |
``node info`` call, to avoid requiring an extra RPC each time. |
|
178 |
|
|
179 |
Ganeti reporting |
|
180 |
---------------- |
|
181 |
|
|
182 |
`gnt-node list`` can be queried for the different disk templates, if they |
|
183 |
are enabled. By default, it will just report information about the default |
|
184 |
disk template. Examples:: |
|
185 |
|
|
186 |
> gnt-node list |
|
187 |
Node DTotal DFree MTotal MNode MFree Pinst Sinst |
|
188 |
mynode1 3.6T 3.6T 64.0G 1023M 62.2G 1 0 |
|
189 |
mynode2 3.6T 3.6T 64.0G 1023M 62.0G 2 1 |
|
190 |
mynode3 3.6T 3.6T 64.0G 1023M 62.3G 0 2 |
|
191 |
|
|
192 |
> gnt-node list -o dtotal/drbd,dfree/file |
|
193 |
Node DTotal (drbd, myvg) DFree (file, mydir) |
|
194 |
mynode1 3.6T - |
|
195 |
mynode2 3.6T - |
|
196 |
|
|
197 |
Note that for drbd, we only report the space of the vg and only if it was not |
|
198 |
renamed to something different than the default volume group name. With this |
|
199 |
design, there is also no possibility to ask about the meta volume group. We |
|
200 |
restrict the design here to make the transition to storage pools easier (as it |
|
201 |
is an interim state only). It is the administrator's responsibility to ensure |
|
202 |
that there is enough space for the meta volume group. |
|
203 |
|
|
204 |
When storage pools are implemented, we switch from referencing the disk template |
|
205 |
to referencing the storage pool name. For that, of course, the pool names need |
|
206 |
to be unique over all storage types. For drbd, we will use the default 'drbd' |
|
207 |
storage pool and possibly a second lvm-based storage pool for the metavg. It |
|
208 |
will be possible to rename storage pools (thus also the default lvm storage |
|
209 |
pool). There will be new functionality to ask about what storage pools are |
|
210 |
available and of what type. Storage pools will have a storage pool type which is |
|
211 |
one of the disk templates. There can be more than one storage pool based on the |
|
212 |
same disk template, therefore we will then start referencing the storage pool |
|
213 |
name instead of the disk template. |
|
214 |
|
|
215 |
``gnt-cluster info`` will report which disk templates are enabled, i.e. |
|
216 |
which ones are supported according to the cluster configuration. Example |
|
217 |
output:: |
|
218 |
|
|
219 |
> gnt-cluster info |
|
220 |
[...] |
|
221 |
Cluster parameters: |
|
222 |
- [...] |
|
223 |
- enabled disk templates: plain, drbd, sharedfile, rados |
|
224 |
- [...] |
|
225 |
|
|
226 |
``gnt-node list-storage`` will not be affected by any changes, since this design |
|
227 |
is restricted only to free storage reporting for non-shared storage types. |
|
228 |
|
|
229 |
Allocator changes |
|
230 |
----------------- |
|
231 |
|
|
232 |
The iallocator protocol doesn't need to change: since we know which |
|
233 |
disk template an instance has, we'll pass only the "free" value for that |
|
234 |
disk template to the iallocator, when asking for an allocation to be |
|
235 |
made. Note that for DRBD nowadays we ignore the case when vg and metavg |
|
236 |
are different, and we only consider the main volume group. Fixing this is |
|
237 |
outside the scope of this design. |
|
238 |
|
|
239 |
With this design, we ensure forward-compatibility with respect to storage |
|
240 |
pools. For now, we'll report space for all available disk templates that |
|
241 |
are based on non-shared storage types, in the future, for all available |
|
242 |
storage pools. |
|
243 |
|
|
244 |
Rebalancing changes |
|
245 |
------------------- |
|
246 |
|
|
247 |
Hbal will not need changes, as it handles it already. We don't forecast |
|
248 |
any changes needed to it. |
|
249 |
|
|
250 |
Space reporting changes |
|
251 |
----------------------- |
|
252 |
|
|
253 |
Hspace will by default report by assuming the allocation will happen on |
|
254 |
the default disk template for the cluster/nodegroup. An option will be added |
|
255 |
to manually specify a different storage. |
|
256 |
|
|
257 |
Interactions with Partitioned Ganeti |
|
258 |
------------------------------------ |
|
259 |
|
|
260 |
Also the design for :doc:`Partitioned Ganeti <design-partitioned>` deals |
|
261 |
with reporting free space. Partitioned Ganeti has a different way to |
|
262 |
report free space for LVM on nodes where the ``exclusive_storage`` flag |
|
263 |
is set. That doesn't interact directly with this design, as the specifics |
|
264 |
of how the free space is computed is not in the scope of this design. |
|
265 |
But the ``node info`` call contains the value of the |
|
266 |
``exclusive_storage`` flag, which is currently only meaningful for the |
|
267 |
LVM storage type. Additional flags like the ``exclusive_storage`` flag |
|
268 |
for lvm might be useful for other disk templates / storage types as well. |
|
269 |
We therefore extend the RPC call with <disktemplate>,<key> to |
|
270 |
<disktemplate>,<key>,<params> to include any disk-template-specific |
|
271 |
(or storage-type specific) parameters in the RPC call. |
|
272 |
|
|
273 |
The reporting of free spindles, also part of Partitioned Ganeti, is not |
|
274 |
concerned with this design doc, as those are seen as a separate resource. |
|
275 |
|
|
276 |
.. vim: set textwidth=72 : |
|
277 |
.. Local Variables: |
|
278 |
.. mode: rst |
|
279 |
.. fill-column: 72 |
|
280 |
.. End: |
Also available in: Unified diff