Statistics
| Branch: | Tag: | Revision:

root / doc / design-storagespace.rst @ 33c730a2

History | View | Annotate | Download (7 kB)

1
============================
2
Storage free space reporting
3
============================
4

    
5
.. contents:: :depth: 4
6

    
7
Background
8
==========
9

    
10
Currently Space reporting is broken for all storage types except drbd or
11
lvm (plain). This design looks at the root causes and proposes a way to
12
fix it.
13

    
14
Proposed changes
15
================
16

    
17
The changes below will streamline Ganeti to properly support
18
interaction with different storage types.
19

    
20
Configuration changes
21
---------------------
22

    
23
Add a new attribute "enabled_storage_methods" (type: list of strings) to the
24
cluster config which holds the types of storages, for example, "plain", "drbd",
25
or "ext". We consider the first one of the list as the default method.
26

    
27
For file storage, we'll report the storage space on the file storage dir,
28
which is currently limited to one directory. In the future, if we'll have
29
support for more directories, or for per-nodegroup directories this can be
30
changed.
31

    
32
Note that the abovementioned enabled_storage_methods are just "mechanisms"
33
parameters that define which storage methods the cluster can use. Further
34
filtering about what's allowed can go in the ipolicy, but these changes are
35
not covered in this design doc.
36

    
37
Since the ipolicy currently has a list of enabled storage types, we'll
38
use that to decide which storage type is the default, and to self-select
39
it for new instance creations, and reporting.
40

    
41
Enabling/disabling of storage types at ``./configure`` time will be
42
eventually removed.
43

    
44
RPC changes
45
-----------
46

    
47
The noded RPC call that reports node storage space will be changed to
48
accept a list of <method>,<key> string tuples. For each of them, it will
49
report the free amount of storage space found on storage <key> as known
50
by the requested storage type method. For example methods are ``lvm``,
51
``filesystem``, or ``rados``, and the key would be a volume group name, in
52
the case of lvm, a directory name for the filesystem and a rados pool name
53
for rados storage.
54

    
55
For now, we will implement only the storage reporting for non-shared storage,
56
that is ``filesystem`` and ``lvm``. For shared storage methods like ``rados``
57
and ``ext`` we will not implement a free space calculation, because it does
58
not make sense to query each node for the free space of a commonly used
59
storage.
60

    
61
Masterd will know (through a constant map) which storage type uses which
62
method for storage calculation (i.e. ``plain`` and ``drbd`` use ``lvm``,
63
``file`` uses ``filesystem``, etc) and query the one needed (or all of the
64
needed ones).
65

    
66
Note that for file and sharedfile the node knows which directories are
67
allowed and won't allow any other directory to be queried for security
68
reasons. The actual path still needs to be passed to distinguish the
69
two, as the method will be the same for both.
70

    
71
These calculations will be implemented in the node storage system
72
(currently lib/storage.py) but querying will still happen through the
73
``node info`` call, to avoid requiring an extra RPC each time.
74

    
75
Ganeti reporting
76
----------------
77

    
78
`gnt-node list`` can be queried for the different storage methods, if they
79
are enabled. By default, it will just report information about the default
80
storage method. Examples::
81

    
82
  > gnt-node list
83
  Node                       DTotal DFree MTotal MNode MFree Pinst Sinst
84
  mynode1                      3.6T  3.6T  64.0G 1023M 62.2G     1     0
85
  mynode2                      3.6T  3.6T  64.0G 1023M 62.0G     2     1
86
  mynode3                      3.6T  3.6T  64.0G 1023M 62.3G     0     2
87

    
88
  > gnt-node list -o dtotal/lvm,dfree/rados
89
  Node      DTotal (Lvm, myvg) DFree (Rados, myrados)
90
  mynode1                 3.6T                      -
91
  mynode2                 3.6T                      -
92

    
93
Note that for drbd, we only report the space of the vg and only if it was not
94
renamed to something different than the default volume group name. With this
95
design, there is also no possibility to ask about the meta volume group. We
96
restrict the design here to make the transition to storage pools easier (as it
97
is an interim state only). It is the administrator's responsibility to ensure
98
that there is enough space for the meta volume group.
99

    
100
When storage pools are implemented, we switch from referencing the storage
101
method to referencing the storage pool name. For that, of course, the pool
102
names need to be unique over all storage methods. For drbd, we will use the
103
default 'lvm' storage pool and possibly a second lvm-based storage pool for
104
the metavg. It will be possible to rename storage pools (thus also the default
105
lvm storage pool). There will be new functionality to ask about what storage
106
pools are available and of what type.
107

    
108
``gnt-cluster info`` will report which storage methods are enabled, i.e.
109
which ones are supported according to the cluster configuration. Example
110
output::
111

    
112
  > gnt-cluster info
113
  [...]
114
  Cluster parameters:
115
    - [...]
116
    - enabled storage methods: plain (default), drbd, lvm, rados
117
    - [...]
118

    
119
``gnt-node list-storage`` will not be affected by any changes, since this design
120
describes only free storage reporting for non-shared storage methods.
121

    
122
Allocator changes
123
-----------------
124

    
125
The iallocator protocol doesn't need to change: since we know which
126
storage type an instance has, we'll pass only the "free" value for that
127
storage type to the iallocator, when asking for an allocation to be
128
made. Note that for DRBD nowadays we ignore the case when vg and metavg
129
are different, and we only consider the main VG. Fixing this is outside
130
the scope of this design.
131

    
132
With this design, we ensure forward-compatibility with respect to storage
133
pools. For now, we'll report space for all available (non-shared) storage
134
types, in the future, for all available storage pools.
135

    
136
Rebalancing changes
137
-------------------
138

    
139
Hbal will not need changes, as it handles it already. We don't forecast
140
any changes needed to it.
141

    
142
Space reporting changes
143
-----------------------
144

    
145
Hspace will by default report by assuming the allocation will happen on
146
the default storage for the cluster/nodegroup. An option will be added
147
to manually specify a different storage.
148

    
149
Interactions with Partitioned Ganeti
150
------------------------------------
151

    
152
Also the design for :doc:`Partitioned Ganeti <design-partitioned>` deals
153
with reporting free space. Partitioned Ganeti has a different way to
154
report free space for LVM on nodes where the ``exclusive_storage`` flag
155
is set. That doesn't interact directly with this design, as the specific
156
of how the free space is computed is not in the scope of this design.
157
But the ``node info`` call contains the value of the
158
``exclusive_storage`` flag, which is currently only meaningful for the
159
LVM back-end. Additional flags like the ``external_storage`` flag
160
for lvm might be useful for other storage types as well. We therefore
161
extend the RPC call with <method>,<key> to <method>,<key>,<params> to
162
include any storage-method specific parameters in the RPC call.
163

    
164
The reporting of free spindles, also part of Partitioned Ganeti, is not
165
concerned with this design doc, as those are seen as a separate resource.
166

    
167
.. vim: set textwidth=72 :
168
.. Local Variables:
169
.. mode: rst
170
.. fill-column: 72
171
.. End: