root / doc / design-storagespace.rst @ 33c730a2
History | View | Annotate | Download (7 kB)
1 | ab8747b7 | Guido Trotter | ============================ |
---|---|---|---|
2 | ab8747b7 | Guido Trotter | Storage free space reporting |
3 | ab8747b7 | Guido Trotter | ============================ |
4 | ab8747b7 | Guido Trotter | |
5 | ab8747b7 | Guido Trotter | .. contents:: :depth: 4 |
6 | ab8747b7 | Guido Trotter | |
7 | ab8747b7 | Guido Trotter | Background |
8 | ab8747b7 | Guido Trotter | ========== |
9 | ab8747b7 | Guido Trotter | |
10 | ab8747b7 | Guido Trotter | Currently Space reporting is broken for all storage types except drbd or |
11 | ab8747b7 | Guido Trotter | lvm (plain). This design looks at the root causes and proposes a way to |
12 | ab8747b7 | Guido Trotter | fix it. |
13 | ab8747b7 | Guido Trotter | |
14 | ab8747b7 | Guido Trotter | Proposed changes |
15 | ab8747b7 | Guido Trotter | ================ |
16 | ab8747b7 | Guido Trotter | |
17 | ab8747b7 | Guido Trotter | The changes below will streamline Ganeti to properly support |
18 | ab8747b7 | Guido Trotter | interaction with different storage types. |
19 | ab8747b7 | Guido Trotter | |
20 | ab8747b7 | Guido Trotter | Configuration changes |
21 | ab8747b7 | Guido Trotter | --------------------- |
22 | ab8747b7 | Guido Trotter | |
23 | 74df4a99 | Helga Velroyen | Add a new attribute "enabled_storage_methods" (type: list of strings) to the |
24 | 74df4a99 | Helga Velroyen | cluster config which holds the types of storages, for example, "plain", "drbd", |
25 | 74df4a99 | Helga Velroyen | or "ext". We consider the first one of the list as the default method. |
26 | 74df4a99 | Helga Velroyen | |
27 | 74df4a99 | Helga Velroyen | For file storage, we'll report the storage space on the file storage dir, |
28 | 74df4a99 | Helga Velroyen | which is currently limited to one directory. In the future, if we'll have |
29 | 74df4a99 | Helga Velroyen | support for more directories, or for per-nodegroup directories this can be |
30 | 74df4a99 | Helga Velroyen | changed. |
31 | 74df4a99 | Helga Velroyen | |
32 | 74df4a99 | Helga Velroyen | Note that the abovementioned enabled_storage_methods are just "mechanisms" |
33 | 74df4a99 | Helga Velroyen | parameters that define which storage methods the cluster can use. Further |
34 | 74df4a99 | Helga Velroyen | filtering about what's allowed can go in the ipolicy, but these changes are |
35 | 74df4a99 | Helga Velroyen | not covered in this design doc. |
36 | ab8747b7 | Guido Trotter | |
37 | ab8747b7 | Guido Trotter | Since the ipolicy currently has a list of enabled storage types, we'll |
38 | ab8747b7 | Guido Trotter | use that to decide which storage type is the default, and to self-select |
39 | ab8747b7 | Guido Trotter | it for new instance creations, and reporting. |
40 | ab8747b7 | Guido Trotter | |
41 | ab8747b7 | Guido Trotter | Enabling/disabling of storage types at ``./configure`` time will be |
42 | ab8747b7 | Guido Trotter | eventually removed. |
43 | ab8747b7 | Guido Trotter | |
44 | ab8747b7 | Guido Trotter | RPC changes |
45 | ab8747b7 | Guido Trotter | ----------- |
46 | ab8747b7 | Guido Trotter | |
47 | ab8747b7 | Guido Trotter | The noded RPC call that reports node storage space will be changed to |
48 | 74df4a99 | Helga Velroyen | accept a list of <method>,<key> string tuples. For each of them, it will |
49 | 74df4a99 | Helga Velroyen | report the free amount of storage space found on storage <key> as known |
50 | 74df4a99 | Helga Velroyen | by the requested storage type method. For example methods are ``lvm``, |
51 | 74df4a99 | Helga Velroyen | ``filesystem``, or ``rados``, and the key would be a volume group name, in |
52 | 74df4a99 | Helga Velroyen | the case of lvm, a directory name for the filesystem and a rados pool name |
53 | 74df4a99 | Helga Velroyen | for rados storage. |
54 | 74df4a99 | Helga Velroyen | |
55 | 74df4a99 | Helga Velroyen | For now, we will implement only the storage reporting for non-shared storage, |
56 | 74df4a99 | Helga Velroyen | that is ``filesystem`` and ``lvm``. For shared storage methods like ``rados`` |
57 | 74df4a99 | Helga Velroyen | and ``ext`` we will not implement a free space calculation, because it does |
58 | 74df4a99 | Helga Velroyen | not make sense to query each node for the free space of a commonly used |
59 | 74df4a99 | Helga Velroyen | storage. |
60 | ab8747b7 | Guido Trotter | |
61 | ab8747b7 | Guido Trotter | Masterd will know (through a constant map) which storage type uses which |
62 | ab8747b7 | Guido Trotter | method for storage calculation (i.e. ``plain`` and ``drbd`` use ``lvm``, |
63 | 74df4a99 | Helga Velroyen | ``file`` uses ``filesystem``, etc) and query the one needed (or all of the |
64 | 74df4a99 | Helga Velroyen | needed ones). |
65 | ab8747b7 | Guido Trotter | |
66 | ab8747b7 | Guido Trotter | Note that for file and sharedfile the node knows which directories are |
67 | ab8747b7 | Guido Trotter | allowed and won't allow any other directory to be queried for security |
68 | ab8747b7 | Guido Trotter | reasons. The actual path still needs to be passed to distinguish the |
69 | ab8747b7 | Guido Trotter | two, as the method will be the same for both. |
70 | ab8747b7 | Guido Trotter | |
71 | ab8747b7 | Guido Trotter | These calculations will be implemented in the node storage system |
72 | ab8747b7 | Guido Trotter | (currently lib/storage.py) but querying will still happen through the |
73 | ab8747b7 | Guido Trotter | ``node info`` call, to avoid requiring an extra RPC each time. |
74 | ab8747b7 | Guido Trotter | |
75 | ab8747b7 | Guido Trotter | Ganeti reporting |
76 | ab8747b7 | Guido Trotter | ---------------- |
77 | ab8747b7 | Guido Trotter | |
78 | 74df4a99 | Helga Velroyen | `gnt-node list`` can be queried for the different storage methods, if they |
79 | 74df4a99 | Helga Velroyen | are enabled. By default, it will just report information about the default |
80 | 74df4a99 | Helga Velroyen | storage method. Examples:: |
81 | 74df4a99 | Helga Velroyen | |
82 | 74df4a99 | Helga Velroyen | > gnt-node list |
83 | 74df4a99 | Helga Velroyen | Node DTotal DFree MTotal MNode MFree Pinst Sinst |
84 | 74df4a99 | Helga Velroyen | mynode1 3.6T 3.6T 64.0G 1023M 62.2G 1 0 |
85 | 74df4a99 | Helga Velroyen | mynode2 3.6T 3.6T 64.0G 1023M 62.0G 2 1 |
86 | 74df4a99 | Helga Velroyen | mynode3 3.6T 3.6T 64.0G 1023M 62.3G 0 2 |
87 | 74df4a99 | Helga Velroyen | |
88 | 74df4a99 | Helga Velroyen | > gnt-node list -o dtotal/lvm,dfree/rados |
89 | 74df4a99 | Helga Velroyen | Node DTotal (Lvm, myvg) DFree (Rados, myrados) |
90 | 74df4a99 | Helga Velroyen | mynode1 3.6T - |
91 | 74df4a99 | Helga Velroyen | mynode2 3.6T - |
92 | 74df4a99 | Helga Velroyen | |
93 | 74df4a99 | Helga Velroyen | Note that for drbd, we only report the space of the vg and only if it was not |
94 | 74df4a99 | Helga Velroyen | renamed to something different than the default volume group name. With this |
95 | 74df4a99 | Helga Velroyen | design, there is also no possibility to ask about the meta volume group. We |
96 | 74df4a99 | Helga Velroyen | restrict the design here to make the transition to storage pools easier (as it |
97 | 74df4a99 | Helga Velroyen | is an interim state only). It is the administrator's responsibility to ensure |
98 | 74df4a99 | Helga Velroyen | that there is enough space for the meta volume group. |
99 | 74df4a99 | Helga Velroyen | |
100 | 74df4a99 | Helga Velroyen | When storage pools are implemented, we switch from referencing the storage |
101 | 74df4a99 | Helga Velroyen | method to referencing the storage pool name. For that, of course, the pool |
102 | 74df4a99 | Helga Velroyen | names need to be unique over all storage methods. For drbd, we will use the |
103 | 74df4a99 | Helga Velroyen | default 'lvm' storage pool and possibly a second lvm-based storage pool for |
104 | 74df4a99 | Helga Velroyen | the metavg. It will be possible to rename storage pools (thus also the default |
105 | 74df4a99 | Helga Velroyen | lvm storage pool). There will be new functionality to ask about what storage |
106 | 74df4a99 | Helga Velroyen | pools are available and of what type. |
107 | 74df4a99 | Helga Velroyen | |
108 | 74df4a99 | Helga Velroyen | ``gnt-cluster info`` will report which storage methods are enabled, i.e. |
109 | 74df4a99 | Helga Velroyen | which ones are supported according to the cluster configuration. Example |
110 | 74df4a99 | Helga Velroyen | output:: |
111 | 74df4a99 | Helga Velroyen | |
112 | 74df4a99 | Helga Velroyen | > gnt-cluster info |
113 | 74df4a99 | Helga Velroyen | [...] |
114 | 74df4a99 | Helga Velroyen | Cluster parameters: |
115 | 74df4a99 | Helga Velroyen | - [...] |
116 | 74df4a99 | Helga Velroyen | - enabled storage methods: plain (default), drbd, lvm, rados |
117 | 74df4a99 | Helga Velroyen | - [...] |
118 | 74df4a99 | Helga Velroyen | |
119 | 74df4a99 | Helga Velroyen | ``gnt-node list-storage`` will not be affected by any changes, since this design |
120 | 74df4a99 | Helga Velroyen | describes only free storage reporting for non-shared storage methods. |
121 | ab8747b7 | Guido Trotter | |
122 | ab8747b7 | Guido Trotter | Allocator changes |
123 | ab8747b7 | Guido Trotter | ----------------- |
124 | ab8747b7 | Guido Trotter | |
125 | ab8747b7 | Guido Trotter | The iallocator protocol doesn't need to change: since we know which |
126 | ab8747b7 | Guido Trotter | storage type an instance has, we'll pass only the "free" value for that |
127 | ab8747b7 | Guido Trotter | storage type to the iallocator, when asking for an allocation to be |
128 | ab8747b7 | Guido Trotter | made. Note that for DRBD nowadays we ignore the case when vg and metavg |
129 | ab8747b7 | Guido Trotter | are different, and we only consider the main VG. Fixing this is outside |
130 | ab8747b7 | Guido Trotter | the scope of this design. |
131 | ab8747b7 | Guido Trotter | |
132 | 74df4a99 | Helga Velroyen | With this design, we ensure forward-compatibility with respect to storage |
133 | 74df4a99 | Helga Velroyen | pools. For now, we'll report space for all available (non-shared) storage |
134 | 74df4a99 | Helga Velroyen | types, in the future, for all available storage pools. |
135 | 74df4a99 | Helga Velroyen | |
136 | ab8747b7 | Guido Trotter | Rebalancing changes |
137 | ab8747b7 | Guido Trotter | ------------------- |
138 | ab8747b7 | Guido Trotter | |
139 | ab8747b7 | Guido Trotter | Hbal will not need changes, as it handles it already. We don't forecast |
140 | ab8747b7 | Guido Trotter | any changes needed to it. |
141 | ab8747b7 | Guido Trotter | |
142 | ab8747b7 | Guido Trotter | Space reporting changes |
143 | ab8747b7 | Guido Trotter | ----------------------- |
144 | ab8747b7 | Guido Trotter | |
145 | ab8747b7 | Guido Trotter | Hspace will by default report by assuming the allocation will happen on |
146 | ab8747b7 | Guido Trotter | the default storage for the cluster/nodegroup. An option will be added |
147 | ab8747b7 | Guido Trotter | to manually specify a different storage. |
148 | ab8747b7 | Guido Trotter | |
149 | 74df4a99 | Helga Velroyen | Interactions with Partitioned Ganeti |
150 | 74df4a99 | Helga Velroyen | ------------------------------------ |
151 | 74df4a99 | Helga Velroyen | |
152 | 74df4a99 | Helga Velroyen | Also the design for :doc:`Partitioned Ganeti <design-partitioned>` deals |
153 | 74df4a99 | Helga Velroyen | with reporting free space. Partitioned Ganeti has a different way to |
154 | 74df4a99 | Helga Velroyen | report free space for LVM on nodes where the ``exclusive_storage`` flag |
155 | 74df4a99 | Helga Velroyen | is set. That doesn't interact directly with this design, as the specific |
156 | 74df4a99 | Helga Velroyen | of how the free space is computed is not in the scope of this design. |
157 | 74df4a99 | Helga Velroyen | But the ``node info`` call contains the value of the |
158 | 74df4a99 | Helga Velroyen | ``exclusive_storage`` flag, which is currently only meaningful for the |
159 | 74df4a99 | Helga Velroyen | LVM back-end. Additional flags like the ``external_storage`` flag |
160 | 74df4a99 | Helga Velroyen | for lvm might be useful for other storage types as well. We therefore |
161 | 74df4a99 | Helga Velroyen | extend the RPC call with <method>,<key> to <method>,<key>,<params> to |
162 | 74df4a99 | Helga Velroyen | include any storage-method specific parameters in the RPC call. |
163 | 74df4a99 | Helga Velroyen | |
164 | 74df4a99 | Helga Velroyen | The reporting of free spindles, also part of Partitioned Ganeti, is not |
165 | 74df4a99 | Helga Velroyen | concerned with this design doc, as those are seen as a separate resource. |
166 | 74df4a99 | Helga Velroyen | |
167 | ab8747b7 | Guido Trotter | .. vim: set textwidth=72 : |
168 | ab8747b7 | Guido Trotter | .. Local Variables: |
169 | ab8747b7 | Guido Trotter | .. mode: rst |
170 | ab8747b7 | Guido Trotter | .. fill-column: 72 |
171 | ab8747b7 | Guido Trotter | .. End: |