root / doc / design-network.rst @ 5d94c034
History | View | Annotate | Download (10.4 kB)
1 |
================== |
---|---|
2 |
Network management |
3 |
================== |
4 |
|
5 |
.. contents:: :depth: 4 |
6 |
|
7 |
This is a design document detailing the implementation of network resource |
8 |
management in Ganeti. |
9 |
|
10 |
Current state and shortcomings |
11 |
============================== |
12 |
|
13 |
Currently Ganeti supports two configuration modes for instance NICs: |
14 |
routed and bridged mode. The ``ip`` NIC parameter, which is mandatory |
15 |
for routed NICs and optional for bridged ones, holds the given NIC's IP |
16 |
address and may be filled either manually, or via a DNS lookup for the |
17 |
instance's hostname. |
18 |
|
19 |
This approach presents some shortcomings: |
20 |
|
21 |
a) It relies on external systems to perform network resource |
22 |
management. Although large organizations may already have IP pool |
23 |
management software in place, this is not usually the case with |
24 |
stand-alone deployments. For smaller installations it makes sense to |
25 |
allocate a pool of IP addresses to Ganeti and let it transparently |
26 |
assign these IPs to instances as appropriate. |
27 |
|
28 |
b) The NIC network information is incomplete, lacking netmask and |
29 |
gateway. Operating system providers could for example use the |
30 |
complete network information to fully configure an instance's |
31 |
network parameters upon its creation. |
32 |
|
33 |
Furthermore, having full network configuration information would |
34 |
enable Ganeti nodes to become more self-contained and be able to |
35 |
infer system configuration (e.g. /etc/network/interfaces content) |
36 |
from Ganeti configuration. This should make configuration of |
37 |
newly-added nodes a lot easier and less dependant on external |
38 |
tools/procedures. |
39 |
|
40 |
c) Instance placement must explicitly take network availability in |
41 |
different node groups into account; the same ``link`` is implicitly |
42 |
expected to connect to the same network across the whole cluster, |
43 |
which may not always be the case with large clusters with multiple |
44 |
node groups. |
45 |
|
46 |
|
47 |
Proposed changes |
48 |
---------------- |
49 |
|
50 |
In order to deal with the above shortcomings, we propose to extend |
51 |
Ganeti with high-level network management logic, which consists of a new |
52 |
NIC slot called ``network``, a new ``Network`` configuration object |
53 |
(cluster level) and logic to perform IP address pool management, i.e. |
54 |
maintain a set of available and occupied IP addresses. |
55 |
|
56 |
Configuration changes |
57 |
+++++++++++++++++++++ |
58 |
|
59 |
We propose the introduction of a new high-level Network object, |
60 |
containing (at least) the following data: |
61 |
|
62 |
- Symbolic name |
63 |
- UUID |
64 |
- Network in CIDR notation (IPv4 + IPv6) |
65 |
- Default gateway, if one exists (IPv4 + IPv6) |
66 |
- IP pool management data (reservations) |
67 |
- Default NIC connectivity mode (bridged, routed). This is the |
68 |
functional equivalent of the current NIC ``mode``. |
69 |
- Default host interface (e.g. br0). This is the functional equivalent |
70 |
of the current NIC ``link``. |
71 |
- Tags |
72 |
|
73 |
Each network will be connected to any number of node groups. During the |
74 |
connection of a network to a nodegroup, we define the corresponding |
75 |
connectivity mode (bridged or routed) and the host interface (br100 or |
76 |
routing_table_200). This is achieved by adding a ``networks`` slot to |
77 |
the NodeGroup object and using the networks' UUIDs as keys. The value |
78 |
for each key is a dictionary containing the network's ``mode`` and |
79 |
``link`` (netparams). Every NIC assigned to the network will eventually |
80 |
inherit the network's netparams, as its nicparams. |
81 |
|
82 |
|
83 |
IP pool management |
84 |
++++++++++++++++++ |
85 |
|
86 |
A new helper library is introduced, wrapping around Network objects to |
87 |
give IP pool management capabilities. A network's pool is defined by two |
88 |
bitfields, the length of the network size each: |
89 |
|
90 |
``reservations`` |
91 |
This field holds all IP addresses reserved by Ganeti instances, as |
92 |
well as cluster IP addresses (node addresses + cluster master) |
93 |
|
94 |
``external reservations`` |
95 |
This field holds all IP addresses that are manually reserved by the |
96 |
administrator, because some other equipment is using them outside the |
97 |
scope of Ganeti. |
98 |
|
99 |
The bitfields are implemented using the python-bitarray package for |
100 |
space efficiency and their binary value stored base64-encoded for JSON |
101 |
compatibility. This approach gives relatively compact representations |
102 |
even for large IPv4 networks (e.g. /20). |
103 |
|
104 |
Ganeti-owned IP addresses (node + master IPs) are reserved automatically |
105 |
if the cluster's data network itself is placed under pool management. |
106 |
|
107 |
Helper ConfigWriter methods provide free IP address generation and |
108 |
reservation, using a TemporaryReservationManager. |
109 |
|
110 |
It should be noted that IP pool management is performed only for IPv4 |
111 |
networks, as they are expected to be densely populated. IPv6 networks |
112 |
can use different approaches, e.g. sequential address asignment or |
113 |
EUI-64 addresses. |
114 |
|
115 |
New NIC parameter: network |
116 |
++++++++++++++++++++++++++ |
117 |
|
118 |
In order to be able to use the new network facility while maintaining |
119 |
compatibility with the current networking model, a new NIC parameter is |
120 |
introduced, called ``network`` to reflect the fact that the given NIC |
121 |
belongs to the given network and its configuration is managed by Ganeti |
122 |
itself. To keep backwards compatibility, existing code is executed if |
123 |
the ``network`` value is 'none' or omitted during NIC creation. If we |
124 |
want our NIC to be assigned to a network, then only the ip (optional) |
125 |
and the network parameters should be passed. Mode and link are inherited |
126 |
from the network-nodegroup mapping configuration (netparams). This |
127 |
provides the desired abstraction between the VM's network and the |
128 |
node-specific underlying infrastructure. |
129 |
|
130 |
We also introduce a new ``ip`` address value, ``constants.NIC_IP_POOL``, |
131 |
that specifies that a given NIC's IP address should be obtained using |
132 |
the IP address pool of the specified network. This value is only valid |
133 |
for NICs belonging to a network. A NIC's IP address can also be |
134 |
specified manually, as long as it is contained in the network the NIC |
135 |
is connected to. |
136 |
|
137 |
|
138 |
Hooks |
139 |
+++++ |
140 |
|
141 |
Introduce new hooks concerning network operations: |
142 |
|
143 |
``OP_NETWORK_ADD`` |
144 |
Add a network to Ganeti |
145 |
|
146 |
:directory: network-add |
147 |
:pre-execution: master node |
148 |
:post-execution: master node |
149 |
|
150 |
``OP_NETWORK_REMOVE`` |
151 |
Remove a network from Ganeti |
152 |
|
153 |
:directory: network-remove |
154 |
:pre-execution: master node |
155 |
:post-execution: master node |
156 |
|
157 |
``OP_NETWORK_SET_PARAMS`` |
158 |
Modify a network |
159 |
|
160 |
:directory: network-modify |
161 |
:pre-execution: master node |
162 |
:post-execution: master node |
163 |
|
164 |
For connect/disconnect operations use existing: |
165 |
|
166 |
``OP_GROUP_SET_PARAMS`` |
167 |
Modify a nodegroup |
168 |
|
169 |
:directory: group-modify |
170 |
:pre-execution: master node |
171 |
:post-execution: master node |
172 |
|
173 |
Hook variables |
174 |
^^^^^^^^^^^^^^ |
175 |
|
176 |
During instance related operations: |
177 |
|
178 |
``INSTANCE_NICn_NETWORK`` |
179 |
The friendly name of the network |
180 |
|
181 |
During network related operations: |
182 |
|
183 |
``NETWORK_NAME`` |
184 |
The friendly name of the network |
185 |
|
186 |
``NETWORK_SUBNET`` |
187 |
The ip range of the network |
188 |
|
189 |
``NETWORK_GATEWAY`` |
190 |
The gateway of the network |
191 |
|
192 |
During nodegroup related operations: |
193 |
|
194 |
``GROUP_NETWORK`` |
195 |
The friendly name of the network |
196 |
|
197 |
``GROUP_NETWORK_MODE`` |
198 |
The mode (bridged or routed) of the netparams |
199 |
|
200 |
``GROUP_NETWORK_LINK`` |
201 |
The link of the netparams |
202 |
|
203 |
Backend changes |
204 |
+++++++++++++++ |
205 |
|
206 |
To keep the hypervisor-visible changes to a minimum, and maintain |
207 |
compatibility with the existing network configuration scripts, the |
208 |
instance's hypervisor configuration will have host-level mode and link |
209 |
replaced by the *connectivity mode* and *host interface* (netparams) of |
210 |
the given network on the current node group. |
211 |
|
212 |
Network configuration scripts detect if a NIC is assigned to a Network |
213 |
by the presence of the new environment variable: |
214 |
|
215 |
Network configuration script variables |
216 |
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
217 |
|
218 |
``NETWORK`` |
219 |
The friendly name of the network |
220 |
|
221 |
Conflicting IPs |
222 |
+++++++++++++++ |
223 |
|
224 |
To ensure IP uniqueness inside a nodegroup, we introduce the term |
225 |
``conflicting ips``. Conflicting IPs occur: (a) when creating a |
226 |
networkless NIC with IP contained in a network already connected to the |
227 |
instance's nodegroup (b) when connecting/disconnecting a network |
228 |
to/from a nodegroup and at the same time instances with IPs inside the |
229 |
network's range still exist. Conflicting IPs produce prereq errors. |
230 |
|
231 |
Handling of conflicting IP with --force option: |
232 |
|
233 |
For case (a) reserve the IP and assign the NIC to the Network. |
234 |
For case (b) during connect same as (a), during disconnect release IP and |
235 |
reset NIC's network parameter to None |
236 |
|
237 |
|
238 |
Userland interface |
239 |
++++++++++++++++++ |
240 |
|
241 |
A new client script is introduced, ``gnt-network``, which handles |
242 |
network-related configuration in Ganeti. |
243 |
|
244 |
Network addition/deletion |
245 |
^^^^^^^^^^^^^^^^^^^^^^^^^ |
246 |
:: |
247 |
|
248 |
gnt-network add --network=192.168.100.0/28 --gateway=192.168.100.1 \ |
249 |
--network6=2001:db8:2ffc::/64 --gateway6=2001:db8:2ffc::1 \ |
250 |
--add-reserved-ips=192.168.100.10,192.168.100.11 net100 |
251 |
(Checks for already exising name and valid IP values) |
252 |
gnt-network remove network_name |
253 |
(Checks if not connected to any nodegroup) |
254 |
|
255 |
|
256 |
Network modification |
257 |
^^^^^^^^^^^^^^^^^^^^ |
258 |
:: |
259 |
|
260 |
gnt-network modify --gateway=192.168.100.5 net100 |
261 |
(Changes the gateway only if ip is available) |
262 |
gnt-network modify --add-reserved-ips=192.168.100.11 net100 |
263 |
(Adds externally reserved ip) |
264 |
gnt-network modify --remove-reserved-ips=192.168.100.11 net100 |
265 |
(Removes externally reserved ip) |
266 |
|
267 |
|
268 |
Assignment to node groups |
269 |
^^^^^^^^^^^^^^^^^^^^^^^^^ |
270 |
:: |
271 |
|
272 |
gnt-network connect net100 nodegroup1 bridged br100 |
273 |
(Checks for existing bridge among nodegroup) |
274 |
gnt-network connect net100 nodegroup2 routed rt_table |
275 |
(Checks for conflicting IPs) |
276 |
gnt-network disconnect net101 nodegroup1 |
277 |
(Checks for conflicting IPs) |
278 |
|
279 |
|
280 |
Network listing |
281 |
^^^^^^^^^^^^^^^ |
282 |
:: |
283 |
|
284 |
gnt-network list |
285 |
|
286 |
Network Subnet Gateway NodeGroups GroupList |
287 |
net100 192.168.100.0/28 192.168.100.1 1 default(bridged, br100) |
288 |
net101 192.168.101.0/28 192.168.101.1 1 default(routed, rt_tab) |
289 |
|
290 |
Network information |
291 |
^^^^^^^^^^^^^^^^^^^ |
292 |
:: |
293 |
|
294 |
gnt-network info testnet1 |
295 |
|
296 |
Network name: testnet1 |
297 |
subnet: 192.168.100.0/28 |
298 |
gateway: 192.168.100.1 |
299 |
size: 16 |
300 |
free: 10 (62.50%) |
301 |
usage map: |
302 |
0 XXXXX..........X 63 |
303 |
(X) used (.) free |
304 |
externally reserved IPs: |
305 |
192.168.100.0, 192.168.100.1, 192.168.100.15 |
306 |
connected to node groups: |
307 |
default(bridged, br100) |
308 |
used by 3 instances: |
309 |
test1 : 0:192.168.100.4 |
310 |
test2 : 0:192.168.100.2 |
311 |
test3 : 0:192.168.100.3 |
312 |
|
313 |
|
314 |
IAllocator changes |
315 |
++++++++++++++++++ |
316 |
|
317 |
The IAllocator protocol can be made network-aware, i.e. also consider |
318 |
network availability for node group selection. Networks, as well as |
319 |
future shared storage pools, can be seen as constraints used to rule out |
320 |
the placement on certain node groups. |
321 |
|
322 |
.. vim: set textwidth=72 : |
323 |
.. Local Variables: |
324 |
.. mode: rst |
325 |
.. fill-column: 72 |
326 |
.. End: |