Revision eec83a95

b/Makefile.am
285 285
	doc/design-impexp2.rst \
286 286
	doc/design-lu-generated-jobs.rst \
287 287
	doc/design-multi-reloc.rst \
288
	doc/design-network.rst \
288 289
	doc/cluster-merge.rst \
289 290
	doc/design-shared-storage.rst \
290 291
	doc/devnotes.rst \
b/doc/design-network.rst
1
==================
2
Network management
3
==================
4

  
5
.. contents:: :depth: 4
6

  
7
This is a design document detailing the implementation of network resource
8
management in Ganeti.
9

  
10
Current state and shortcomings
11
==============================
12

  
13
Currently Ganeti supports two configuration modes for instance NICs:
14
routed and bridged mode. The ``ip`` NIC parameter, which is mandatory
15
for routed NICs and optional for bridged ones, holds the given NIC's IP
16
address and may be filled either manually, or via a DNS lookup for the
17
instance's hostname.
18

  
19
This approach presents some shortcomings:
20

  
21
a) It relies on external systems to perform network resource
22
   management. Although large organizations may already have IP pool
23
   management software in place, this is not usually the case with
24
   stand-alone deployments. For smaller installations it makes sense to
25
   allocate a pool of IP addresses to Ganeti and let it transparently
26
   assign these IPs to instances as appropriate.
27

  
28
b) The NIC network information is incomplete, lacking netmask and
29
   gateway.  Operating system providers could for example use the
30
   complete network information to fully configure an instance's
31
   network parameters upon its creation.
32

  
33
   Furthermore, having full network configuration information would
34
   enable Ganeti nodes to become more self-contained and be able to
35
   infer system configuration (e.g. /etc/network/interfaces content)
36
   from Ganeti configuration. This should make configuration of
37
   newly-added nodes a lot easier and less dependant on external
38
   tools/procedures.
39

  
40
c) Instance placement must explicitly take network availability in
41
   different node groups into account; the same ``link`` is implicitly
42
   expected to connect to the same network across the whole cluster,
43
   which may not always be the case with large clusters with multiple
44
   node groups.
45

  
46

  
47
Proposed changes
48
----------------
49

  
50
In order to deal with the above shortcomings, we propose to extend
51
Ganeti with high-level network management logic, which consists of a new
52
NIC mode called ``managed``, a new "Network" configuration object and
53
logic to perform IP address pool management, i.e. maintain a set of
54
available and occupied IP addresses.
55

  
56
Configuration changes
57
+++++++++++++++++++++
58

  
59
We propose the introduction of a new high-level Network object,
60
containing (at least) the following data:
61

  
62
- Symbolic name
63
- UUID
64
- Network in CIDR notation (IPv4 + IPv6)
65
- Default gateway, if one exists (IPv4 + IPv6)
66
- IP pool management data (reservations)
67
- Default NIC connectivity mode (bridged, routed). This is the
68
  functional equivalent of the current NIC ``mode``.
69
- Default host interface (e.g. br0). This is the functional equivalent
70
  of the current NIC ``link``.
71
- Tags
72

  
73
Each network will be connected to any number of node groups, possibly
74
overriding connectivity mode and host interface for each node group.
75
This is achieved by adding a ``networks`` slot to the NodeGroup object
76
and using the networks' UUIDs as keys.
77

  
78
IP pool management
79
++++++++++++++++++
80

  
81
A new helper library is introduced, wrapping around Network objects to
82
give IP pool management capabilities. A network's pool is defined by two
83
bitfields, the length of the network size each:
84

  
85
``reservations``
86
  This field holds all IP addresses reserved by Ganeti instances, as
87
  well as cluster IP addresses (node addresses + cluster master)
88

  
89
``external reservations``
90
  This field holds all IP addresses that are manually reserved by the
91
  administrator, because some other equipment is using them outside the
92
  scope of Ganeti.
93

  
94
The bitfields are implemented using the python-bitarray package for
95
space efficiency and their binary value stored base64-encoded for JSON
96
compatibility. This approach gives relatively compact representations
97
even for large IPv4 networks (e.g. /20).
98

  
99
Ganeti-owned IP addresses (node + master IPs) are reserved automatically
100
if the cluster's data network itself is placed under pool management.
101

  
102
Helper ConfigWriter methods provide free IP address generation and
103
reservation, using a TemporaryReservationManager.
104

  
105
It should be noted that IP pool management is performed only for IPv4
106
networks, as they are expected to be densely populated. IPv6 networks
107
can use different approaches, e.g. sequential address asignment or
108
EUI-64 addresses.
109

  
110
Managed NIC mode
111
++++++++++++++++
112

  
113
In order to be able to use the new network facility while maintaining
114
compatibility with the current networking model, a new network mode is
115
introduced, called ``managed`` to reflect the fact that the given NICs
116
network configuration is managed by Ganeti itself. A managed mode NIC
117
accepts the network it is connected to in its ``link`` argument.
118
Userspace tools can refer to networks using their symbolic names,
119
however internally, the link argument stores the network's UUID.
120

  
121
We also introduce a new ``ip`` address value, ``constants.NIC_IP_POOL``,
122
that specifies that a given NIC's IP address should be obtained using
123
the IP address pool of the specified network. This value is only valid
124
for managed-mode NICs, where it is also used as a default instead of
125
``constants.VALUE_AUTO``. A managed-mode NIC's IP address can also be
126
specified manually, as long as it is compatible with the network the NIC
127
is connected to.
128

  
129

  
130
Hooks
131
+++++
132

  
133
``OP_NETWORK_ADD``
134
  Add a network to Ganeti
135

  
136
  :directory: network-add
137
  :pre-execution: master node
138
  :post-execution: master node
139

  
140
``OP_NETWORK_CONNECT``
141
  Connect a network to a node group. This hook can be used to e.g.
142
  configure network interfaces on the group's nodes.
143

  
144
  :directory: network-connect
145
  :pre-execution: master node, all nodes in the connected group
146
  :post-execution: master node, all nodes in the connected group
147

  
148
``OP_NETWORK_DISCONNECT``
149
  Disconnect a network to a node group. This hook can be used to e.g.
150
  deconfigure network interfaces on the group's nodes.
151

  
152
  :directory: network-disconnect
153
  :pre-execution: master node, all nodes in the connected group
154
  :post-execution: master node, all nodes in the connected group
155

  
156
``OP_NETWORK_REMOVE``
157
  Remove a network from Ganeti
158

  
159
  :directory: network-add
160
  :pre-execution: master node, all nodes
161
  :post-execution: master node, all nodes
162

  
163
Hook variables
164
^^^^^^^^^^^^^^
165

  
166
``INSTANCE_NICn_MANAGED``
167
  Non-zero if NIC n is a managed-mode NIC
168

  
169
``INSTANCE_NICn_NETWORK``
170
  The friendly name of the network
171

  
172
``INSTANCE_NICn_NETWORK_UUID``
173
  The network's UUID
174

  
175
``INSTANCE_NICn_NETWORK_TAGS``
176
  The network's tags
177

  
178
``INSTANCE_NICn_NETWORK_IPV4_CIDR``, ``INSTANCE_NICn_NETWORK_IPV6_CIDR``
179
  The subnet in CIDR notation
180

  
181
``INSTANCE_NICn_NETWORK_IPV4_GATEWAY``, ``INSTANCE_NICn_NETWORK_IPV6_GATEWAY``
182
  The subnet's default gateway
183

  
184

  
185
Backend changes
186
+++++++++++++++
187

  
188
In order to keep the hypervisor-visible changes to a minimum, and
189
maintain compatibility with the existing network configuration scripts,
190
the instance's hypervisor configuration will have host-level link and
191
mode replaced by the *connectivity mode* and *host interface* of the
192
given network on the current node group.
193

  
194
The managed mode can be detected by the presence of new environment
195
variables in network configuration scripts:
196

  
197
Network configuration script variables
198
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
199

  
200
``MANAGED``
201
  Non-zero if NIC is a managed-mode NIC
202

  
203
``NETWORK``
204
  The friendly name of the network
205

  
206
``NETWORK_UUID``
207
  The network's UUID
208

  
209
``NETWORK_TAGS``
210
  The network's tags
211

  
212
``NETWORK_IPv4_CIDR``, ``NETWORK_IPv6_CIDR``
213
  The subnet in CIDR notation
214

  
215
``NETWORK_IPV4_GATEWAY``, ``NETWORK_IPV6_GATEWAY``
216
  The subnet's default gateway
217

  
218
Userland interface
219
++++++++++++++++++
220

  
221
A new client script is introduced, ``gnt-network``, which handles
222
network-related configuration in Ganeti.
223

  
224
Network addition/deletion
225
^^^^^^^^^^^^^^^^^^^^^^^^^
226
::
227

  
228
 gnt-network add --cidr=192.0.2.0/24 --gateway=192.0.2.1 \
229
                --cidr6=2001:db8:2ffc::/64 --gateway6=2001:db8:2ffc::1 \
230
                --nic_connectivity=bridged --host_interface=br0 public
231
 gnt-network remove public (only allowed if no instances are using the network)
232

  
233
Manual IP address reservation
234
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
235
::
236

  
237
 gnt-network reserve-ips public 192.0.2.2 192.0.2.10-192.0.2.20
238
 gnt-network release-ips public 192.0.2.3
239

  
240

  
241
Network modification
242
^^^^^^^^^^^^^^^^^^^^
243
::
244

  
245
 gnt-network modify --cidr=192.0.2.0/25 public (only allowed if all current reservations fit in the new network)
246
 gnt-network modify --gateway=192.0.2.126 public
247
 gnt-network modify --host_interface=test --nic_connectivity=routed public (issues warning about instances that need to be rebooted)
248
 gnt-network rename public public2
249

  
250

  
251
Assignment to node groups
252
^^^^^^^^^^^^^^^^^^^^^^^^^
253
::
254

  
255
 gnt-network connect public nodegroup1
256
 gnt-network connect --host_interface=br1 public nodegroup2
257
 gnt-network disconnect public nodegroup1 (only permitted if no instances are currently using this network in the group)
258

  
259
Tagging
260
^^^^^^^
261
::
262

  
263
 gnt-network add-tags public foo bar:baz
264

  
265
Network listing
266
^^^^^^^^^^^^^^^
267
::
268

  
269
 gnt-network list
270
  Name		IPv4 Network	IPv4 Gateway	      IPv6 Network	       IPv6 Gateway		Connected to
271
  public	 192.0.2.0/24	192.0.2.1	2001:db8:dead:beef::/64	   2001:db8:dead:beef::1       nodegroup1:br0
272
  private	 10.0.1.0/24	   -			 -				-
273

  
274
Network information
275
^^^^^^^^^^^^^^^^^^^
276
::
277

  
278
 gnt-network info public
279
  Name: public
280
  IPv4 Network: 192.0.2.0/24
281
  IPv4 Gateway: 192.0.2.1
282
  IPv6 Network: 2001:db8:dead:beef::/64
283
  IPv6 Gateway: 2001:db8:dead:beef::1
284
  Total IPv4 count: 256
285
  Free address count: 201 (80% free)
286
  IPv4 pool status: XXX.........XXXXXXXXXXXXXX...XX.............
287
                    XXX..........XXX...........................X
288
                    ....XXX..........XXX.....................XXX
289
                                            X: occupied  .: free
290
  Externally reserved IPv4 addresses:
291
    192.0.2.3, 192.0.2.22
292
  Connected to node groups:
293
   default (link br0), other_group(link br1)
294
  Used by 22 instances:
295
   inst1
296
   inst2
297
   inst32
298
   ..
299

  
300

  
301
IAllocator changes
302
++++++++++++++++++
303

  
304
The IAllocator protocol can be made network-aware, i.e. also consider
305
network availability for node group selection. Networks, as well as
306
future shared storage pools, can be seen as constraints used to rule out
307
the placement on certain node groups.
308

  
309
.. vim: set textwidth=72 :
310
.. Local Variables:
311
.. mode: rst
312
.. fill-column: 72
313
.. End:
b/doc/index.rst
22 22
   design-htools-2.3.rst
23 23
   design-2.4.rst
24 24
   design-draft.rst
25
   design-network.rst
25 26
   cluster-merge.rst
26 27
   design-shared-storage.rst
27 28
   locking.rst

Also available in: Unified diff