Statistics
| Branch: | Tag: | Revision:

root / docs / index.rst @ 126f8f4e

History | View | Annotate | Download (9.9 kB)

1
.. snf-network documentation master file, created by
2
   sphinx-quickstart on Wed Feb 12 20:00:16 2014.
3
   You can adapt this file completely to your liking, but it should at least
4
   contain the root `toctree` directive.
5

    
6
Welcome to snf-network's documentation!
7
=======================================
8

    
9
snf-network is a set of scripts that handle the network configuration of
10
an instance inside a Ganeti cluster. It takes advantange of the
11
variables that Ganeti exports to their execution environment and issue
12
all the necessary commands to ensure network connectivity to the instance
13
based on the requested setup.
14

    
15
Environment
16
-----------
17

    
18
Ganeti supports `IP pool management
19
<http://docs.ganeti.org/ganeti/master/html/design-network.html>`_
20
so that end-user can put instances inside networks and get all information
21
related to the network in scripts. Specifically the following options are
22
exported:
23

    
24
* IP
25
* MAC
26
* MODE
27
* LINK
28

    
29
are per NIC specific, whereas:
30

    
31
* NETWORK_SUBNET
32
* NETWORK_GATEWAY
33
* NETWORK_MAC_PREFIX
34
* NETWORK_TAGS
35
* NETWORK_SUBNET6
36
* NETWORK_GATEWAY6
37

    
38
are inherited by the network in which a NIC resides (optional).
39

    
40
Scripts
41
-------
42

    
43
The scripts can be devided into two categories:
44

    
45
1. The scripts that are invoked explicitly by Ganeti upon NIC creation.
46

    
47
2. The scripts that are invoked by Ganeti Hooks Manager before or after an
48
   opcode execution.
49

    
50
The first group has the exact NIC info that is about to be configured where
51
the latter one has the info of the whole instance. The big difference is that
52
instance configuration (from the master perspective) might vary or be total
53
different from the one that is currently running. The reason is that some
54
modifications can take place without hotplug.
55

    
56

    
57
kvm-ifup-custom
58
^^^^^^^^^^^^^^^
59

    
60
Ganeti upon instance startup and NIC hotplug creates the TAP devices to
61
reflect to the instance's NICs. After that it invokes the Ganeti's `kvm-ifup`
62
script with the TAP name as first argument and an environment including
63
all NIC's and the corresponding network's info. This script searches for
64
a user provided one under `/etc/ganeti/kvm-ifup-custom` and executes it
65
instead.
66

    
67

    
68
kvm-ifdown-custom
69
^^^^^^^^^^^^^^^^^
70

    
71
In order to cleanup or modify the node's setup or the configuration of an
72
external component, Ganeti upon instance shutdown, successful instance
73
migration on source node and NIC hot-unplug invokes `kvm-ifdown` script
74
with the TAP name as first argument and a boolean second argument pointing
75
whether we want to do local cleanup only (in case of instance migration) or
76
totally unconfigure the interface along with e.g., any DNS entries (in case
77
of NIC hot-unplug). This script searches for a user provided one under
78
`/etc/ganeti/kvm-ifdown-custom` and executes it instead.
79

    
80

    
81
vif-custom
82
^^^^^^^^^^
83

    
84
Ganeti provides a hypervisor parameter that defines the script to be executed
85
per NIC upon instance startup: `vif-script`. Ganeti provides `vif-ganeti` as
86
example script which executes `/etc/xen/scripts/vif-custom` if found.
87

    
88

    
89
snf-network-hook
90
^^^^^^^^^^^^^^^^
91

    
92
This hook gets all static info related to an instance from evironment variables
93
and issues any commands needed. It was used to fix node's setup upon migration
94
when ifdown script was not supported but now it does nothing.
95

    
96

    
97
snf-network-dnshook
98
^^^^^^^^^^^^^^^^^^^
99

    
100
This hook updates an external `DDNS <https://wiki.debian.org/DDNS>`_ setup via
101
``nsupdate``. Since we add/remove entries during ifup/ifdown scripts, we use
102
this only during instance remove/shutdown/rename. It does not rely on exported
103
environment but it queries first the DNS server to obtain current entries and
104
then it invokes the neccessary commands to remove them (and the relevant
105
reverse ones too).
106

    
107

    
108
Supported Setups
109
----------------
110

    
111
Currently since NICs in Ganeti are not taggable objects, we use network's and
112
instance's tags to customize each NIC configuration. NIC inherits the network's
113
tags (if attached to any) and further customization can be achieved with
114
instance tags e.g. <tag prefix>:<nic uuid or name>:<tag>. In the following
115
subsections we will mention all supported tags and their reflected underline
116
setup.
117

    
118

    
119
ip-less-routed
120
^^^^^^^^^^^^^^
121

    
122
This setup has the following characteristics:
123

    
124
* An external gateway on the same collition domain with all nodes on some
125
  interface (e.g. eth1, eth0.200) is needed.
126
* Each node is a router for the hostes VMs
127
* The node itself does not have an IP inside the routed network
128
* The node does proxy ARP for IPv4 networks
129
* The node does proxy NDP for IPv6 networks while RA and NA are
130
* RS and NS are served locally by
131
  `nfdhcpd <http://www.synnefo.org/docs/nfdhcpd/latest/index.html>`_
132
  since the VMs are not on the same link with the router.
133

    
134
Lets analyze a simple PING from an instance to an external IP using this setup.
135
We assume the following:
136

    
137
* ``IP`` is the instance's IP
138
* ``GW_IP`` is the external router's IP
139
* ``NODE_IP`` is the node's IP
140
* ``ARP_IP`` is a dummy IP inside the network needed for proxy ARP
141

    
142
* ``MAC`` is the instance's MAC
143
* ``TAP_MAC`` is the tap's MAC
144
* ``DEV_MAC`` is the host's DEV MAC
145
* ``GW_MAC`` is the external router's MAC
146

    
147
* ``DEV`` is the node's device that the router is visible from
148
* ``TAP`` is the host interface connected with the instance's eth0
149

    
150
Since we suppose to be on the same link with the router, ARP takes place first:
151

    
152
1) The VM wants to know the GW_MAC. Since the traffic is routed we do proxy ARP.
153

    
154
 - ARP, Request who-has GW_IP tell IP
155
 - ARP, Reply GW_IP is-at TAP_MAC ``echo 1 > /proc/sys/net/conf/TAP/proxy_arp``
156
 - So `arp -na` insided the VM shows: ``(GW_IP) at TAP_MAC [ether] on eth0``
157

    
158
2) The host wants to know the GW_MAC. Since the node does **not** have an IP
159
   inside the network we use the dummy one specified above.
160

    
161
 - ARP, Request who-has GW_IP tell ARP_IP (Created by DEV)
162
   ``arptables -I OUTPUT -o DEV --opcode 1 -j mangle --mangle-ip-s ARP_IP``
163
 - ARP, Reply GW_IP is-at GW_MAC
164

    
165
3) The host wants to know MAC so that it can proxy it.
166

    
167
 - We simulate here that the VM sees **only** GW on the link.
168
 - ARP, Request who-has IP tell GW_IP (Created by TAP)
169
   ``arptables -I OUTPUT -o TAP --opcode 1 -j mangle --mangle-ip-s GW_IP``
170
 - So `arp -na` inside the host shows:
171
   ``(GW_IP) at GW_MAC [ether] on DEV, (IP) at MAC on TAP``
172

    
173
4) GW wants to know who does proxy for IP.
174

    
175
 - ARP, Request who-has IP tell GW_IP
176
 - ARP, Reply IP is-at DEV_MAC (Created by host's DEV)
177

    
178

    
179
With the above we have a working proxy ARP configuration. The rest is done
180
via simple L3 routing. Lets assume the following:
181

    
182
* ``TABLE`` is the extra routing table
183
* ``SUBNET`` is the IPv4 subnet where the VM's IP reside
184

    
185
1) Outgoing traffic:
186

    
187
 - Traffic coming out of TAP is routed via TABLE
188
   ``ip rule add dev TAP table TABLE``
189
 - TABLE states that default route is GW_IP via DEV
190
   ``ip route add default via GW_IP dev DEV``
191

    
192
2) Incoming traffic:
193

    
194
 - Packet arrives at router
195
 - Router knows from proxy ARP that the IP is at DEV_MAC.
196
 - Router sends ethernet packet with tgt DEV_MAC
197
 - Host receives the packet from DEV interface
198
 - Traffic coming out DEV is routed via TABLE
199
   ``ip rule add dev DEV table TABLE``
200
 - Traffic targeting IP is routed to TAP
201
   ``ip route add IP dev TAP``
202

    
203
3) Host to VM traffic:
204

    
205
 - Impossible if the VM resides in the host
206
 - Otherwise there is a route for it: ``ip route add SUBNET dev DEV``
207

    
208
The IPv6 setup is pretty similar but instead of proxy ARP we have proxy NDP
209
and RS and NS coming from TAP are served by nfdhpcd. RA contain network's
210
prefix and has M flag unset in order the VM to obtain its IP6 via SLAAC and
211
O flag set to obtain static info (nameservers, domain search list) via DHCPv6
212
(also served by nfdhcpd).
213

    
214
Again the VM sees on its link local only TAP which is supposed to be the
215
Router. The host does proxy for IP6 ``ip -6 neigh add EUI64 dev DEV``.
216

    
217
When an interface gets up inside a host we should invalidate all entries
218
related to its IP among other nodes and the router. For proxy ARP we do
219
``arpsend -U -c 1 -i IP DEV`` and for proxy NDP we do ``ndsend EUI64 DEV``
220

    
221

    
222
private-filtered
223
^^^^^^^^^^^^^^^^
224

    
225
In order to provide L2 isolation among several VMs we can use ebtables on a
226
**single** bridge. The infrastracture must provide a physical VLAN or separate
227
interaface shared among all nodes in the cluster. All virtual interfaces will
228
be bridged on a common bridge (e.g. ``prv0``) and filtering will be done via
229
ebtables and MAC prefix. The concept is that all interfaces on the same L2
230
should have the same MAC prefix. MAC prefix uniqueness is quaranteed by
231
synnefo and passed to Ganeti as a network option.
232

    
233
To ensure isolation we should allow traffic coming from tap to have specific
234
source MAC and at the same time allow traffic coming to tap to have a source
235
MAC in the same MAC prefix. Applying those rules only in FORWARD chain will not
236
guarantee isolation. The reason is because packets with target MAC a `mutlicast
237
address <http://en.wikipedia.org/wiki/Multicast_address>`_ go through INPUT and
238
OUTPUT chains. To sum up the following ebtables rules are applied:
239

    
240
.. code-block:: console
241

    
242
  # Create new chains
243
  ebtables -t filter -N FROMTAP5
244
  ebtables -t filter -N TOTAP5
245

    
246
  # Filter multicast traffic from VM
247
  ebtables -t filter -A INPUT -i tap5 -j FROMTAP5
248

    
249
  # Filter multicast traffic to VM
250
  ebtables -t filter -A OUTPUT -o tap5 -j TOTAP5
251

    
252
  # Filter traffic from VM
253
  ebtables -t filter -A FORWARD -i tap5 -j FROMTAP5
254
  # Filter traffic to VM
255
  ebtables -t filter -A FORWARD -o tap5 -j TOTAP5
256

    
257
  # Allow only specific src MAC for outgoing traffic
258
  ebtables -t filter -A FROMTAP5 -s ! aa:55:66:1a:ae:82 -j DROP
259
  # Allow only specific src MAC prefix for incoming traffic
260
  ebtables -t filter -A TOTAP5 -s ! aa:55:60:0:0:0/ff:ff:f0:0:0:0 -j DROP
261

    
262

    
263
dns
264
^^^
265

    
266
snf-network can update an external `DDNS <https://wiki.debian.org/DDNS>`_
267
server.  `ifup` and `ifdown` scripts, if `dns` network tag is found, will use
268
`nsupdate` and add/remove entries related to the interface that is being
269
managed.
270

    
271

    
272
Contents:
273

    
274
.. toctree::
275
   :maxdepth: 2
276

    
277

    
278

    
279
Indices and tables
280
==================
281

    
282
* :ref:`genindex`
283
* :ref:`modindex`
284
* :ref:`search`
285