Revision c0446a46

b/doc/design-2.1.rst
40 40
Feature changes
41 41
---------------
42 42

  
43
Ganeti Confd
44
~~~~~~~~~~~~
45

  
46
Current State and shortcomings
47
++++++++++++++++++++++++++++++
48
In Ganeti 2.0 all nodes are equal, but some are more equal than others. In
49
particular they are divided between "master", "master candidates" and "normal".
50
(Moreover they can be offline or drained, but this is not important for the
51
current discussion). In general the whole configuration is only replicated to
52
master candidates, and some partial information is spread to all nodes via
53
ssconf.
54

  
55
This change was done so that the most frequent Ganeti operations didn't need to
56
contact all nodes, and so clusters could become bigger. If we want more
57
information to be available on all nodes, we need to add more ssconf values,
58
which is counter-balancing the change, or to talk with the master node, which
59
is not designed to happen now, and requires its availability.
60

  
61
Information such as the instance->primary_node mapping will be needed on all
62
nodes, and we also want to make sure services external to the cluster can query
63
this information as well. This information must be available at all times, so
64
we can't query it through RAPI, which would be a single point of failure, as
65
it's only available on the master.
66

  
67

  
68
Proposed changes
69
++++++++++++++++
70

  
71
In order to allow fast and highly available access read-only to some
72
configuration values, we'll create a new ganeti-confd daemon, which will run on
73
master candidates. This daemon will talk via UDP, and authenticate messages
74
using HMAC with a cluster-wide shared key.
75

  
76
An interested client can query a value by making a request to a subset of the
77
cluster master candidates. It will then wait to get a few responses, and use
78
the one with the highest configuration serial number (which will be always
79
included in the answer). If some candidates are stale, or we are in the middle
80
of a configuration update, various master candidates may return different
81
values, and this should make sure the most recent information is used.
82

  
83
In order to prevent replay attacks queries will contain the current unix
84
timestamp according to the client, and the server will verify that its
85
timestamp is in the same 5 minutes range (this requires synchronized clocks,
86
which is a good idea anyway). Queries will also contain a "salt" which they
87
expect the answers to be sent with, and clients are supposed to accept only
88
answers which contain salt generated by them.
89

  
90
The configuration daemon will be able to answer simple queries such as:
91
- master candidates list
92
- master node
93
- offline nodes
94
- instance list
95
- instance primary nodes
96

  
97

  
43 98
Redistribute Config
44 99
~~~~~~~~~~~~~~~~~~~
45 100

  

Also available in: Unified diff