Revision c0446a46 doc/design-2.1.rst
b/doc/design-2.1.rst | ||
---|---|---|
40 | 40 |
Feature changes |
41 | 41 |
--------------- |
42 | 42 |
|
43 |
Ganeti Confd |
|
44 |
~~~~~~~~~~~~ |
|
45 |
|
|
46 |
Current State and shortcomings |
|
47 |
++++++++++++++++++++++++++++++ |
|
48 |
In Ganeti 2.0 all nodes are equal, but some are more equal than others. In |
|
49 |
particular they are divided between "master", "master candidates" and "normal". |
|
50 |
(Moreover they can be offline or drained, but this is not important for the |
|
51 |
current discussion). In general the whole configuration is only replicated to |
|
52 |
master candidates, and some partial information is spread to all nodes via |
|
53 |
ssconf. |
|
54 |
|
|
55 |
This change was done so that the most frequent Ganeti operations didn't need to |
|
56 |
contact all nodes, and so clusters could become bigger. If we want more |
|
57 |
information to be available on all nodes, we need to add more ssconf values, |
|
58 |
which is counter-balancing the change, or to talk with the master node, which |
|
59 |
is not designed to happen now, and requires its availability. |
|
60 |
|
|
61 |
Information such as the instance->primary_node mapping will be needed on all |
|
62 |
nodes, and we also want to make sure services external to the cluster can query |
|
63 |
this information as well. This information must be available at all times, so |
|
64 |
we can't query it through RAPI, which would be a single point of failure, as |
|
65 |
it's only available on the master. |
|
66 |
|
|
67 |
|
|
68 |
Proposed changes |
|
69 |
++++++++++++++++ |
|
70 |
|
|
71 |
In order to allow fast and highly available access read-only to some |
|
72 |
configuration values, we'll create a new ganeti-confd daemon, which will run on |
|
73 |
master candidates. This daemon will talk via UDP, and authenticate messages |
|
74 |
using HMAC with a cluster-wide shared key. |
|
75 |
|
|
76 |
An interested client can query a value by making a request to a subset of the |
|
77 |
cluster master candidates. It will then wait to get a few responses, and use |
|
78 |
the one with the highest configuration serial number (which will be always |
|
79 |
included in the answer). If some candidates are stale, or we are in the middle |
|
80 |
of a configuration update, various master candidates may return different |
|
81 |
values, and this should make sure the most recent information is used. |
|
82 |
|
|
83 |
In order to prevent replay attacks queries will contain the current unix |
|
84 |
timestamp according to the client, and the server will verify that its |
|
85 |
timestamp is in the same 5 minutes range (this requires synchronized clocks, |
|
86 |
which is a good idea anyway). Queries will also contain a "salt" which they |
|
87 |
expect the answers to be sent with, and clients are supposed to accept only |
|
88 |
answers which contain salt generated by them. |
|
89 |
|
|
90 |
The configuration daemon will be able to answer simple queries such as: |
|
91 |
- master candidates list |
|
92 |
- master node |
|
93 |
- offline nodes |
|
94 |
- instance list |
|
95 |
- instance primary nodes |
|
96 |
|
|
97 |
|
|
43 | 98 |
Redistribute Config |
44 | 99 |
~~~~~~~~~~~~~~~~~~~ |
45 | 100 |
|
Also available in: Unified diff