Statistics
| Branch: | Tag: | Revision:

root / doc / design-node-add.rst @ 06c2fb4a

History | View | Annotate | Download (5.6 kB)

1 7742b03c Michael Hanselmann
Design for adding a node to a cluster
2 f98efa98 Michael Hanselmann
=====================================
3 f98efa98 Michael Hanselmann
4 f98efa98 Michael Hanselmann
.. contents:: :depth: 3
5 f98efa98 Michael Hanselmann
6 f98efa98 Michael Hanselmann
7 f98efa98 Michael Hanselmann
Current state and shortcomings
8 f98efa98 Michael Hanselmann
------------------------------
9 f98efa98 Michael Hanselmann
10 f98efa98 Michael Hanselmann
Before a node can be added to a cluster, its SSH daemon must be
11 f98efa98 Michael Hanselmann
re-configured to use the cluster-wide SSH host key. Ganeti 2.3.0 changed
12 f98efa98 Michael Hanselmann
the way this is done by moving all related code to a separate script,
13 f98efa98 Michael Hanselmann
``tools/setup-ssh``, using Paramiko. Before all such configuration was
14 f98efa98 Michael Hanselmann
done from ``lib/bootstrap.py`` using the system's own SSH client and a
15 f98efa98 Michael Hanselmann
shell script given to said client through parameters.
16 f98efa98 Michael Hanselmann
17 f98efa98 Michael Hanselmann
Both solutions controlled all actions on the connecting machine; the
18 f98efa98 Michael Hanselmann
newly added node was merely executing commands. This implies and
19 f98efa98 Michael Hanselmann
requires a tight coupling and equality between nodes (e.g. paths to
20 f98efa98 Michael Hanselmann
files being the same). Most of the logic and error handling is also done
21 f98efa98 Michael Hanselmann
on the connecting machine.
22 f98efa98 Michael Hanselmann
23 425f3ffe Michael Hanselmann
Once a node's SSH daemon has been configured, more than 25 files need to
24 425f3ffe Michael Hanselmann
be copied using ``scp`` before the node daemon can be started. No
25 425f3ffe Michael Hanselmann
verification is being done before files are copied. Once the node daemon
26 425f3ffe Michael Hanselmann
is started, an opcode is submitted to the master daemon, which will then
27 425f3ffe Michael Hanselmann
copy more files, such as the configuration and job queue for master
28 425f3ffe Michael Hanselmann
candidates, using RPC. This process is somewhat fragile and requires
29 425f3ffe Michael Hanselmann
initiating many SSH connections.
30 f98efa98 Michael Hanselmann
31 f98efa98 Michael Hanselmann
Proposed changes
32 f98efa98 Michael Hanselmann
----------------
33 f98efa98 Michael Hanselmann
34 425f3ffe Michael Hanselmann
SSH
35 425f3ffe Michael Hanselmann
~~~
36 425f3ffe Michael Hanselmann
37 f98efa98 Michael Hanselmann
The main goal is to move more logic to the newly added node. Instead of
38 f98efa98 Michael Hanselmann
having a relatively large script executed on the master node, most of it
39 f98efa98 Michael Hanselmann
is moved over to the added node.
40 f98efa98 Michael Hanselmann
41 f98efa98 Michael Hanselmann
A new script named ``prepare-node-join`` is added. It receives a JSON
42 f98efa98 Michael Hanselmann
data structure (defined :ref:`below <prepare-node-join-json>`) on its
43 f98efa98 Michael Hanselmann
standard input. Once the data has been successfully decoded, it proceeds
44 f4afc16e Michael Hanselmann
to configure the local node's SSH daemon and root's SSH settings, after
45 f4afc16e Michael Hanselmann
which the SSH daemon is restarted.
46 f98efa98 Michael Hanselmann
47 f98efa98 Michael Hanselmann
All the master node has to do to add a new node is to gather all
48 f98efa98 Michael Hanselmann
required data, build the data structure, and invoke the script on the
49 f98efa98 Michael Hanselmann
node to be added. This will enable us to once again use the system's own
50 f98efa98 Michael Hanselmann
SSH client and to drop the dependency on Paramiko for Ganeti itself
51 f98efa98 Michael Hanselmann
(``ganeti-listrunner`` is going to continue using Paramiko).
52 f98efa98 Michael Hanselmann
53 f98efa98 Michael Hanselmann
Eventually ``setup-ssh`` can be removed.
54 f98efa98 Michael Hanselmann
55 425f3ffe Michael Hanselmann
56 425f3ffe Michael Hanselmann
Node daemon
57 425f3ffe Michael Hanselmann
~~~~~~~~~~~
58 425f3ffe Michael Hanselmann
59 425f3ffe Michael Hanselmann
Similar to SSH setup changes, the process of copying files and starting
60 425f3ffe Michael Hanselmann
the node daemon will be moved into a dedicated program. On its standard
61 425f3ffe Michael Hanselmann
input it will receive a standardized JSON structure (defined :ref:`below
62 425f3ffe Michael Hanselmann
<node-daemon-setup-json>`). Once the input data has been successfully
63 425f3ffe Michael Hanselmann
decoded and the received values were verified for sanity, the program
64 425f3ffe Michael Hanselmann
proceeds to write the values to files and then starts the node daemon
65 425f3ffe Michael Hanselmann
(``ganeti-noded``).
66 425f3ffe Michael Hanselmann
67 425f3ffe Michael Hanselmann
To add a new node to the cluster, the master node will have to gather
68 425f3ffe Michael Hanselmann
all values, build the data structure, and then invoke the newly added
69 425f3ffe Michael Hanselmann
``node-daemon-setup`` program via SSH. In this way only a single SSH
70 425f3ffe Michael Hanselmann
connection is needed and the values can be verified before being written
71 425f3ffe Michael Hanselmann
to files.
72 425f3ffe Michael Hanselmann
73 425f3ffe Michael Hanselmann
If the program exits successfully, the node is ready to be added to the
74 425f3ffe Michael Hanselmann
master daemon's configuration. The node daemon will be running, but
75 425f3ffe Michael Hanselmann
``OpNodeAdd`` needs to be run before it becomes a full node. The opcode
76 425f3ffe Michael Hanselmann
will copy more files, such as the :doc:`RAPI certificate <rapi>`.
77 425f3ffe Michael Hanselmann
78 425f3ffe Michael Hanselmann
79 425f3ffe Michael Hanselmann
Data structures
80 425f3ffe Michael Hanselmann
---------------
81 425f3ffe Michael Hanselmann
82 f98efa98 Michael Hanselmann
.. _prepare-node-join-json:
83 f98efa98 Michael Hanselmann
84 425f3ffe Michael Hanselmann
JSON structure for SSH setup
85 425f3ffe Michael Hanselmann
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
86 f98efa98 Michael Hanselmann
87 f4afc16e Michael Hanselmann
The data is given in an object containing the keys described below.
88 f4afc16e Michael Hanselmann
Unless specified otherwise, all entries are optional.
89 f98efa98 Michael Hanselmann
90 f4afc16e Michael Hanselmann
``cluster_name``
91 f4afc16e Michael Hanselmann
  Required string with the cluster name. If a local cluster name is
92 f4afc16e Michael Hanselmann
  found, the join process is aborted unless the passed cluster name
93 f4afc16e Michael Hanselmann
  matches the local name.
94 f4afc16e Michael Hanselmann
``node_daemon_certificate``
95 f4afc16e Michael Hanselmann
  Public part of cluster's node daemon certificate in PEM format. If a
96 f4afc16e Michael Hanselmann
  local node certificate and key is found, the join process is aborted
97 f4afc16e Michael Hanselmann
  unless this passed public part can be verified with the local key.
98 f98efa98 Michael Hanselmann
``ssh_host_key``
99 f98efa98 Michael Hanselmann
  List containing public and private parts of SSH host key. See below
100 f98efa98 Michael Hanselmann
  for definition.
101 f98efa98 Michael Hanselmann
``ssh_root_key``
102 f98efa98 Michael Hanselmann
  List containing public and private parts of root's key for SSH
103 f98efa98 Michael Hanselmann
  authorization. See below for definition.
104 f98efa98 Michael Hanselmann
105 f98efa98 Michael Hanselmann
Lists of SSH keys use a tuple with three values. The first describes the
106 340ae7da Michael Hanselmann
key variant (``rsa`` or ``dsa``). The second and third are the private
107 340ae7da Michael Hanselmann
and public part of the key. Example:
108 f98efa98 Michael Hanselmann
109 f98efa98 Michael Hanselmann
.. highlight:: javascript
110 f98efa98 Michael Hanselmann
111 f98efa98 Michael Hanselmann
::
112 f98efa98 Michael Hanselmann
113 f98efa98 Michael Hanselmann
  [
114 340ae7da Michael Hanselmann
    ("rsa", "-----BEGIN RSA PRIVATE KEY-----...", "ssh-rss AAAA..."),
115 340ae7da Michael Hanselmann
    ("dsa", "-----BEGIN DSA PRIVATE KEY-----...", "ssh-dss AAAA..."),
116 f98efa98 Michael Hanselmann
  ]
117 f98efa98 Michael Hanselmann
118 425f3ffe Michael Hanselmann
119 425f3ffe Michael Hanselmann
.. _node-daemon-setup-json:
120 425f3ffe Michael Hanselmann
121 425f3ffe Michael Hanselmann
JSON structure for node daemon setup
122 425f3ffe Michael Hanselmann
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
123 425f3ffe Michael Hanselmann
124 425f3ffe Michael Hanselmann
The data is given in an object containing the keys described below.
125 425f3ffe Michael Hanselmann
Unless specified otherwise, all entries are optional.
126 425f3ffe Michael Hanselmann
127 425f3ffe Michael Hanselmann
``cluster_name``
128 425f3ffe Michael Hanselmann
  Required string with the cluster name. If a local cluster name is
129 425f3ffe Michael Hanselmann
  found, the join process is aborted unless the passed cluster name
130 425f3ffe Michael Hanselmann
  matches the local name. The cluster name is also included in the
131 425f3ffe Michael Hanselmann
  dictionary given via the ``ssconf`` entry.
132 425f3ffe Michael Hanselmann
``node_daemon_certificate``
133 425f3ffe Michael Hanselmann
  Public and private part of cluster's node daemon certificate in PEM
134 425f3ffe Michael Hanselmann
  format. If a local node certificate is found, the process is aborted
135 425f3ffe Michael Hanselmann
  unless it matches.
136 425f3ffe Michael Hanselmann
``ssconf``
137 425f3ffe Michael Hanselmann
  Dictionary with ssconf names and their values. Both are strings.
138 425f3ffe Michael Hanselmann
  Example:
139 425f3ffe Michael Hanselmann
140 425f3ffe Michael Hanselmann
  .. highlight:: javascript
141 425f3ffe Michael Hanselmann
142 425f3ffe Michael Hanselmann
  ::
143 425f3ffe Michael Hanselmann
144 425f3ffe Michael Hanselmann
    {
145 425f3ffe Michael Hanselmann
      "cluster_name": "cluster.example.com",
146 425f3ffe Michael Hanselmann
      "master_ip": "192.168.2.1",
147 425f3ffe Michael Hanselmann
      "master_netdev": "br0",
148 425f3ffe Michael Hanselmann
      # …
149 425f3ffe Michael Hanselmann
    }
150 425f3ffe Michael Hanselmann
151 425f3ffe Michael Hanselmann
``start_node_daemon``
152 425f3ffe Michael Hanselmann
  Boolean denoting whether the node daemon should be started (or
153 425f3ffe Michael Hanselmann
  restarted if it was running for some reason).
154 425f3ffe Michael Hanselmann
155 f98efa98 Michael Hanselmann
.. vim: set textwidth=72 :
156 f98efa98 Michael Hanselmann
.. Local Variables:
157 f98efa98 Michael Hanselmann
.. mode: rst
158 f98efa98 Michael Hanselmann
.. fill-column: 72
159 f98efa98 Michael Hanselmann
.. End: