Statistics
| Branch: | Tag: | Revision:

root / doc / design-cluster-merger.rst @ c3c5dc77

History | View | Annotate | Download (4.2 kB)

1 3605691e René Nussbaumer
=====================
2 3605691e René Nussbaumer
Ganeti Cluster Merger
3 3605691e René Nussbaumer
=====================
4 3605691e René Nussbaumer
5 3605691e René Nussbaumer
Current situation
6 3605691e René Nussbaumer
=================
7 3605691e René Nussbaumer
8 3605691e René Nussbaumer
Currently there's no easy way to merge two or more clusters together.
9 3605691e René Nussbaumer
But in order to optimize resources this is a needed missing piece. The
10 3605691e René Nussbaumer
goal of this design doc is to come up with a easy to use solution which
11 3605691e René Nussbaumer
allows you to merge two or more cluster together.
12 3605691e René Nussbaumer
13 3605691e René Nussbaumer
Initial contact
14 3605691e René Nussbaumer
===============
15 3605691e René Nussbaumer
16 3605691e René Nussbaumer
As the design of Ganeti is based on an autonomous system, Ganeti by
17 3605691e René Nussbaumer
itself has no way to reach nodes outside of its cluster. To overcome
18 3605691e René Nussbaumer
this situation we're required to prepare the cluster before we can go
19 3605691e René Nussbaumer
ahead with the actual merge: We've to replace at least the ssh keys on
20 3605691e René Nussbaumer
the affected nodes before we can do any operation within ``gnt-``
21 3605691e René Nussbaumer
commands.
22 3605691e René Nussbaumer
23 3605691e René Nussbaumer
To make this a automated process we'll ask the user to provide us with
24 3605691e René Nussbaumer
the root password of every cluster we've to merge. We use the password
25 3605691e René Nussbaumer
to grab the current ``id_dsa`` key and then rely on that ssh key for any
26 3605691e René Nussbaumer
further communication to be made until the cluster is fully merged.
27 3605691e René Nussbaumer
28 3605691e René Nussbaumer
Cluster merge
29 3605691e René Nussbaumer
=============
30 3605691e René Nussbaumer
31 3605691e René Nussbaumer
After initial contact we do the cluster merge:
32 3605691e René Nussbaumer
33 3605691e René Nussbaumer
1. Grab the list of nodes
34 3605691e René Nussbaumer
2. On all nodes add our own ``id_dsa.pub`` key to ``authorized_keys``
35 3605691e René Nussbaumer
3. Stop all instances running on the merging cluster
36 3605691e René Nussbaumer
4. Disable ``ganeti-watcher`` as it tries to restart Ganeti daemons
37 3605691e René Nussbaumer
5. Stop all Ganeti daemons on all merging nodes
38 3605691e René Nussbaumer
6. Grab the ``config.data`` from the master of the merging cluster
39 3605691e René Nussbaumer
7. Stop local ``ganeti-masterd``
40 3605691e René Nussbaumer
8. Merge the config:
41 3605691e René Nussbaumer
42 3605691e René Nussbaumer
   1. Open our own cluster ``config.data``
43 3605691e René Nussbaumer
   2. Open cluster ``config.data`` of the merging cluster
44 3605691e René Nussbaumer
   3. Grab all nodes of the merging cluster
45 3605691e René Nussbaumer
   4. Set ``master_candidate`` to false on all merging nodes
46 3605691e René Nussbaumer
   5. Add the nodes to our own cluster ``config.data``
47 3605691e René Nussbaumer
   6. Grab all the instances on the merging cluster
48 3605691e René Nussbaumer
   7. Adjust the port if the instance has drbd layout:
49 3605691e René Nussbaumer
50 3605691e René Nussbaumer
      1. In ``logical_id`` (index 2)
51 3605691e René Nussbaumer
      2. In ``physical_id`` (index 1 and 3)
52 3605691e René Nussbaumer
53 3605691e René Nussbaumer
   8. Add the instances to our own cluster ``config.data``
54 3605691e René Nussbaumer
55 3605691e René Nussbaumer
9. Start ``ganeti-masterd`` with ``--no-voting`` ``--yes-do-it``
56 3605691e René Nussbaumer
10. ``gnt-node add --readd`` on all merging nodes
57 3605691e René Nussbaumer
11. ``gnt-cluster redist-conf``
58 3605691e René Nussbaumer
12. Restart ``ganeti-masterd`` normally
59 3605691e René Nussbaumer
13. Enable ``ganeti-watcher`` again
60 3605691e René Nussbaumer
14. Start all merging instances again
61 3605691e René Nussbaumer
62 3605691e René Nussbaumer
Rollback
63 3605691e René Nussbaumer
========
64 3605691e René Nussbaumer
65 3605691e René Nussbaumer
Until we actually (re)add any nodes we can abort and rollback the merge
66 3605691e René Nussbaumer
at any point. After merging the config, though, we've to get the backup
67 3605691e René Nussbaumer
copy of ``config.data`` (from another master candidate node). And for
68 3605691e René Nussbaumer
security reasons it's a good idea to undo ``id_dsa.pub`` distribution by
69 3605691e René Nussbaumer
going on every affected node and remove the ``id_dsa.pub`` key again.
70 3605691e René Nussbaumer
Also we've to keep in mind, that we've to start the Ganeti daemons and
71 3605691e René Nussbaumer
starting up the instances again.
72 3605691e René Nussbaumer
73 3605691e René Nussbaumer
Verification
74 3605691e René Nussbaumer
============
75 3605691e René Nussbaumer
76 3605691e René Nussbaumer
Last but not least we should verify that the merge was successful.
77 3605691e René Nussbaumer
Therefore we run ``gnt-cluster verify``, which ensures that the cluster
78 3605691e René Nussbaumer
overall is in a healthy state. Additional it's also possible to compare
79 3605691e René Nussbaumer
the list of instances/nodes with a list made prior to the upgrade to
80 3605691e René Nussbaumer
make sure we didn't lose any data/instance/node.
81 3605691e René Nussbaumer
82 3605691e René Nussbaumer
Appendix
83 3605691e René Nussbaumer
========
84 3605691e René Nussbaumer
85 3605691e René Nussbaumer
cluster-merge.py
86 3605691e René Nussbaumer
----------------
87 3605691e René Nussbaumer
88 3605691e René Nussbaumer
Used to merge the cluster config. This is a POC and might differ from
89 3605691e René Nussbaumer
actual production code.
90 3605691e René Nussbaumer
91 3605691e René Nussbaumer
::
92 3605691e René Nussbaumer
93 3605691e René Nussbaumer
  #!/usr/bin/python
94 3605691e René Nussbaumer
95 3605691e René Nussbaumer
  import sys
96 3605691e René Nussbaumer
  from ganeti import config
97 3605691e René Nussbaumer
  from ganeti import constants
98 3605691e René Nussbaumer
99 3605691e René Nussbaumer
  c_mine = config.ConfigWriter(offline=True)
100 3605691e René Nussbaumer
  c_other = config.ConfigWriter(sys.argv[1])
101 3605691e René Nussbaumer
102 3605691e René Nussbaumer
  fake_id = 0
103 3605691e René Nussbaumer
  for node in c_other.GetNodeList():
104 3605691e René Nussbaumer
    node_info = c_other.GetNodeInfo(node)
105 3605691e René Nussbaumer
    node_info.master_candidate = False
106 3605691e René Nussbaumer
    c_mine.AddNode(node_info, str(fake_id))
107 3605691e René Nussbaumer
    fake_id += 1
108 3605691e René Nussbaumer
109 3605691e René Nussbaumer
  for instance in c_other.GetInstanceList():
110 3605691e René Nussbaumer
    instance_info = c_other.GetInstanceInfo(instance)
111 3605691e René Nussbaumer
    for dsk in instance_info.disks:
112 3605691e René Nussbaumer
      if dsk.dev_type in constants.LDS_DRBD:
113 3605691e René Nussbaumer
         port = c_mine.AllocatePort()
114 3605691e René Nussbaumer
         logical_id = list(dsk.logical_id)
115 3605691e René Nussbaumer
         logical_id[2] = port
116 3605691e René Nussbaumer
         dsk.logical_id = tuple(logical_id)
117 3605691e René Nussbaumer
         physical_id = list(dsk.physical_id)
118 3605691e René Nussbaumer
         physical_id[1] = physical_id[3] = port
119 3605691e René Nussbaumer
         dsk.physical_id = tuple(physical_id)
120 3605691e René Nussbaumer
    c_mine.AddInstance(instance_info, str(fake_id))
121 3605691e René Nussbaumer
    fake_id += 1
122 3605691e René Nussbaumer
123 3605691e René Nussbaumer
.. vim: set textwidth=72 :
124 3605691e René Nussbaumer
.. Local Variables:
125 3605691e René Nussbaumer
.. mode: rst
126 3605691e René Nussbaumer
.. fill-column: 72
127 3605691e René Nussbaumer
.. End: