root / doc / design-multi-reloc.rst @ 25ee7fd8
History | View | Annotate | Download (4.1 kB)
1 |
==================================== |
---|---|
2 |
Moving instances accross node groups |
3 |
==================================== |
4 |
|
5 |
This design document explains the changes needed in Ganeti to perform |
6 |
instance moves across node groups. Reader familiarity with the following |
7 |
existing documents is advised: |
8 |
|
9 |
- :doc:`Current IAllocator specification <iallocator>` |
10 |
- :doc:`Shared storage model in 2.3+ <design-shared-storage>` |
11 |
|
12 |
Motivation and and design proposal |
13 |
================================== |
14 |
|
15 |
At the moment, moving instances away from their primary or secondary |
16 |
nodes with the ``relocate`` and ``multi-evacuate`` IAllocator calls |
17 |
restricts target nodes to those on the same node group. This ensures a |
18 |
mobility domain is never crossed, and allows normal operation of each |
19 |
node group to be confined within itself. |
20 |
|
21 |
It is desirable, however, to have a way of moving instances across node |
22 |
groups so that, for example, it is possible to move a set of instances |
23 |
to another group for policy reasons, or completely empty a given group |
24 |
to perform maintenance operations. |
25 |
|
26 |
To implement this, we propose the addition of new IAllocator calls to |
27 |
compute inter-group instance moves and group-aware node evacuation, |
28 |
taking into account mobility domains as appropriate. The interface |
29 |
proposed below should be enough to cover the use cases mentioned above. |
30 |
|
31 |
With the implementation of this design proposal, the previous |
32 |
``multi-evacuate`` mode will be deprecated. |
33 |
|
34 |
.. _multi-reloc-detailed-design: |
35 |
|
36 |
Detailed design |
37 |
=============== |
38 |
|
39 |
All requests honor the groups' ``alloc_policy`` attribute. |
40 |
|
41 |
Changing instance's groups |
42 |
-------------------------- |
43 |
|
44 |
Takes a list of instances and a list of node group UUIDs; the instances |
45 |
will be moved away from their current group, to any of the groups in the |
46 |
target list. All instances need to have their primary node in the same |
47 |
group, which may not be a target group. If the target group list is |
48 |
empty, the request is simply "change group" and the instances are placed |
49 |
in any group but their original one. |
50 |
|
51 |
Node evacuation |
52 |
--------------- |
53 |
|
54 |
Evacuates instances off their primary nodes. The evacuation mode |
55 |
can be given as ``primary-only``, ``secondary-only`` or |
56 |
``all``. The call is given a list of instances whose primary nodes need |
57 |
to be in the same node group. The returned nodes need to be in the same |
58 |
group as the original primary node. |
59 |
|
60 |
.. _multi-reloc-result: |
61 |
|
62 |
Result |
63 |
------ |
64 |
|
65 |
In all storage models, an inter-group move can be modeled as a sequence |
66 |
of **replace secondary**, **migration** and **failover** operations |
67 |
(when shared storage is used, they will all be failover or migration |
68 |
operations within the corresponding mobility domain). |
69 |
|
70 |
The result of the operations described above must contain two lists of |
71 |
instances and a list of jobsets. |
72 |
|
73 |
The two lists of instances describe which instances could be |
74 |
moved/migrated and which couldn't for some reason ("unsuccessful"). The |
75 |
union of the two lists must be equal to the set of instances given in |
76 |
the original request. |
77 |
|
78 |
The list of jobsets contained in the result describe how to actually |
79 |
execute the operation. Each jobset contains lists of serialized opcodes. |
80 |
Example:: |
81 |
|
82 |
[ |
83 |
[ |
84 |
{ "OP_ID": "OP_INSTANCE_MIGRATE", |
85 |
"instance_name": "inst1.example.com", |
86 |
}, |
87 |
{ "OP_ID": "OP_INSTANCE_MIGRATE", |
88 |
"instance_name": "inst2.example.com", |
89 |
}, |
90 |
], |
91 |
[ |
92 |
{ "OP_ID": "OP_INSTANCE_REPLACE_DISKS", |
93 |
"instance_name": "inst2.example.com", |
94 |
"mode": "replace_new_secondary", |
95 |
"remote_node": "node4.example.com" |
96 |
}, |
97 |
], |
98 |
[ |
99 |
{ "OP_ID": "OP_INSTANCE_FAILOVER", |
100 |
"instance_name": "inst8.example.com", |
101 |
}, |
102 |
] |
103 |
] |
104 |
|
105 |
Accepted opcodes: |
106 |
|
107 |
- ``OP_INSTANCE_FAILOVER`` |
108 |
- ``OP_INSTANCE_MIGRATE`` |
109 |
- ``OP_INSTANCE_REPLACE_DISKS`` |
110 |
|
111 |
Starting with the first set, Ganeti will submit all jobs of a set at the |
112 |
same time, enabling execution in parallel. Upon completion of all jobs |
113 |
in a set, the process is repeated for the next one. Ganeti is at liberty |
114 |
to abort the execution after any jobset. In such a case the user is |
115 |
notified and can restart the operation. |
116 |
|
117 |
.. vim: set textwidth=72 : |
118 |
.. Local Variables: |
119 |
.. mode: rst |
120 |
.. fill-column: 72 |
121 |
.. End: |