root / doc / design-multi-reloc.rst @ d23a2a9d
History | View | Annotate | Download (3.8 kB)
1 | 4055b109 | Adeodato Simo | ==================================== |
---|---|---|---|
2 | 4055b109 | Adeodato Simo | Moving instances accross node groups |
3 | 4055b109 | Adeodato Simo | ==================================== |
4 | 4055b109 | Adeodato Simo | |
5 | 4055b109 | Adeodato Simo | This design document explains the changes needed in Ganeti to perform |
6 | 4055b109 | Adeodato Simo | instance moves across node groups. Reader familiarity with the following |
7 | 4055b109 | Adeodato Simo | existing documents is advised: |
8 | 4055b109 | Adeodato Simo | |
9 | 4055b109 | Adeodato Simo | - :doc:`Current IAllocator specification <iallocator>` |
10 | 4055b109 | Adeodato Simo | - :doc:`Shared storage model in 2.3+ <design-shared-storage>` |
11 | 4055b109 | Adeodato Simo | |
12 | 4055b109 | Adeodato Simo | Motivation and and design proposal |
13 | 4055b109 | Adeodato Simo | ================================== |
14 | 4055b109 | Adeodato Simo | |
15 | 4055b109 | Adeodato Simo | At the moment, moving instances away from their primary or secondary |
16 | 4055b109 | Adeodato Simo | nodes with the ``relocate`` and ``multi-evacuate`` IAllocator calls |
17 | 4055b109 | Adeodato Simo | restricts target nodes to those on the same node group. This ensures a |
18 | 4055b109 | Adeodato Simo | mobility domain is never crossed, and allows normal operation of each |
19 | 4055b109 | Adeodato Simo | node group to be confined within itself. |
20 | 4055b109 | Adeodato Simo | |
21 | 4055b109 | Adeodato Simo | It is desirable, however, to have a way of moving instances across node |
22 | 4055b109 | Adeodato Simo | groups so that, for example, it is possible to move a set of instances |
23 | 4055b109 | Adeodato Simo | to another group for policy reasons, or completely empty a given group |
24 | 4055b109 | Adeodato Simo | to perform maintenance operations. |
25 | 4055b109 | Adeodato Simo | |
26 | 4055b109 | Adeodato Simo | To implement this, we propose a new ``multi-relocate`` IAllocator call |
27 | 4055b109 | Adeodato Simo | that will be able to compute inter-group instance moves, taking into |
28 | 4055b109 | Adeodato Simo | account mobility domains as appropriate. The interface proposed below |
29 | 4055b109 | Adeodato Simo | should be enough to cover the use cases mentioned above. |
30 | 4055b109 | Adeodato Simo | |
31 | 6d267b81 | Adeodato Simo | .. _multi-reloc-detailed-design: |
32 | 6d267b81 | Adeodato Simo | |
33 | 4055b109 | Adeodato Simo | Detailed design |
34 | 4055b109 | Adeodato Simo | =============== |
35 | 4055b109 | Adeodato Simo | |
36 | 4055b109 | Adeodato Simo | We introduce a new ``multi-relocate`` IAllocator call whose input will |
37 | 4055b109 | Adeodato Simo | be a list of instances to move, and a "mode of operation" that will |
38 | 4055b109 | Adeodato Simo | determine what groups will be candidates to receive the new instances. |
39 | 4055b109 | Adeodato Simo | |
40 | 4055b109 | Adeodato Simo | The mode of operation will be one of: |
41 | 4055b109 | Adeodato Simo | |
42 | 4055b109 | Adeodato Simo | - *Stay in group*: the instances will be moved off their current nodes, |
43 | 4055b109 | Adeodato Simo | but will stay in the same group; this is what the ``relocate`` call |
44 | 4055b109 | Adeodato Simo | does, but here it can act on multiple instances. (Typically, the |
45 | 4055b109 | Adeodato Simo | source nodes will be marked as drained, to avoid just exchanging |
46 | 4055b109 | Adeodato Simo | instances among them.) |
47 | 4055b109 | Adeodato Simo | |
48 | 4055b109 | Adeodato Simo | - *Change group*: this mode accepts one extra parameter, a list of node |
49 | 4055b109 | Adeodato Simo | group UUIDs; the instances will be moved away from their current |
50 | 4055b109 | Adeodato Simo | group, to any of the groups in this list. If the list is empty, the |
51 | 4055b109 | Adeodato Simo | request is, simply, "change group": the instances are placed in any |
52 | 4055b109 | Adeodato Simo | group but their original one. |
53 | 4055b109 | Adeodato Simo | |
54 | 4055b109 | Adeodato Simo | - *Any*: for each instance, any group is valid, including its current |
55 | 4055b109 | Adeodato Simo | one. |
56 | 4055b109 | Adeodato Simo | |
57 | 4055b109 | Adeodato Simo | In all modes, the groups' ``alloc_policy`` attribute will be honored. |
58 | 4055b109 | Adeodato Simo | |
59 | 9626f028 | Michael Hanselmann | .. _multi-reloc-result: |
60 | 9626f028 | Michael Hanselmann | |
61 | 4055b109 | Adeodato Simo | Result |
62 | 4055b109 | Adeodato Simo | ------ |
63 | 4055b109 | Adeodato Simo | |
64 | 4055b109 | Adeodato Simo | In all storage models, an inter-group move can be modeled as a sequence |
65 | 9626f028 | Michael Hanselmann | of **replace secondary**, **migration** and **failover** operations |
66 | 9626f028 | Michael Hanselmann | (when shared storage is used, they will all be failover or migration |
67 | 9626f028 | Michael Hanselmann | operations within the corresponding mobility domain). |
68 | 9626f028 | Michael Hanselmann | |
69 | 9626f028 | Michael Hanselmann | The result is expected to be a list of jobsets. Each jobset contains |
70 | 9626f028 | Michael Hanselmann | lists of serialized opcodes. Example:: |
71 | 9626f028 | Michael Hanselmann | |
72 | 9626f028 | Michael Hanselmann | [ |
73 | 9626f028 | Michael Hanselmann | [ |
74 | 9626f028 | Michael Hanselmann | { "OP_ID": "OP_INSTANCE_MIGRATE", |
75 | 9626f028 | Michael Hanselmann | "instance_name": "inst1.example.com", |
76 | 9626f028 | Michael Hanselmann | }, |
77 | 9626f028 | Michael Hanselmann | { "OP_ID": "OP_INSTANCE_MIGRATE", |
78 | 9626f028 | Michael Hanselmann | "instance_name": "inst2.example.com", |
79 | 9626f028 | Michael Hanselmann | }, |
80 | 9626f028 | Michael Hanselmann | ], |
81 | 9626f028 | Michael Hanselmann | [ |
82 | 9626f028 | Michael Hanselmann | { "OP_ID": "OP_INSTANCE_REPLACE_DISKS", |
83 | 9626f028 | Michael Hanselmann | "instance_name": "inst2.example.com", |
84 | 9626f028 | Michael Hanselmann | "mode": "replace_new_secondary", |
85 | 9626f028 | Michael Hanselmann | "remote_node": "node4.example.com" |
86 | 9626f028 | Michael Hanselmann | }, |
87 | 9626f028 | Michael Hanselmann | ], |
88 | 9626f028 | Michael Hanselmann | [ |
89 | 9626f028 | Michael Hanselmann | { "OP_ID": "OP_INSTANCE_FAILOVER", |
90 | 9626f028 | Michael Hanselmann | "instance_name": "inst8.example.com", |
91 | 9626f028 | Michael Hanselmann | }, |
92 | 9626f028 | Michael Hanselmann | ] |
93 | 9626f028 | Michael Hanselmann | ] |
94 | 9626f028 | Michael Hanselmann | |
95 | 9626f028 | Michael Hanselmann | Accepted opcodes: |
96 | 9626f028 | Michael Hanselmann | |
97 | 9626f028 | Michael Hanselmann | - ``OP_INSTANCE_FAILOVER`` |
98 | 9626f028 | Michael Hanselmann | - ``OP_INSTANCE_MIGRATE`` |
99 | 9626f028 | Michael Hanselmann | - ``OP_INSTANCE_REPLACE_DISKS`` |
100 | 9626f028 | Michael Hanselmann | |
101 | 9626f028 | Michael Hanselmann | Starting with the first set, Ganeti will submit all jobs of a set at the |
102 | 9626f028 | Michael Hanselmann | same time, enabling execution in parallel. Upon completion of all jobs |
103 | 9626f028 | Michael Hanselmann | in a set, the process is repeated for the next one. Ganeti is at liberty |
104 | 9626f028 | Michael Hanselmann | to abort the execution of the relocation after any jobset. In such a |
105 | 9626f028 | Michael Hanselmann | case the user is notified and can restart the relocation. |
106 | 4055b109 | Adeodato Simo | |
107 | 4055b109 | Adeodato Simo | .. vim: set textwidth=72 : |
108 | 4055b109 | Adeodato Simo | .. Local Variables: |
109 | 4055b109 | Adeodato Simo | .. mode: rst |
110 | 4055b109 | Adeodato Simo | .. fill-column: 72 |
111 | 4055b109 | Adeodato Simo | .. End: |