Statistics
| Branch: | Tag: | Revision:

root / doc / design-multi-reloc.rst @ 9626f028

History | View | Annotate | Download (3.8 kB)

1
====================================
2
Moving instances accross node groups
3
====================================
4

    
5
This design document explains the changes needed in Ganeti to perform
6
instance moves across node groups. Reader familiarity with the following
7
existing documents is advised:
8

    
9
- :doc:`Current IAllocator specification <iallocator>`
10
- :doc:`Shared storage model in 2.3+ <design-shared-storage>`
11

    
12
Motivation and and design proposal
13
==================================
14

    
15
At the moment, moving instances away from their primary or secondary
16
nodes with the ``relocate`` and ``multi-evacuate`` IAllocator calls
17
restricts target nodes to those on the same node group. This ensures a
18
mobility domain is never crossed, and allows normal operation of each
19
node group to be confined within itself.
20

    
21
It is desirable, however, to have a way of moving instances across node
22
groups so that, for example, it is possible to move a set of instances
23
to another group for policy reasons, or completely empty a given group
24
to perform maintenance operations.
25

    
26
To implement this, we propose a new ``multi-relocate`` IAllocator call
27
that will be able to compute inter-group instance moves, taking into
28
account mobility domains as appropriate. The interface proposed below
29
should be enough to cover the use cases mentioned above.
30

    
31
.. _multi-reloc-detailed-design:
32

    
33
Detailed design
34
===============
35

    
36
We introduce a new ``multi-relocate`` IAllocator call whose input will
37
be a list of instances to move, and a "mode of operation" that will
38
determine what groups will be candidates to receive the new instances.
39

    
40
The mode of operation will be one of:
41

    
42
- *Stay in group*: the instances will be moved off their current nodes,
43
  but will stay in the same group; this is what the ``relocate`` call
44
  does, but here it can act on multiple instances. (Typically, the
45
  source nodes will be marked as drained, to avoid just exchanging
46
  instances among them.)
47

    
48
- *Change group*: this mode accepts one extra parameter, a list of node
49
  group UUIDs; the instances will be moved away from their current
50
  group, to any of the groups in this list. If the list is empty, the
51
  request is, simply, "change group": the instances are placed in any
52
  group but their original one.
53

    
54
- *Any*: for each instance, any group is valid, including its current
55
  one.
56

    
57
In all modes, the groups' ``alloc_policy`` attribute will be honored.
58

    
59
.. _multi-reloc-result:
60

    
61
Result
62
------
63

    
64
In all storage models, an inter-group move can be modeled as a sequence
65
of **replace secondary**, **migration** and **failover** operations
66
(when shared storage is used, they will all be failover or migration
67
operations within the corresponding mobility domain).
68

    
69
The result is expected to be a list of jobsets. Each jobset contains
70
lists of serialized opcodes. Example::
71

    
72
  [
73
    [
74
      { "OP_ID": "OP_INSTANCE_MIGRATE",
75
        "instance_name": "inst1.example.com",
76
      },
77
      { "OP_ID": "OP_INSTANCE_MIGRATE",
78
        "instance_name": "inst2.example.com",
79
      },
80
    ],
81
    [
82
      { "OP_ID": "OP_INSTANCE_REPLACE_DISKS",
83
        "instance_name": "inst2.example.com",
84
        "mode": "replace_new_secondary",
85
        "remote_node": "node4.example.com"
86
      },
87
    ],
88
    [
89
      { "OP_ID": "OP_INSTANCE_FAILOVER",
90
        "instance_name": "inst8.example.com",
91
      },
92
    ]
93
  ]
94

    
95
Accepted opcodes:
96

    
97
- ``OP_INSTANCE_FAILOVER``
98
- ``OP_INSTANCE_MIGRATE``
99
- ``OP_INSTANCE_REPLACE_DISKS``
100

    
101
Starting with the first set, Ganeti will submit all jobs of a set at the
102
same time, enabling execution in parallel. Upon completion of all jobs
103
in a set, the process is repeated for the next one. Ganeti is at liberty
104
to abort the execution of the relocation after any jobset. In such a
105
case the user is notified and can restart the relocation.
106

    
107
.. vim: set textwidth=72 :
108
.. Local Variables:
109
.. mode: rst
110
.. fill-column: 72
111
.. End: