Revision f5d723fe doc/design-2.3.rst
b/doc/design-2.3.rst | ||
---|---|---|
51 | 51 |
To manage node groups and the nodes belonging to them, the following new |
52 | 52 |
commands and flags will be introduced:: |
53 | 53 |
|
54 |
gnt-node group-add <group> # add a new node group |
|
55 |
gnt-node group-del <group> # delete an empty group |
|
56 |
gnt-node group-list # list node groups |
|
57 |
gnt-node group-rename <oldname> <newname> # rename a group |
|
58 |
gnt-node list/info -g <group> # list only nodes belonging to a group |
|
59 |
gnt-node add -g <group> # add a node to a certain group |
|
60 |
gnt-node modify -g <group> # move a node to a new group |
|
54 |
gnt-group add <group> # add a new node group |
|
55 |
gnt-group del <group> # delete an empty node group |
|
56 |
gnt-group list # list node groups |
|
57 |
gnt-group rename <oldname> <newname> # rename a node group |
|
58 |
gnt-node {list,info} -g <group> # list only nodes belonging to a node group |
|
59 |
gnt-node modify -g <group> # assign a node to a node group |
|
60 |
|
|
61 |
Node group attributes |
|
62 |
+++++++++++++++++++++ |
|
63 |
|
|
64 |
In clusters with more than one node group, it may be desirable to |
|
65 |
establish local policies regarding which groups should be preferred when |
|
66 |
performing allocation of new instances, or inter-group instance migrations. |
|
67 |
|
|
68 |
To help with this, we will provide an ``alloc_policy`` attribute for |
|
69 |
node groups. Such attribute will be honored by iallocator plugins when |
|
70 |
making automatic decisions regarding instance placement. |
|
71 |
|
|
72 |
The ``alloc_policy`` attribute can have the following values: |
|
73 |
|
|
74 |
- unallocable: the node group should not be a candidate for instance |
|
75 |
allocations, and the operation should fail if only groups in this |
|
76 |
state could be found that would satisfy the requirements. |
|
77 |
|
|
78 |
- last_resort: the node group should not be used for instance |
|
79 |
allocations, unless this would be the only way to have the operation |
|
80 |
succeed. |
|
81 |
|
|
82 |
- preferred: the node group can be used freely for allocation of |
|
83 |
instances (this is the default state for newly created node |
|
84 |
groups). Note that prioritization among groups in this state will be |
|
85 |
deferred to the iallocator plugin that's being used. |
|
86 |
|
|
87 |
Node group operations |
|
88 |
+++++++++++++++++++++ |
|
89 |
|
|
90 |
One operation at the node group level will be initially provided:: |
|
91 |
|
|
92 |
gnt-group drain <group> |
|
93 |
|
|
94 |
The purpose of this operation is to migrate all instances in a given |
|
95 |
node group to other groups in the cluster, e.g. to reclaim capacity if |
|
96 |
there are enough free resources in other node groups that share a |
|
97 |
storage pool with the evacuated group. |
|
61 | 98 |
|
62 | 99 |
Instance level changes |
63 | 100 |
++++++++++++++++++++++ |
64 | 101 |
|
65 |
Instances will be able to live in only one group at a time. This is |
|
66 |
mostly important for DRBD instances, in which case both their primary |
|
67 |
and secondary nodes will need to be in the same group. To support this |
|
68 |
we envision the following changes: |
|
69 |
|
|
70 |
- The cluster will have a default group, which will initially be |
|
71 |
- Instance allocation will happen to the cluster's default group |
|
72 |
(which will be changeable via ``gnt-cluster modify`` or RAPI) unless |
|
73 |
a group is explicitly specified in the creation job (with -g or via |
|
74 |
RAPI). Iallocator will be only passed the nodes belonging to that |
|
102 |
With the introduction of node groups, instances will be required to live |
|
103 |
in only one group at a time; this is mostly important for DRBD |
|
104 |
instances, which will not be allowed to have their primary and secondary |
|
105 |
nodes in different node groups. To support this, we envision the |
|
106 |
following changes: |
|
107 |
|
|
108 |
- The iallocator interface will be augmented, and node groups exposed, |
|
109 |
so that plugins will be able to make a decision regarding the group |
|
110 |
in which to place a new instance. By default, all node groups will |
|
111 |
be considered, but it will be possible to include a list of groups |
|
112 |
in the creation job, in which case the plugin will limit itself to |
|
113 |
considering those; in both cases, the ``alloc_policy`` attribute |
|
114 |
will be honored. |
|
115 |
- If, on the other hand, a primary and secondary nodes are specified |
|
116 |
for a new instance, they will be required to be on the same node |
|
75 | 117 |
group. |
76 | 118 |
- Moving an instance between groups can only happen via an explicit |
77 | 119 |
operation, which for example in the case of DRBD will work by |
78 | 120 |
performing internally a replace-disks, a migration, and a second |
79 | 121 |
replace-disks. It will be possible to clean up an interrupted |
80 | 122 |
group-move operation. |
81 |
- Cluster verify will signal an error if an instance has been left |
|
82 |
mid-transition between groups. |
|
83 |
- Inter-group instance migration/failover will check that the target |
|
84 |
group will be able to accept the instance network/storage wise, and |
|
85 |
fail otherwise. In the future we may be able to make some parameter |
|
86 |
changed during the move, but in the first version we expect an |
|
87 |
import/export if this is not possible. |
|
88 |
- From an allocation point of view, inter-group movements will be |
|
89 |
shown to a iallocator as a new allocation over the target group. |
|
90 |
Only in a future version we may add allocator extensions to decide |
|
91 |
which group the instance should be in. In the meantime we expect |
|
92 |
Ganeti administrators to either put instances on different groups by |
|
93 |
filling all groups first, or to have their own strategy based on the |
|
94 |
instance needs. |
|
123 |
- Cluster verify will signal an error if an instance has nodes |
|
124 |
belonging to different groups. Additionally, changing the group of a |
|
125 |
given node will be initially only allowed if the node is empty, as a |
|
126 |
straightforward mechanism to avoid creating such situation. |
|
127 |
- Inter-group instance migration will have the same operation modes as |
|
128 |
new instance allocation, defined above: letting an iallocator plugin |
|
129 |
decide the target group, possibly restricting the set of node groups |
|
130 |
to consider, or specifying a target primary and secondary nodes. In |
|
131 |
both cases, the target group or nodes must be able to accept the |
|
132 |
instance network- and storage-wise; the operation will fail |
|
133 |
otherwise, though in the future we may be able to allow some |
|
134 |
parameter to be changed together with the move (in the meantime, an |
|
135 |
import/export will be required in this scenario). |
|
95 | 136 |
|
96 | 137 |
Internal changes |
97 | 138 |
++++++++++++++++ |
98 | 139 |
|
99 | 140 |
We expect the following changes for cluster management: |
100 | 141 |
|
101 |
- Frequent multinode operations, such as os-diagnose or cluster-verify |
|
102 |
will act on one group at a time. The default group will be used if none |
|
103 |
is passed. Command line tools will have a way to easily target all |
|
104 |
groups, by generating one job per group. |
|
142 |
- Frequent multinode operations, such as os-diagnose or cluster-verify, |
|
143 |
will act on one group at a time, which will have to be specified in |
|
144 |
all cases, except for clusters with just one group. Command line |
|
145 |
tools will also have a way to easily target all groups, by |
|
146 |
generating one job per group. |
|
105 | 147 |
- Groups will have a human-readable name, but will internally always |
106 |
be referenced by a UUID, which will be immutable. For example the |
|
107 |
cluster object will contain the UUID of the default group, each node |
|
108 |
will contain the UUID of the group it belongs to, etc. This is done |
|
148 |
be referenced by a UUID, which will be immutable; for example, nodes |
|
149 |
will contain the UUID of the group they belong to. This is done |
|
109 | 150 |
to simplify referencing while keeping it easy to handle renames and |
110 | 151 |
movements. If we see that this works well, we'll transition other |
111 | 152 |
config objects (instances, nodes) to the same model. |
... | ... | |
483 | 524 |
specification, the total available count is the count for the given |
484 | 525 |
entry, plus the sum of counts for higher specifications. |
485 | 526 |
|
486 |
Also note that the node group information is provided just |
|
487 |
informationally, not for allocation decisions. |
|
488 |
|
|
489 | 527 |
|
490 | 528 |
Node flags |
491 | 529 |
---------- |
Also available in: Unified diff