Revision f5d723fe doc/design-2.3.rst

b/doc/design-2.3.rst
51 51
To manage node groups and the nodes belonging to them, the following new
52 52
commands and flags will be introduced::
53 53

  
54
  gnt-node group-add <group> # add a new node group
55
  gnt-node group-del <group> # delete an empty group
56
  gnt-node group-list # list node groups
57
  gnt-node group-rename <oldname> <newname> # rename a group
58
  gnt-node list/info -g <group> # list only nodes belonging to a group
59
  gnt-node add -g <group> # add a node to a certain group
60
  gnt-node modify -g <group> # move a node to a new group
54
  gnt-group add <group> # add a new node group
55
  gnt-group del <group> # delete an empty node group
56
  gnt-group list # list node groups
57
  gnt-group rename <oldname> <newname> # rename a node group
58
  gnt-node {list,info} -g <group> # list only nodes belonging to a node group
59
  gnt-node modify -g <group> # assign a node to a node group
60

  
61
Node group attributes
62
+++++++++++++++++++++
63

  
64
In clusters with more than one node group, it may be desirable to
65
establish local policies regarding which groups should be preferred when
66
performing allocation of new instances, or inter-group instance migrations.
67

  
68
To help with this, we will provide an ``alloc_policy`` attribute for
69
node groups. Such attribute will be honored by iallocator plugins when
70
making automatic decisions regarding instance placement.
71

  
72
The ``alloc_policy`` attribute can have the following values:
73

  
74
- unallocable: the node group should not be a candidate for instance
75
  allocations, and the operation should fail if only groups in this
76
  state could be found that would satisfy the requirements.
77

  
78
- last_resort: the node group should not be used for instance
79
  allocations, unless this would be the only way to have the operation
80
  succeed.
81

  
82
- preferred: the node group can be used freely for allocation of
83
  instances (this is the default state for newly created node
84
  groups). Note that prioritization among groups in this state will be
85
  deferred to the  iallocator plugin that's being used.
86

  
87
Node group operations
88
+++++++++++++++++++++
89

  
90
One operation at the node group level will be initially provided::
91

  
92
  gnt-group drain <group>
93

  
94
The purpose of this operation is to migrate all instances in a given
95
node group to other groups in the cluster, e.g. to reclaim capacity if
96
there are enough free resources in other node groups that share a
97
storage pool with the evacuated group.
61 98

  
62 99
Instance level changes
63 100
++++++++++++++++++++++
64 101

  
65
Instances will be able to live in only one group at a time. This is
66
mostly important for DRBD instances, in which case both their primary
67
and secondary nodes will need to be in the same group. To support this
68
we envision the following changes:
69

  
70
  - The cluster will have a default group, which will initially be
71
  - Instance allocation will happen to the cluster's default group
72
    (which will be changeable via ``gnt-cluster modify`` or RAPI) unless
73
    a group is explicitly specified in the creation job (with -g or via
74
    RAPI). Iallocator will be only passed the nodes belonging to that
102
With the introduction of node groups, instances will be required to live
103
in only one group at a time; this is mostly important for DRBD
104
instances, which will not be allowed to have their primary and secondary
105
nodes in different node groups. To support this, we envision the
106
following changes:
107

  
108
  - The iallocator interface will be augmented, and node groups exposed,
109
    so that plugins will be able to make a decision regarding the group
110
    in which to place a new instance. By default, all node groups will
111
    be considered, but it will be possible to include a list of groups
112
    in the creation job, in which case the plugin will limit itself to
113
    considering those; in both cases, the ``alloc_policy`` attribute
114
    will be honored.
115
  - If, on the other hand, a primary and secondary nodes are specified
116
    for a new instance, they will be required to be on the same node
75 117
    group.
76 118
  - Moving an instance between groups can only happen via an explicit
77 119
    operation, which for example in the case of DRBD will work by
78 120
    performing internally a replace-disks, a migration, and a second
79 121
    replace-disks. It will be possible to clean up an interrupted
80 122
    group-move operation.
81
  - Cluster verify will signal an error if an instance has been left
82
    mid-transition between groups.
83
  - Inter-group instance migration/failover will check that the target
84
    group will be able to accept the instance network/storage wise, and
85
    fail otherwise. In the future we may be able to make some parameter
86
    changed during the move, but in the first version we expect an
87
    import/export if this is not possible.
88
  - From an allocation point of view, inter-group movements will be
89
    shown to a iallocator as a new allocation over the target group.
90
    Only in a future version we may add allocator extensions to decide
91
    which group the instance should be in. In the meantime we expect
92
    Ganeti administrators to either put instances on different groups by
93
    filling all groups first, or to have their own strategy based on the
94
    instance needs.
123
  - Cluster verify will signal an error if an instance has nodes
124
    belonging to different groups. Additionally, changing the group of a
125
    given node will be initially only allowed if the node is empty, as a
126
    straightforward mechanism to avoid creating such situation.
127
  - Inter-group instance migration will have the same operation modes as
128
    new instance allocation, defined above: letting an iallocator plugin
129
    decide the target group, possibly restricting the set of node groups
130
    to consider, or specifying a target primary and secondary nodes. In
131
    both cases, the target group or nodes must be able to accept the
132
    instance network- and storage-wise; the operation will fail
133
    otherwise, though in the future we may be able to allow some
134
    parameter to be changed together with the move (in the meantime, an
135
    import/export will be required in this scenario).
95 136

  
96 137
Internal changes
97 138
++++++++++++++++
98 139

  
99 140
We expect the following changes for cluster management:
100 141

  
101
  - Frequent multinode operations, such as os-diagnose or cluster-verify
102
    will act on one group at a time. The default group will be used if none
103
    is passed. Command line tools will have a way to easily target all
104
    groups, by generating one job per group.
142
  - Frequent multinode operations, such as os-diagnose or cluster-verify,
143
    will act on one group at a time, which will have to be specified in
144
    all cases, except for clusters with just one group. Command line
145
    tools will also have a way to easily target all groups, by
146
    generating one job per group.
105 147
  - Groups will have a human-readable name, but will internally always
106
    be referenced by a UUID, which will be immutable. For example the
107
    cluster object will contain the UUID of the default group, each node
108
    will contain the UUID of the group it belongs to, etc. This is done
148
    be referenced by a UUID, which will be immutable; for example, nodes
149
    will contain the UUID of the group they belong to. This is done
109 150
    to simplify referencing while keeping it easy to handle renames and
110 151
    movements. If we see that this works well, we'll transition other
111 152
    config objects (instances, nodes) to the same model.
......
483 524
specification, the total available count is the count for the given
484 525
entry, plus the sum of counts for higher specifications.
485 526

  
486
Also note that the node group information is provided just
487
informationally, not for allocation decisions.
488

  
489 527

  
490 528
Node flags
491 529
----------

Also available in: Unified diff