root / man / hroller.rst @ 8660ba15
History | View | Annotate | Download (5.1 kB)
1 |
HROLLER(1) Ganeti | Version @GANETI_VERSION@ |
---|---|
2 |
============================================ |
3 |
|
4 |
NAME |
5 |
---- |
6 |
|
7 |
hroller \- Cluster rolling maintenance scheduler for Ganeti |
8 |
|
9 |
SYNOPSIS |
10 |
-------- |
11 |
|
12 |
**hroller** {backend options...} [algorithm options...] [reporting options...] |
13 |
|
14 |
**hroller** \--version |
15 |
|
16 |
|
17 |
Backend options: |
18 |
|
19 |
{ **-m** *cluster* | **-L[** *path* **]** | **-t** *data-file* | |
20 |
**-I** *path* } |
21 |
|
22 |
**[ --force ]** |
23 |
|
24 |
Algorithm options: |
25 |
|
26 |
**[ -G *name* ]** |
27 |
**[ -O *name...* ]** |
28 |
**[ --node-tags** *tag,..* **]** |
29 |
**[ --skip-non-redundant ]** |
30 |
|
31 |
**[ --offline-maintenance ]** |
32 |
**[ --ignore-non-redundant ]** |
33 |
|
34 |
Reporting options: |
35 |
|
36 |
**[ -v... | -q ]** |
37 |
**[ -S *file* ]** |
38 |
**[ --one-step-only ]** |
39 |
**[ --print-moves ]** |
40 |
|
41 |
DESCRIPTION |
42 |
----------- |
43 |
|
44 |
hroller is a cluster maintenance reboot scheduler. It can calculate |
45 |
which set of nodes can be rebooted at the same time while avoiding |
46 |
having both primary and secondary nodes being rebooted at the same time. |
47 |
|
48 |
For backends that support identifying the master node (currenlty |
49 |
RAPI and LUXI), the master node is scheduled as the last node |
50 |
in the last reboot group. Apart from this restriction, larger reboot |
51 |
groups are put first. |
52 |
|
53 |
ALGORITHM FOR CALCULATING OFFLINE REBOOT GROUPS |
54 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
55 |
|
56 |
hroller will view the nodes as vertices of an undirected graph, |
57 |
with two kind of edges. Firstly, there are edges from the primary |
58 |
to the secondary node of every instance. Secondly, two nodes are connected |
59 |
by an edge if they are the primary nodes of two instances that have the |
60 |
same secondary node. It will then color the graph using a few different |
61 |
heuristics, and return the minimum-size color set found. Node with the same |
62 |
color can then simultaneously migrate all instance off to their respective |
63 |
secondary nodes, and it is safe to reboot them simultaneously. |
64 |
|
65 |
OPTIONS |
66 |
------- |
67 |
|
68 |
For a description of the standard options check **htools**\(7) and |
69 |
**hbal**\(1). |
70 |
|
71 |
\--force |
72 |
Do not fail, even if the master node cannot be determined. |
73 |
|
74 |
\--node-tags *tag,...* |
75 |
Restrict to nodes having at least one of the given tags. |
76 |
|
77 |
\--full-evacuation |
78 |
Also plan moving secondaries out of the nodes to be rebooted. For |
79 |
each instance the move is at most a migrate (if it was primary |
80 |
on that node) followed by a replace secondary. |
81 |
|
82 |
\--skip-non-redundant |
83 |
Restrict to nodes not hosting any non-redundant instance. |
84 |
|
85 |
\--offline-maintenance |
86 |
Pretend that all instances are shutdown before the reboots are carried |
87 |
out. I.e., only edges from the primary to the secondary node of an instance |
88 |
are considered. |
89 |
|
90 |
\--ignore-non-redundnant |
91 |
Pretend that the non-redundant instances do not exist, and only take |
92 |
instances with primary and secondary node into account. |
93 |
|
94 |
\--one-step-only |
95 |
Restrict to the first reboot group. Output the group one node per line. |
96 |
|
97 |
\--print-moves |
98 |
After each group list for each affected instance a node |
99 |
where it can be evacuated to. The moves are computed under the assumption |
100 |
that after each reboot group, all instances are moved back to their |
101 |
initial position. |
102 |
|
103 |
|
104 |
BUGS |
105 |
---- |
106 |
|
107 |
If instances are online the tool should refuse to do offline rolling |
108 |
maintenances, unless explicitly requested. |
109 |
|
110 |
End-to-end shelltests should be provided. |
111 |
|
112 |
EXAMPLES |
113 |
-------- |
114 |
|
115 |
Online Rolling reboots, using tags |
116 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
117 |
|
118 |
Selecting by tags and getting output for one step only can be used for |
119 |
planing the next maintenance step. |
120 |
:: |
121 |
|
122 |
$ hroller --node-tags needsreboot --one-step-only -L |
123 |
'First Reboot Group' |
124 |
node1.example.com |
125 |
node3.example.com |
126 |
|
127 |
Typically these nodes would be drained and migrated. |
128 |
:: |
129 |
|
130 |
$ GROUP=`hroller --node-tags needsreboot --one-step-only --no-headers -L` |
131 |
$ for node in $GROUP; do gnt-node modify -D yes $node; done |
132 |
$ for node in $GROUP; do gnt-node migrate -f --submit $node; done |
133 |
|
134 |
After maintenance, the tags would be removed and the nodes undrained. |
135 |
|
136 |
|
137 |
Offline Rolling node reboot output |
138 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
139 |
|
140 |
If all instances are shut down, usually larger node groups can be found. |
141 |
:: |
142 |
|
143 |
$ hroller --offline-maintainance -L |
144 |
'Node Reboot Groups' |
145 |
node1.example.com,node3.example.com,node5.example.com |
146 |
node8.example.com,node6.example.com,node2.example.com |
147 |
node7.example.com,node4.example.com |
148 |
|
149 |
Rolling reboots with non-redundant instances |
150 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
151 |
|
152 |
By default, hroller plans capacity to move the non-redundant instances |
153 |
out of the nodes to be rebooted. If requested, apropriate locations for |
154 |
the non-redundant instances can be shown. The assumption is that instances |
155 |
are moved back to their original node after each reboot; these back moves |
156 |
are not part of the output. |
157 |
:: |
158 |
|
159 |
$ hroller --print-moves -L |
160 |
'Node Reboot Groups' |
161 |
node-01-002,node-01-003 |
162 |
inst-20 node-01-001 |
163 |
inst-21 node-01-000 |
164 |
inst-30 node-01-005 |
165 |
inst-31 node-01-004 |
166 |
node-01-004,node-01-005 |
167 |
inst-40 node-01-001 |
168 |
inst-41 node-01-000 |
169 |
inst-50 node-01-003 |
170 |
inst-51 node-01-002 |
171 |
node-01-001,node-01-000 |
172 |
inst-00 node-01-002 |
173 |
inst-01 node-01-003 |
174 |
inst-10 node-01-005 |
175 |
inst-11 node-01-004 |
176 |
|
177 |
|
178 |
|
179 |
.. vim: set textwidth=72 : |
180 |
.. Local Variables: |
181 |
.. mode: rst |
182 |
.. fill-column: 72 |
183 |
.. End: |