root / man / hroller.rst @ 71c41fc0
History | View | Annotate | Download (4 kB)
1 |
HROLLER(1) Ganeti | Version @GANETI_VERSION@ |
---|---|
2 |
============================================ |
3 |
|
4 |
NAME |
5 |
---- |
6 |
|
7 |
hroller \- Cluster rolling maintenance scheduler for Ganeti |
8 |
|
9 |
SYNOPSIS |
10 |
-------- |
11 |
|
12 |
**hroller** {backend options...} [algorithm options...] [reporting options...] |
13 |
|
14 |
**hroller** \--version |
15 |
|
16 |
|
17 |
Backend options: |
18 |
|
19 |
{ **-m** *cluster* | **-L[** *path* **]** | **-t** *data-file* | |
20 |
**-I** *path* } |
21 |
|
22 |
**[ --force ]** |
23 |
|
24 |
Algorithm options: |
25 |
|
26 |
**[ -G *name* ]** |
27 |
**[ -O *name...* ]** |
28 |
**[ --node-tags** *tag,..* **]** |
29 |
**[ --skip-non-redundant ]** |
30 |
|
31 |
**[ --offline-maintenance ]** |
32 |
**[ --ignore-non-redundant ]** |
33 |
|
34 |
Reporting options: |
35 |
|
36 |
**[ -v... | -q ]** |
37 |
**[ -S *file* ]** |
38 |
**[ --one-step-only ]** |
39 |
|
40 |
DESCRIPTION |
41 |
----------- |
42 |
|
43 |
hroller is a cluster maintenance reboot scheduler. It can calculate |
44 |
which set of nodes can be rebooted at the same time while avoiding |
45 |
having both primary and secondary nodes being rebooted at the same time. |
46 |
|
47 |
For backends that support identifying the master node (currenlty |
48 |
RAPI and LUXI), the master node is scheduled as the last node |
49 |
in the last reboot group. Apart from this restriction, larger reboot |
50 |
groups are put first. |
51 |
|
52 |
ALGORITHM FOR CALCULATING OFFLINE REBOOT GROUPS |
53 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
54 |
|
55 |
hroller will view the nodes as vertices of an undirected graph, |
56 |
with two kind of edges. Firstly, there are edges from the primary |
57 |
to the secondary node of every instance. Secondly, two nodes are connected |
58 |
by an edge if they are the primary nodes of two instances that have the |
59 |
same secondary node. It will then color the graph using a few different |
60 |
heuristics, and return the minimum-size color set found. Node with the same |
61 |
color can then simultaneously migrate all instance off to their respective |
62 |
secondary nodes, and it is safe to reboot them simultaneously. |
63 |
|
64 |
OPTIONS |
65 |
------- |
66 |
|
67 |
For a description of the standard options check **htools**\(7) and |
68 |
**hbal**\(1). |
69 |
|
70 |
\--force |
71 |
Do not fail, even if the master node cannot be determined. |
72 |
|
73 |
\--node-tags *tag,...* |
74 |
Restrict to nodes having at least one of the given tags. |
75 |
|
76 |
\--skip-non-redundant |
77 |
Restrict to nodes not hosting any non-redundant instance. |
78 |
|
79 |
\--offline-maintenance |
80 |
Pretend that all instances are shutdown before the reboots are carried |
81 |
out. I.e., only edges from the primary to the secondary node of an instance |
82 |
are considered. |
83 |
|
84 |
\--ignore-non-redundnant |
85 |
Pretend that the non-redundant instances do not exist, and only take |
86 |
instances with primary and secondary node into account. |
87 |
|
88 |
\--one-step-only |
89 |
Restrict to the first reboot group. Output the group one node per line. |
90 |
|
91 |
|
92 |
BUGS |
93 |
---- |
94 |
|
95 |
Offline nodes should be ignored. |
96 |
|
97 |
If instances are online the tool should refuse to do offline rolling |
98 |
maintenances, unless explicitly requested. |
99 |
|
100 |
End-to-end shelltests should be provided. |
101 |
|
102 |
Online rolling maintenances (where instance need not be shut down, but |
103 |
are migrated from node to node) are not supported yet. Hroller by design |
104 |
should support them both with and without secondary node replacement. |
105 |
|
106 |
EXAMPLES |
107 |
-------- |
108 |
|
109 |
Online Rolling reboots, using tags |
110 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
111 |
|
112 |
Selecting by tags and getting output for one step only can be used for |
113 |
planing the next maintenance step. |
114 |
:: |
115 |
|
116 |
$ hroller --node-tags needsreboot --one-step-only -L |
117 |
'First Reboot Group' |
118 |
node1.example.com |
119 |
node3.example.com |
120 |
|
121 |
Typically these nodes would be drained and migrated. |
122 |
:: |
123 |
|
124 |
$ GROUP=`hroller --node-tags needsreboot --one-step-only --no-headers -L` |
125 |
$ for node in $GROUP; do gnt-node modify -D yes $node; done |
126 |
$ for node in $GROUP; do gnt-node migrate -f --submit $node; done |
127 |
|
128 |
After maintenance, the tags would be removed and the nodes undrained. |
129 |
|
130 |
|
131 |
Offline Rolling node reboot output |
132 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
133 |
|
134 |
If all instances are shut down, usually larger node groups can be found. |
135 |
:: |
136 |
|
137 |
$ hroller --offline-maintainance -L |
138 |
'Node Reboot Groups' |
139 |
node1.example.com,node3.example.com,node5.example.com |
140 |
node8.example.com,node6.example.com,node2.example.com |
141 |
node7.example.com,node4.example.com |
142 |
|
143 |
.. vim: set textwidth=72 : |
144 |
.. Local Variables: |
145 |
.. mode: rst |
146 |
.. fill-column: 72 |
147 |
.. End: |