Statistics
| Branch: | Tag: | Revision:

root / hbal.1 @ d09b6ed3

History | View | Annotate | Download (17.4 kB)

1 d0003b35 Iustin Pop
.TH HBAL 1 2009-03-23 htools "Ganeti H-tools"
2 a9211170 Iustin Pop
.SH NAME
3 a9211170 Iustin Pop
hbal \- Cluster balancer for Ganeti
4 a9211170 Iustin Pop
5 a9211170 Iustin Pop
.SH SYNOPSIS
6 a9211170 Iustin Pop
.B hbal
7 a9211170 Iustin Pop
.B "[-C]"
8 a9211170 Iustin Pop
.B "[-p]"
9 a9211170 Iustin Pop
.B "[-o]"
10 d09b6ed3 Iustin Pop
.B "[-v... | -q]"
11 d2ac5526 Iustin Pop
.BI "[-l" limit "]"
12 d2ac5526 Iustin Pop
.BI "[-O" name... "]"
13 b0517d61 Iustin Pop
.BI "[-e" score "]"
14 d2ac5526 Iustin Pop
.BI "[-m " cluster "]"
15 a9211170 Iustin Pop
.BI "[-n " nodes-file " ]"
16 d2ac5526 Iustin Pop
.BI "[-i " instances-file "]"
17 a9211170 Iustin Pop
18 b0045e4d Iustin Pop
.B hbal
19 b0045e4d Iustin Pop
.B --version
20 b0045e4d Iustin Pop
21 a9211170 Iustin Pop
.SH DESCRIPTION
22 a9211170 Iustin Pop
hbal is a cluster balancer that looks at the current state of the
23 a9211170 Iustin Pop
cluster (nodes with their total and free disk, memory, etc.) and
24 a9211170 Iustin Pop
instance placement and computes a series of steps designed to bring
25 a9211170 Iustin Pop
the cluster into a better state.
26 a9211170 Iustin Pop
27 a9211170 Iustin Pop
The algorithm to do so is designed to be stable (i.e. it will give you
28 a9211170 Iustin Pop
the same results when restarting it from the middle of the solution)
29 a9211170 Iustin Pop
and reasonably fast. It is not, however, designed to be a perfect
30 a9211170 Iustin Pop
algorithm - it is possible to make it go into a corner from which it
31 a9211170 Iustin Pop
can find no improvement, because it only look one "step" ahead.
32 a9211170 Iustin Pop
33 a9211170 Iustin Pop
By default, the program will show the solution incrementally as it is
34 a9211170 Iustin Pop
computed, in a somewhat cryptic format; for getting the actual Ganeti
35 a9211170 Iustin Pop
command list, use the \fB-C\fR option.
36 a9211170 Iustin Pop
37 a9211170 Iustin Pop
.SS ALGORITHM
38 a9211170 Iustin Pop
39 b0045e4d Iustin Pop
The program works in independent steps; at each step, we compute the
40 a9211170 Iustin Pop
best instance move that lowers the cluster score.
41 a9211170 Iustin Pop
42 a9211170 Iustin Pop
The possible move type for an instance are combinations of
43 a9211170 Iustin Pop
failover/migrate and replace-disks such that we change one of the
44 a9211170 Iustin Pop
instance nodes, and the other one remains (but possibly with changed
45 a9211170 Iustin Pop
role, e.g. from primary it becomes secondary). The list is:
46 d0003b35 Iustin Pop
.RS 4
47 d0003b35 Iustin Pop
.TP 3
48 d0003b35 Iustin Pop
\(em
49 d0003b35 Iustin Pop
failover (f)
50 d0003b35 Iustin Pop
.TP
51 d0003b35 Iustin Pop
\(em
52 d0003b35 Iustin Pop
replace secondary (r)
53 d0003b35 Iustin Pop
.TP
54 d0003b35 Iustin Pop
\(em
55 d0003b35 Iustin Pop
replace primary, a composite move (f, r, f)
56 d0003b35 Iustin Pop
.TP
57 d0003b35 Iustin Pop
\(em
58 d0003b35 Iustin Pop
failover and replace secondary, also composite (f, r)
59 d0003b35 Iustin Pop
.TP
60 d0003b35 Iustin Pop
\(em
61 d0003b35 Iustin Pop
replace secondary and failover, also composite (r, f)
62 d0003b35 Iustin Pop
.RE
63 a9211170 Iustin Pop
64 a9211170 Iustin Pop
We don't do the only remaining possibility of replacing both nodes
65 a9211170 Iustin Pop
(r,f,r,f or the equivalent f,r,f,r) since these move needs an
66 a9211170 Iustin Pop
exhaustive search over both candidate primary and secondary nodes, and
67 a9211170 Iustin Pop
is O(n*n) in the number of nodes. Furthermore, it doesn't seems to
68 a9211170 Iustin Pop
give better scores but will result in more disk replacements.
69 a9211170 Iustin Pop
70 a9211170 Iustin Pop
.SS CLUSTER SCORING
71 a9211170 Iustin Pop
72 b0045e4d Iustin Pop
As said before, the algorithm tries to minimise the cluster score at
73 a9211170 Iustin Pop
each step. Currently this score is computed as a sum of the following
74 a9211170 Iustin Pop
components:
75 d0003b35 Iustin Pop
.RS 4
76 d0003b35 Iustin Pop
.TP 3
77 d0003b35 Iustin Pop
\(em
78 d0003b35 Iustin Pop
coefficient of variance of the percent of free memory
79 d0003b35 Iustin Pop
.TP
80 d0003b35 Iustin Pop
\(em
81 d0003b35 Iustin Pop
coefficient of variance of the percent of reserved memory
82 d0003b35 Iustin Pop
.TP
83 d0003b35 Iustin Pop
\(em
84 d0003b35 Iustin Pop
coefficient of variance of the percent of free disk
85 d0003b35 Iustin Pop
.TP
86 d0003b35 Iustin Pop
\(em
87 d0003b35 Iustin Pop
percentage of nodes failing N+1 check
88 d0003b35 Iustin Pop
.TP
89 d0003b35 Iustin Pop
\(em
90 d0003b35 Iustin Pop
percentage of instances living (either as primary or secondary) on
91 d0003b35 Iustin Pop
offline nodes
92 d0003b35 Iustin Pop
.RE
93 a9211170 Iustin Pop
94 a9211170 Iustin Pop
The free memory and free disk values help ensure that all nodes are
95 a9211170 Iustin Pop
somewhat balanced in their resource usage. The reserved memory helps
96 a9211170 Iustin Pop
to ensure that nodes are somewhat balanced in holding secondary
97 a9211170 Iustin Pop
instances, and that no node keeps too much memory reserved for
98 a9211170 Iustin Pop
N+1. And finally, the N+1 percentage helps guide the algorithm towards
99 a9211170 Iustin Pop
eliminating N+1 failures, if possible.
100 a9211170 Iustin Pop
101 d2ac5526 Iustin Pop
Except for the N+1 failures and offline instances percentage, we use
102 d2ac5526 Iustin Pop
the coefficient of variance since this brings the values into the same
103 d2ac5526 Iustin Pop
unit so to speak, and with a restrict domain of values (between zero
104 d2ac5526 Iustin Pop
and one). The percentage of N+1 failures, while also in this numeric
105 d2ac5526 Iustin Pop
range, doesn't actually has the same meaning, but it has shown to work
106 d2ac5526 Iustin Pop
well.
107 a9211170 Iustin Pop
108 a9211170 Iustin Pop
The other alternative, using for N+1 checks the coefficient of
109 a9211170 Iustin Pop
variance of (N+1 fail=1, N+1 pass=0) across nodes could hint the
110 a9211170 Iustin Pop
algorithm to make more N+1 failures if most nodes are N+1 fail
111 a9211170 Iustin Pop
already. Since this (making N+1 failures) is not allowed by other
112 a9211170 Iustin Pop
rules of the algorithm, so the N+1 checks would simply not work
113 a9211170 Iustin Pop
anymore in this case.
114 a9211170 Iustin Pop
115 d2ac5526 Iustin Pop
The offline instances percentage (meaning the percentage of instances
116 d2ac5526 Iustin Pop
living on offline nodes) will cause the algorithm to actively move
117 d2ac5526 Iustin Pop
instances away from offline nodes. This, coupled with the restriction
118 d2ac5526 Iustin Pop
on placement given by offline nodes, will cause evacuation of such
119 d2ac5526 Iustin Pop
nodes.
120 d2ac5526 Iustin Pop
121 a9211170 Iustin Pop
On a perfectly balanced cluster (all nodes the same size, all
122 a9211170 Iustin Pop
instances the same size and spread across the nodes equally), all
123 a9211170 Iustin Pop
values would be zero. This doesn't happen too often in practice :)
124 a9211170 Iustin Pop
125 d0003b35 Iustin Pop
.SS OFFLINE INSTANCES
126 d0003b35 Iustin Pop
127 d0003b35 Iustin Pop
Since current Ganeti versions do not report the memory used by offline
128 d0003b35 Iustin Pop
(down) instances, ignoring the run status of instances will cause
129 d0003b35 Iustin Pop
wrong calculations. For this reason, the algorithm subtracts the
130 d0003b35 Iustin Pop
memory size of down instances from the free node memory of their
131 d0003b35 Iustin Pop
primary node, in effect simulating the startup of such instances.
132 d0003b35 Iustin Pop
133 a9211170 Iustin Pop
.SS OTHER POSSIBLE METRICS
134 a9211170 Iustin Pop
135 a9211170 Iustin Pop
It would be desirable to add more metrics to the algorithm, especially
136 a9211170 Iustin Pop
dynamically-computed metrics, such as:
137 d0003b35 Iustin Pop
.RS 4
138 d0003b35 Iustin Pop
.TP 3
139 d0003b35 Iustin Pop
\(em
140 d0003b35 Iustin Pop
CPU usage of instances, combined with VCPU versus PCPU count
141 d0003b35 Iustin Pop
.TP
142 d0003b35 Iustin Pop
\(em
143 d0003b35 Iustin Pop
Disk IO usage
144 d0003b35 Iustin Pop
.TP
145 d0003b35 Iustin Pop
\(em
146 d0003b35 Iustin Pop
Network IO
147 d0003b35 Iustin Pop
.RE
148 a9211170 Iustin Pop
149 a9211170 Iustin Pop
.SH OPTIONS
150 a9211170 Iustin Pop
The options that can be passed to the program are as follows:
151 a9211170 Iustin Pop
.TP
152 a9211170 Iustin Pop
.B -C, --print-commands
153 a9211170 Iustin Pop
Print the command list at the end of the run. Without this, the
154 a9211170 Iustin Pop
program will only show a shorter, but cryptic output.
155 a9211170 Iustin Pop
.TP
156 a9211170 Iustin Pop
.B -p, --print-nodes
157 a9211170 Iustin Pop
Prints the before and after node status, in a format designed to allow
158 a9211170 Iustin Pop
the user to understand the node's most important parameters.
159 a9211170 Iustin Pop
160 a9211170 Iustin Pop
The node list will contain these informations:
161 d2ac5526 Iustin Pop
.RS
162 d2ac5526 Iustin Pop
.TP
163 d2ac5526 Iustin Pop
.B F
164 d2ac5526 Iustin Pop
a character denoting the status of the node, with '-' meaning an
165 d2ac5526 Iustin Pop
offline node, '*' meaning N+1 failure and blank meaning a good node
166 d2ac5526 Iustin Pop
.TP
167 d2ac5526 Iustin Pop
.B Name
168 d2ac5526 Iustin Pop
the node name
169 d2ac5526 Iustin Pop
.TP
170 d2ac5526 Iustin Pop
.B t_mem
171 d2ac5526 Iustin Pop
the total node memory
172 d2ac5526 Iustin Pop
.TP
173 d2ac5526 Iustin Pop
.B n_mem
174 d2ac5526 Iustin Pop
the memory used by the node itself
175 d2ac5526 Iustin Pop
.TP
176 d2ac5526 Iustin Pop
.B i_mem
177 d2ac5526 Iustin Pop
the memory used by instances
178 d2ac5526 Iustin Pop
.TP
179 d2ac5526 Iustin Pop
.B x_mem
180 d2ac5526 Iustin Pop
amount memory which seems to be in use but cannot be determined why or
181 d2ac5526 Iustin Pop
by which instance; usually this means that the hypervisor has some
182 d2ac5526 Iustin Pop
overhead or that there are other reporting errors
183 d2ac5526 Iustin Pop
.TP
184 d2ac5526 Iustin Pop
.B f_mem
185 d2ac5526 Iustin Pop
the free node memory
186 d2ac5526 Iustin Pop
.TP
187 d2ac5526 Iustin Pop
.B r_mem
188 d2ac5526 Iustin Pop
the reserved node memory, which is the amount of free memory needed
189 d2ac5526 Iustin Pop
for N+1 compliance
190 d2ac5526 Iustin Pop
.TP
191 d2ac5526 Iustin Pop
.B t_dsk
192 d2ac5526 Iustin Pop
total disk
193 d2ac5526 Iustin Pop
.TP
194 d2ac5526 Iustin Pop
.B f_dsk
195 d2ac5526 Iustin Pop
free disk
196 d2ac5526 Iustin Pop
.TP
197 d2ac5526 Iustin Pop
.B pri
198 d2ac5526 Iustin Pop
number of primary instances
199 d2ac5526 Iustin Pop
.TP
200 d2ac5526 Iustin Pop
.B sec
201 d2ac5526 Iustin Pop
number of secondary instances
202 d2ac5526 Iustin Pop
.TP
203 d2ac5526 Iustin Pop
.B p_fmem
204 d2ac5526 Iustin Pop
percent of free memory
205 d2ac5526 Iustin Pop
.TP
206 d2ac5526 Iustin Pop
.B p_fdsk
207 d2ac5526 Iustin Pop
percent of free disk
208 d2ac5526 Iustin Pop
.RE
209 a9211170 Iustin Pop
210 a9211170 Iustin Pop
.TP
211 a9211170 Iustin Pop
.B -o, --oneline
212 a9211170 Iustin Pop
Only shows a one-line output from the program, designed for the case
213 a9211170 Iustin Pop
when one wants to look at multiple clusters at once and check their
214 a9211170 Iustin Pop
status.
215 a9211170 Iustin Pop
216 a9211170 Iustin Pop
The line will contain four fields:
217 d0003b35 Iustin Pop
.RS
218 d0003b35 Iustin Pop
.RS 4
219 d0003b35 Iustin Pop
.TP 3
220 d0003b35 Iustin Pop
\(em
221 d0003b35 Iustin Pop
initial cluster score
222 d0003b35 Iustin Pop
.TP
223 d0003b35 Iustin Pop
\(em
224 d0003b35 Iustin Pop
number of steps in the solution
225 d0003b35 Iustin Pop
.TP
226 d0003b35 Iustin Pop
\(em
227 d0003b35 Iustin Pop
final cluster score
228 d0003b35 Iustin Pop
.TP
229 d0003b35 Iustin Pop
\(em
230 d0003b35 Iustin Pop
improvement in the cluster score
231 d0003b35 Iustin Pop
.RE
232 d0003b35 Iustin Pop
.RE
233 a9211170 Iustin Pop
234 a9211170 Iustin Pop
.TP
235 d2ac5526 Iustin Pop
.BI "-O " name
236 d2ac5526 Iustin Pop
This option (which can be given multiple times) will mark nodes as
237 d2ac5526 Iustin Pop
being \fIoffline\fR. This means a couple of things:
238 d2ac5526 Iustin Pop
.RS
239 d0003b35 Iustin Pop
.RS 4
240 d0003b35 Iustin Pop
.TP 3
241 d0003b35 Iustin Pop
\(em
242 d2ac5526 Iustin Pop
instances won't be placed on these nodes, not even temporarily;
243 d2ac5526 Iustin Pop
e.g. the \fIreplace primary\fR move is not available if the secondary
244 d2ac5526 Iustin Pop
node is offline, since this move requires a failover.
245 d2ac5526 Iustin Pop
.TP
246 d0003b35 Iustin Pop
\(em
247 d2ac5526 Iustin Pop
these nodes will not be included in the score calculation (except for
248 d2ac5526 Iustin Pop
the percentage of instances on offline nodes)
249 d2ac5526 Iustin Pop
.RE
250 d0003b35 Iustin Pop
.RE
251 d2ac5526 Iustin Pop
252 d2ac5526 Iustin Pop
.TP
253 b0517d61 Iustin Pop
.BI "-e" score ", --min-score=" score
254 b0517d61 Iustin Pop
This parameter denotes the minimum score we are happy with and alters
255 b0517d61 Iustin Pop
the computation in two ways:
256 b0517d61 Iustin Pop
.RS
257 b0517d61 Iustin Pop
.RS 4
258 b0517d61 Iustin Pop
.TP 3
259 b0517d61 Iustin Pop
\(em
260 b0517d61 Iustin Pop
if the cluster has the initial score lower than this value, then we
261 b0517d61 Iustin Pop
don't enter the algorithm at all, and exit with success
262 b0517d61 Iustin Pop
.TP
263 b0517d61 Iustin Pop
\(em
264 b0517d61 Iustin Pop
during the iterative process, if we reach a score lower than this
265 b0517d61 Iustin Pop
value, we exit the algorithm
266 b0517d61 Iustin Pop
.RE
267 b0517d61 Iustin Pop
The default value of the parameter is currently \fI1e-9\fR (chosen
268 b0517d61 Iustin Pop
empirically).
269 b0517d61 Iustin Pop
.RE
270 b0517d61 Iustin Pop
271 b0517d61 Iustin Pop
.TP
272 a9211170 Iustin Pop
.BI "-n" nodefile ", --nodes=" nodefile
273 a9211170 Iustin Pop
The name of the file holding node information (if not collecting via
274 7b255913 Iustin Pop
RAPI), instead of the default \fInodes\fR file (but see below how to
275 7b255913 Iustin Pop
customize the default value via the environment).
276 a9211170 Iustin Pop
277 a9211170 Iustin Pop
.TP
278 a9211170 Iustin Pop
.BI "-i" instancefile ", --instances=" instancefile
279 a9211170 Iustin Pop
The name of the file holding instance information (if not collecting
280 7b255913 Iustin Pop
via RAPI), instead of the default \fIinstances\fR file (but see below
281 7b255913 Iustin Pop
how to customize the default value via the environment).
282 a9211170 Iustin Pop
283 a9211170 Iustin Pop
.TP
284 a9211170 Iustin Pop
.BI "-m" cluster
285 a9211170 Iustin Pop
Collect data not from files but directly from the
286 a9211170 Iustin Pop
.I cluster
287 a9211170 Iustin Pop
given as an argument via RAPI. This work for both Ganeti 1.2 and
288 a9211170 Iustin Pop
Ganeti 2.0.
289 a9211170 Iustin Pop
290 a9211170 Iustin Pop
.TP
291 a9211170 Iustin Pop
.BI "-l" N ", --max-length=" N
292 a9211170 Iustin Pop
Restrict the solution to this length. This can be used for example to
293 a9211170 Iustin Pop
automate the execution of the balancing.
294 a9211170 Iustin Pop
295 a9211170 Iustin Pop
.TP
296 a9211170 Iustin Pop
.B -v, --verbose
297 a9211170 Iustin Pop
Increase the output verbosity. Each usage of this option will increase
298 a9211170 Iustin Pop
the verbosity (currently more than 2 doesn't make sense) from the
299 d09b6ed3 Iustin Pop
default of one.
300 d09b6ed3 Iustin Pop
301 d09b6ed3 Iustin Pop
.TP
302 d09b6ed3 Iustin Pop
.B -q, --quiet
303 d09b6ed3 Iustin Pop
Decrease the output verbosity. Each usage of this option will decrease
304 d09b6ed3 Iustin Pop
the verbosity (less than zero doesn't make sense) from the default of
305 d09b6ed3 Iustin Pop
one.
306 a9211170 Iustin Pop
307 a9211170 Iustin Pop
.TP
308 a9211170 Iustin Pop
.B -V, --version
309 a9211170 Iustin Pop
Just show the program version and exit.
310 a9211170 Iustin Pop
311 a9211170 Iustin Pop
.SH EXIT STATUS
312 a9211170 Iustin Pop
313 a9211170 Iustin Pop
The exist status of the command will be zero, unless for some reason
314 a9211170 Iustin Pop
the algorithm fatally failed (e.g. wrong node or instance data).
315 a9211170 Iustin Pop
316 7b255913 Iustin Pop
.SH ENVIRONMENT
317 7b255913 Iustin Pop
318 7b255913 Iustin Pop
If the variables \fBHTOOLS_NODES\fR and \fBHTOOLS_INSTANCES\fR are
319 7b255913 Iustin Pop
present in the environment, they will override the default names for
320 7b255913 Iustin Pop
the nodes and instances files. These will have of course no effect
321 7b255913 Iustin Pop
when RAPI is used.
322 7b255913 Iustin Pop
323 a9211170 Iustin Pop
.SH BUGS
324 a9211170 Iustin Pop
325 a9211170 Iustin Pop
The program does not check its input data for consistency, and aborts
326 a9211170 Iustin Pop
with cryptic errors messages in this case.
327 a9211170 Iustin Pop
328 a9211170 Iustin Pop
The algorithm is not perfect.
329 a9211170 Iustin Pop
330 d0003b35 Iustin Pop
The algorithm doesn't deal with non-\fBdrbd\fR instances, and chokes
331 d0003b35 Iustin Pop
on input data which has such instances.
332 d0003b35 Iustin Pop
333 a9211170 Iustin Pop
The output format is not easily scriptable, and the program should
334 a9211170 Iustin Pop
feed moves directly into Ganeti (either via RAPI or via a gnt-debug
335 a9211170 Iustin Pop
input file).
336 a9211170 Iustin Pop
337 a9211170 Iustin Pop
.SH EXAMPLE
338 a9211170 Iustin Pop
339 d2ac5526 Iustin Pop
Note that this example are not for the latest version (they don't have
340 d2ac5526 Iustin Pop
full node data).
341 d2ac5526 Iustin Pop
342 a9211170 Iustin Pop
.SS Default output
343 a9211170 Iustin Pop
344 a9211170 Iustin Pop
With the default options, the program shows each individual step and
345 a9211170 Iustin Pop
the improvements it brings in cluster score:
346 a9211170 Iustin Pop
347 a9211170 Iustin Pop
.in +4n
348 a9211170 Iustin Pop
.nf
349 a9211170 Iustin Pop
.RB "$" " hbal"
350 a9211170 Iustin Pop
Loaded 20 nodes, 80 instances
351 a9211170 Iustin Pop
Cluster is not N+1 happy, continuing but no guarantee that the cluster will end N+1 happy.
352 a9211170 Iustin Pop
Initial score: 0.52329131
353 a9211170 Iustin Pop
Trying to minimize the CV...
354 a9211170 Iustin Pop
    1. instance14  node1:node10  => node16:node10 0.42109120 a=f r:node16 f
355 a9211170 Iustin Pop
    2. instance54  node4:node15  => node16:node15 0.31904594 a=f r:node16 f
356 a9211170 Iustin Pop
    3. instance4   node5:node2   => node2:node16  0.26611015 a=f r:node16
357 a9211170 Iustin Pop
    4. instance48  node18:node20 => node2:node18  0.21361717 a=r:node2 f
358 a9211170 Iustin Pop
    5. instance93  node19:node18 => node16:node19 0.16166425 a=r:node16 f
359 a9211170 Iustin Pop
    6. instance89  node3:node20  => node2:node3   0.11005629 a=r:node2 f
360 a9211170 Iustin Pop
    7. instance5   node6:node2   => node16:node6  0.05841589 a=r:node16 f
361 a9211170 Iustin Pop
    8. instance94  node7:node20  => node20:node16 0.00658759 a=f r:node16
362 a9211170 Iustin Pop
    9. instance44  node20:node2  => node2:node15  0.00438740 a=f r:node15
363 a9211170 Iustin Pop
   10. instance62  node14:node18 => node14:node16 0.00390087 a=r:node16
364 a9211170 Iustin Pop
   11. instance13  node11:node14 => node11:node16 0.00361787 a=r:node16
365 a9211170 Iustin Pop
   12. instance19  node10:node11 => node10:node7  0.00336636 a=r:node7
366 a9211170 Iustin Pop
   13. instance43  node12:node13 => node12:node1  0.00305681 a=r:node1
367 a9211170 Iustin Pop
   14. instance1   node1:node2   => node1:node4   0.00263124 a=r:node4
368 a9211170 Iustin Pop
   15. instance58  node19:node20 => node19:node17 0.00252594 a=r:node17
369 a9211170 Iustin Pop
Cluster score improved from 0.52329131 to 0.00252594
370 a9211170 Iustin Pop
.fi
371 a9211170 Iustin Pop
.in
372 a9211170 Iustin Pop
373 a9211170 Iustin Pop
In the above output, we can see:
374 a9211170 Iustin Pop
  - the input data (here from files) shows a cluster with 20 nodes and
375 a9211170 Iustin Pop
    80 instances
376 a9211170 Iustin Pop
  - the cluster is not initially N+1 compliant
377 a9211170 Iustin Pop
  - the initial score is 0.52329131
378 a9211170 Iustin Pop
379 a9211170 Iustin Pop
The step list follows, showing the instance, its initial
380 a9211170 Iustin Pop
primary/secondary nodes, the new primary secondary, the cluster list,
381 a9211170 Iustin Pop
and the actions taken in this step (with 'f' denoting failover/migrate
382 a9211170 Iustin Pop
and 'r' denoting replace secondary).
383 a9211170 Iustin Pop
384 a9211170 Iustin Pop
Finally, the program shows the improvement in cluster score.
385 a9211170 Iustin Pop
386 a9211170 Iustin Pop
A more detailed output is obtained via the \fB-C\fR and \fB-p\fR options:
387 a9211170 Iustin Pop
388 a9211170 Iustin Pop
.in +4n
389 a9211170 Iustin Pop
.nf
390 a9211170 Iustin Pop
.RB "$" " hbal"
391 a9211170 Iustin Pop
Loaded 20 nodes, 80 instances
392 a9211170 Iustin Pop
Cluster is not N+1 happy, continuing but no guarantee that the cluster will end N+1 happy.
393 a9211170 Iustin Pop
Initial cluster status:
394 a9211170 Iustin Pop
N1 Name   t_mem f_mem r_mem t_dsk f_dsk pri sec  p_fmem  p_fdsk
395 a9211170 Iustin Pop
 * node1  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
396 a9211170 Iustin Pop
   node2  32762 31280 12000  1861  1026   0   8 0.95476 0.55179
397 a9211170 Iustin Pop
 * node3  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
398 a9211170 Iustin Pop
 * node4  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
399 a9211170 Iustin Pop
 * node5  32762  1280  6000  1861   978   5   5 0.03907 0.52573
400 a9211170 Iustin Pop
 * node6  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
401 a9211170 Iustin Pop
 * node7  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
402 a9211170 Iustin Pop
   node8  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
403 a9211170 Iustin Pop
   node9  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
404 a9211170 Iustin Pop
 * node10 32762  7280 12000  1861  1026   4   4 0.22221 0.55179
405 a9211170 Iustin Pop
   node11 32762  7280  6000  1861   922   4   5 0.22221 0.49577
406 a9211170 Iustin Pop
   node12 32762  7280  6000  1861  1026   4   4 0.22221 0.55179
407 a9211170 Iustin Pop
   node13 32762  7280  6000  1861   922   4   5 0.22221 0.49577
408 a9211170 Iustin Pop
   node14 32762  7280  6000  1861   922   4   5 0.22221 0.49577
409 a9211170 Iustin Pop
 * node15 32762  7280 12000  1861  1131   4   3 0.22221 0.60782
410 a9211170 Iustin Pop
   node16 32762 31280     0  1861  1860   0   0 0.95476 1.00000
411 a9211170 Iustin Pop
   node17 32762  7280  6000  1861  1106   5   3 0.22221 0.59479
412 a9211170 Iustin Pop
 * node18 32762  1280  6000  1396   561   5   3 0.03907 0.40239
413 a9211170 Iustin Pop
 * node19 32762  1280  6000  1861  1026   5   3 0.03907 0.55179
414 a9211170 Iustin Pop
   node20 32762 13280 12000  1861   689   3   9 0.40535 0.37068
415 a9211170 Iustin Pop
416 a9211170 Iustin Pop
Initial score: 0.52329131
417 a9211170 Iustin Pop
Trying to minimize the CV...
418 a9211170 Iustin Pop
    1. instance14  node1:node10  => node16:node10 0.42109120 a=f r:node16 f
419 a9211170 Iustin Pop
    2. instance54  node4:node15  => node16:node15 0.31904594 a=f r:node16 f
420 a9211170 Iustin Pop
    3. instance4   node5:node2   => node2:node16  0.26611015 a=f r:node16
421 a9211170 Iustin Pop
    4. instance48  node18:node20 => node2:node18  0.21361717 a=r:node2 f
422 a9211170 Iustin Pop
    5. instance93  node19:node18 => node16:node19 0.16166425 a=r:node16 f
423 a9211170 Iustin Pop
    6. instance89  node3:node20  => node2:node3   0.11005629 a=r:node2 f
424 a9211170 Iustin Pop
    7. instance5   node6:node2   => node16:node6  0.05841589 a=r:node16 f
425 a9211170 Iustin Pop
    8. instance94  node7:node20  => node20:node16 0.00658759 a=f r:node16
426 a9211170 Iustin Pop
    9. instance44  node20:node2  => node2:node15  0.00438740 a=f r:node15
427 a9211170 Iustin Pop
   10. instance62  node14:node18 => node14:node16 0.00390087 a=r:node16
428 a9211170 Iustin Pop
   11. instance13  node11:node14 => node11:node16 0.00361787 a=r:node16
429 a9211170 Iustin Pop
   12. instance19  node10:node11 => node10:node7  0.00336636 a=r:node7
430 a9211170 Iustin Pop
   13. instance43  node12:node13 => node12:node1  0.00305681 a=r:node1
431 a9211170 Iustin Pop
   14. instance1   node1:node2   => node1:node4   0.00263124 a=r:node4
432 a9211170 Iustin Pop
   15. instance58  node19:node20 => node19:node17 0.00252594 a=r:node17
433 a9211170 Iustin Pop
Cluster score improved from 0.52329131 to 0.00252594
434 a9211170 Iustin Pop
435 a9211170 Iustin Pop
Commands to run to reach the above solution:
436 a9211170 Iustin Pop
  echo step 1
437 a9211170 Iustin Pop
  echo gnt-instance migrate instance14
438 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node16 instance14
439 a9211170 Iustin Pop
  echo gnt-instance migrate instance14
440 a9211170 Iustin Pop
  echo step 2
441 a9211170 Iustin Pop
  echo gnt-instance migrate instance54
442 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node16 instance54
443 a9211170 Iustin Pop
  echo gnt-instance migrate instance54
444 a9211170 Iustin Pop
  echo step 3
445 a9211170 Iustin Pop
  echo gnt-instance migrate instance4
446 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node16 instance4
447 a9211170 Iustin Pop
  echo step 4
448 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node2 instance48
449 a9211170 Iustin Pop
  echo gnt-instance migrate instance48
450 a9211170 Iustin Pop
  echo step 5
451 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node16 instance93
452 a9211170 Iustin Pop
  echo gnt-instance migrate instance93
453 a9211170 Iustin Pop
  echo step 6
454 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node2 instance89
455 a9211170 Iustin Pop
  echo gnt-instance migrate instance89
456 a9211170 Iustin Pop
  echo step 7
457 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node16 instance5
458 a9211170 Iustin Pop
  echo gnt-instance migrate instance5
459 a9211170 Iustin Pop
  echo step 8
460 a9211170 Iustin Pop
  echo gnt-instance migrate instance94
461 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node16 instance94
462 a9211170 Iustin Pop
  echo step 9
463 a9211170 Iustin Pop
  echo gnt-instance migrate instance44
464 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node15 instance44
465 a9211170 Iustin Pop
  echo step 10
466 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node16 instance62
467 a9211170 Iustin Pop
  echo step 11
468 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node16 instance13
469 a9211170 Iustin Pop
  echo step 12
470 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node7 instance19
471 a9211170 Iustin Pop
  echo step 13
472 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node1 instance43
473 a9211170 Iustin Pop
  echo step 14
474 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node4 instance1
475 a9211170 Iustin Pop
  echo step 15
476 a9211170 Iustin Pop
  echo gnt-instance replace-disks -n node17 instance58
477 a9211170 Iustin Pop
478 a9211170 Iustin Pop
Final cluster status:
479 a9211170 Iustin Pop
N1 Name   t_mem f_mem r_mem t_dsk f_dsk pri sec  p_fmem  p_fdsk
480 a9211170 Iustin Pop
   node1  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
481 a9211170 Iustin Pop
   node2  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
482 a9211170 Iustin Pop
   node3  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
483 a9211170 Iustin Pop
   node4  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
484 a9211170 Iustin Pop
   node5  32762  7280  6000  1861  1078   4   5 0.22221 0.57947
485 a9211170 Iustin Pop
   node6  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
486 a9211170 Iustin Pop
   node7  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
487 a9211170 Iustin Pop
   node8  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
488 a9211170 Iustin Pop
   node9  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
489 a9211170 Iustin Pop
   node10 32762  7280  6000  1861  1026   4   4 0.22221 0.55179
490 a9211170 Iustin Pop
   node11 32762  7280  6000  1861  1022   4   4 0.22221 0.54951
491 a9211170 Iustin Pop
   node12 32762  7280  6000  1861  1026   4   4 0.22221 0.55179
492 a9211170 Iustin Pop
   node13 32762  7280  6000  1861  1022   4   4 0.22221 0.54951
493 a9211170 Iustin Pop
   node14 32762  7280  6000  1861  1022   4   4 0.22221 0.54951
494 a9211170 Iustin Pop
   node15 32762  7280  6000  1861  1031   4   4 0.22221 0.55408
495 a9211170 Iustin Pop
   node16 32762  7280  6000  1861  1060   4   4 0.22221 0.57007
496 a9211170 Iustin Pop
   node17 32762  7280  6000  1861  1006   5   4 0.22221 0.54105
497 a9211170 Iustin Pop
   node18 32762  7280  6000  1396   761   4   2 0.22221 0.54570
498 a9211170 Iustin Pop
   node19 32762  7280  6000  1861  1026   4   4 0.22221 0.55179
499 a9211170 Iustin Pop
   node20 32762 13280  6000  1861  1089   3   5 0.40535 0.58565
500 a9211170 Iustin Pop
501 a9211170 Iustin Pop
.fi
502 a9211170 Iustin Pop
.in
503 a9211170 Iustin Pop
504 a9211170 Iustin Pop
Here we see, beside the step list, the initial and final cluster
505 a9211170 Iustin Pop
status, with the final one showing all nodes being N+1 compliant, and
506 a9211170 Iustin Pop
the command list to reach the final solution. In the initial listing,
507 a9211170 Iustin Pop
we see which nodes are not N+1 compliant.
508 a9211170 Iustin Pop
509 a9211170 Iustin Pop
The algorithm is stable as long as each step above is fully completed,
510 a9211170 Iustin Pop
e.g. in step 8, both the migrate and the replace-disks are
511 a9211170 Iustin Pop
done. Otherwise, if only the migrate is done, the input data is
512 a9211170 Iustin Pop
changed in a way that the program will output a different solution
513 a9211170 Iustin Pop
list (but hopefully will end in the same state).
514 a9211170 Iustin Pop
515 a9211170 Iustin Pop
.SH SEE ALSO
516 d2ac5526 Iustin Pop
.BR hn1 "(1), " hscan "(1), " ganeti "(7), " gnt-instance "(8), "
517 d2ac5526 Iustin Pop
.BR gnt-node "(8)"