Statistics
| Branch: | Tag: | Revision:

root / hbal.1 @ a7a8f280

History | View | Annotate | Download (23.5 kB)

1
.TH HBAL 1 2009-03-23 htools "Ganeti H-tools"
2
.SH NAME
3
hbal \- Cluster balancer for Ganeti
4

    
5
.SH SYNOPSIS
6
.B hbal
7
.B "[backend options...]"
8
.B "[algorithm options...]"
9
.B "[reporting options...]"
10

    
11
.B hbal
12
.B --version
13

    
14
.TP
15
Backend options:
16
.BI "[ -m " cluster " ]"
17
|
18
.BI "[ -L[" path "] [-X]]"
19
|
20
.BI "[ -n " nodes-file " ]"
21
.BI "[ -i " instances-file " ]"
22

    
23
.TP
24
Algorithm options:
25
.BI "[ --max-cpu " cpu-ratio " ]"
26
.BI "[ --min-disk " disk-ratio " ]"
27
.BI "[ -l " limit " ]"
28
.BI "[ -e " score " ]"
29
.BI "[ -O " name... " ]"
30
.B "[ --no-disk-moves ]"
31
.BI "[ -U " util-file " ]"
32

    
33
.TP
34
Reporting options:
35
.BI "[ -C[" file "] ]"
36
.B "[ -p ]"
37
.B "[ --print-instances ]"
38
.B "[ -o ]"
39
.B "[ -v... | -q ]"
40

    
41

    
42
.SH DESCRIPTION
43
hbal is a cluster balancer that looks at the current state of the
44
cluster (nodes with their total and free disk, memory, etc.) and
45
instance placement and computes a series of steps designed to bring
46
the cluster into a better state.
47

    
48
The algorithm to do so is designed to be stable (i.e. it will give you
49
the same results when restarting it from the middle of the solution)
50
and reasonably fast. It is not, however, designed to be a perfect
51
algorithm - it is possible to make it go into a corner from which it
52
can find no improvement, because it only look one "step" ahead.
53

    
54
By default, the program will show the solution incrementally as it is
55
computed, in a somewhat cryptic format; for getting the actual Ganeti
56
command list, use the \fB-C\fR option.
57

    
58
.SS ALGORITHM
59

    
60
The program works in independent steps; at each step, we compute the
61
best instance move that lowers the cluster score.
62

    
63
The possible move type for an instance are combinations of
64
failover/migrate and replace-disks such that we change one of the
65
instance nodes, and the other one remains (but possibly with changed
66
role, e.g. from primary it becomes secondary). The list is:
67
.RS 4
68
.TP 3
69
\(em
70
failover (f)
71
.TP
72
\(em
73
replace secondary (r)
74
.TP
75
\(em
76
replace primary, a composite move (f, r, f)
77
.TP
78
\(em
79
failover and replace secondary, also composite (f, r)
80
.TP
81
\(em
82
replace secondary and failover, also composite (r, f)
83
.RE
84

    
85
We don't do the only remaining possibility of replacing both nodes
86
(r,f,r,f or the equivalent f,r,f,r) since these move needs an
87
exhaustive search over both candidate primary and secondary nodes, and
88
is O(n*n) in the number of nodes. Furthermore, it doesn't seems to
89
give better scores but will result in more disk replacements.
90

    
91
.SS CLUSTER SCORING
92

    
93
As said before, the algorithm tries to minimise the cluster score at
94
each step. Currently this score is computed as a sum of the following
95
components:
96
.RS 4
97
.TP 3
98
\(em
99
coefficient of variance of the percent of free memory
100
.TP
101
\(em
102
coefficient of variance of the percent of reserved memory
103
.TP
104
\(em
105
coefficient of variance of the percent of free disk
106
.TP
107
\(em
108
percentage of nodes failing N+1 check
109
.TP
110
\(em
111
percentage of instances living (either as primary or secondary) on
112
offline nodes
113
.TP
114
\(em
115
coefficent of variance of the ratio of virtual-to-physical cpus (for
116
primary instaces of the node)
117
.TP
118
\(em
119
coefficients of variance of the dynamic load on the nodes, for cpus,
120
memory, disk and network
121
.RE
122

    
123
The free memory and free disk values help ensure that all nodes are
124
somewhat balanced in their resource usage. The reserved memory helps
125
to ensure that nodes are somewhat balanced in holding secondary
126
instances, and that no node keeps too much memory reserved for
127
N+1. And finally, the N+1 percentage helps guide the algorithm towards
128
eliminating N+1 failures, if possible.
129

    
130
Except for the N+1 failures and offline instances percentage, we use
131
the coefficient of variance since this brings the values into the same
132
unit so to speak, and with a restrict domain of values (between zero
133
and one). The percentage of N+1 failures, while also in this numeric
134
range, doesn't actually has the same meaning, but it has shown to work
135
well.
136

    
137
The other alternative, using for N+1 checks the coefficient of
138
variance of (N+1 fail=1, N+1 pass=0) across nodes could hint the
139
algorithm to make more N+1 failures if most nodes are N+1 fail
140
already. Since this (making N+1 failures) is not allowed by other
141
rules of the algorithm, so the N+1 checks would simply not work
142
anymore in this case.
143

    
144
The offline instances percentage (meaning the percentage of instances
145
living on offline nodes) will cause the algorithm to actively move
146
instances away from offline nodes. This, coupled with the restriction
147
on placement given by offline nodes, will cause evacuation of such
148
nodes.
149

    
150
The dynamic load values need to be read from an external file (Ganeti
151
doesn't supply them), and are computed for each node as: sum of
152
primary instance cpu load, sum of primary instance memory load, sum of
153
primary and secondary instance disk load (as DRBD generates write load
154
on secondary nodes too in normal case and in degraded scenarios also
155
read load), and sum of primary instance network load. An example of
156
how to generate these values for input to hbal would be to track "xm
157
list" for instance over a day and by computing the delta of the cpu
158
values, and feed that via the \fI-U\fR option for all instances (and
159
keep the other metrics as one). For the algorithm to work, all that is
160
needed is that the values are consistent for a metric across all
161
instances (e.g. all instances use cpu% to report cpu usage, but they
162
could represent network bandwith in Gbps). Note that it's recommended
163
to not have zero as the load value for any instance metric since then
164
secondary instances are not well balanced.
165

    
166
On a perfectly balanced cluster (all nodes the same size, all
167
instances the same size and spread across the nodes equally), the
168
values for all metrics would be zero. This doesn't happen too often in
169
practice :)
170

    
171
.SS OFFLINE INSTANCES
172

    
173
Since current Ganeti versions do not report the memory used by offline
174
(down) instances, ignoring the run status of instances will cause
175
wrong calculations. For this reason, the algorithm subtracts the
176
memory size of down instances from the free node memory of their
177
primary node, in effect simulating the startup of such instances.
178

    
179
.SS OTHER POSSIBLE METRICS
180

    
181
It would be desirable to add more metrics to the algorithm, especially
182
dynamically-computed metrics, such as:
183
.RS 4
184
.TP 3
185
\(em
186
CPU usage of instances
187
.TP
188
\(em
189
Disk IO usage
190
.TP
191
\(em
192
Network IO
193
.RE
194

    
195
.SH OPTIONS
196
The options that can be passed to the program are as follows:
197
.TP
198
.B -C, --print-commands
199
Print the command list at the end of the run. Without this, the
200
program will only show a shorter, but cryptic output.
201

    
202
Note that the moves list will be split into independent steps, called
203
"jobsets", but only for visual inspection, not for actually
204
parallelisation. It is not possible to parallelise these directly when
205
executed via "gnt-instance" commands, since a compound command
206
(e.g. failover and replace-disks) must be executed serially. Parallel
207
execution is only possible when using the Luxi backend and the
208
\fI-L\fR option.
209

    
210
The algorithm for splitting the moves into jobsets is by accumulating
211
moves until the next move is touching nodes already touched by the
212
current moves; this means we can't execute in parallel (due to
213
resource allocation in Ganeti) and thus we start a new jobset.
214

    
215
.TP
216
.B -p, --print-nodes
217
Prints the before and after node status, in a format designed to allow
218
the user to understand the node's most important parameters.
219

    
220
The node list will contain these informations:
221
.RS
222
.TP
223
.B F
224
a character denoting the status of the node, with '-' meaning an
225
offline node, '*' meaning N+1 failure and blank meaning a good node
226
.TP
227
.B Name
228
the node name
229
.TP
230
.B t_mem
231
the total node memory
232
.TP
233
.B n_mem
234
the memory used by the node itself
235
.TP
236
.B i_mem
237
the memory used by instances
238
.TP
239
.B x_mem
240
amount memory which seems to be in use but cannot be determined why or
241
by which instance; usually this means that the hypervisor has some
242
overhead or that there are other reporting errors
243
.TP
244
.B f_mem
245
the free node memory
246
.TP
247
.B r_mem
248
the reserved node memory, which is the amount of free memory needed
249
for N+1 compliance
250
.TP
251
.B t_dsk
252
total disk
253
.TP
254
.B f_dsk
255
free disk
256
.TP
257
.B pcpu
258
the number of physical cpus on the node
259
.TP
260
.B vcpu
261
the number of virtual cpus allocated to primary instances
262
.TP
263
.B pri
264
number of primary instances
265
.TP
266
.B sec
267
number of secondary instances
268
.TP
269
.B p_fmem
270
percent of free memory
271
.TP
272
.B p_fdsk
273
percent of free disk
274
.TP
275
.B r_cpu
276
ratio of virtual to physical cpus
277
.TP
278
.B lCpu
279
the dynamic CPU load (if the information is available)
280
.TP
281
.B lMem
282
the dynamic memory load (if the information is available)
283
.TP
284
.B lDsk
285
the dynamic disk load (if the information is available)
286
.TP
287
.B lNet
288
the dynamic net load (if the information is available)
289
.RE
290

    
291
.TP
292
.B --print-instances
293
Prints the before and after instance map. This is less useful as the
294
node status, but it can help in understanding instance moves.
295

    
296
.TP
297
.B -o, --oneline
298
Only shows a one-line output from the program, designed for the case
299
when one wants to look at multiple clusters at once and check their
300
status.
301

    
302
The line will contain four fields:
303
.RS
304
.RS 4
305
.TP 3
306
\(em
307
initial cluster score
308
.TP
309
\(em
310
number of steps in the solution
311
.TP
312
\(em
313
final cluster score
314
.TP
315
\(em
316
improvement in the cluster score
317
.RE
318
.RE
319

    
320
.TP
321
.BI "-O " name
322
This option (which can be given multiple times) will mark nodes as
323
being \fIoffline\fR. This means a couple of things:
324
.RS
325
.RS 4
326
.TP 3
327
\(em
328
instances won't be placed on these nodes, not even temporarily;
329
e.g. the \fIreplace primary\fR move is not available if the secondary
330
node is offline, since this move requires a failover.
331
.TP
332
\(em
333
these nodes will not be included in the score calculation (except for
334
the percentage of instances on offline nodes)
335
.RE
336
Note that hbal will also mark as offline any nodes which are reported
337
by RAPI as such, or that have "?" in file-based input in any numeric
338
fields.
339
.RE
340

    
341
.TP
342
.BI "-e" score ", --min-score=" score
343
This parameter denotes the minimum score we are happy with and alters
344
the computation in two ways:
345
.RS
346
.RS 4
347
.TP 3
348
\(em
349
if the cluster has the initial score lower than this value, then we
350
don't enter the algorithm at all, and exit with success
351
.TP
352
\(em
353
during the iterative process, if we reach a score lower than this
354
value, we exit the algorithm
355
.RE
356
The default value of the parameter is currently \fI1e-9\fR (chosen
357
empirically).
358
.RE
359

    
360
.TP
361
.BI "--no-disk-moves"
362
This parameter prevents hbal from using disk move (i.e. "gnt-instance
363
replace-disks") operations. This will result in a much quicker
364
balancing, but of course the improvements are limited. It is up to the
365
user to decide when to use one or another.
366

    
367
.TP
368
.BI "-U" util-file
369
This parameter specifies a file holding instance dynamic utilisation
370
information that will be used to tweak the balancing algorithm to
371
equalise load on the nodes (as opposed to static resource usage). The
372
file is in the format "instance_name cpu_util mem_util disk_util
373
net_util" where the "_util" parameters are interpreted as numbers and
374
the instance name must match exactly the instance as read from
375
Ganeti. In case of unknown instance names, the program will abort.
376

    
377
If not given, the default values are one for all metrics and thus
378
dynamic utilisation has only one effect on the algorithm: the
379
equalisation of the secondary instances across nodes (this is the only
380
metric that is not tracked by another, dedicated value, and thus the
381
disk load of instances will cause secondary instance
382
equalisation). Note that value of one will also influence slightly the
383
primary instance count, but that is already tracked via other metrics
384
and thus the influence of the dynamic utilisation will be practically
385
insignificant.
386

    
387
.TP
388
.BI "-n" nodefile ", --nodes=" nodefile
389
The name of the file holding node information (if not collecting via
390
RAPI), instead of the default \fInodes\fR file (but see below how to
391
customize the default value via the environment).
392

    
393
.TP
394
.BI "-i" instancefile ", --instances=" instancefile
395
The name of the file holding instance information (if not collecting
396
via RAPI), instead of the default \fIinstances\fR file (but see below
397
how to customize the default value via the environment).
398

    
399
.TP
400
.BI "-m" cluster
401
Collect data not from files but directly from the
402
.I cluster
403
given as an argument via RAPI. If the argument doesn't contain a colon
404
(:), then it is converted into a fully-built URL via prepending
405
https:// and appending the default RAPI port, otherwise it's
406
considered a fully-specified URL and is used as-is.
407

    
408
.TP
409
.BI "-L[" path "]"
410
Collect data not from files but directly from the master daemon, which
411
is to be contacted via the luxi (an internal Ganeti protocol). An
412
optional \fIpath\fR argument is interpreted as the path to the unix
413
socket on which the master daemon listens; otherwise, the default path
414
used by ganeti when installed with "--localstatedir=/var" is used.
415

    
416
.TP
417
.B "-X"
418
When using the Luxi backend, hbal can also execute the given
419
commands. The execution method is to execute the individual jobsets
420
(see the \fI-C\fR option for details) in separate stages, aborting if
421
at any time a jobset doesn't have all jobs successful. Each step in
422
the balancing solution will be translated into exactly one Ganeti job
423
(having between one and three OpCodes), and all the steps in a jobset
424
will be executed in parallel. The jobsets themselves are executed
425
serially.
426

    
427
.TP
428
.BI "-l" N ", --max-length=" N
429
Restrict the solution to this length. This can be used for example to
430
automate the execution of the balancing.
431

    
432
.TP
433
.BI "--max-cpu " cpu-ratio
434
The maximum virtual-to-physical cpu ratio, as a floating point number
435
between zero and one. For example, specifying \fIcpu-ratio\fR as
436
\fB2.5\fR means that, for a 4-cpu machine, a maximum of 10 virtual
437
cpus should be allowed to be in use for primary instances. A value of
438
one doesn't make sense though, as that means no disk space can be used
439
on it.
440

    
441
.TP
442
.BI "--min-disk " disk-ratio
443
The minimum amount of free disk space remaining, as a floating point
444
number. For example, specifying \fIdisk-ratio\fR as \fB0.25\fR means
445
that at least one quarter of disk space should be left free on nodes.
446

    
447
.TP
448
.B -v, --verbose
449
Increase the output verbosity. Each usage of this option will increase
450
the verbosity (currently more than 2 doesn't make sense) from the
451
default of one.
452

    
453
.TP
454
.B -q, --quiet
455
Decrease the output verbosity. Each usage of this option will decrease
456
the verbosity (less than zero doesn't make sense) from the default of
457
one.
458

    
459
.TP
460
.B -V, --version
461
Just show the program version and exit.
462

    
463
.SH EXIT STATUS
464

    
465
The exist status of the command will be zero, unless for some reason
466
the algorithm fatally failed (e.g. wrong node or instance data).
467

    
468
.SH ENVIRONMENT
469

    
470
If the variables \fBHTOOLS_NODES\fR and \fBHTOOLS_INSTANCES\fR are
471
present in the environment, they will override the default names for
472
the nodes and instances files. These will have of course no effect
473
when RAPI is used.
474

    
475
.SH BUGS
476

    
477
The program does not check its input data for consistency, and aborts
478
with cryptic errors messages in this case.
479

    
480
The algorithm is not perfect.
481

    
482
The output format is not easily scriptable, and the program should
483
feed moves directly into Ganeti (either via RAPI or via a gnt-debug
484
input file).
485

    
486
.SH EXAMPLE
487

    
488
Note that this example are not for the latest version (they don't have
489
full node data).
490

    
491
.SS Default output
492

    
493
With the default options, the program shows each individual step and
494
the improvements it brings in cluster score:
495

    
496
.in +4n
497
.nf
498
.RB "$" " hbal"
499
Loaded 20 nodes, 80 instances
500
Cluster is not N+1 happy, continuing but no guarantee that the cluster will end N+1 happy.
501
Initial score: 0.52329131
502
Trying to minimize the CV...
503
    1. instance14  node1:node10  => node16:node10 0.42109120 a=f r:node16 f
504
    2. instance54  node4:node15  => node16:node15 0.31904594 a=f r:node16 f
505
    3. instance4   node5:node2   => node2:node16  0.26611015 a=f r:node16
506
    4. instance48  node18:node20 => node2:node18  0.21361717 a=r:node2 f
507
    5. instance93  node19:node18 => node16:node19 0.16166425 a=r:node16 f
508
    6. instance89  node3:node20  => node2:node3   0.11005629 a=r:node2 f
509
    7. instance5   node6:node2   => node16:node6  0.05841589 a=r:node16 f
510
    8. instance94  node7:node20  => node20:node16 0.00658759 a=f r:node16
511
    9. instance44  node20:node2  => node2:node15  0.00438740 a=f r:node15
512
   10. instance62  node14:node18 => node14:node16 0.00390087 a=r:node16
513
   11. instance13  node11:node14 => node11:node16 0.00361787 a=r:node16
514
   12. instance19  node10:node11 => node10:node7  0.00336636 a=r:node7
515
   13. instance43  node12:node13 => node12:node1  0.00305681 a=r:node1
516
   14. instance1   node1:node2   => node1:node4   0.00263124 a=r:node4
517
   15. instance58  node19:node20 => node19:node17 0.00252594 a=r:node17
518
Cluster score improved from 0.52329131 to 0.00252594
519
.fi
520
.in
521

    
522
In the above output, we can see:
523
  - the input data (here from files) shows a cluster with 20 nodes and
524
    80 instances
525
  - the cluster is not initially N+1 compliant
526
  - the initial score is 0.52329131
527

    
528
The step list follows, showing the instance, its initial
529
primary/secondary nodes, the new primary secondary, the cluster list,
530
and the actions taken in this step (with 'f' denoting failover/migrate
531
and 'r' denoting replace secondary).
532

    
533
Finally, the program shows the improvement in cluster score.
534

    
535
A more detailed output is obtained via the \fB-C\fR and \fB-p\fR options:
536

    
537
.in +4n
538
.nf
539
.RB "$" " hbal"
540
Loaded 20 nodes, 80 instances
541
Cluster is not N+1 happy, continuing but no guarantee that the cluster will end N+1 happy.
542
Initial cluster status:
543
N1 Name   t_mem f_mem r_mem t_dsk f_dsk pri sec  p_fmem  p_fdsk
544
 * node1  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
545
   node2  32762 31280 12000  1861  1026   0   8 0.95476 0.55179
546
 * node3  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
547
 * node4  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
548
 * node5  32762  1280  6000  1861   978   5   5 0.03907 0.52573
549
 * node6  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
550
 * node7  32762  1280  6000  1861  1026   5   3 0.03907 0.55179
551
   node8  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
552
   node9  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
553
 * node10 32762  7280 12000  1861  1026   4   4 0.22221 0.55179
554
   node11 32762  7280  6000  1861   922   4   5 0.22221 0.49577
555
   node12 32762  7280  6000  1861  1026   4   4 0.22221 0.55179
556
   node13 32762  7280  6000  1861   922   4   5 0.22221 0.49577
557
   node14 32762  7280  6000  1861   922   4   5 0.22221 0.49577
558
 * node15 32762  7280 12000  1861  1131   4   3 0.22221 0.60782
559
   node16 32762 31280     0  1861  1860   0   0 0.95476 1.00000
560
   node17 32762  7280  6000  1861  1106   5   3 0.22221 0.59479
561
 * node18 32762  1280  6000  1396   561   5   3 0.03907 0.40239
562
 * node19 32762  1280  6000  1861  1026   5   3 0.03907 0.55179
563
   node20 32762 13280 12000  1861   689   3   9 0.40535 0.37068
564

    
565
Initial score: 0.52329131
566
Trying to minimize the CV...
567
    1. instance14  node1:node10  => node16:node10 0.42109120 a=f r:node16 f
568
    2. instance54  node4:node15  => node16:node15 0.31904594 a=f r:node16 f
569
    3. instance4   node5:node2   => node2:node16  0.26611015 a=f r:node16
570
    4. instance48  node18:node20 => node2:node18  0.21361717 a=r:node2 f
571
    5. instance93  node19:node18 => node16:node19 0.16166425 a=r:node16 f
572
    6. instance89  node3:node20  => node2:node3   0.11005629 a=r:node2 f
573
    7. instance5   node6:node2   => node16:node6  0.05841589 a=r:node16 f
574
    8. instance94  node7:node20  => node20:node16 0.00658759 a=f r:node16
575
    9. instance44  node20:node2  => node2:node15  0.00438740 a=f r:node15
576
   10. instance62  node14:node18 => node14:node16 0.00390087 a=r:node16
577
   11. instance13  node11:node14 => node11:node16 0.00361787 a=r:node16
578
   12. instance19  node10:node11 => node10:node7  0.00336636 a=r:node7
579
   13. instance43  node12:node13 => node12:node1  0.00305681 a=r:node1
580
   14. instance1   node1:node2   => node1:node4   0.00263124 a=r:node4
581
   15. instance58  node19:node20 => node19:node17 0.00252594 a=r:node17
582
Cluster score improved from 0.52329131 to 0.00252594
583

    
584
Commands to run to reach the above solution:
585
  echo step 1
586
  echo gnt-instance migrate instance14
587
  echo gnt-instance replace-disks -n node16 instance14
588
  echo gnt-instance migrate instance14
589
  echo step 2
590
  echo gnt-instance migrate instance54
591
  echo gnt-instance replace-disks -n node16 instance54
592
  echo gnt-instance migrate instance54
593
  echo step 3
594
  echo gnt-instance migrate instance4
595
  echo gnt-instance replace-disks -n node16 instance4
596
  echo step 4
597
  echo gnt-instance replace-disks -n node2 instance48
598
  echo gnt-instance migrate instance48
599
  echo step 5
600
  echo gnt-instance replace-disks -n node16 instance93
601
  echo gnt-instance migrate instance93
602
  echo step 6
603
  echo gnt-instance replace-disks -n node2 instance89
604
  echo gnt-instance migrate instance89
605
  echo step 7
606
  echo gnt-instance replace-disks -n node16 instance5
607
  echo gnt-instance migrate instance5
608
  echo step 8
609
  echo gnt-instance migrate instance94
610
  echo gnt-instance replace-disks -n node16 instance94
611
  echo step 9
612
  echo gnt-instance migrate instance44
613
  echo gnt-instance replace-disks -n node15 instance44
614
  echo step 10
615
  echo gnt-instance replace-disks -n node16 instance62
616
  echo step 11
617
  echo gnt-instance replace-disks -n node16 instance13
618
  echo step 12
619
  echo gnt-instance replace-disks -n node7 instance19
620
  echo step 13
621
  echo gnt-instance replace-disks -n node1 instance43
622
  echo step 14
623
  echo gnt-instance replace-disks -n node4 instance1
624
  echo step 15
625
  echo gnt-instance replace-disks -n node17 instance58
626

    
627
Final cluster status:
628
N1 Name   t_mem f_mem r_mem t_dsk f_dsk pri sec  p_fmem  p_fdsk
629
   node1  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
630
   node2  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
631
   node3  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
632
   node4  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
633
   node5  32762  7280  6000  1861  1078   4   5 0.22221 0.57947
634
   node6  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
635
   node7  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
636
   node8  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
637
   node9  32762  7280  6000  1861  1026   4   4 0.22221 0.55179
638
   node10 32762  7280  6000  1861  1026   4   4 0.22221 0.55179
639
   node11 32762  7280  6000  1861  1022   4   4 0.22221 0.54951
640
   node12 32762  7280  6000  1861  1026   4   4 0.22221 0.55179
641
   node13 32762  7280  6000  1861  1022   4   4 0.22221 0.54951
642
   node14 32762  7280  6000  1861  1022   4   4 0.22221 0.54951
643
   node15 32762  7280  6000  1861  1031   4   4 0.22221 0.55408
644
   node16 32762  7280  6000  1861  1060   4   4 0.22221 0.57007
645
   node17 32762  7280  6000  1861  1006   5   4 0.22221 0.54105
646
   node18 32762  7280  6000  1396   761   4   2 0.22221 0.54570
647
   node19 32762  7280  6000  1861  1026   4   4 0.22221 0.55179
648
   node20 32762 13280  6000  1861  1089   3   5 0.40535 0.58565
649

    
650
.fi
651
.in
652

    
653
Here we see, beside the step list, the initial and final cluster
654
status, with the final one showing all nodes being N+1 compliant, and
655
the command list to reach the final solution. In the initial listing,
656
we see which nodes are not N+1 compliant.
657

    
658
The algorithm is stable as long as each step above is fully completed,
659
e.g. in step 8, both the migrate and the replace-disks are
660
done. Otherwise, if only the migrate is done, the input data is
661
changed in a way that the program will output a different solution
662
list (but hopefully will end in the same state).
663

    
664
.SH SEE ALSO
665
.BR hspace "(1), " hscan "(1), " hail "(1), "
666
.BR ganeti "(7), " gnt-instance "(8), " gnt-node "(8)"
667

    
668
.SH "COPYRIGHT"
669
.PP
670
Copyright (C) 2009 Google Inc. Permission is granted to copy,
671
distribute and/or modify under the terms of the GNU General Public
672
License as published by the Free Software Foundation; either version 2
673
of the License, or (at your option) any later version.
674
.PP
675
On Debian systems, the complete text of the GNU General Public License
676
can be found in /usr/share/common-licenses/GPL.