Statistics
| Branch: | Tag: | Revision:

root / hn1.1 @ d2ac5526

History | View | Annotate | Download (6.2 kB)

1 d2ac5526 Iustin Pop
.TH HN1 1 2009-03-22 htools "Ganeti H-tools"
2 b0045e4d Iustin Pop
.SH NAME
3 b0045e4d Iustin Pop
hn1 \- N+1 fixer for Ganeti
4 b0045e4d Iustin Pop
5 b0045e4d Iustin Pop
.SH SYNOPSIS
6 b0045e4d Iustin Pop
.B hn1
7 b0045e4d Iustin Pop
.B "[-C]"
8 b0045e4d Iustin Pop
.B "[-p]"
9 b0045e4d Iustin Pop
.B "[-o]"
10 b0045e4d Iustin Pop
.BI "[ -m " cluster "]"
11 b0045e4d Iustin Pop
.BI "[-n " nodes-file " ]"
12 b0045e4d Iustin Pop
.BI "[ -i " instances-file "]"
13 b0045e4d Iustin Pop
.BI "[-d " depth "]"
14 b0045e4d Iustin Pop
.BI "[-r " max-removals "]"
15 b0045e4d Iustin Pop
.BI "[-L " max-delta "]"
16 b0045e4d Iustin Pop
.BI "[-l " min-delta "]"
17 b0045e4d Iustin Pop
18 b0045e4d Iustin Pop
.B hn1
19 b0045e4d Iustin Pop
.B --version
20 b0045e4d Iustin Pop
21 b0045e4d Iustin Pop
.SH DESCRIPTION
22 b0045e4d Iustin Pop
hn1 is a cluster N+1 fixer that tries to compute the minimum number of
23 b0045e4d Iustin Pop
moves needed for getting all nodes to be N+1 compliant.
24 b0045e4d Iustin Pop
25 b0045e4d Iustin Pop
The algorithm is designed to be a 'perfect' algorithm, so that we
26 b0045e4d Iustin Pop
always examine the entire solution space until we find the minimum
27 b0045e4d Iustin Pop
solution. The algorithm can be tweaked via the \fB-d\fR, \fB-r\fR,
28 b0045e4d Iustin Pop
\fB-L\fR and \fB-l\fR options.
29 b0045e4d Iustin Pop
30 b0045e4d Iustin Pop
By default, the program will show the solution in a somewhat cryptic
31 b0045e4d Iustin Pop
format; for getting the actual Ganeti command list, use the \fB-C\fR
32 b0045e4d Iustin Pop
option.
33 b0045e4d Iustin Pop
34 b0045e4d Iustin Pop
\fBNote:\fR this program is somewhat deprecated; \fBhbal(1)\fR gives
35 b0045e4d Iustin Pop
usually much faster results, and a better cluster. It is recommended
36 b0045e4d Iustin Pop
to use this program only when \fBhbal\fR doesn't give a N+1 compliant
37 b0045e4d Iustin Pop
cluster.
38 b0045e4d Iustin Pop
39 b0045e4d Iustin Pop
.SS ALGORITHM
40 b0045e4d Iustin Pop
41 b0045e4d Iustin Pop
The algorithm works in multiple rounds, of increasing \fIdepth\fR,
42 b0045e4d Iustin Pop
until we have a solution.
43 b0045e4d Iustin Pop
44 b0045e4d Iustin Pop
First, before starting the solution computation, we compute all the
45 b0045e4d Iustin Pop
N+1-fail nodes and the instances they hold. These instances are
46 b0045e4d Iustin Pop
candidate for replacement (and only these!).
47 b0045e4d Iustin Pop
48 b0045e4d Iustin Pop
The program start then with \fIdepth\fR one (unless overridden via the
49 b0045e4d Iustin Pop
\fB-d\fR option), and at each round:
50 b0045e4d Iustin Pop
  - it tries to remove from the cluster as many instances as the
51 b0045e4d Iustin Pop
    current depth in order to make the cluster N+1 compliant
52 b0045e4d Iustin Pop
  - then, for each of the possible instance combinations that allow
53 b0045e4d Iustin Pop
    this (unless the total size is reduced via the \fB-r\fR option),
54 b0045e4d Iustin Pop
    it tries to put them back on the cluster while maintaining N+1
55 b0045e4d Iustin Pop
    compliance
56 b0045e4d Iustin Pop
57 b0045e4d Iustin Pop
It might be that at a given round, the results are:
58 b0045e4d Iustin Pop
  - no instance combination that can be put back; this means it is not
59 b0045e4d Iustin Pop
    possible to make the cluster N+1 compliant with this number of
60 b0045e4d Iustin Pop
    instances being moved, so we increase the depth and go on to the
61 b0045e4d Iustin Pop
    next round
62 b0045e4d Iustin Pop
  - one or more successful result, in which case we take the one that
63 b0045e4d Iustin Pop
    has as few changes as possible (by change meaning a replace-disks
64 b0045e4d Iustin Pop
    needed)
65 b0045e4d Iustin Pop
66 b0045e4d Iustin Pop
The main problem with the algorithm is that, being an exhaustive
67 b0045e4d Iustin Pop
search, the CPU time required grows very very quickly based on
68 b0045e4d Iustin Pop
depth. On a 20-node, 80-instances cluster, depths up to 5-6 are
69 b0045e4d Iustin Pop
quickly computed, and depth 10 could already take days.
70 b0045e4d Iustin Pop
71 b0045e4d Iustin Pop
Since the algorithm is designed to prune the search space as quickly
72 b0045e4d Iustin Pop
as possible, is by luck we find a good solution early at a given
73 b0045e4d Iustin Pop
depth, then the other solutions which would result in a bigger delta
74 b0045e4d Iustin Pop
(the number of changes) will not be investigated, and the program will
75 b0045e4d Iustin Pop
finish fast. Since this is random and depends on where in the full
76 b0045e4d Iustin Pop
solution space the good solution will be, there are two options for
77 b0045e4d Iustin Pop
cutting down the time needed:
78 b0045e4d Iustin Pop
  - \fB-l\fR makes any solution that has delta lower than its
79 b0045e4d Iustin Pop
    parameter succeed instantly
80 b0045e4d Iustin Pop
  - \fB-L\fR makes any solution with delta higher than its parameter
81 b0045e4d Iustin Pop
    being rejected instantly (and not descend on the search tree)
82 b0045e4d Iustin Pop
83 b0045e4d Iustin Pop
.SH OPTIONS
84 b0045e4d Iustin Pop
The options that can be passed to the program are as follows:
85 b0045e4d Iustin Pop
.TP
86 b0045e4d Iustin Pop
.B -C, --print-commands
87 b0045e4d Iustin Pop
Print the command list at the end of the run. Without this, the
88 b0045e4d Iustin Pop
program will only show a shorter, but cryptic output.
89 b0045e4d Iustin Pop
.TP
90 b0045e4d Iustin Pop
.B -p, --print-nodes
91 b0045e4d Iustin Pop
Prints the before and after node status, in a format designed to allow
92 b0045e4d Iustin Pop
the user to understand the node's most important parameters.
93 b0045e4d Iustin Pop
94 b0045e4d Iustin Pop
The node list will contain these informations:
95 d2ac5526 Iustin Pop
.RS
96 d2ac5526 Iustin Pop
.TP
97 d2ac5526 Iustin Pop
.B F
98 d2ac5526 Iustin Pop
a character denoting the status of the node, with '-' meaning an
99 d2ac5526 Iustin Pop
offline node, '*' meaning N+1 failure and blank meaning a good node
100 d2ac5526 Iustin Pop
.TP
101 d2ac5526 Iustin Pop
.B Name
102 d2ac5526 Iustin Pop
the node name
103 d2ac5526 Iustin Pop
.TP
104 d2ac5526 Iustin Pop
.B t_mem
105 d2ac5526 Iustin Pop
the total node memory
106 d2ac5526 Iustin Pop
.TP
107 d2ac5526 Iustin Pop
.B n_mem
108 d2ac5526 Iustin Pop
the memory used by the node itself
109 d2ac5526 Iustin Pop
.TP
110 d2ac5526 Iustin Pop
.B i_mem
111 d2ac5526 Iustin Pop
the memory used by instances
112 d2ac5526 Iustin Pop
.TP
113 d2ac5526 Iustin Pop
.B x_mem
114 d2ac5526 Iustin Pop
amount memory which seems to be in use but cannot be determined why or
115 d2ac5526 Iustin Pop
by which instance; usually this means that the hypervisor has some
116 d2ac5526 Iustin Pop
overhead or that there are other reporting errors
117 d2ac5526 Iustin Pop
.TP
118 d2ac5526 Iustin Pop
.B f_mem
119 d2ac5526 Iustin Pop
the free node memory
120 d2ac5526 Iustin Pop
.TP
121 d2ac5526 Iustin Pop
.B r_mem
122 d2ac5526 Iustin Pop
the reserved node memory, which is the amount of free memory needed
123 d2ac5526 Iustin Pop
for N+1 compliance
124 d2ac5526 Iustin Pop
.TP
125 d2ac5526 Iustin Pop
.B t_dsk
126 d2ac5526 Iustin Pop
total disk
127 d2ac5526 Iustin Pop
.TP
128 d2ac5526 Iustin Pop
.B f_dsk
129 d2ac5526 Iustin Pop
free disk
130 d2ac5526 Iustin Pop
.TP
131 d2ac5526 Iustin Pop
.B pri
132 d2ac5526 Iustin Pop
number of primary instances
133 d2ac5526 Iustin Pop
.TP
134 d2ac5526 Iustin Pop
.B sec
135 d2ac5526 Iustin Pop
number of secondary instances
136 d2ac5526 Iustin Pop
.TP
137 d2ac5526 Iustin Pop
.B p_fmem
138 d2ac5526 Iustin Pop
percent of free memory
139 d2ac5526 Iustin Pop
.TP
140 d2ac5526 Iustin Pop
.B p_fdsk
141 d2ac5526 Iustin Pop
percent of free disk
142 d2ac5526 Iustin Pop
.RE
143 b0045e4d Iustin Pop
144 b0045e4d Iustin Pop
.TP
145 b0045e4d Iustin Pop
.BI "-n" nodefile ", --nodes=" nodefile
146 b0045e4d Iustin Pop
The name of the file holding node information (if not collecting via
147 b0045e4d Iustin Pop
RAPI), instead of the default
148 b0045e4d Iustin Pop
.I nodes
149 b0045e4d Iustin Pop
file.
150 b0045e4d Iustin Pop
151 b0045e4d Iustin Pop
.TP
152 b0045e4d Iustin Pop
.BI "-i" instancefile ", --instances=" instancefile
153 b0045e4d Iustin Pop
The name of the file holding instance information (if not collecting
154 b0045e4d Iustin Pop
via RAPI), instead of the default
155 b0045e4d Iustin Pop
.I instances
156 b0045e4d Iustin Pop
file.
157 b0045e4d Iustin Pop
158 b0045e4d Iustin Pop
.TP
159 b0045e4d Iustin Pop
.BI "-m" cluster
160 b0045e4d Iustin Pop
Collect data not from files but directly from the
161 b0045e4d Iustin Pop
.I cluster
162 b0045e4d Iustin Pop
given as an argument via RAPI. This work for both Ganeti 1.2 and
163 b0045e4d Iustin Pop
Ganeti 2.0.
164 b0045e4d Iustin Pop
165 b0045e4d Iustin Pop
.TP
166 b0045e4d Iustin Pop
.BI "-d" DEPTH ", --depth=" DEPTH
167 b0045e4d Iustin Pop
Start the algorithm directly at depth \fID\fR, so that we don't
168 b0045e4d Iustin Pop
examine lower depth. This will be faster if we know a solution is not
169 b0045e4d Iustin Pop
found a lower depths, and thus it's unneeded to search them.
170 b0045e4d Iustin Pop
171 b0045e4d Iustin Pop
.TP
172 b0045e4d Iustin Pop
.BI "-l" MIN-DELTA ", --min-delta=" MIN-DELTA
173 b0045e4d Iustin Pop
If we find a solution with delta lower or equal to \fIMIN-DELTA\fR,
174 b0045e4d Iustin Pop
consider this a success and don't examine further.
175 b0045e4d Iustin Pop
176 b0045e4d Iustin Pop
.TP
177 b0045e4d Iustin Pop
.BI "-L" MAX-DELTA ", --max-delta=" MAX-DELTA
178 b0045e4d Iustin Pop
If while computing a solution, it's intermediate delta is already
179 b0045e4d Iustin Pop
higher or equal to \fIMAX-DELTA\fR, consider this a failure and abort
180 b0045e4d Iustin Pop
(as if N+1 checks have failed).
181 b0045e4d Iustin Pop
182 b0045e4d Iustin Pop
.TP
183 b0045e4d Iustin Pop
.B -V, --version
184 b0045e4d Iustin Pop
Just show the program version and exit.
185 b0045e4d Iustin Pop
186 b0045e4d Iustin Pop
.SH EXIT STATUS
187 b0045e4d Iustin Pop
188 b0045e4d Iustin Pop
The exist status of the command will be zero, unless for some reason
189 b0045e4d Iustin Pop
the algorithm fatally failed (e.g. wrong node or instance data).
190 b0045e4d Iustin Pop
191 b0045e4d Iustin Pop
.SH BUGS
192 b0045e4d Iustin Pop
193 b0045e4d Iustin Pop
The program does not check its input data for consistency, and aborts
194 b0045e4d Iustin Pop
with cryptic errors messages in this case.
195 b0045e4d Iustin Pop
196 b0045e4d Iustin Pop
The algorithm doesn't know when it won't be possible to reach N+1
197 b0045e4d Iustin Pop
compliance at all, and will happily churn CPU for ages without
198 b0045e4d Iustin Pop
realising it won't reach a solution.
199 b0045e4d Iustin Pop
200 b0045e4d Iustin Pop
The algorithm is too slow.
201 b0045e4d Iustin Pop
202 b0045e4d Iustin Pop
The output format is not easily scriptable, and the program should
203 b0045e4d Iustin Pop
feed moves directly into Ganeti (either via RAPI or via a gnt-debug
204 b0045e4d Iustin Pop
input file).
205 b0045e4d Iustin Pop
206 b0045e4d Iustin Pop
.SH SEE ALSO
207 d2ac5526 Iustin Pop
.BR hbal "(1), " hscan "(1), " ganeti "(7), " gnt-instance "(8), "
208 d2ac5526 Iustin Pop
.BR gnt-node "(8)"