root / hn1.1 @ d2ac5526
History | View | Annotate | Download (6.2 kB)
1 | d2ac5526 | Iustin Pop | .TH HN1 1 2009-03-22 htools "Ganeti H-tools" |
---|---|---|---|
2 | b0045e4d | Iustin Pop | .SH NAME |
3 | b0045e4d | Iustin Pop | hn1 \- N+1 fixer for Ganeti |
4 | b0045e4d | Iustin Pop | |
5 | b0045e4d | Iustin Pop | .SH SYNOPSIS |
6 | b0045e4d | Iustin Pop | .B hn1 |
7 | b0045e4d | Iustin Pop | .B "[-C]" |
8 | b0045e4d | Iustin Pop | .B "[-p]" |
9 | b0045e4d | Iustin Pop | .B "[-o]" |
10 | b0045e4d | Iustin Pop | .BI "[ -m " cluster "]" |
11 | b0045e4d | Iustin Pop | .BI "[-n " nodes-file " ]" |
12 | b0045e4d | Iustin Pop | .BI "[ -i " instances-file "]" |
13 | b0045e4d | Iustin Pop | .BI "[-d " depth "]" |
14 | b0045e4d | Iustin Pop | .BI "[-r " max-removals "]" |
15 | b0045e4d | Iustin Pop | .BI "[-L " max-delta "]" |
16 | b0045e4d | Iustin Pop | .BI "[-l " min-delta "]" |
17 | b0045e4d | Iustin Pop | |
18 | b0045e4d | Iustin Pop | .B hn1 |
19 | b0045e4d | Iustin Pop | .B --version |
20 | b0045e4d | Iustin Pop | |
21 | b0045e4d | Iustin Pop | .SH DESCRIPTION |
22 | b0045e4d | Iustin Pop | hn1 is a cluster N+1 fixer that tries to compute the minimum number of |
23 | b0045e4d | Iustin Pop | moves needed for getting all nodes to be N+1 compliant. |
24 | b0045e4d | Iustin Pop | |
25 | b0045e4d | Iustin Pop | The algorithm is designed to be a 'perfect' algorithm, so that we |
26 | b0045e4d | Iustin Pop | always examine the entire solution space until we find the minimum |
27 | b0045e4d | Iustin Pop | solution. The algorithm can be tweaked via the \fB-d\fR, \fB-r\fR, |
28 | b0045e4d | Iustin Pop | \fB-L\fR and \fB-l\fR options. |
29 | b0045e4d | Iustin Pop | |
30 | b0045e4d | Iustin Pop | By default, the program will show the solution in a somewhat cryptic |
31 | b0045e4d | Iustin Pop | format; for getting the actual Ganeti command list, use the \fB-C\fR |
32 | b0045e4d | Iustin Pop | option. |
33 | b0045e4d | Iustin Pop | |
34 | b0045e4d | Iustin Pop | \fBNote:\fR this program is somewhat deprecated; \fBhbal(1)\fR gives |
35 | b0045e4d | Iustin Pop | usually much faster results, and a better cluster. It is recommended |
36 | b0045e4d | Iustin Pop | to use this program only when \fBhbal\fR doesn't give a N+1 compliant |
37 | b0045e4d | Iustin Pop | cluster. |
38 | b0045e4d | Iustin Pop | |
39 | b0045e4d | Iustin Pop | .SS ALGORITHM |
40 | b0045e4d | Iustin Pop | |
41 | b0045e4d | Iustin Pop | The algorithm works in multiple rounds, of increasing \fIdepth\fR, |
42 | b0045e4d | Iustin Pop | until we have a solution. |
43 | b0045e4d | Iustin Pop | |
44 | b0045e4d | Iustin Pop | First, before starting the solution computation, we compute all the |
45 | b0045e4d | Iustin Pop | N+1-fail nodes and the instances they hold. These instances are |
46 | b0045e4d | Iustin Pop | candidate for replacement (and only these!). |
47 | b0045e4d | Iustin Pop | |
48 | b0045e4d | Iustin Pop | The program start then with \fIdepth\fR one (unless overridden via the |
49 | b0045e4d | Iustin Pop | \fB-d\fR option), and at each round: |
50 | b0045e4d | Iustin Pop | - it tries to remove from the cluster as many instances as the |
51 | b0045e4d | Iustin Pop | current depth in order to make the cluster N+1 compliant |
52 | b0045e4d | Iustin Pop | - then, for each of the possible instance combinations that allow |
53 | b0045e4d | Iustin Pop | this (unless the total size is reduced via the \fB-r\fR option), |
54 | b0045e4d | Iustin Pop | it tries to put them back on the cluster while maintaining N+1 |
55 | b0045e4d | Iustin Pop | compliance |
56 | b0045e4d | Iustin Pop | |
57 | b0045e4d | Iustin Pop | It might be that at a given round, the results are: |
58 | b0045e4d | Iustin Pop | - no instance combination that can be put back; this means it is not |
59 | b0045e4d | Iustin Pop | possible to make the cluster N+1 compliant with this number of |
60 | b0045e4d | Iustin Pop | instances being moved, so we increase the depth and go on to the |
61 | b0045e4d | Iustin Pop | next round |
62 | b0045e4d | Iustin Pop | - one or more successful result, in which case we take the one that |
63 | b0045e4d | Iustin Pop | has as few changes as possible (by change meaning a replace-disks |
64 | b0045e4d | Iustin Pop | needed) |
65 | b0045e4d | Iustin Pop | |
66 | b0045e4d | Iustin Pop | The main problem with the algorithm is that, being an exhaustive |
67 | b0045e4d | Iustin Pop | search, the CPU time required grows very very quickly based on |
68 | b0045e4d | Iustin Pop | depth. On a 20-node, 80-instances cluster, depths up to 5-6 are |
69 | b0045e4d | Iustin Pop | quickly computed, and depth 10 could already take days. |
70 | b0045e4d | Iustin Pop | |
71 | b0045e4d | Iustin Pop | Since the algorithm is designed to prune the search space as quickly |
72 | b0045e4d | Iustin Pop | as possible, is by luck we find a good solution early at a given |
73 | b0045e4d | Iustin Pop | depth, then the other solutions which would result in a bigger delta |
74 | b0045e4d | Iustin Pop | (the number of changes) will not be investigated, and the program will |
75 | b0045e4d | Iustin Pop | finish fast. Since this is random and depends on where in the full |
76 | b0045e4d | Iustin Pop | solution space the good solution will be, there are two options for |
77 | b0045e4d | Iustin Pop | cutting down the time needed: |
78 | b0045e4d | Iustin Pop | - \fB-l\fR makes any solution that has delta lower than its |
79 | b0045e4d | Iustin Pop | parameter succeed instantly |
80 | b0045e4d | Iustin Pop | - \fB-L\fR makes any solution with delta higher than its parameter |
81 | b0045e4d | Iustin Pop | being rejected instantly (and not descend on the search tree) |
82 | b0045e4d | Iustin Pop | |
83 | b0045e4d | Iustin Pop | .SH OPTIONS |
84 | b0045e4d | Iustin Pop | The options that can be passed to the program are as follows: |
85 | b0045e4d | Iustin Pop | .TP |
86 | b0045e4d | Iustin Pop | .B -C, --print-commands |
87 | b0045e4d | Iustin Pop | Print the command list at the end of the run. Without this, the |
88 | b0045e4d | Iustin Pop | program will only show a shorter, but cryptic output. |
89 | b0045e4d | Iustin Pop | .TP |
90 | b0045e4d | Iustin Pop | .B -p, --print-nodes |
91 | b0045e4d | Iustin Pop | Prints the before and after node status, in a format designed to allow |
92 | b0045e4d | Iustin Pop | the user to understand the node's most important parameters. |
93 | b0045e4d | Iustin Pop | |
94 | b0045e4d | Iustin Pop | The node list will contain these informations: |
95 | d2ac5526 | Iustin Pop | .RS |
96 | d2ac5526 | Iustin Pop | .TP |
97 | d2ac5526 | Iustin Pop | .B F |
98 | d2ac5526 | Iustin Pop | a character denoting the status of the node, with '-' meaning an |
99 | d2ac5526 | Iustin Pop | offline node, '*' meaning N+1 failure and blank meaning a good node |
100 | d2ac5526 | Iustin Pop | .TP |
101 | d2ac5526 | Iustin Pop | .B Name |
102 | d2ac5526 | Iustin Pop | the node name |
103 | d2ac5526 | Iustin Pop | .TP |
104 | d2ac5526 | Iustin Pop | .B t_mem |
105 | d2ac5526 | Iustin Pop | the total node memory |
106 | d2ac5526 | Iustin Pop | .TP |
107 | d2ac5526 | Iustin Pop | .B n_mem |
108 | d2ac5526 | Iustin Pop | the memory used by the node itself |
109 | d2ac5526 | Iustin Pop | .TP |
110 | d2ac5526 | Iustin Pop | .B i_mem |
111 | d2ac5526 | Iustin Pop | the memory used by instances |
112 | d2ac5526 | Iustin Pop | .TP |
113 | d2ac5526 | Iustin Pop | .B x_mem |
114 | d2ac5526 | Iustin Pop | amount memory which seems to be in use but cannot be determined why or |
115 | d2ac5526 | Iustin Pop | by which instance; usually this means that the hypervisor has some |
116 | d2ac5526 | Iustin Pop | overhead or that there are other reporting errors |
117 | d2ac5526 | Iustin Pop | .TP |
118 | d2ac5526 | Iustin Pop | .B f_mem |
119 | d2ac5526 | Iustin Pop | the free node memory |
120 | d2ac5526 | Iustin Pop | .TP |
121 | d2ac5526 | Iustin Pop | .B r_mem |
122 | d2ac5526 | Iustin Pop | the reserved node memory, which is the amount of free memory needed |
123 | d2ac5526 | Iustin Pop | for N+1 compliance |
124 | d2ac5526 | Iustin Pop | .TP |
125 | d2ac5526 | Iustin Pop | .B t_dsk |
126 | d2ac5526 | Iustin Pop | total disk |
127 | d2ac5526 | Iustin Pop | .TP |
128 | d2ac5526 | Iustin Pop | .B f_dsk |
129 | d2ac5526 | Iustin Pop | free disk |
130 | d2ac5526 | Iustin Pop | .TP |
131 | d2ac5526 | Iustin Pop | .B pri |
132 | d2ac5526 | Iustin Pop | number of primary instances |
133 | d2ac5526 | Iustin Pop | .TP |
134 | d2ac5526 | Iustin Pop | .B sec |
135 | d2ac5526 | Iustin Pop | number of secondary instances |
136 | d2ac5526 | Iustin Pop | .TP |
137 | d2ac5526 | Iustin Pop | .B p_fmem |
138 | d2ac5526 | Iustin Pop | percent of free memory |
139 | d2ac5526 | Iustin Pop | .TP |
140 | d2ac5526 | Iustin Pop | .B p_fdsk |
141 | d2ac5526 | Iustin Pop | percent of free disk |
142 | d2ac5526 | Iustin Pop | .RE |
143 | b0045e4d | Iustin Pop | |
144 | b0045e4d | Iustin Pop | .TP |
145 | b0045e4d | Iustin Pop | .BI "-n" nodefile ", --nodes=" nodefile |
146 | b0045e4d | Iustin Pop | The name of the file holding node information (if not collecting via |
147 | b0045e4d | Iustin Pop | RAPI), instead of the default |
148 | b0045e4d | Iustin Pop | .I nodes |
149 | b0045e4d | Iustin Pop | file. |
150 | b0045e4d | Iustin Pop | |
151 | b0045e4d | Iustin Pop | .TP |
152 | b0045e4d | Iustin Pop | .BI "-i" instancefile ", --instances=" instancefile |
153 | b0045e4d | Iustin Pop | The name of the file holding instance information (if not collecting |
154 | b0045e4d | Iustin Pop | via RAPI), instead of the default |
155 | b0045e4d | Iustin Pop | .I instances |
156 | b0045e4d | Iustin Pop | file. |
157 | b0045e4d | Iustin Pop | |
158 | b0045e4d | Iustin Pop | .TP |
159 | b0045e4d | Iustin Pop | .BI "-m" cluster |
160 | b0045e4d | Iustin Pop | Collect data not from files but directly from the |
161 | b0045e4d | Iustin Pop | .I cluster |
162 | b0045e4d | Iustin Pop | given as an argument via RAPI. This work for both Ganeti 1.2 and |
163 | b0045e4d | Iustin Pop | Ganeti 2.0. |
164 | b0045e4d | Iustin Pop | |
165 | b0045e4d | Iustin Pop | .TP |
166 | b0045e4d | Iustin Pop | .BI "-d" DEPTH ", --depth=" DEPTH |
167 | b0045e4d | Iustin Pop | Start the algorithm directly at depth \fID\fR, so that we don't |
168 | b0045e4d | Iustin Pop | examine lower depth. This will be faster if we know a solution is not |
169 | b0045e4d | Iustin Pop | found a lower depths, and thus it's unneeded to search them. |
170 | b0045e4d | Iustin Pop | |
171 | b0045e4d | Iustin Pop | .TP |
172 | b0045e4d | Iustin Pop | .BI "-l" MIN-DELTA ", --min-delta=" MIN-DELTA |
173 | b0045e4d | Iustin Pop | If we find a solution with delta lower or equal to \fIMIN-DELTA\fR, |
174 | b0045e4d | Iustin Pop | consider this a success and don't examine further. |
175 | b0045e4d | Iustin Pop | |
176 | b0045e4d | Iustin Pop | .TP |
177 | b0045e4d | Iustin Pop | .BI "-L" MAX-DELTA ", --max-delta=" MAX-DELTA |
178 | b0045e4d | Iustin Pop | If while computing a solution, it's intermediate delta is already |
179 | b0045e4d | Iustin Pop | higher or equal to \fIMAX-DELTA\fR, consider this a failure and abort |
180 | b0045e4d | Iustin Pop | (as if N+1 checks have failed). |
181 | b0045e4d | Iustin Pop | |
182 | b0045e4d | Iustin Pop | .TP |
183 | b0045e4d | Iustin Pop | .B -V, --version |
184 | b0045e4d | Iustin Pop | Just show the program version and exit. |
185 | b0045e4d | Iustin Pop | |
186 | b0045e4d | Iustin Pop | .SH EXIT STATUS |
187 | b0045e4d | Iustin Pop | |
188 | b0045e4d | Iustin Pop | The exist status of the command will be zero, unless for some reason |
189 | b0045e4d | Iustin Pop | the algorithm fatally failed (e.g. wrong node or instance data). |
190 | b0045e4d | Iustin Pop | |
191 | b0045e4d | Iustin Pop | .SH BUGS |
192 | b0045e4d | Iustin Pop | |
193 | b0045e4d | Iustin Pop | The program does not check its input data for consistency, and aborts |
194 | b0045e4d | Iustin Pop | with cryptic errors messages in this case. |
195 | b0045e4d | Iustin Pop | |
196 | b0045e4d | Iustin Pop | The algorithm doesn't know when it won't be possible to reach N+1 |
197 | b0045e4d | Iustin Pop | compliance at all, and will happily churn CPU for ages without |
198 | b0045e4d | Iustin Pop | realising it won't reach a solution. |
199 | b0045e4d | Iustin Pop | |
200 | b0045e4d | Iustin Pop | The algorithm is too slow. |
201 | b0045e4d | Iustin Pop | |
202 | b0045e4d | Iustin Pop | The output format is not easily scriptable, and the program should |
203 | b0045e4d | Iustin Pop | feed moves directly into Ganeti (either via RAPI or via a gnt-debug |
204 | b0045e4d | Iustin Pop | input file). |
205 | b0045e4d | Iustin Pop | |
206 | b0045e4d | Iustin Pop | .SH SEE ALSO |
207 | d2ac5526 | Iustin Pop | .BR hbal "(1), " hscan "(1), " ganeti "(7), " gnt-instance "(8), " |
208 | d2ac5526 | Iustin Pop | .BR gnt-node "(8)" |