root / hspace.1 @ 2bbf77cc
History | View | Annotate | Download (9.3 kB)
1 |
.TH HSPACE 1 2009-06-01 htools "Ganeti H-tools" |
---|---|
2 |
.SH NAME |
3 |
hspace \- Cluster space analyzer for Ganeti |
4 |
|
5 |
.SH SYNOPSIS |
6 |
.B hspace |
7 |
.B "[-p]" |
8 |
.B "[-v... | -q]" |
9 |
.BI "[-O" name... "]" |
10 |
.BI "[-m " cluster "]" |
11 |
.BI "[-n " nodes-file " ]" |
12 |
.BI "[-i " instances-file "]" |
13 |
.BI "[--memory " mem "]" |
14 |
.BI "[--disk " disk "]" |
15 |
.BI "[--req-nodes " req-nodes "]" |
16 |
.BI "[--max-cpu " cpu-ratio "]" |
17 |
.BI "[--min-disk " disk-ratio "]" |
18 |
|
19 |
.B hspace |
20 |
.B --version |
21 |
|
22 |
.SH DESCRIPTION |
23 |
hspace computes how many additional instances can be fit on a cluster, |
24 |
while maintaining N+1 status. |
25 |
|
26 |
The program will try to place instances, all of the same size, on the |
27 |
cluster, until the point where we don't have any N+1 possible |
28 |
allocation. It uses the exact same allocation algorithm as the hail |
29 |
iallocator plugin. |
30 |
|
31 |
The output of the program is designed to interpreted as a shell |
32 |
fragment (or parsed as a \fIkey=value\fR file). Options which extend |
33 |
the output (e.g. -p, -v) will output the additional information on |
34 |
stderr (such that the stdout is still parseable). |
35 |
|
36 |
The following keys are available in the output of the script (all |
37 |
prefixed with \fIHTS_\fR): |
38 |
.TP |
39 |
.I SPEC_MEM, SPEC_DSK, SPEC_CPU, SPEC_RQN |
40 |
These represent the specifications of the instance model used for |
41 |
allocation (the memory, disk, cpu, requested nodes). |
42 |
|
43 |
.TP |
44 |
.I CLUSTER_MEM, CLUSTER_DSK, CLUSTER_CPU, CLUSTER_NODES |
45 |
These represent the total memory, disk, CPU count and total nodes in |
46 |
the cluster. |
47 |
|
48 |
.TP |
49 |
.I INI_SCORE, FIN_SCORE |
50 |
These are the initial (current) and final cluster score (see the hbal |
51 |
man page for details about the scoring algorithm). |
52 |
|
53 |
.TP |
54 |
.I INI_INST_CNT, FIN_INST_CNT |
55 |
The initial and final instance count. |
56 |
|
57 |
.TP |
58 |
.I INI_MEM_FREE, FIN_MEM_FREE |
59 |
The initial and final total free memory in the cluster (but this |
60 |
doesn't necessarily mean available for use). |
61 |
|
62 |
.TP |
63 |
.I INI_MEM_AVAIL, FIN_MEM_AVAIL |
64 |
The initial and final total available memory for allocation in the |
65 |
cluster. If allocating redundant instances, new instances could |
66 |
increase the reserved memory so it doesn't necessarily mean the |
67 |
entirety of this memory can be used for new instance allocations. |
68 |
|
69 |
.TP |
70 |
.I INI_MEM_RESVD, FIN_MEM_RESVD |
71 |
The initial and final reserved memory (for redundancy/N+1 purposes). |
72 |
|
73 |
.TP |
74 |
.I INI_MEM_INST, FIN_MEM_INST |
75 |
The initial and final memory used for instances (actual runtime used |
76 |
RAM). |
77 |
|
78 |
.TP |
79 |
.I INI_MEM_OVERHEAD, FIN_MEM_OVERHEAD |
80 |
The initial and final memory overhead - memory used for the node |
81 |
itself and unacounted memory (e.g. due to hypervisor overhead). |
82 |
|
83 |
.TP |
84 |
.I INI_MEM_EFF, HTS_INI_MEM_EFF |
85 |
The initial and final memory efficiency, represented as instance |
86 |
memory divided by total memory. |
87 |
|
88 |
.TP |
89 |
.I INI_DSK_FREE, INI_DSK_AVAIL, INI_DSK_RESVD, INI_DSK_INST, INI_DSK_EFF |
90 |
Initial disk stats, similar to the memory ones. |
91 |
|
92 |
.TP |
93 |
.I FIN_DSK_FREE, FIN_DSK_AVAIL, FIN_DSK_RESVD, FIN_DSK_INST, FIN_DSK_EFF |
94 |
Final disk stats, similar to the memory ones. |
95 |
|
96 |
.TP |
97 |
.I INI_CPU_INST, FIN_CPU_INST |
98 |
Initial and final number of virtual CPUs used by instances. |
99 |
|
100 |
.TP |
101 |
.I INI_CPU_EFF, FIN_CPU_EFF |
102 |
The initial and final CPU efficiency, represented as the count of |
103 |
virtual instance CPUs divided by the total physical CPU count. |
104 |
|
105 |
.TP |
106 |
.I INI_MNODE_MEM_AVAIL, FIN_MNODE_MEM_AVAIL |
107 |
The initial and final maximum per-node available memory. This is not |
108 |
very useful as a metric but can give an impression of the status of |
109 |
the nodes; as an example, this value restricts the maximum instance |
110 |
size that can be still created on the cluster. |
111 |
|
112 |
.TP |
113 |
.I INI_MNODE_DSK_AVAIL, FIN_MNODE_DSK_AVAIL |
114 |
Like the above but for disk. |
115 |
|
116 |
.TP |
117 |
.I ALLOC_USAGE |
118 |
The current usage represented as initial number of instances divided |
119 |
per final number of instances. |
120 |
|
121 |
.TP |
122 |
.I ALLOC_COUNT |
123 |
The number of instances allocated (delta between FIN_INST_CNT and |
124 |
INI_INST_CNT). |
125 |
|
126 |
.TP |
127 |
.I ALLOC_FAIL*_CNT |
128 |
For the last attemp at allocations (which would have increased |
129 |
FIN_INST_CNT with one, if it had succeeded), this is the count of the |
130 |
failure reasons per failure type; currently defined are FAILMEM, |
131 |
FAILDISK and FAILCPU which represent errors due to not enough memory, |
132 |
disk and CPUs, and FAILN1 which represents a non N+1 compliant cluster |
133 |
on which we can't allocate instances at all. |
134 |
|
135 |
.TP |
136 |
.I ALLOC_FAIL_REASON |
137 |
The reason for most of the failures, being one of the above FAIL* |
138 |
strings. |
139 |
|
140 |
.TP |
141 |
.I OK |
142 |
A marker representing the successful end of the computation, and |
143 |
having value "1". If this key is not present in the output it means |
144 |
that the computation failed and any values present should not be |
145 |
relied upon. |
146 |
|
147 |
.SH OPTIONS |
148 |
The options that can be passed to the program are as follows: |
149 |
|
150 |
.TP |
151 |
.BI "--memory " mem |
152 |
The memory size of the instances to be placed (defaults to 4GiB). |
153 |
|
154 |
.TP |
155 |
.BI "--disk " disk |
156 |
The disk size of the instances to be placed (defaults to 100GiB). |
157 |
|
158 |
.TP |
159 |
.BI "--req-nodes " num-nodes |
160 |
The number of nodes for the instances; the default of two means |
161 |
mirrored instances, while passing one means plain type instances. |
162 |
|
163 |
.TP |
164 |
.BI "--max-cpu " cpu-ratio |
165 |
The maximum virtual-to-physical cpu ratio, as a floating point number |
166 |
between zero and one. For example, specifying \fIcpu-ratio\fR as |
167 |
\fB2.5\fR means that, for a 4-cpu machine, a maximum of 10 virtual |
168 |
cpus should be allowed to be in use for primary instances. A value of |
169 |
one doesn't make sense though, as that means no disk space can be used |
170 |
on it. |
171 |
|
172 |
.TP |
173 |
.BI "--min-disk " disk-ratio |
174 |
The minimum amount of free disk space remaining, as a floating point |
175 |
number. For example, specifying \fIdisk-ratio\fR as \fB0.25\fR means |
176 |
that at least one quarter of disk space should be left free on nodes. |
177 |
|
178 |
.TP |
179 |
.B -p, --print-nodes |
180 |
Prints the before and after node status, in a format designed to allow |
181 |
the user to understand the node's most important parameters. |
182 |
|
183 |
The node list will contain these informations: |
184 |
.RS |
185 |
.TP |
186 |
.B F |
187 |
a character denoting the status of the node, with '-' meaning an |
188 |
offline node, '*' meaning N+1 failure and blank meaning a good node |
189 |
.TP |
190 |
.B Name |
191 |
the node name |
192 |
.TP |
193 |
.B t_mem |
194 |
the total node memory |
195 |
.TP |
196 |
.B n_mem |
197 |
the memory used by the node itself |
198 |
.TP |
199 |
.B i_mem |
200 |
the memory used by instances |
201 |
.TP |
202 |
.B x_mem |
203 |
amount memory which seems to be in use but cannot be determined why or |
204 |
by which instance; usually this means that the hypervisor has some |
205 |
overhead or that there are other reporting errors |
206 |
.TP |
207 |
.B f_mem |
208 |
the free node memory |
209 |
.TP |
210 |
.B r_mem |
211 |
the reserved node memory, which is the amount of free memory needed |
212 |
for N+1 compliance |
213 |
.TP |
214 |
.B t_dsk |
215 |
total disk |
216 |
.TP |
217 |
.B f_dsk |
218 |
free disk |
219 |
.TP |
220 |
.B pcpu |
221 |
the number of physical cpus on the node |
222 |
.TP |
223 |
.B vcpu |
224 |
the number of virtual cpus allocated to primary instances |
225 |
.TP |
226 |
.B pri |
227 |
number of primary instances |
228 |
.TP |
229 |
.B sec |
230 |
number of secondary instances |
231 |
.TP |
232 |
.B p_fmem |
233 |
percent of free memory |
234 |
.TP |
235 |
.B p_fdsk |
236 |
percent of free disk |
237 |
.TP |
238 |
.B r_cpu |
239 |
ratio of virtual to physical cpus |
240 |
.RE |
241 |
|
242 |
.TP |
243 |
.BI "-O " name |
244 |
This option (which can be given multiple times) will mark nodes as |
245 |
being \fIoffline\fR, and instances won't be placed on these nodes. |
246 |
|
247 |
Note that hspace will also mark as offline any nodes which are |
248 |
reported by RAPI as such, or that have "?" in file-based input in any |
249 |
numeric fields. |
250 |
.RE |
251 |
|
252 |
.TP |
253 |
.BI "-n" nodefile ", --nodes=" nodefile |
254 |
The name of the file holding node information (if not collecting via |
255 |
RAPI), instead of the default \fInodes\fR file (but see below how to |
256 |
customize the default value via the environment). |
257 |
|
258 |
.TP |
259 |
.BI "-i" instancefile ", --instances=" instancefile |
260 |
The name of the file holding instance information (if not collecting |
261 |
via RAPI), instead of the default \fIinstances\fR file (but see below |
262 |
how to customize the default value via the environment). |
263 |
|
264 |
.TP |
265 |
.BI "-m" cluster |
266 |
Collect data not from files but directly from the |
267 |
.I cluster |
268 |
given as an argument via RAPI. If the argument doesn't contain a colon |
269 |
(:), then it is converted into a fully-built URL via prepending |
270 |
https:// and appending the default RAPI port, otherwise it's |
271 |
considered a fully-specified URL and is used as-is. |
272 |
|
273 |
.TP |
274 |
.B -v, --verbose |
275 |
Increase the output verbosity. Each usage of this option will increase |
276 |
the verbosity (currently more than 2 doesn't make sense) from the |
277 |
default of one. At verbosity 2 the location of the new instances is |
278 |
shown in the standard error. |
279 |
|
280 |
.TP |
281 |
.B -q, --quiet |
282 |
Decrease the output verbosity. Each usage of this option will decrease |
283 |
the verbosity (less than zero doesn't make sense) from the default of |
284 |
one. |
285 |
|
286 |
.TP |
287 |
.B -V, --version |
288 |
Just show the program version and exit. |
289 |
|
290 |
.SH EXIT STATUS |
291 |
|
292 |
The exist status of the command will be zero, unless for some reason |
293 |
the algorithm fatally failed (e.g. wrong node or instance data). |
294 |
|
295 |
.SH BUGS |
296 |
|
297 |
The algorithm is highly dependent on the number of nodes; its runtime |
298 |
grows exponentially with this number, and as such is impractical for |
299 |
really big clusters. |
300 |
|
301 |
The algorithm doesn't rebalance the cluster or try to get the optimal |
302 |
fit; it just allocates in the best place for the current step, without |
303 |
taking into consideration the impact on future placements. |
304 |
|
305 |
.SH ENVIRONMENT |
306 |
|
307 |
If the variables \fBHTOOLS_NODES\fR and \fBHTOOLS_INSTANCES\fR are |
308 |
present in the environment, they will override the default names for |
309 |
the nodes and instances files. These will have of course no effect |
310 |
when RAPI is used. |
311 |
|
312 |
.SH SEE ALSO |
313 |
.BR hbal "(1), " hscan "(1), " ganeti "(7), " gnt-instance "(8), " |
314 |
.BR gnt-node "(8)" |
315 |
|
316 |
.SH "COPYRIGHT" |
317 |
.PP |
318 |
Copyright (C) 2009 Google Inc. Permission is granted to copy, |
319 |
distribute and/or modify under the terms of the GNU General Public |
320 |
License as published by the Free Software Foundation; either version 2 |
321 |
of the License, or (at your option) any later version. |
322 |
.PP |
323 |
On Debian systems, the complete text of the GNU General Public License |
324 |
can be found in /usr/share/common-licenses/GPL. |