root / NEWS @ 62eb3aa2
History | View | Annotate | Download (10.7 kB)
1 |
Ganeti-htools release notes |
---|---|
2 |
=========================== |
3 |
|
4 |
Version 0.2.3 (Thu, 4 Feb 2010) |
5 |
-------------------------------- |
6 |
|
7 |
A small release: |
8 |
|
9 |
- Fixes selection of secondary node: previously, if the cluster had |
10 |
many N+1 failures, a N+1 failed node could be selected as secondary |
11 |
even if it did not have enough memory to allow the instance to be |
12 |
migrated/failed over to it; this is bad for automated tools, since |
13 |
we can get the cluster in an unhealthy state |
14 |
- Switch the text backend to a single input file, that is generated |
15 |
now by hscan and shouldn't be generated manually via |
16 |
gnt-node/instance list anymore; this allows richer information to be |
17 |
kept in the file, and simplifies a little the internals of the text |
18 |
backend |
19 |
|
20 |
Version 0.2.2 (Tue, 29 Dec 2009) |
21 |
-------------------------------- |
22 |
|
23 |
Small release, 0.2.1 was broken and thus this was released earlier: |
24 |
|
25 |
- Release 0.2.1 broke the LUXI backend due to a typo, fixed |
26 |
- Added a live-test script that should catch errors like the above one |
27 |
in the future (needs a working, non-empty cluster) |
28 |
- Changed RAPI and LUXI backends to treat drained nodes as offline, |
29 |
similar to the IAllocator backend change in 0.2.0 (which was wrongly |
30 |
marked as affecting all backends) |
31 |
- Changed the metrics for offline instances and N1 score from percent to |
32 |
count, in order to increase the priority of evacuations |
33 |
- Added a new metric (offline primary instances) which should fix the |
34 |
evacuation of a offline node in a 2-node cluster |
35 |
|
36 |
Version 0.2.1 (Wed, 2 Dec 2009) |
37 |
-------------------------------- |
38 |
|
39 |
- Added instance exclusion defined via instance tags |
40 |
- Fixed the output of hspace to be again parseable from the shell |
41 |
|
42 |
Version 0.2.0 (Tue, 10 Nov 2009) |
43 |
-------------------------------- |
44 |
|
45 |
A significant release, with a few new major features: |
46 |
|
47 |
- Added direct execution of the hbal solution when using the Luxi |
48 |
backend; the steps for each instance moves are submitted as a single |
49 |
jobs, and the different jobs are submitted as groups in order to |
50 |
parallelise the execution of moves |
51 |
- Added support for balancing based on dynamic utilisation data for |
52 |
instances, fed in via a text file; by default, all instances are |
53 |
considered equal and this change also improves the equalisation of |
54 |
secondary instances per node |
55 |
- Added support for tiered capacity calculation in hspace, where we |
56 |
start from a maximum instance spec and decrease the spec when we run |
57 |
out of resources; this should give a better measure of available |
58 |
capacity on 'fragmented' clusters; this is done separately from the |
59 |
current fixed-mode computation |
60 |
|
61 |
Also there have been many minor improvements: |
62 |
|
63 |
- Added option for showing instances (“--print-instances”), similar to |
64 |
the print nodes option |
65 |
- Added support for customising the node list via an argument to the |
66 |
print nodes option in the form of a comma-separated list of field |
67 |
names; currently the field names are not documented, expecting further |
68 |
changes in a next release |
69 |
- Enhanced the error reporting in the Luxi and Rapi backends |
70 |
- Changed the handling of drained nodes, now being treated the same as |
71 |
offline nodes, for Ganeti 2.0.4+ compatibility |
72 |
- A number of internal changes, simplifying code and merging some |
73 |
disparate functions |
74 |
- Simplify the build system in relation to creation of archives |
75 |
|
76 |
Version 0.1.8 (Tue, 29 Sep 2009) |
77 |
-------------------------------- |
78 |
|
79 |
- Brown-paper-bag release fixing haddock issues |
80 |
|
81 |
Version 0.1.7 (Mon, 28 Sep 2009) |
82 |
-------------------------------- |
83 |
|
84 |
- Fixed a bug in the Luxi backend for big responses |
85 |
- Fixed test suite exit code in presence of test failures |
86 |
- Changed the migrate operation to run instead failover for instances |
87 |
which were marked as not running in the input data (this could have |
88 |
been changed since then, but it's better than today's always migrate) |
89 |
- Added support for 'cheap' moves only (only migrate/failover) in |
90 |
balancing |
91 |
- Added support for building without curl (thus no RAPI backend) |
92 |
|
93 |
Version 0.1.6 (Wed, 19 Aug 2009) |
94 |
-------------------------------- |
95 |
|
96 |
- Added support for Luxi (the native Ganeti protocol) |
97 |
- Added support for simulated clusters (for hspace only) |
98 |
- Added timeouts for the RAPI backend |
99 |
- Fixed a few inconsistencies in the command line handling |
100 |
- Fixed handling of errors while loading data |
101 |
- The 'network' is a new dependency due to the Luxi addition |
102 |
|
103 |
Version 0.1.5 (Thu, 09 Jul 2009) |
104 |
-------------------------------- |
105 |
|
106 |
- Removed obsolete hn1 program; this allowed removal of a lot of |
107 |
supporting code |
108 |
- Lots of changes in hspace: the output now is a shell fragment in order |
109 |
for script to source it or parse it easier; added failure reasons; |
110 |
optimised to use less memory for large clusters |
111 |
- Optimized the scoring algorithm (used by all tools) so that now |
112 |
computations should be faster |
113 |
|
114 |
Version 0.1.4 (Tue, 16 Jun 2009) |
115 |
-------------------------------- |
116 |
|
117 |
- Added CPU count/ratio of virtual-to-physical CPUs to the cluster |
118 |
scoring methods; this means that now the balancer, the iallocator |
119 |
plugin and so on will try to keep the VCPU-to-PCPU ratio equal across |
120 |
the cluster |
121 |
- Fixed some hscan bugs |
122 |
- Fixed the way iallocator reads the total disk size (was broken and it |
123 |
was always falling back to summing the disk sizes) |
124 |
- Internals: fixed most compile-time warnings |
125 |
|
126 |
Version 0.1.3 (Fri, 05 Jun 2009) |
127 |
-------------------------------- |
128 |
|
129 |
- Fix a bug in the ReplacePrimary instance moves, affecting most of the |
130 |
tools |
131 |
|
132 |
Version 0.1.2 (Tue, 02 Jun 2009) |
133 |
-------------------------------- |
134 |
|
135 |
- Add a new program, “hspace”, which computes the free space on a |
136 |
cluster (based on a given instance spec) |
137 |
- Improvements in API docs and partially in the user docs |
138 |
- Started adding unittests |
139 |
|
140 |
Version 0.1.1 (Tue, 26 May 2009) |
141 |
-------------------------------- |
142 |
|
143 |
- Add a new program, “hail”, which is an iallocator plugin and can |
144 |
allocate/relocate instances |
145 |
- Experimental support for non-mirrored instances (hail supports them, |
146 |
hbal should no longer abort when it finds such instances and simply |
147 |
ignore them) |
148 |
- The RAPI port and/or scheme can be overriden now, and even “file://” |
149 |
schemes can be used if the message body has been saved under the |
150 |
appropriate name |
151 |
- Lots of code reorganization, esp. rewritten loading pipeline |
152 |
- Better data checking and better error messages in case validation |
153 |
fails; tools now consider nodes with error in input data (‘?’ returned |
154 |
by ganeti) as offline |
155 |
- Small enhancement to the makefile for simpler packaging |
156 |
|
157 |
Version 0.1.0 (Tue, 19 May 2009) |
158 |
-------------------------------- |
159 |
|
160 |
- Drop compatibility with Ganeti 1.2 |
161 |
- Add a new minimum score option (with a very low default), should help |
162 |
with very good clusters (but is still not optimal) |
163 |
- Add a --quiet option to hbal |
164 |
- Add support for reading offline nodes directly from the cluster |
165 |
|
166 |
Version 0.0.8 (Tue, 21 Apr 2009) |
167 |
-------------------------------- |
168 |
|
169 |
- hbal: prevent mismatches in wrong node names being passed to -O, by |
170 |
aborting in this case |
171 |
- add the ability to write the commands (-C) to a script via (-C<file>), |
172 |
so that it can be later executed directly; this has also changed the |
173 |
commands to include the ncessary -f flags to skip confirmations |
174 |
- add checks for extra argument in hbal and hn1, so that unintended |
175 |
errors are catched |
176 |
- raise the accepted “missing” memory limit to 512MB, to cover usual Xen |
177 |
reservations |
178 |
|
179 |
Version 0.0.7 (Mon, 23 Mar 2009) |
180 |
-------------------------------- |
181 |
|
182 |
- added support for offline nodes, which are not used as targets for |
183 |
instance relocation and if they hold instances the hbal algorithm will |
184 |
attempt to relocate these away |
185 |
- added support for offline instances, which now will no longer skew the |
186 |
free memory estimation of nodes; the algorithm will no longer create |
187 |
conditions for N+1 failures when such instances are later started |
188 |
- implemented a complete model of node resources, in order to prevent an |
189 |
unintended re-occurrence of cases like the offline instance were we |
190 |
miscalculate some node resource; this gives warning now in case the |
191 |
node reported free disk or free memory deviates by more than a set |
192 |
amount from the expected value |
193 |
- a new tool *hscan* that can generate the input text-file for the other |
194 |
tools by collection via RAPI |
195 |
- some small changes to the build system to make it more friendly; also |
196 |
included the generated documentation in the source archive |
197 |
|
198 |
Version 0.0.6 (Mon, 16 Mar 2009) |
199 |
-------------------------------- |
200 |
|
201 |
- re-factored the hbal algorithm to make it stable in the sense that it |
202 |
gives the same solution when restarted from the middle; barring |
203 |
rounding of disk/memory and incomplete reporting from Ganeti (for |
204 |
1.2), it should be now feasible to rely on its output without |
205 |
generating moves ad infinitum |
206 |
- the hbal algorithm now uses two more variables: the node N+1 failures |
207 |
and the amount of reserved memory; the first of which tries to ‘fix’ |
208 |
the N+1 status, the latter tries to distribute secondaries more |
209 |
equally |
210 |
- the hbal algorithm now uses two more moves at each step: |
211 |
replace+failover and failover+replace (besides the original failover, |
212 |
replace, and failover+replace+failover) |
213 |
- slightly changed the build system to embed GIT version/tags into the |
214 |
binaries so that we know for a binary from which tree it was done, |
215 |
either via ‘--version’ or via “strings hbal|grep version” |
216 |
- changed the solution list and in general the hbal output to be more |
217 |
clear by default, and changed “gnt-instance failover” to “gnt-instance |
218 |
migrate” |
219 |
- added man pages for the two binaries |
220 |
|
221 |
Version 0.0.5 (Mon, 09 Mar 2009) |
222 |
-------------------------------- |
223 |
|
224 |
- a few small improvements for hbal (possibly undone by later changes), |
225 |
hbal is now quite faster |
226 |
- fix documentation building |
227 |
- allow hbal to work on non N+1 compliant clusters, but without |
228 |
guarantees that the end cluster will be compliant; in any case, this |
229 |
should give a smaller number of nodes that are not compliant if the |
230 |
cluster state permits it |
231 |
- strip common domain suffix from nodes and instances, so that output is |
232 |
shorter and hopefully clearer |
233 |
|
234 |
Version 0.0.4 (Sun, 15 Feb 2009) |
235 |
-------------------------------- |
236 |
|
237 |
- better balancing algorithm in hbal |
238 |
- implemented an RAPI collector, now the cluster data can be gathered |
239 |
automatically via RAPI and doesn't need manual export of node and |
240 |
instance list |
241 |
|
242 |
Version 0.0.3 (Wed, 28 Jan 2009) |
243 |
-------------------------------- |
244 |
|
245 |
- initial release of the hbal, a cluster rebalancing tool |
246 |
- input data format changed due to hbal requirements |
247 |
|
248 |
Version 0.0.2 (Tue, 06 Jan 2009) |
249 |
-------------------------------- |
250 |
|
251 |
- fix handling of some common cases (cluster N+1 compliant from the |
252 |
start, too big depth given, failure to compute solution) |
253 |
- add option to print the needed command list for reaching the proposed |
254 |
solution |
255 |
|
256 |
Version 0.0.1 (Tue, 06 Jan 2009) |
257 |
-------------------------------- |
258 |
|
259 |
- initial release of hn1 tool |
260 |
|
261 |
.. vim: set textwidth=72 : |
262 |
.. Local Variables: |
263 |
.. mode: rst |
264 |
.. fill-column: 72 |
265 |
.. End: |