1 Ganeti-htools release notes
2 ===========================
4 Version 0.2.4 (Mon, 22 Feb 2010)
5 --------------------------------
7 Two improvements for node evacuation:
9 - hbal takes a new parameter ``--evac-mode`` that restricts the
10 instances to be moved to the ones on offline/drained nodes, which
11 should reduce the work done
12 - hail supports the new ``multi-evacuate`` mode of the IAllocator
13 protocol, that will be released in a minor release on the Ganeti 2.1
16 Version 0.2.3 (Thu, 4 Feb 2010)
17 --------------------------------
21 - Fixes selection of secondary node: previously, if the cluster had
22 many N+1 failures, a N+1 failed node could be selected as secondary
23 even if it did not have enough memory to allow the instance to be
24 migrated/failed over to it; this is bad for automated tools, since
25 we can get the cluster in an unhealthy state
26 - Switch the text backend to a single input file, that is generated
27 now by hscan and shouldn't be generated manually via
28 gnt-node/instance list anymore; this allows richer information to be
29 kept in the file, and simplifies a little the internals of the text
32 Version 0.2.2 (Tue, 29 Dec 2009)
33 --------------------------------
35 Small release, 0.2.1 was broken and thus this was released earlier:
37 - Release 0.2.1 broke the LUXI backend due to a typo, fixed
38 - Added a live-test script that should catch errors like the above one
39 in the future (needs a working, non-empty cluster)
40 - Changed RAPI and LUXI backends to treat drained nodes as offline,
41 similar to the IAllocator backend change in 0.2.0 (which was wrongly
42 marked as affecting all backends)
43 - Changed the metrics for offline instances and N1 score from percent to
44 count, in order to increase the priority of evacuations
45 - Added a new metric (offline primary instances) which should fix the
46 evacuation of a offline node in a 2-node cluster
48 Version 0.2.1 (Wed, 2 Dec 2009)
49 --------------------------------
51 - Added instance exclusion defined via instance tags
52 - Fixed the output of hspace to be again parseable from the shell
54 Version 0.2.0 (Tue, 10 Nov 2009)
55 --------------------------------
57 A significant release, with a few new major features:
59 - Added direct execution of the hbal solution when using the Luxi
60 backend; the steps for each instance moves are submitted as a single
61 jobs, and the different jobs are submitted as groups in order to
62 parallelise the execution of moves
63 - Added support for balancing based on dynamic utilisation data for
64 instances, fed in via a text file; by default, all instances are
65 considered equal and this change also improves the equalisation of
66 secondary instances per node
67 - Added support for tiered capacity calculation in hspace, where we
68 start from a maximum instance spec and decrease the spec when we run
69 out of resources; this should give a better measure of available
70 capacity on 'fragmented' clusters; this is done separately from the
71 current fixed-mode computation
73 Also there have been many minor improvements:
75 - Added option for showing instances (“--print-instances”), similar to
76 the print nodes option
77 - Added support for customising the node list via an argument to the
78 print nodes option in the form of a comma-separated list of field
79 names; currently the field names are not documented, expecting further
80 changes in a next release
81 - Enhanced the error reporting in the Luxi and Rapi backends
82 - Changed the handling of drained nodes, now being treated the same as
83 offline nodes, for Ganeti 2.0.4+ compatibility
84 - A number of internal changes, simplifying code and merging some
86 - Simplify the build system in relation to creation of archives
88 Version 0.1.8 (Tue, 29 Sep 2009)
89 --------------------------------
91 - Brown-paper-bag release fixing haddock issues
93 Version 0.1.7 (Mon, 28 Sep 2009)
94 --------------------------------
96 - Fixed a bug in the Luxi backend for big responses
97 - Fixed test suite exit code in presence of test failures
98 - Changed the migrate operation to run instead failover for instances
99 which were marked as not running in the input data (this could have
100 been changed since then, but it's better than today's always migrate)
101 - Added support for 'cheap' moves only (only migrate/failover) in
103 - Added support for building without curl (thus no RAPI backend)
105 Version 0.1.6 (Wed, 19 Aug 2009)
106 --------------------------------
108 - Added support for Luxi (the native Ganeti protocol)
109 - Added support for simulated clusters (for hspace only)
110 - Added timeouts for the RAPI backend
111 - Fixed a few inconsistencies in the command line handling
112 - Fixed handling of errors while loading data
113 - The 'network' is a new dependency due to the Luxi addition
115 Version 0.1.5 (Thu, 09 Jul 2009)
116 --------------------------------
118 - Removed obsolete hn1 program; this allowed removal of a lot of
120 - Lots of changes in hspace: the output now is a shell fragment in order
121 for script to source it or parse it easier; added failure reasons;
122 optimised to use less memory for large clusters
123 - Optimized the scoring algorithm (used by all tools) so that now
124 computations should be faster
126 Version 0.1.4 (Tue, 16 Jun 2009)
127 --------------------------------
129 - Added CPU count/ratio of virtual-to-physical CPUs to the cluster
130 scoring methods; this means that now the balancer, the iallocator
131 plugin and so on will try to keep the VCPU-to-PCPU ratio equal across
133 - Fixed some hscan bugs
134 - Fixed the way iallocator reads the total disk size (was broken and it
135 was always falling back to summing the disk sizes)
136 - Internals: fixed most compile-time warnings
138 Version 0.1.3 (Fri, 05 Jun 2009)
139 --------------------------------
141 - Fix a bug in the ReplacePrimary instance moves, affecting most of the
144 Version 0.1.2 (Tue, 02 Jun 2009)
145 --------------------------------
147 - Add a new program, “hspace”, which computes the free space on a
148 cluster (based on a given instance spec)
149 - Improvements in API docs and partially in the user docs
150 - Started adding unittests
152 Version 0.1.1 (Tue, 26 May 2009)
153 --------------------------------
155 - Add a new program, “hail”, which is an iallocator plugin and can
156 allocate/relocate instances
157 - Experimental support for non-mirrored instances (hail supports them,
158 hbal should no longer abort when it finds such instances and simply
160 - The RAPI port and/or scheme can be overriden now, and even “file://”
161 schemes can be used if the message body has been saved under the
163 - Lots of code reorganization, esp. rewritten loading pipeline
164 - Better data checking and better error messages in case validation
165 fails; tools now consider nodes with error in input data (‘?’ returned
166 by ganeti) as offline
167 - Small enhancement to the makefile for simpler packaging
169 Version 0.1.0 (Tue, 19 May 2009)
170 --------------------------------
172 - Drop compatibility with Ganeti 1.2
173 - Add a new minimum score option (with a very low default), should help
174 with very good clusters (but is still not optimal)
175 - Add a --quiet option to hbal
176 - Add support for reading offline nodes directly from the cluster
178 Version 0.0.8 (Tue, 21 Apr 2009)
179 --------------------------------
181 - hbal: prevent mismatches in wrong node names being passed to -O, by
182 aborting in this case
183 - add the ability to write the commands (-C) to a script via (-C<file>),
184 so that it can be later executed directly; this has also changed the
185 commands to include the ncessary -f flags to skip confirmations
186 - add checks for extra argument in hbal and hn1, so that unintended
188 - raise the accepted “missing” memory limit to 512MB, to cover usual Xen
191 Version 0.0.7 (Mon, 23 Mar 2009)
192 --------------------------------
194 - added support for offline nodes, which are not used as targets for
195 instance relocation and if they hold instances the hbal algorithm will
196 attempt to relocate these away
197 - added support for offline instances, which now will no longer skew the
198 free memory estimation of nodes; the algorithm will no longer create
199 conditions for N+1 failures when such instances are later started
200 - implemented a complete model of node resources, in order to prevent an
201 unintended re-occurrence of cases like the offline instance were we
202 miscalculate some node resource; this gives warning now in case the
203 node reported free disk or free memory deviates by more than a set
204 amount from the expected value
205 - a new tool *hscan* that can generate the input text-file for the other
206 tools by collection via RAPI
207 - some small changes to the build system to make it more friendly; also
208 included the generated documentation in the source archive
210 Version 0.0.6 (Mon, 16 Mar 2009)
211 --------------------------------
213 - re-factored the hbal algorithm to make it stable in the sense that it
214 gives the same solution when restarted from the middle; barring
215 rounding of disk/memory and incomplete reporting from Ganeti (for
216 1.2), it should be now feasible to rely on its output without
217 generating moves ad infinitum
218 - the hbal algorithm now uses two more variables: the node N+1 failures
219 and the amount of reserved memory; the first of which tries to ‘fix’
220 the N+1 status, the latter tries to distribute secondaries more
222 - the hbal algorithm now uses two more moves at each step:
223 replace+failover and failover+replace (besides the original failover,
224 replace, and failover+replace+failover)
225 - slightly changed the build system to embed GIT version/tags into the
226 binaries so that we know for a binary from which tree it was done,
227 either via ‘--version’ or via “strings hbal|grep version”
228 - changed the solution list and in general the hbal output to be more
229 clear by default, and changed “gnt-instance failover” to “gnt-instance
231 - added man pages for the two binaries
233 Version 0.0.5 (Mon, 09 Mar 2009)
234 --------------------------------
236 - a few small improvements for hbal (possibly undone by later changes),
237 hbal is now quite faster
238 - fix documentation building
239 - allow hbal to work on non N+1 compliant clusters, but without
240 guarantees that the end cluster will be compliant; in any case, this
241 should give a smaller number of nodes that are not compliant if the
242 cluster state permits it
243 - strip common domain suffix from nodes and instances, so that output is
244 shorter and hopefully clearer
246 Version 0.0.4 (Sun, 15 Feb 2009)
247 --------------------------------
249 - better balancing algorithm in hbal
250 - implemented an RAPI collector, now the cluster data can be gathered
251 automatically via RAPI and doesn't need manual export of node and
254 Version 0.0.3 (Wed, 28 Jan 2009)
255 --------------------------------
257 - initial release of the hbal, a cluster rebalancing tool
258 - input data format changed due to hbal requirements
260 Version 0.0.2 (Tue, 06 Jan 2009)
261 --------------------------------
263 - fix handling of some common cases (cluster N+1 compliant from the
264 start, too big depth given, failure to compute solution)
265 - add option to print the needed command list for reaching the proposed
268 Version 0.0.1 (Tue, 06 Jan 2009)
269 --------------------------------
271 - initial release of hn1 tool
273 .. vim: set textwidth=72 :