Statistics
| Branch: | Tag: | Revision:

root / Ganeti / HTools @ ee9724b9

# Date Author Comment
ee9724b9 10/16/2009 04:59 pm Iustin Pop

Start using the utilisation scores in balancing

This enables the per-node load/total available capacity scores to be
used in balancing. Note that the total available capacity is currently
fixed at zero and cannot be changed by the user.

aa8d2e71 10/16/2009 04:59 pm Iustin Pop

Add loading and processing of utilisation data

This patch adds loading and processing the utilisation data during
instance moves. While the data is not yet used, it is correctly modified
by instance changes between nodes.

hbal has the new ‘-U’ command line argument for this. The format of the...

4f83a560 10/16/2009 02:54 pm Iustin Pop

Add an option to input utilisation data

a488a217 10/16/2009 02:43 pm Iustin Pop

Merge the Node.setPri and Node.addCpus functions

The latter is only used right after the former in the Loader module, and
we'll need more of this 'update not with the data of this instance'
functionality (which is different than addPri where all information must...

5b763470 10/16/2009 10:41 am Iustin Pop

Move some utility functions to Utils.hs

These were already duplicate (Text and Simu) and we need tryRead in more places.

183a9c3d 10/16/2009 10:09 am Iustin Pop

Show the load on nodes in node lists

The strange printf usage is due to some limitation (it seems) in ghc for
very long argument lists. The whole printout should be rewritten later.

2180829f 10/15/2009 05:05 pm Iustin Pop

Add initial structure for utilisation balancing

This patch adds the datatypes and modifies the nodes and instance types to have
such attributes. They are not used yet in any way.

507fda3f 10/15/2009 05:00 pm Iustin Pop

Allow displaying the instance map in hbal

This is similar to --print-nodes, but with much fewer fields.

181d4e04 10/15/2009 11:49 am Iustin Pop

Add an explicit export list to Instance.hs

This exports all functions, but it's still good to have.

3a3c1eb4 10/15/2009 11:33 am Iustin Pop

More hlint fixes

This makes (for now) the code hlint-clean. This is per se not a huge
gain, but it allows easier tracking of regressions in style later
(one-two new violations are easier to diagnose when not hidden among 20
“known” ones).

c15f7183 10/14/2009 04:43 pm Iustin Pop

Style change: camel-casing of unittests

f5b553da 10/14/2009 04:41 pm Iustin Pop

Style change: cluster CStats camel-casing

This is again the cs_x to csX name change.

2060348b 10/14/2009 04:41 pm Iustin Pop

Style change: node and instance attributes

This changes from a_b to aB in all node and instance attributes, to
match the standard Haskell style. Also attributes that should have been
camel-cased but weren't were changed (e.g. plist → pList, pnode →
pNode).

fca250e9 10/14/2009 01:45 pm Iustin Pop

Modify the internals of the detailed CV scores

Before we used a tuple; since we'll need more metrics in the future,
it's simpler to transform this into a list of doubles, whose elements
are handled homogeneously by all the code that needs them.

0df5a1b4 10/14/2009 11:56 am Iustin Pop

Add a command line option for executing jobs

Signed-off-by: Iustin Pop <>
Reviewed-by: Michael Hanselmann <>

dfbbd43a 10/14/2009 11:56 am Iustin Pop

Change iMoveToJob to properly create migrates

The current Cluster.iMoveToJob always creates failovers, which is not
what we want. This simply used the original instances status to select
between these two (this is not optimal by the way, since the status...

924f9c16 10/14/2009 11:55 am Iustin Pop

Extend the MoveJob type to hold the instance index

This will be needed in order to generate the proper instance move commands.

Signed-off-by: Iustin Pop <>

66dac8e0 10/12/2009 03:13 pm Iustin Pop

Fix haddock issues with tuple members

It seems that haddock cannot document tuple members - but arguably, once
one needs to do that, tuples should not be used anymore.

This just moves the comments to the tuple comment.

Signed-off-by: Iustin Pop <>...

e97f211e 10/08/2009 01:06 pm Guido Trotter

parseNode: don't lookup values in drained nodes

Currently parseNode skips looking for values in offline nodes, but tries
to read them for drained ones. With this patch we treat offline and
drained nodes in the same way (which is compatible with the iallocator...

a2e90275 10/02/2009 06:54 pm Iustin Pop

Store the instance move in the MoveJobs

This will automatically sort our Ganeti jobs into the independent job
sets, and then we can submit them separately.

92e32d76 10/02/2009 06:48 pm Iustin Pop

Move some more type definitions to Types.hs

6b20875c 10/02/2009 06:37 pm Iustin Pop

Add a function converting Placements into Jobs

This converts from htools-specific Placements into Ganeti standard
OpCodes, which will later allow execution via Luxi.

3173c987 10/02/2009 05:52 pm Iustin Pop

Record the move being performed in a Placement

This will allow a more descriptive output later in the solution list, as
opposed to trying to reconstruct the move from the node indices.

The patch also documents the Placement members.

6583e677 10/02/2009 03:54 pm Iustin Pop

Split the Luxi generic parts from the loader

The Luxi loader implements both a generic Ganeti Luxi client and the
loader; it is better if these two are separated. The patch adds a
Ganeti/Luxi.hs (not under HTools!) since that is generic for Ganeti, and
not related necessarily to htools.

0e8ae201 10/02/2009 02:56 pm Iustin Pop

hbal: Implement grouping of moves into jobsets

Since moving two instances between different node-quadruples (inst X: A,
B → C, D and inst Y: E, F → G, H) can be parallelised by Ganeti, it
makes sense to split the operation list into jobsets whose execution...

1cf97474 09/30/2009 03:12 am Iustin Pop

Change ExtLoader to only handle I/O errors

Due to the Control.Exception changes between 6.8 and 6.10, using it
portably is difficult. Since we're only interested in handling I/O
errors, we can use prelude's catch and not have to deal with
Control.Exception at all....

cf924b6d 09/29/2009 04:43 pm Iustin Pop

Brown-paper-bag release fixing haddock issues

Haddock doesn't like pre-processed files (at least not in all versions).
Thus we need to remove the ExtLoader module from the haddock-procesed
file list.

fbb95f28 09/28/2009 05:09 pm Iustin Pop

Turn on, and fix, more warnings

The Makefile was intented to be -Wall and not simply -W, but I missed
that. This enables more warnings and also enables -Werror (except for
the tests).

685f5bc6 09/28/2009 05:09 pm Iustin Pop

Brown bag fix: invert a test

During testing I used the test inversely to see it triggers correctly,
and committed by mistake the inverted test. Fixing it.

45ab6a8d 09/28/2009 04:16 pm Iustin Pop

Add support for building without curl

Since curl is not always needed (e.g. when only using luxi or less
likely file backends only) and is also not always available, it is
useful for building without it. This of course disabled the RAPI
backend.

This patch changes ExtLoader to build with the ‘-cpp’ option which makes...

e8f89bb6 09/28/2009 03:50 pm Iustin Pop

Split the exernal data loader out of CLI.hs

Currently the external data loader is in CLI.hs, which makes all
programs that need cli functionality (options, etc.) link against the
network modules (most importantly curl). This patch splits this
functionality into a new module such that (for example) hail which only...

084b2502 09/03/2009 01:58 am Iustin Pop

Fix luxi recvMsg for messages bigger than 4K

This patch fixes a logic bug in luxi that breaks receive of messages
bigger than 4096 bytes. The send message is not impacted as it uses a
different algorithm.

cf35a869 09/01/2009 01:54 am Iustin Pop

Test some cases for the cluster score computation

1ae7a904 09/01/2009 01:54 am Iustin Pop

Add some more instance tests

This include instance text load tests.

f25e5aac 08/30/2009 06:55 pm Iustin Pop

Split the balancing algorithm in two parts

Currently the computation, recursing part and the IO part (progress
updates) of the balancing main function (iterateDepth) are all in the
same function, which makes it hard to test. This patch moves the
decision/computation part (whether to proceed one more round, whether we...

c0501c69 08/26/2009 11:07 am Iustin Pop

Implement support for 'cheap' moves only

This patch adds support for cheap (failover/migrate) operations only in
the balancing algorithm and in the hbal command line options.

This allows a very quick balancing (compared to allowing replace-disks)
which can be useful as a scheduled operation.

633e6bcb 08/26/2009 10:45 am Iustin Pop

Simplify the wrapIO function

This fixes one warning from hlint.

c9926b22 08/26/2009 10:40 am Iustin Pop

Use migrate or failover based on instance state

While we can't guarantee that the instance will be in the same state by
the time the migrate/failover command will be run, we can at least try
to do the right thing assuming no other changes to the cluster state....

f723de38 08/19/2009 02:03 pm Iustin Pop

Improve the error message for command line errors

Instead of using ioError . userError, we format the error ourselves.
This is nicer - no ‘)’ at the end of the output.

b2278348 08/18/2009 07:11 pm Iustin Pop

Add a simulated cluster data loader

This is useful especially for hspace, where we might want to simulate a
hypothetical cluster to check allocation beforehand.

175cc337 07/15/2009 08:22 pm Iustin Pop

CLI: Handle error better

This patch adds an error handler for any exceptions that are raised
during the external data load phase. This can be improved further, but
it's a good start.

0427285d 07/15/2009 11:31 am Iustin Pop

Unify the command line options and structures

This patch moves all the command line options and their internal
representation into CLI.hs. This means that duplicated options between
any two binaries are no longer declared twice, and that we no longer
need the two *Option classes.

2485487d 07/14/2009 05:15 pm Iustin Pop

Fix a few hlint errors

26d47cf5 07/14/2009 05:00 pm Iustin Pop

CLI: Prevent incompatible options to be selected

This patch makes CLI abort if more than one backend is selected.

8e445e6d 07/14/2009 04:06 pm Iustin Pop

Add support for luxi backend in CLI/hspace/hbal

This patch changes the backend selection method in CLI to prefer, in order:
- a RAPI specification
- a Luxi specification
- and finally the node/instance files

It also modifies hspace and hbal to provide a ‘-L’ command line option...

53ec9022 07/14/2009 03:56 pm Iustin Pop

Initial commit of the luxi backend

This patch adds a luxi backend that allows direct query of the master
daemon on the local node. This patch doesn't enable the backend to be
used.

There are a couple of things still missing in the implementation:
- we don't have a master timeout in reads and writes, only a...

135a6c6a 07/14/2009 03:56 pm Iustin Pop

Introduce timeout in RAPI queries

The patch adds two constants in Types.hs for connect and query timeout,
then modifies Rapi.hs to use them as the connect and general curl
timeout.

Rapi could be improved more, as currently we wait double the total
timeout due to not aborting early in case the node queries failed.

7d11799b 07/09/2009 04:58 pm Iustin Pop

Fix a haddoc issue

31e7ac17 07/09/2009 04:16 pm Iustin Pop

hspace: fix failure handling of tryAlloc results

Currently hspace doesn't handle failures from tryAlloc correctly; this
patch changes the iterateDepth function in hspace to return a Result (…)
so that errors can be propagated correctly.

The patch also changes one output key to be more clear and a typo in...

478df686 07/09/2009 03:44 pm Iustin Pop

Change the tryAlloc/tryReloc workflow

Currently, the tryAlloc and tryReloc function return a list with all the
results, both failures and successes. This is fine for hail, which does
one round of allocations, but is not so good for hspace, which does
iterative rounds; since at each (successful) step we only take the best...

685935f7 07/08/2009 08:30 pm Iustin Pop

Simplify the Cluster.tryAlloc structures

Currently the tryAlloc function calls the
allocateOnSingle/allocateOnPair and the builds a new tuple with those
functions's result plus the new node list. This is however suboptimal
in two respects:
- the new nodes added are the 'old' versions of the respective nodes,...

8880d889 07/08/2009 07:38 pm Iustin Pop

Slight change to the internal allocation results

Currently the Cluster.AllocSolution type is defined as a list of
‘(OpResult Node.list, …)’ and the results for applyMove are defined as
‘(OpResult Node.List, …)’. Both these means that the failure/success
indication is hidden in the first elements of this tuple, which makes is...

2bbf77cc 07/08/2009 05:34 pm Iustin Pop

hspace: switch output to shell-script format

This (big) patch changes the output of hspace from text-format
(separated by ‘: ’) to a shell-snippet, in ‘key=value’ format.

This will allow sourcing the output or parsing it via awk/sed/etc.

de4ac2c2 07/08/2009 12:49 pm Iustin Pop

hspace: move instance count and score into CStats

Currently the instance count and cluster score are separated from the
other initial/final phase stats, even though they are very similar. This
patch moves computation of these two into totalResources/CStats and...

79a72ce7 07/07/2009 07:10 pm Iustin Pop

Fix unittests

The recent OpResult and CPU values additions broke unittests.

8c4c6a8a 07/07/2009 12:56 pm Iustin Pop

Export more stats in hspace

This patch changes Cluster.totalResources to compute more resources and
prints them in hspace.

2795466b 07/07/2009 12:20 pm Iustin Pop

Show errors on stderr instead of stdout

Currently many of the exit and warning conditions mistakenly display error
messages on stdout, which makes parsing the output of programs harder. This
patch attempts to fix such occurrences.

16103319 07/07/2009 11:06 am Iustin Pop

Fix score calculation to work with empty clusters

Currently the cluster score calculation includes an offline instance
percentage, expressed as “offline inst / (offline + online inst)”, which
results in NaN for empty clusters. This patch changes the calculation...

e6f4f05c 07/07/2009 01:22 am Iustin Pop

Optimize the Utils.stdDev function

This patch optimizes the stdDev function in two respects:
- first, we don't do sum . map which builds an intermediate list, but
instead use a fold over the list to build incrementally the sum;
this should reduce both the time and space characteristics, as we...

d71d0a1d 07/07/2009 12:24 am Iustin Pop

Take the foldl out of Loader.fixNodes

Currently Loader.fixNodes is foldl' with a complicated function. It
makes more sense to take foldl' out of this function (and put it into
the caller) and let fixNodes be only this internal function.

41c3b292 07/07/2009 12:13 am Iustin Pop

Simplify Cluster.computeMoves

This patch changes the function Cluster.computeMoves to use guards and a
couple of subexpressions in order to greatly simplify it.

9f6dcdea 07/06/2009 11:50 pm Iustin Pop

Fix hlint-generated warnings

This big patch cleans up the code per hlint indications. Many removals
of extra parentheses, replacements of concat . map with concabtMap,
extra dollar signs, eta reductions, etc. were performed.

The code still compiles and passes a couple of manual tests on sample...

44763b51 07/05/2009 06:56 pm Iustin Pop

Add computation of the failure reason in hspace

This patch enhances hspace to report why the allocation sequence
stopped, both in absolute error count and for the top reason.

c43c3354 07/05/2009 06:42 pm Iustin Pop

Return correct failure data from Node.add*

This patch alters the Node.addPri/addSec to return correct failure data.
It removes the computeFailN1 function from the module as that used to
combine both mem and disk checks in the same function and thus the real...

f2280553 07/05/2009 03:53 pm Iustin Pop

Introduce a new type for allocation results

Currently the allocation/move operations workflow return ‘Maybe a’,
which is very convenient but loses all details about the failure mode.

This patch introduces a new data type which encodes the specific failure...

266aea94 07/05/2009 03:21 pm Iustin Pop

Remove hn1 and related code

hn1 was deprecated for a while and this patch removes it altogether. The
support code in Cluster.hs is also removed.

301789f4 07/03/2009 10:01 pm Iustin Pop

Fix totalResources avail disk computation

This uses the newly-added Node.availDisk to compute the actual available
disk correctl, and display the total allocatable disk in hspace.

fe3d6f02 07/03/2009 10:01 pm Iustin Pop

Add an availDisk node function

This function returns the amount of available disk, which depends on
whether a low disk limit has been configured or not and on the free disk
space of the node.

836533fa 07/03/2009 10:00 pm Iustin Pop

Add two new autocomputed vars to Nodes

Currently we track the max disk usage/max vcpus as percentages, however
sometimes it's easier to check against minimum free disk or maximum
number of cpus, as units instead of percentages.

This patch adds two new variables, lo_dsk, hi_cpu, which are recomputed...

1a7eff0e 07/03/2009 12:50 am Iustin Pop

Add a new type for cluster statistics

Currently totalResources returns a 5-tuple of integers. This is not easy
to handle, as each change on the return type means that each caller must
be updated.

This patch adds a new type for cluster stats and uses that instead as...

e2af3156 07/02/2009 01:33 pm Iustin Pop

Add display of more stats in hspace

This patch changes Cluster.totalResources to compute more details about
the cluster status, and enhances hspace to display more of these.

0c936d24 06/16/2009 12:52 pm Iustin Pop

Fix a haddock/docstring issue

18b6444b 06/12/2009 03:16 am Iustin Pop

Implement cpu/disk limits in instance moves

We modify Node.addPri/addSec to take into account the limits on instance
adds.

844eff86 06/12/2009 02:56 am Iustin Pop

Add two new node attributes

Two new min disk free ratio and max cpu usage attributes are added to the
nodes. These will be used in the future to restrict allocation.

c6484f0b 06/12/2009 02:29 am Iustin Pop

Fix 'unused X' warnings

This removes some unused functions and imports to cleanup the warnings.

78694255 06/12/2009 02:22 am Iustin Pop

Fix the various monomorphism warning

In a few places (e.g. tryRead or any printf call) it's a little bit hard
to add the correct type signatures, but in the it is possible to fix
these warnings (which can bite one in subtle cases).

3c64b5aa 06/12/2009 01:12 am Iustin Pop

Small changes to the node list output

This is just some cleanup of the node list output, adding pcpu/vcpu
counters, and making the display slightly nicer.

0a8dd21d 06/11/2009 12:17 am Iustin Pop

Add cpu ratio to cluster calculation

f1e64aba 06/10/2009 11:37 pm Iustin Pop

Update cpu counters correctly after pinst changes

The cpu counters are update on primary instance adds/removes.

1a82215d 06/10/2009 11:29 pm Iustin Pop

Add cpu-count-related attributes to nodes

This patch adds cpu-count related attributes to nodes:
- total cpus
- cpus in use
- ratio of virtual:physical cpus

We also set correctly the cpu values at load time, but we don't do
anything yet while moving instances around. The cpu ratio is shown in...

d752eb39 06/10/2009 10:42 pm Iustin Pop

Add a new vcpus attribute to instances

This patch adds reading of vcpu count for instances, in preparation for
using the vcpu ratio in cluster scoring.

734b1ff1 06/10/2009 10:31 pm Iustin Pop

Fix reading of total disk space in iallocator

IAllocator currently uses a wrong key name for reading the total disk
space (‘disk_usage’ which was copied from RAPI, but the actual
iallocator key is ‘disk_space_total’).

This patch fixes that and also makes iallocator always use this key,...

70db354e 06/04/2009 04:32 pm Iustin Pop

Fix the ReplacePrimary instance move

During a replace-primary instance move, on the real cluster the instance
is temporarily started on the secondary, and as such we must check that
the secondary node can hold it for this duration. Currently the code
does not, and depending on cluster scoring it will put instances on such...

9dcec001 06/01/2009 04:48 pm Iustin Pop

Rework the tryAlloc/tryReloc functions

Currently tryAlloc/tryReloc do not return the new instance, as this is
not needed for IAllocator alloc/reloc requests. However, for computing
the space, the new instance is useful, so we modify these functions to
return this information too....

a80bf544 06/01/2009 04:48 pm Iustin Pop

Add an utility function for triples

903a7d46 06/01/2009 02:44 pm Iustin Pop

Small doc change

And an alignment issue.

9b1e1cc9 06/01/2009 01:25 pm Iustin Pop

Ensure consistent naming of the tools

This patch makes sure that all references to the name of the software is
ganeti-htools, not simply htools.

e2fa2baf 06/01/2009 12:55 pm Iustin Pop

Add copyright/license information

This doc-patch adds copyright and license information to (hopefully) all
needed files.

7dd5ee6c 06/01/2009 12:24 pm Iustin Pop

tests: move the test declaration in QC.hs

This patch moves the test declaration into QC.hs, so that test.hs has to
be modified only when we add a new test category.

0991ed70 06/01/2009 12:18 pm Iustin Pop

Small whitespace change

dbba5246 06/01/2009 12:18 pm Iustin Pop

Move some alloc functions from hail into Cluster

These are generic enough to be used from multiple places, they belong
better in Cluster.hs than in the hail source.

19f38ee8 06/01/2009 12:18 pm Iustin Pop

Move the RqType and Request types to Loader.hs

These two will be more generic than now, and belong somewhere else -
Loader.hs is a generic module for data loading, thus we move them there.

d85a0a0f 06/01/2009 12:18 pm Iustin Pop

Cleanup an old function

Also replace a type with its synonim.

9188aeef 06/01/2009 12:18 pm Iustin Pop

Lots of documentation updates

This patch does only doc build changes, doc changes and function move
around (for more logical documentation). It should have no impact at all
on the code.

095d7ac0 06/01/2009 12:18 pm Iustin Pop

A simple test for Container.addTwo

7bc82927 06/01/2009 12:17 pm Iustin Pop

Add some very trivial Instance tests

This is more of an exercise in QuickCheck than strong testing.

9cf4267a 06/01/2009 12:14 pm Iustin Pop

Finish removal of unused params from PeerMap

This completes the removal started earlier byt removeing the need to
pass the number of nodes to Node.buildPeers, which is now unused.

15f4c8ca 06/01/2009 12:14 pm Iustin Pop

Add test infrastructure and initial tests

This patch adds a QuickCheck-based test infrastructure and initial tests
based on it. The PeerMap module has a 100% coverage ☺

Side-note: one has to read the source of QuickCheck to see how to use it
(especially the Batch submodule), the docs are not enough…

71e13e48 05/27/2009 11:50 pm Iustin Pop

Some cleanup of the PeerMap module

This patch removes some unused functions and does some cleanup of the
remaining ones.

17c59f4b 05/27/2009 11:09 pm Iustin Pop

Remove unused parameters from PeerMap creation

We remove some unused arguments (added way back for compatibility with
Arrays, which we didn't use in the end). This makes the code clearer
(and doesn't need the Ndx type to be an instance of Num).