Statistics
| Branch: | Tag: | Revision:

root / Ganeti @ 2cae47e9

# Date Author Comment
2cae47e9 07/22/2010 01:42 am Iustin Pop

Allow balancing moves to introduce N+1 errors

This patch switches the applyMove function to the extended versions of
Node.addPri and addSec, and passes the override flag based on the state
of the node that we're moving away from.

3e3c9393 07/22/2010 01:42 am Iustin Pop

Introduce a relaxed add instance mode

In case an instance is living on an offline node, it doesn't make sense
to refuse moving it because that would create N+1 failures; failing N+1
is still much better than not running at all. Similarly, if the
secondary node of an instance is offline, meaning the instance doesn't...

2849670b 07/19/2010 02:20 pm Iustin Pop

Remove obsolete Container.maxNameLen

This was only used in one place (hbal), and is obsolete by the change to
the dual name/alias structure.

14c972c7 07/19/2010 02:20 pm Iustin Pop

hbal: print short names in steps list

This was a regression from the name handling changes, as we started
using the original names for the solution list (which is not designed
for parsing/feeding back into ganeti).

fb33aaaf 07/19/2010 02:20 pm Iustin Pop

Remove an obsolete function

printSolution is no longer used, as we print the solution iteratively
now.

6dfa04fd 07/19/2010 12:13 am Iustin Pop

Allow '+' in node list fields

When the field list is prefixed with a plus sign, this will extend the
default field list, instead of replacing it entirely.

16f08e82 07/19/2010 12:13 am Iustin Pop

Update the node list fields

This patch renames the pri/sec to pcnt/scnt, and adds the real primary
and secondary instance lists, the peermap and the index of a node as
selectable options.

124b7cd7 07/19/2010 12:13 am Iustin Pop

Cleanup a node's peer map when possible

If the last secondary instance of a peer is deleted (detected by the new
peer memory value being equal to zero), then the pair (pdx, 0) should be
deleted completely. This is not optimization per se, but rather cleanup...

95446d7a 06/21/2010 12:12 pm Iustin Pop

Fix another haddock special-char issue

db079755 06/21/2010 05:59 am Iustin Pop

Remove JOB_STATUS_GONE and add unittests

… for the serialization/deserialization of the job and opcode status.

Job status 'gone' was not actually used. It can be reintroduced if
needed.

41065165 06/21/2010 05:46 am Iustin Pop

Add opcode status constants/type

This mirrors, again, the Ganeti constats, and are added for future use.

7e98f782 06/21/2010 05:46 am Iustin Pop

Rename the job status constants

The rename is done such that we match Ganeti's own constants.

95f490de 06/08/2010 03:48 am Iustin Pop

Optimise the Luxi.recvMsg function

Since the current buffer cannot contain (during network reads) an EOM,
we should look for the EOM only in the newly-received string. While
this shouldn't make much difference, in some tests it cuts the recvMsg
total time by around half....

04282772 06/08/2010 01:09 am Iustin Pop

Complete the client Luxi implementation

All current Luxi calls are supported after this patch. A bug in
ArchiveJob is also fixed (Ganeti's job IDs are strings).

9622919d 06/08/2010 12:35 am Iustin Pop

Add support for more LUXI calls

While not are directly useful, having them will open some possibilities
(e.g. polling for job changes in hbal's -X mode, and auto-archiving the
jobs once they are successful).

4a007641 06/03/2010 12:08 am Iustin Pop

Fix some lint errors in the unit tests

683b1ca7 06/02/2010 11:55 pm Iustin Pop

Change the Luxi operations structure

Currently, we define the LuxiOp type as a simple enumeration, and leave
the arguments structure to the users of the Ganeti.Luxi module. This is
suboptimal for a couple of reasons: first, we decouple the operation
type from operation arguments, and that means we don't use the type...

9c0a748f 06/01/2010 11:51 pm Iustin Pop

Fix a warning in Loader tests

Incomplete pattern match…

c088674b 06/01/2010 08:54 pm Iustin Pop

Add a few Loader tests

These are not comprehensive, but at least we have a start.

9e35522c 05/28/2010 12:13 pm Iustin Pop

Reduce the warnings during the unittests

Since the unittests are not 'clean' from the p.o.v. of type
declarations, and cannot be made clean in all respects (e.g. orphan
instances), we silence some warnings for the test target, to have a
cleaner output.

88f25dd0 05/28/2010 12:25 am Iustin Pop

Introduce OpCode unittests

f36a8028 05/28/2010 12:00 am Iustin Pop

Introduce suport for optional keys in JObjects

Some keys are optional in the Ganeti opcodes (e.g. ‘node’ in the
OpReplaceDisks), and as such we need to transform them in a Maybe value,
instead of failing.

The patch reworks a bit fromObj and adds maybeFromObj which parses such...

c96d44df 05/27/2010 11:37 pm Iustin Pop

Replace fromJResult with annotateJResult

This patch removes all old uses of fromJResult with the annotated
version, and removes the non-annotated version. All JSON parsing points
should now have annotated errors.

c8b662f1 05/27/2010 11:32 pm Iustin Pop

Add annotations to loadJSArray

This allows, for example, the RAPI backend to detail which information
(instance or node data) fails to parse.

50d26669 05/27/2010 11:23 pm Iustin Pop

Change fromObj error messages

Currently fromObj doesn't detail what we're trying to read, which can
lead to cryptic messages: "Cannot read Int". The patch changes this
function to annotate the error messages with the key/value we're trying
to convert, by using a new version of fromJResult....

82ea2874 05/27/2010 01:11 am Iustin Pop

A few more small Node unit-tests

39d11971 05/25/2010 08:17 pm Iustin Pop

Add more unittests

Instance, Node and Text modules have improved coverage.

3fea6959 05/20/2010 07:45 pm Iustin Pop

Add more unit tests for allocation/balance

The patch adds some simple unit-tests for both the allocation function
(we can allocate small instances on an empty cluster, we can allocate in
tiered more starting from any size) and the balancing functions (one...

3ce8009a 05/20/2010 01:31 pm Iustin Pop

Move two functions from hspace to Cluster.hs

This is done so we can test a longer pipeline.

8423f76b 05/20/2010 01:31 pm Iustin Pop

Make CStats instance of show

This helps debugging via ghci.

381be58a 05/20/2010 12:19 pm Iustin Pop

Another haddoc fix…

c854092b 05/20/2010 12:07 pm Iustin Pop

Accept both full and short names in CLI

This patch introduces some new functionality in the base Element type
and in Container which supports searching for all 'known' names of an
element, such that both short and full names are accept for various
options like '-O' and '--excluded-instances'.

3e4480e0 05/20/2010 12:07 pm Iustin Pop

Stop modifying names for internal computations

Currently the name used internally is modified and holds the shortened
name of the nodes/instances. This has caused issues before, since we
always have to strip the suffix from input data and reapply it if we...

8bcdde0c 05/20/2010 12:07 pm Iustin Pop

Add a new node/instance field

This new field ('alias') will hold the shortened/beautified display
name. When resetting the name, the alias is reset too, and there's a new
function to update only the alias.

49f9627a 05/20/2010 12:07 pm Iustin Pop

Change some test constants

First, we reduce the max size of the disks, since Int on 32bits will
overflow for big simulated clusters. This is a real issue, that will
need fixing in real life, but for now we just "silence" this test.

Second, we increase the amount of time a test is allowed to run,...

3ed46bb7 05/19/2010 04:28 pm Iustin Pop

Fix some haddock comments

8fcf251f 05/19/2010 04:09 pm Iustin Pop

Add more unit tests

This increases the overall coverage by 5%-10% (depending on coverage
type). Some modules are still not unittested at all, as HUnit is a
better choice for them.

1e3dccc8 05/19/2010 04:08 pm Iustin Pop

Shuffle some constants around

… and export more functions. This will help with unit testing.

f4c0b8c5 05/18/2010 07:31 pm Iustin Pop

Remove the noLimit values and always use limits

This patch moves from allowing no-limits for disk/cpu ratios, and always
use a real limit. For disk, it's simple since we use 0, which means no
reservations for disks. For CPU, we set an (arbitrary) limit of 64 v/p,...

e2436511 05/04/2010 02:42 pm Iustin Pop

Fix hspace's KM metrics

We returned the KM_POOL_* metrics as the final state, not as the delta
between the final and the initial state.

e87a419f 04/15/2010 05:16 pm Iustin Pop

Fix Node hiCpu computation

In case we're not enabling limits, let's restrict this to -1, instead of
-1 times the number of pcpus.

9b8fac3d 04/15/2010 12:50 pm Iustin Pop

Add a new function to compute allocation deltas

Given two cluster states, the new function can answer the following
questions:

- how much resources currently allocated
- how much resources finally allocated (delta from above is how much we
can actually allocate on the cluster)...

86ecce4a 04/15/2010 12:27 pm Iustin Pop

Introduce total vcpu tracking in CStats

We add a new field that tracks the available virtual cpus (expressed as
node cpus times the vcpu ratio).

bfefb674 04/14/2010 03:44 pm Iustin Pop

Merge branch 'master' into next

  • master:
    Fix iallocator crash when no solutions exist
    Fix IAllocator multi-evacuate message
57587760 03/31/2010 12:54 pm Iustin Pop

Fix iallocator crash when no solutions exist

Commit 5436576 added an un-guarded `head' call, which crashes with
“Prelude.head: empty list” when no results exists for the per-instance
allocation/relocation calls.

This patch fixes this, and also adds another check for an unguarded...

934c62dc 03/31/2010 12:51 pm Iustin Pop

Fix IAllocator multi-evacuate message

Since Ganeti passes full host names (not common-suffix-stripped), we
need to remove the suffix from the evac_nodes keys too. In case one node
is not part of the cluster, it will lead to a wrong error message, but
for now it fixes the problem.

e41f4ba0 03/09/2010 04:40 pm Iustin Pop

Fix iallocator crash when no solutions exist

Commit 5436576 added an un-guarded `head' call, which crashes with
“Prelude.head: empty list” when no results exists for the per-instance
allocation/relocation calls.

This patch fixes this, and also adds another check for an unguarded...

be811997 02/26/2010 03:42 pm Iustin Pop

Fix a haddock comment issue

For some versions of haddock, this can create problems.

a46f34d7 02/25/2010 03:47 pm Iustin Pop

Abstract instance running states into a list

This removes some manual checks from a few places in the code with a
single list defined once.

5182e970 02/25/2010 03:39 pm Iustin Pop

A number of small fixes from hlint

c939b58e 02/25/2010 02:35 pm Iustin Pop

Fix unused-do-binds for ghc 6.12

GHC 6.12 has some new warnings, which are valid in most cases except
(IMHO) printf usage.

0903280b 02/25/2010 02:34 pm Iustin Pop

Fix unused imports for ghc 6.12

GHC 6.12 has become more picky about unused imports, so we need to
remove/tighten some of them.

ba9349b8 02/23/2010 07:10 pm Iustin Pop

hscan: implement LUXI backend scanning

This allows hscan to work also with NO_CURL (but only for the local
machine, of course).

5ab2b771 02/23/2010 02:53 pm Iustin Pop

Loader: abort for unknown to-be-excluded instances

c424cdc8 02/23/2010 02:13 pm Iustin Pop

balance function: use the movable flag directly

Instead of deciding based on secondary node, use the new flag.

39f979b8 02/23/2010 02:09 pm Iustin Pop

Update the loader pipeline to set the movable flag

This updates the movable flag on instances if they have only one node
(we don't rely on OpMoveInstance) or if they are set so via the command
line options.

This doesn't yet enable the use of the new flag.

a182df55 02/23/2010 01:56 pm Iustin Pop

Add a 'movable' flag on instances

This will be used instead of checking for no secondary and for
simplifying 'do not touch' instances.

10f396e1 02/23/2010 11:40 am Iustin Pop

Add an option for excluding instances from moves

54365762 02/22/2010 04:19 pm Iustin Pop

Implement IAllocator node evacuate request

This patch adds the new request loading/execution (trivial), but the
actual response formatting becomes more difficult as now the response
type differs by request.

Signed-off-by: Iustin Pop <>

12b0511d 02/22/2010 04:19 pm Iustin Pop

Add a tryEvac function

This will be used by the node evacuate IAllocator request type.

Signed-off-by: Iustin Pop <>

1fe81531 02/22/2010 04:19 pm Iustin Pop

Move a type declaration to Node.hs

We'll need AllocElement in both Cluster and IAlloc in the future, so we
move it to Node.hs which is imported by both.

Signed-off-by: Iustin Pop <>

23f9ab76 02/22/2010 04:19 pm Iustin Pop

Change an internal type from Maybe to list

In preparation for multiple responses, we change from Maybe to List
(both used in the container sense).

This allows us to keep the same workflow for all kind of requests.

Signed-off-by: Iustin Pop <>

20c891d0 02/22/2010 04:19 pm Iustin Pop

IAllocator: move some keys into per-request data

Since not all structures will have these keys in the future, we move
them into per-structure keys.

Signed-off-by: Iustin Pop <>

2e28ac32 02/22/2010 03:50 pm Iustin Pop

Implement evacuation mode in hbal

This mode restricts the list of instances to be moved to the instances
living on the offline (and drained) nodes.

Signed-off-by: Iustin Pop <>

f0f21ec4 02/22/2010 03:50 pm Iustin Pop

Add an evac mode CLI option

Signed-off-by: Iustin Pop <>

df18fdfe 02/22/2010 03:50 pm Iustin Pop

Reorder options in CLI.hs

This should be no code change, just reordering of the options.

Signed-off-by: Iustin Pop <>

146b37eb 02/03/2010 01:29 pm Iustin Pop

Fix secondary node selection for existing N+1

In case a secondary node is already N+1 failed, currently the node
selection will accept a node that cannot start (at all) the new instance
as valid. This is wrong, so we add a new simple check to prevent the...

a4a6e623 02/03/2010 10:24 am Iustin Pop

Rewrite the node add checks for simpler layout

This will make it clearer than many if…then choices.

a804261a 01/14/2010 06:38 pm Iustin Pop

Move instance relocation test upper in the chain

Currently we test each instance for relocation in checkMove; however, it
is a little more clear if we pass only the relocatable instances to
checkMove. The patch also slightly rewrites (indendation/style) the...

5ad86777 01/14/2010 06:05 pm Iustin Pop

Split the balancing function in two parts

Currently in the balancing function we do two thing:

- take the decision where to do a new balancing round or not
- and actually computing the balancing round

This is not nice, as the two parts are conceptually separate, so this...

71e635f3 01/12/2010 12:18 pm René Nussbaumer

Fixing a typo in option description

Signed-off-by: René Nussbaumer <>
Reviewed-by: Michael Hanselmann <>
Signed-off-by: Iustin Pop <>

16c2369c 01/07/2010 01:57 pm Iustin Pop

Switch the text file format to single-file

This patch changes from the two separate files to a single file, with
sections separated by a blank line. Currently only the node and instance
data is accepted, later the cluster tags will be read too via this
format....

f5197d89 01/07/2010 12:44 pm Iustin Pop

Change the signatures of the text loader slightly

This is in preparation for the text format changes.

0ccaab44 12/28/2009 12:09 pm Iustin Pop

Fix small typo

This was found, of all things, via lintian during the Debian packaging…

0c860cff 12/11/2009 07:01 pm Iustin Pop

Convert n1_score metric from % to count

This increases the priority of fixing N+1 failures compared to balancing
metrics.

8ce618f3 12/11/2009 06:54 pm Iustin Pop

Merge branch 'master' into next

  • master:
    Use the oper_ram field if available
    rapi, luxi: treat drained nodes as offline
673f0f00 12/11/2009 06:47 pm Iustin Pop

Metric: count of primary instances/offline nodes

This helps with evacuation/failover of instances on 2-node clusters with
one one offline.

e4d31268 12/11/2009 06:43 pm Iustin Pop

Offline instance metric: change from % to count

Currently we use the offline instance percentage (with range [0, 1]),
but this is not good, since we want the evacuation of such instances to
have a high priority; therefore we change this to a count of offline...

6402a260 12/11/2009 06:17 pm Iustin Pop

Use the oper_ram field if available

For the RAPI and LUXI backends, we can get the actual memory usage (if
instances are running) via the oper_ram, whereas backend/memory only
tell what the instance will use at the next boot.

Not using oper_ram means that the node model is flawed and we consider...

b45222ce 12/09/2009 12:30 pm Iustin Pop

rapi, luxi: treat drained nodes as offline

Commit e97f211 changed the iallocator backend to handle drained nodes as
offline. This commit completes that change by making the rapi and luxi
backend do the same (the text backend ignores any '?' values which are...

1cea2e1e 12/02/2009 04:58 pm Iustin Pop

Fix typo breaking LUXI backend

This really shows the need for actual dist-time full testing (not
unittests).

434c15d5 12/02/2009 12:57 pm Iustin Pop

Fix unittests after instance tags addition

f5e67f55 12/01/2009 02:49 pm Iustin Pop

Configure exclusion tags via the cluster tags

This patch adds reading of the exclusion tags from the cluster tags: any
tags starting with htools:iextags: will convert their suffix into an
exclusion tags prefix. In other words, "htools:iextags:service" will...

669ea132 12/01/2009 01:24 pm Iustin Pop

Read cluster tags in the IAllocator backend

f89235f1 12/01/2009 12:47 pm Iustin Pop

Read cluster tags in the LUXI backend

ea017cbc 12/01/2009 11:53 am Iustin Pop

Read cluster tags in the RAPI backend

This also shows them in hbal in verbose mode.

94e05c32 11/27/2009 05:13 pm Iustin Pop

Introduce support for reading the cluster tags

While these are not actually populated from the backends, and all the
programs ignore them, this patch contains the changes in the function
types required.

185297fa 11/17/2009 11:44 am Iustin Pop

Collapse the statistical functions into one

This allows us to get rid of two duplicate list length computations,
with a minor speedup.

e27eb8ab 11/17/2009 11:44 am Iustin Pop

Specialize the math functions

The statistics functions are currently defined as polymorphic with a
Floating constraint. Changing this to monomorphic on Double type makes
them stricter and much more performant (~70% speedup). This is a cheap
way to recoup some of the loses incurred by the recent proliferation of...

d844fe88 11/17/2009 11:44 am Iustin Pop

Use conflicting primaries count in cluster score

This small patch adds the number of conflicting primaries in the cluster
score. This is different from the other non-CV metrics where we usually
compute the percentage of failing instances (for that metric); but for a...

1e4b5230 11/17/2009 11:44 am Iustin Pop

Node: add function for conflicting primary count

b2999982 11/17/2009 11:44 am Iustin Pop

Add a new node list field

This patch adds a new node list field (ptags), showing the primary
instance tags.

0f15cc76 11/17/2009 11:44 am Iustin Pop

Add a command-line option to filter exclusion tags

Since we don't want all instance tags to be used for exclusion, we add a
command line option to filter on these. Since the iallocator protocol
cannot accept command line options, currently it's not possible to...

5f0b9579 11/17/2009 11:44 am Iustin Pop

Introduce tag-based exclusion of primary instances

This patch introduces exclusion of primary instances based on tags. This
is incomplete as currently all tags are being excluded, and we don't
optimise towards relocation of instances sharing tags on the same node.

17e7af2b 11/11/2009 12:10 pm Iustin Pop

Add a tags attribute to instances

… and read it in all the loaders. hscan is modified to save it to the
files it generates.

The attribute is not yet used in any place.

27671a61 11/11/2009 11:39 am Iustin Pop

Small change in some list arguments

This is simpler than the concat operator.

e98fb766 11/10/2009 02:59 pm Iustin Pop

Allow overriding the field list in -p

The print nodes option can now accept an optional field list to
customise the output. This is ugly, since the field names do not match
the header names, but it is at least barely customisable (at runtime).

76354e11 11/09/2009 05:49 pm Iustin Pop

Move more node-listing functionality in Node.hs

This will prepare for the runtime-selectable field list.

a7a8f280 11/09/2009 04:51 pm Iustin Pop

Change the default dynamic usage to baseUtil

This fixed the unbalanced secondary instances on partially empty
clusters, and helps in general for the cases where real utilisation data
is not available.

daee4bed 11/09/2009 03:43 pm Iustin Pop

Add a few comments in the scoring function