Statistics
| Branch: | Tag: | Revision:

root / Ganeti / HTools / Cluster.hs @ df18fdfe

History | View | Annotate | Download (28.9 kB)

# Date Author Comment
a804261a 01/14/2010 06:38 pm Iustin Pop

Move instance relocation test upper in the chain

Currently we test each instance for relocation in checkMove; however, it
is a little more clear if we pass only the relocatable instances to
checkMove. The patch also slightly rewrites (indendation/style) the...

5ad86777 01/14/2010 06:05 pm Iustin Pop

Split the balancing function in two parts

Currently in the balancing function we do two thing:

- take the decision where to do a new balancing round or not
- and actually computing the balancing round

This is not nice, as the two parts are conceptually separate, so this...

0c860cff 12/11/2009 07:01 pm Iustin Pop

Convert n1_score metric from % to count

This increases the priority of fixing N+1 failures compared to balancing
metrics.

673f0f00 12/11/2009 06:47 pm Iustin Pop

Metric: count of primary instances/offline nodes

This helps with evacuation/failover of instances on 2-node clusters with
one one offline.

e4d31268 12/11/2009 06:43 pm Iustin Pop

Offline instance metric: change from % to count

Currently we use the offline instance percentage (with range [0, 1]),
but this is not good, since we want the evacuation of such instances to
have a high priority; therefore we change this to a count of offline...

d844fe88 11/17/2009 11:44 am Iustin Pop

Use conflicting primaries count in cluster score

This small patch adds the number of conflicting primaries in the cluster
score. This is different from the other non-CV metrics where we usually
compute the percentage of failing instances (for that metric); but for a...

e98fb766 11/10/2009 02:59 pm Iustin Pop

Allow overriding the field list in -p

The print nodes option can now accept an optional field list to
customise the output. This is ugly, since the field names do not match
the header names, but it is at least barely customisable (at runtime).

76354e11 11/09/2009 05:49 pm Iustin Pop

Move more node-listing functionality in Node.hs

This will prepare for the runtime-selectable field list.

daee4bed 11/09/2009 03:43 pm Iustin Pop

Add a few comments in the scoring function

30ff0c73 10/21/2009 11:47 am Iustin Pop

Expand the --print-instances output

This adds run status, resource parameters and load parameters for
instances.

8c9af2f0 10/19/2009 12:17 am Iustin Pop

Simplify the cstats initializer

Since all values are initialized to zero, the exact ordering is not
important and thus we can use the positional mode for simpler code.

The patch also adds docstrings to the cstats functions.

668c03b3 10/19/2009 12:11 am Iustin Pop

Simplify Cluster.computeMoves

Since we now have an actual type for describing the instance moves
(IMove), it's simpler to convert this into the move description/move
commands, rather than re-computing the move based on initial and final
nodes. This makes the shell commands computation and over-Luxi command...

eb2598ab 10/18/2009 11:20 pm Iustin Pop

Remove obsolete export

The ‘Placement’ type has been moved to Types.hs but we kept exporting it
from Cluster, which is not needed.

c5f7412e 10/18/2009 08:21 pm Iustin Pop

Generalise the node/instance listing

This patch introduces a generic formatTable function (based on, and
similar to the Ganeti one, but different and more FP in style) and
changes the node and instance listing to it.

The node list (due to the many variables) is still a little bit hackish...

ad6cffe4 10/18/2009 07:38 pm Iustin Pop

Fix instance listing for non-redundant case

ee9724b9 10/16/2009 04:59 pm Iustin Pop

Start using the utilisation scores in balancing

This enables the per-node load/total available capacity scores to be
used in balancing. Note that the total available capacity is currently
fixed at zero and cannot be changed by the user.

183a9c3d 10/16/2009 10:09 am Iustin Pop

Show the load on nodes in node lists

The strange printf usage is due to some limitation (it seems) in ghc for
very long argument lists. The whole printout should be rewritten later.

507fda3f 10/15/2009 05:00 pm Iustin Pop

Allow displaying the instance map in hbal

This is similar to --print-nodes, but with much fewer fields.

f5b553da 10/14/2009 04:41 pm Iustin Pop

Style change: cluster CStats camel-casing

This is again the cs_x to csX name change.

2060348b 10/14/2009 04:41 pm Iustin Pop

Style change: node and instance attributes

This changes from a_b to aB in all node and instance attributes, to
match the standard Haskell style. Also attributes that should have been
camel-cased but weren't were changed (e.g. plist → pList, pnode →
pNode).

fca250e9 10/14/2009 01:45 pm Iustin Pop

Modify the internals of the detailed CV scores

Before we used a tuple; since we'll need more metrics in the future,
it's simpler to transform this into a list of doubles, whose elements
are handled homogeneously by all the code that needs them.

dfbbd43a 10/14/2009 11:56 am Iustin Pop

Change iMoveToJob to properly create migrates

The current Cluster.iMoveToJob always creates failovers, which is not
what we want. This simply used the original instances status to select
between these two (this is not optimal by the way, since the status...

924f9c16 10/14/2009 11:55 am Iustin Pop

Extend the MoveJob type to hold the instance index

This will be needed in order to generate the proper instance move commands.

Signed-off-by: Iustin Pop <>

a2e90275 10/02/2009 06:54 pm Iustin Pop

Store the instance move in the MoveJobs

This will automatically sort our Ganeti jobs into the independent job
sets, and then we can submit them separately.

92e32d76 10/02/2009 06:48 pm Iustin Pop

Move some more type definitions to Types.hs

6b20875c 10/02/2009 06:37 pm Iustin Pop

Add a function converting Placements into Jobs

This converts from htools-specific Placements into Ganeti standard
OpCodes, which will later allow execution via Luxi.

3173c987 10/02/2009 05:52 pm Iustin Pop

Record the move being performed in a Placement

This will allow a more descriptive output later in the solution list, as
opposed to trying to reconstruct the move from the node indices.

The patch also documents the Placement members.

0e8ae201 10/02/2009 02:56 pm Iustin Pop

hbal: Implement grouping of moves into jobsets

Since moving two instances between different node-quadruples (inst X: A,
B → C, D and inst Y: E, F → G, H) can be parallelised by Ganeti, it
makes sense to split the operation list into jobsets whose execution...

fbb95f28 09/28/2009 05:09 pm Iustin Pop

Turn on, and fix, more warnings

The Makefile was intented to be -Wall and not simply -W, but I missed
that. This enables more warnings and also enables -Werror (except for
the tests).

f25e5aac 08/30/2009 06:55 pm Iustin Pop

Split the balancing algorithm in two parts

Currently the computation, recursing part and the IO part (progress
updates) of the balancing main function (iterateDepth) are all in the
same function, which makes it hard to test. This patch moves the
decision/computation part (whether to proceed one more round, whether we...

c0501c69 08/26/2009 11:07 am Iustin Pop

Implement support for 'cheap' moves only

This patch adds support for cheap (failover/migrate) operations only in
the balancing algorithm and in the hbal command line options.

This allows a very quick balancing (compared to allowing replace-disks)
which can be useful as a scheduled operation.

c9926b22 08/26/2009 10:40 am Iustin Pop

Use migrate or failover based on instance state

While we can't guarantee that the instance will be in the same state by
the time the migrate/failover command will be run, we can at least try
to do the right thing assuming no other changes to the cluster state....

2485487d 07/14/2009 05:15 pm Iustin Pop

Fix a few hlint errors

7d11799b 07/09/2009 04:58 pm Iustin Pop

Fix a haddoc issue

31e7ac17 07/09/2009 04:16 pm Iustin Pop

hspace: fix failure handling of tryAlloc results

Currently hspace doesn't handle failures from tryAlloc correctly; this
patch changes the iterateDepth function in hspace to return a Result (…)
so that errors can be propagated correctly.

The patch also changes one output key to be more clear and a typo in...

478df686 07/09/2009 03:44 pm Iustin Pop

Change the tryAlloc/tryReloc workflow

Currently, the tryAlloc and tryReloc function return a list with all the
results, both failures and successes. This is fine for hail, which does
one round of allocations, but is not so good for hspace, which does
iterative rounds; since at each (successful) step we only take the best...

685935f7 07/08/2009 08:30 pm Iustin Pop

Simplify the Cluster.tryAlloc structures

Currently the tryAlloc function calls the
allocateOnSingle/allocateOnPair and the builds a new tuple with those
functions's result plus the new node list. This is however suboptimal
in two respects:
- the new nodes added are the 'old' versions of the respective nodes,...

8880d889 07/08/2009 07:38 pm Iustin Pop

Slight change to the internal allocation results

Currently the Cluster.AllocSolution type is defined as a list of
‘(OpResult Node.list, …)’ and the results for applyMove are defined as
‘(OpResult Node.List, …)’. Both these means that the failure/success
indication is hidden in the first elements of this tuple, which makes is...

de4ac2c2 07/08/2009 12:49 pm Iustin Pop

hspace: move instance count and score into CStats

Currently the instance count and cluster score are separated from the
other initial/final phase stats, even though they are very similar. This
patch moves computation of these two into totalResources/CStats and...

8c4c6a8a 07/07/2009 12:56 pm Iustin Pop

Export more stats in hspace

This patch changes Cluster.totalResources to compute more resources and
prints them in hspace.

16103319 07/07/2009 11:06 am Iustin Pop

Fix score calculation to work with empty clusters

Currently the cluster score calculation includes an offline instance
percentage, expressed as “offline inst / (offline + online inst)”, which
results in NaN for empty clusters. This patch changes the calculation...

41c3b292 07/07/2009 12:13 am Iustin Pop

Simplify Cluster.computeMoves

This patch changes the function Cluster.computeMoves to use guards and a
couple of subexpressions in order to greatly simplify it.

9f6dcdea 07/06/2009 11:50 pm Iustin Pop

Fix hlint-generated warnings

This big patch cleans up the code per hlint indications. Many removals
of extra parentheses, replacements of concat . map with concabtMap,
extra dollar signs, eta reductions, etc. were performed.

The code still compiles and passes a couple of manual tests on sample...

f2280553 07/05/2009 03:53 pm Iustin Pop

Introduce a new type for allocation results

Currently the allocation/move operations workflow return ‘Maybe a’,
which is very convenient but loses all details about the failure mode.

This patch introduces a new data type which encodes the specific failure...

266aea94 07/05/2009 03:21 pm Iustin Pop

Remove hn1 and related code

hn1 was deprecated for a while and this patch removes it altogether. The
support code in Cluster.hs is also removed.

301789f4 07/03/2009 10:01 pm Iustin Pop

Fix totalResources avail disk computation

This uses the newly-added Node.availDisk to compute the actual available
disk correctl, and display the total allocatable disk in hspace.

1a7eff0e 07/03/2009 12:50 am Iustin Pop

Add a new type for cluster statistics

Currently totalResources returns a 5-tuple of integers. This is not easy
to handle, as each change on the return type means that each caller must
be updated.

This patch adds a new type for cluster stats and uses that instead as...

e2af3156 07/02/2009 01:33 pm Iustin Pop

Add display of more stats in hspace

This patch changes Cluster.totalResources to compute more details about
the cluster status, and enhances hspace to display more of these.

0c936d24 06/16/2009 12:52 pm Iustin Pop

Fix a haddock/docstring issue

78694255 06/12/2009 02:22 am Iustin Pop

Fix the various monomorphism warning

In a few places (e.g. tryRead or any printf call) it's a little bit hard
to add the correct type signatures, but in the it is possible to fix
these warnings (which can bite one in subtle cases).

3c64b5aa 06/12/2009 01:12 am Iustin Pop

Small changes to the node list output

This is just some cleanup of the node list output, adding pcpu/vcpu
counters, and making the display slightly nicer.

0a8dd21d 06/11/2009 12:17 am Iustin Pop

Add cpu ratio to cluster calculation

1a82215d 06/10/2009 11:29 pm Iustin Pop

Add cpu-count-related attributes to nodes

This patch adds cpu-count related attributes to nodes:
- total cpus
- cpus in use
- ratio of virtual:physical cpus

We also set correctly the cpu values at load time, but we don't do
anything yet while moving instances around. The cpu ratio is shown in...

70db354e 06/04/2009 04:32 pm Iustin Pop

Fix the ReplacePrimary instance move

During a replace-primary instance move, on the real cluster the instance
is temporarily started on the secondary, and as such we must check that
the secondary node can hold it for this duration. Currently the code
does not, and depending on cluster scoring it will put instances on such...

9dcec001 06/01/2009 04:48 pm Iustin Pop

Rework the tryAlloc/tryReloc functions

Currently tryAlloc/tryReloc do not return the new instance, as this is
not needed for IAllocator alloc/reloc requests. However, for computing
the space, the new instance is useful, so we modify these functions to
return this information too....

e2fa2baf 06/01/2009 12:55 pm Iustin Pop

Add copyright/license information

This doc-patch adds copyright and license information to (hopefully) all
needed files.

0991ed70 06/01/2009 12:18 pm Iustin Pop

Small whitespace change

dbba5246 06/01/2009 12:18 pm Iustin Pop

Move some alloc functions from hail into Cluster

These are generic enough to be used from multiple places, they belong
better in Cluster.hs than in the hail source.

d85a0a0f 06/01/2009 12:18 pm Iustin Pop

Cleanup an old function

Also replace a type with its synonim.

9188aeef 06/01/2009 12:18 pm Iustin Pop

Lots of documentation updates

This patch does only doc build changes, doc changes and function move
around (for more logical documentation). It should have no impact at all
on the code.

f9fc7a63 05/27/2009 11:01 pm Iustin Pop

Remove an unused type synonim

608efcce 05/27/2009 10:45 pm Iustin Pop

Add type synonyms for the node/instance indices

This is a first step towards full datatype renaming. That requires more
changes, so at first we only want to document clearly what is a node
index, what is an instance index, and what is a plain Int.

262a08a2 05/27/2009 02:17 am Iustin Pop

Change the module import hierarchy

This patch makes the Types module a base module, and Node/Instance ones
import it, from the previous (opposite) situation. This will allow in
the future to use newtypes for the index and name types.

5e15f460 05/25/2009 09:31 pm Iustin Pop

hail: Implement non-mirrored instance allocation

This patch implements non-mirrored instance allocation, by allocating as
secondary node “noSecondary”.

4a340313 05/25/2009 02:09 am Iustin Pop

Implement hail allocate (for 2-node requests)

This patch implements allocate for two node requests. One node requests
can be done as soon as we have a valid allocateOn function for single
nodes.

58709f92 05/25/2009 02:06 am Iustin Pop

Working implementation if relocate

This patch completes the implementation of hail relocate. It maps all
valid destination nodes through a ReplaceSecondary IMove, filters out
the failed relocations, computes the resulting scores and picks the
lowest one.

db1bcfe8 05/24/2009 02:29 am Iustin Pop

Remove most uses of ktn/kti

This patch removes all uses of ktn/kti from the past-loader stages.

dbd6700b 05/24/2009 02:05 am Iustin Pop

Remove some extraneous uses of ktn/kti

Since we have Node/Instance.name, we can now simplify a few constructs.

446d8827 05/24/2009 01:16 am Iustin Pop

Move checkData from Cluster to Loader

This moves the remaining loading function to Loader (together with its
associated support functions).

e4c5beaf 05/23/2009 02:29 am Iustin Pop

More code reorganizations

This new big patch does a couple of more cleanups in the loading of data
chapter:
- introduce a Types module that holds most types (except the base
Node/Instance/etc.) so that multiple other modules can use these
(instead of only Cluster and its users)...

040afc35 05/22/2009 08:03 pm Iustin Pop

Rework the loader model

This big patch changes the loader model from “string data as common
format” to actual object structures as common format.

The text loading function move from Cluster.hs to a new Text.hs module,
some common functions are moved to a new Loader.hs module, and the...

7e7f6ca2 05/21/2009 03:54 am Iustin Pop

Experimental support for non-redundant instances

This patch adds experimental support to hbal for non-redundant instances
(i.e. instances with only one node). They are currently handled as
non-moveable, and as such the algorithm simply ignores them.

Supports needs to be added when reading from RAPI via hscan, and...

b33a2243 05/21/2009 03:31 am Iustin Pop

Small doc addition

1c035cb3 05/21/2009 03:26 am Iustin Pop

Introduce nice errors on invalid input fields

This patch switches from plain read to a wrapper over readsPrec that
returns better error messages than the buildin 'Prelude: no parse'.

62007053 05/21/2009 03:10 am Iustin Pop

Split node/instance parsing into functions

This allows easy checking for valid format of the input data (row-wise).

9d3fada5 05/21/2009 02:37 am Iustin Pop

Add initial validation checks in Cluster.loadData

This patch converts loadTabular and loadData to a monadic form, thus
allowing meaningful error messages from the node/instance load routines.

fd22ce8e 05/21/2009 02:09 am Iustin Pop

Convert Cluster.loadData to Result return

This patch changes Cluster.loadData to return a Result, instead of
directly the values; this will allow us to return meaningful error
values (e.g. when an instances lives on unknown node) rather than simply
abort. Currently the result is always an Ok, the actual signalling of...

234d8af0 05/20/2009 12:59 am Iustin Pop

Don't consider offline nodes as N+1 failed

This is just a cosmetic (I hope) change; the nodes shouldn't be used
anyway, and we only correct the display message.

00b15752 05/20/2009 12:45 am Iustin Pop

Add support for 'offline' nodes

This patch drops compatiblity with Ganeti 1.2 and adds support for
offline nodes in the cluster. When reading from RAPI, the drained nodes
are considered offline so that we don't allocate on them too.

b0517d61 04/25/2009 06:40 pm Iustin Pop

hbal: Add a new min-score option

This new parameter causes the algorithm to finish (or even not start at
all) if we reach/have a score better than it.

b161386d 04/25/2009 06:39 pm Iustin Pop

hbal: Change hardcoded tests to monadic composition

In some case we manually do “if isNothing … then Nothing else …”, which
can be very easily replaced with a monadic construct in the Maybe monad.

8930eef2 04/20/2009 03:12 pm Iustin Pop

Increase allowed missing memory to 512MB

Since Xen seems to “steal” some amounts of memory (depending on total
node memory), we increase the maximum allowed missing memory to 512MB,
based on gathered data from multiple machines.

e0eb63f0 04/19/2009 01:17 am Iustin Pop

Implement writing the command list to a script

This patch adds support in hbal for writing the command list to a shell
script, with error checking and allowing for early exit.

9cded5d3 03/23/2009 12:32 am Iustin Pop

Add checks for missing disk space

This small patch adds disk space checks to the Cluster.checkData
function, and simplifies a little the warning messages.

53f00b20 03/22/2009 12:40 pm Iustin Pop

Fix interaction between down instances and nodes

If an instance is down, it's memory is not reflected in the node used
memory, and thus the node free memory is higher than the actual value.
This patch deducts the memory for such instances from the node free...

f82f1f39 03/22/2009 12:24 pm Iustin Pop

Add a new instance field denoting run status

This patch modifies Rapi, the Cluster.loadData and hscan serialization to load
and save the instance run status. At instance level, we add both a boolean
field denoting the true/false run status, and a string field which holds the...

a1c6212e 03/22/2009 12:07 pm Iustin Pop

Show the x_mem/i_mem in node list

This patch adds checking of cluster data in the binaries and display of
node's x_mem/i_mem in the node list.

5d1baf63 03/22/2009 12:02 pm Iustin Pop

Add functions to check and fix cluster data

This patch adds a checkData function which goes over the node list and computes
the unaccounted memory, returning a list of warning messages (if any) and the
update nodes.

04be800a 03/22/2009 01:25 am Iustin Pop

Add node memory field to Node objects

This patch adds a new n_mem field to the node objects, and implements
read/save/show support for it. The field is not currently used (except
in the node list) but will be used for checking data consistency and
instance up/down status.

47a8bade 03/22/2009 01:12 am Iustin Pop

Pass actual types to node/instance constructors

This patch changes the parameters passed to the node and instance
constructors from generic Strings (which are then parsed via “read”) to
the actual used types, by converting them earlier in Cluster.loadData.

7847a037 03/22/2009 12:48 am Iustin Pop

Some small changes in preparation for hscan

This patch does some small changes:
- fixes a comment
- export more node functions (unneeded now, but hscan will use them)
- fixes Makefile rule for building the programs

740ec004 03/21/2009 11:00 pm Iustin Pop

Add a separate type for the [(Int, String)] list

This is added for better readability, since this is very often used in
declarations.

19777638 03/21/2009 04:48 pm Iustin Pop

Handle correctly offline nodes in cluster scoring

This patch changes two things with regard to offline nodes:
- first, it only calculates the various coefficients across online
nodes
- second, it adds a new score denoting the percentage of instances...

352806f7 03/21/2009 01:20 pm Iustin Pop

Show offline nodes in the node status list

This patch adds a new ‘-’ flag for the node status which denotes offline
nodes.

40d4eba0 03/21/2009 01:26 am Iustin Pop

Restrict move list based on offline node status

This patch changes the Cluster.checkInstanceMove function to restrict
the target move list based on which nodes are online.

7ae514ba 03/20/2009 08:58 pm Iustin Pop

Some updates to the apidoc rules

669d7e3d 03/20/2009 07:16 pm Iustin Pop

Introduce a namespace for the modules

The modules are moved from the ‘top’ namespace to ‘Ganeti.HTools’, in
compliance with standard practices.