/Ganeti/HTools/Cluster.hs - Changes - snf-ganeti - Greek Research and Technology Network's projects

| Branch: | Tag: | Revision:

root / Ganeti / HTools / Cluster.hs @ adc5c176

History | View | Annotate | Download (33.7 kB)

#	Date	Author	Comment
94d08202	08/30/2010 12:12 pm	Iustin Pop	Change iterateAlloc to return the instance list The Cluster.iterateAlloc and tieredAlloc functions are changed to also return the updated instance list, since it is needed to have a “full” cluster view.
0ca66853	07/27/2010 09:44 pm	Iustin Pop	hail: fix error message for failed multi-evac Currently we show the instance index, but this makes no sense outside the current running program. Instead, we show the instance name.
c3c7a0c1	07/22/2010 01:42 am	Iustin Pop	Change the meaning of the N+1 fail metric Currently, this metric tracks the nodes failing the N+1 check. While this helps (in some cases) to evacuate such nodes, it's not a good metric since rarely it will change during a step (only at the last instance moving away). Therefore we replace it with the count of...
8a3b30ca	07/22/2010 01:42 am	Iustin Pop	Introduce per-metric weights Currently all metrics have the same weight (we just sum them together). However, for the hard constraints (N+1 failures, offline nodes, etc.) we should handle the metrics differently based on their meaning. For example, an instance living on a primary offline node is worse than an...
2cae47e9	07/22/2010 01:42 am	Iustin Pop	Allow balancing moves to introduce N+1 errors This patch switches the applyMove function to the extended versions of Node.addPri and addSec, and passes the override flag based on the state of the node that we're moving away from.
14c972c7	07/19/2010 02:20 pm	Iustin Pop	hbal: print short names in steps list This was a regression from the name handling changes, as we started using the original names for the solution list (which is not designed for parsing/feeding back into ganeti).
fb33aaaf	07/19/2010 02:20 pm	Iustin Pop	Remove an obsolete function printSolution is no longer used, as we print the solution iteratively now.
6dfa04fd	07/19/2010 12:13 am	Iustin Pop	Allow '+' in node list fields When the field list is prefixed with a plus sign, this will extend the default field list, instead of replacing it entirely.
3fea6959	05/20/2010 07:45 pm	Iustin Pop	Add more unit tests for allocation/balance The patch adds some simple unit-tests for both the allocation function (we can allocate small instances on an empty cluster, we can allocate in tiered more starting from any size) and the balancing functions (one...
3ce8009a	05/20/2010 01:31 pm	Iustin Pop	Move two functions from hspace to Cluster.hs This is done so we can test a longer pipeline.
8423f76b	05/20/2010 01:31 pm	Iustin Pop	Make CStats instance of show This helps debugging via ghci.
3e4480e0	05/20/2010 12:07 pm	Iustin Pop	Stop modifying names for internal computations Currently the name used internally is modified and holds the shortened name of the nodes/instances. This has caused issues before, since we always have to strip the suffix from input data and reapply it if we...
f4c0b8c5	05/18/2010 07:31 pm	Iustin Pop	Remove the noLimit values and always use limits This patch moves from allowing no-limits for disk/cpu ratios, and always use a real limit. For disk, it's simple since we use 0, which means no reservations for disks. For CPU, we set an (arbitrary) limit of 64 v/p,...
e2436511	05/04/2010 02:42 pm	Iustin Pop	Fix hspace's KM metrics We returned the KM_POOL_* metrics as the final state, not as the delta between the final and the initial state.
9b8fac3d	04/15/2010 12:50 pm	Iustin Pop	Add a new function to compute allocation deltas Given two cluster states, the new function can answer the following questions: - how much resources currently allocated - how much resources finally allocated (delta from above is how much we can actually allocate on the cluster)...
86ecce4a	04/15/2010 12:27 pm	Iustin Pop	Introduce total vcpu tracking in CStats We add a new field that tracks the available virtual cpus (expressed as node cpus times the vcpu ratio).
5182e970	02/25/2010 03:39 pm	Iustin Pop	A number of small fixes from hlint
c424cdc8	02/23/2010 02:13 pm	Iustin Pop	balance function: use the movable flag directly Instead of deciding based on secondary node, use the new flag.
12b0511d	02/22/2010 04:19 pm	Iustin Pop	Add a tryEvac function This will be used by the node evacuate IAllocator request type. Signed-off-by: Iustin Pop <iustin@google.com>
1fe81531	02/22/2010 04:19 pm	Iustin Pop	Move a type declaration to Node.hs We'll need AllocElement in both Cluster and IAlloc in the future, so we move it to Node.hs which is imported by both. Signed-off-by: Iustin Pop <iustin@google.com>
23f9ab76	02/22/2010 04:19 pm	Iustin Pop	Change an internal type from Maybe to list In preparation for multiple responses, we change from Maybe to List (both used in the container sense). This allows us to keep the same workflow for all kind of requests. Signed-off-by: Iustin Pop <iustin@google.com>
2e28ac32	02/22/2010 03:50 pm	Iustin Pop	Implement evacuation mode in hbal This mode restricts the list of instances to be moved to the instances living on the offline (and drained) nodes. Signed-off-by: Iustin Pop <iustin@google.com>
a804261a	01/14/2010 06:38 pm	Iustin Pop	Move instance relocation test upper in the chain Currently we test each instance for relocation in checkMove; however, it is a little more clear if we pass only the relocatable instances to checkMove. The patch also slightly rewrites (indendation/style) the...
5ad86777	01/14/2010 06:05 pm	Iustin Pop	Split the balancing function in two parts Currently in the balancing function we do two thing: - take the decision where to do a new balancing round or not - and actually computing the balancing round This is not nice, as the two parts are conceptually separate, so this...
0c860cff	12/11/2009 07:01 pm	Iustin Pop	Convert n1_score metric from % to count This increases the priority of fixing N+1 failures compared to balancing metrics.
673f0f00	12/11/2009 06:47 pm	Iustin Pop	Metric: count of primary instances/offline nodes This helps with evacuation/failover of instances on 2-node clusters with one one offline.
e4d31268	12/11/2009 06:43 pm	Iustin Pop	Offline instance metric: change from % to count Currently we use the offline instance percentage (with range [0, 1]), but this is not good, since we want the evacuation of such instances to have a high priority; therefore we change this to a count of offline...
d844fe88	11/17/2009 11:44 am	Iustin Pop	Use conflicting primaries count in cluster score This small patch adds the number of conflicting primaries in the cluster score. This is different from the other non-CV metrics where we usually compute the percentage of failing instances (for that metric); but for a...
e98fb766	11/10/2009 02:59 pm	Iustin Pop	Allow overriding the field list in -p The print nodes option can now accept an optional field list to customise the output. This is ugly, since the field names do not match the header names, but it is at least barely customisable (at runtime).
76354e11	11/09/2009 05:49 pm	Iustin Pop	Move more node-listing functionality in Node.hs This will prepare for the runtime-selectable field list.
daee4bed	11/09/2009 03:43 pm	Iustin Pop	Add a few comments in the scoring function
30ff0c73	10/21/2009 11:47 am	Iustin Pop	Expand the --print-instances output This adds run status, resource parameters and load parameters for instances.
8c9af2f0	10/19/2009 12:17 am	Iustin Pop	Simplify the cstats initializer Since all values are initialized to zero, the exact ordering is not important and thus we can use the positional mode for simpler code. The patch also adds docstrings to the cstats functions.
668c03b3	10/19/2009 12:11 am	Iustin Pop	Simplify Cluster.computeMoves Since we now have an actual type for describing the instance moves (IMove), it's simpler to convert this into the move description/move commands, rather than re-computing the move based on initial and final nodes. This makes the shell commands computation and over-Luxi command...
eb2598ab	10/18/2009 11:20 pm	Iustin Pop	Remove obsolete export The ‘Placement’ type has been moved to Types.hs but we kept exporting it from Cluster, which is not needed.
c5f7412e	10/18/2009 08:21 pm	Iustin Pop	Generalise the node/instance listing This patch introduces a generic formatTable function (based on, and similar to the Ganeti one, but different and more FP in style) and changes the node and instance listing to it. The node list (due to the many variables) is still a little bit hackish...
ad6cffe4	10/18/2009 07:38 pm	Iustin Pop	Fix instance listing for non-redundant case
ee9724b9	10/16/2009 04:59 pm	Iustin Pop	Start using the utilisation scores in balancing This enables the per-node load/total available capacity scores to be used in balancing. Note that the total available capacity is currently fixed at zero and cannot be changed by the user.
183a9c3d	10/16/2009 10:09 am	Iustin Pop	Show the load on nodes in node lists The strange printf usage is due to some limitation (it seems) in ghc for very long argument lists. The whole printout should be rewritten later.
507fda3f	10/15/2009 05:00 pm	Iustin Pop	Allow displaying the instance map in hbal This is similar to --print-nodes, but with much fewer fields.
f5b553da	10/14/2009 04:41 pm	Iustin Pop	Style change: cluster CStats camel-casing This is again the cs_x to csX name change.
2060348b	10/14/2009 04:41 pm	Iustin Pop	Style change: node and instance attributes This changes from a_b to aB in all node and instance attributes, to match the standard Haskell style. Also attributes that should have been camel-cased but weren't were changed (e.g. plist → pList, pnode → pNode).
fca250e9	10/14/2009 01:45 pm	Iustin Pop	Modify the internals of the detailed CV scores Before we used a tuple; since we'll need more metrics in the future, it's simpler to transform this into a list of doubles, whose elements are handled homogeneously by all the code that needs them.
dfbbd43a	10/14/2009 11:56 am	Iustin Pop	Change iMoveToJob to properly create migrates The current Cluster.iMoveToJob always creates failovers, which is not what we want. This simply used the original instances status to select between these two (this is not optimal by the way, since the status...
924f9c16	10/14/2009 11:55 am	Iustin Pop	Extend the MoveJob type to hold the instance index This will be needed in order to generate the proper instance move commands. Signed-off-by: Iustin Pop <iustin@google.com>
a2e90275	10/02/2009 06:54 pm	Iustin Pop	Store the instance move in the MoveJobs This will automatically sort our Ganeti jobs into the independent job sets, and then we can submit them separately.
92e32d76	10/02/2009 06:48 pm	Iustin Pop	Move some more type definitions to Types.hs
6b20875c	10/02/2009 06:37 pm	Iustin Pop	Add a function converting Placements into Jobs This converts from htools-specific Placements into Ganeti standard OpCodes, which will later allow execution via Luxi.
3173c987	10/02/2009 05:52 pm	Iustin Pop	Record the move being performed in a Placement This will allow a more descriptive output later in the solution list, as opposed to trying to reconstruct the move from the node indices. The patch also documents the Placement members.
0e8ae201	10/02/2009 02:56 pm	Iustin Pop	hbal: Implement grouping of moves into jobsets Since moving two instances between different node-quadruples (inst X: A, B → C, D and inst Y: E, F → G, H) can be parallelised by Ganeti, it makes sense to split the operation list into jobsets whose execution...
fbb95f28	09/28/2009 05:09 pm	Iustin Pop	Turn on, and fix, more warnings The Makefile was intented to be -Wall and not simply -W, but I missed that. This enables more warnings and also enables -Werror (except for the tests).
f25e5aac	08/30/2009 06:55 pm	Iustin Pop	Split the balancing algorithm in two parts Currently the computation, recursing part and the IO part (progress updates) of the balancing main function (iterateDepth) are all in the same function, which makes it hard to test. This patch moves the decision/computation part (whether to proceed one more round, whether we...
c0501c69	08/26/2009 11:07 am	Iustin Pop	Implement support for 'cheap' moves only This patch adds support for cheap (failover/migrate) operations only in the balancing algorithm and in the hbal command line options. This allows a very quick balancing (compared to allowing replace-disks) which can be useful as a scheduled operation.
c9926b22	08/26/2009 10:40 am	Iustin Pop	Use migrate or failover based on instance state While we can't guarantee that the instance will be in the same state by the time the migrate/failover command will be run, we can at least try to do the right thing assuming no other changes to the cluster state....
2485487d	07/14/2009 05:15 pm	Iustin Pop	Fix a few hlint errors
7d11799b	07/09/2009 04:58 pm	Iustin Pop	Fix a haddoc issue
31e7ac17	07/09/2009 04:16 pm	Iustin Pop	hspace: fix failure handling of tryAlloc results Currently hspace doesn't handle failures from tryAlloc correctly; this patch changes the iterateDepth function in hspace to return a Result (…) so that errors can be propagated correctly. The patch also changes one output key to be more clear and a typo in...
478df686	07/09/2009 03:44 pm	Iustin Pop	Change the tryAlloc/tryReloc workflow Currently, the tryAlloc and tryReloc function return a list with all the results, both failures and successes. This is fine for hail, which does one round of allocations, but is not so good for hspace, which does iterative rounds; since at each (successful) step we only take the best...
685935f7	07/08/2009 08:30 pm	Iustin Pop	Simplify the Cluster.tryAlloc structures Currently the tryAlloc function calls the allocateOnSingle/allocateOnPair and the builds a new tuple with those functions's result plus the new node list. This is however suboptimal in two respects: - the new nodes added are the 'old' versions of the respective nodes,...
8880d889	07/08/2009 07:38 pm	Iustin Pop	Slight change to the internal allocation results Currently the Cluster.AllocSolution type is defined as a list of ‘(OpResult Node.list, …)’ and the results for applyMove are defined as ‘(OpResult Node.List, …)’. Both these means that the failure/success indication is hidden in the first elements of this tuple, which makes is...
de4ac2c2	07/08/2009 12:49 pm	Iustin Pop	hspace: move instance count and score into CStats Currently the instance count and cluster score are separated from the other initial/final phase stats, even though they are very similar. This patch moves computation of these two into totalResources/CStats and...
8c4c6a8a	07/07/2009 12:56 pm	Iustin Pop	Export more stats in hspace This patch changes Cluster.totalResources to compute more resources and prints them in hspace.
16103319	07/07/2009 11:06 am	Iustin Pop	Fix score calculation to work with empty clusters Currently the cluster score calculation includes an offline instance percentage, expressed as “offline inst / (offline + online inst)”, which results in NaN for empty clusters. This patch changes the calculation...
41c3b292	07/07/2009 12:13 am	Iustin Pop	Simplify Cluster.computeMoves This patch changes the function Cluster.computeMoves to use guards and a couple of subexpressions in order to greatly simplify it.
9f6dcdea	07/06/2009 11:50 pm	Iustin Pop	Fix hlint-generated warnings This big patch cleans up the code per hlint indications. Many removals of extra parentheses, replacements of concat . map with concabtMap, extra dollar signs, eta reductions, etc. were performed. The code still compiles and passes a couple of manual tests on sample...
f2280553	07/05/2009 03:53 pm	Iustin Pop	Introduce a new type for allocation results Currently the allocation/move operations workflow return ‘Maybe a’, which is very convenient but loses all details about the failure mode. This patch introduces a new data type which encodes the specific failure...
266aea94	07/05/2009 03:21 pm	Iustin Pop	Remove hn1 and related code hn1 was deprecated for a while and this patch removes it altogether. The support code in Cluster.hs is also removed.
301789f4	07/03/2009 10:01 pm	Iustin Pop	Fix totalResources avail disk computation This uses the newly-added Node.availDisk to compute the actual available disk correctl, and display the total allocatable disk in hspace.
1a7eff0e	07/03/2009 12:50 am	Iustin Pop	Add a new type for cluster statistics Currently totalResources returns a 5-tuple of integers. This is not easy to handle, as each change on the return type means that each caller must be updated. This patch adds a new type for cluster stats and uses that instead as...
e2af3156	07/02/2009 01:33 pm	Iustin Pop	Add display of more stats in hspace This patch changes Cluster.totalResources to compute more details about the cluster status, and enhances hspace to display more of these.
0c936d24	06/16/2009 12:52 pm	Iustin Pop	Fix a haddock/docstring issue
78694255	06/12/2009 02:22 am	Iustin Pop	Fix the various monomorphism warning In a few places (e.g. tryRead or any printf call) it's a little bit hard to add the correct type signatures, but in the it is possible to fix these warnings (which can bite one in subtle cases).
3c64b5aa	06/12/2009 01:12 am	Iustin Pop	Small changes to the node list output This is just some cleanup of the node list output, adding pcpu/vcpu counters, and making the display slightly nicer.
0a8dd21d	06/11/2009 12:17 am	Iustin Pop	Add cpu ratio to cluster calculation
1a82215d	06/10/2009 11:29 pm	Iustin Pop	Add cpu-count-related attributes to nodes This patch adds cpu-count related attributes to nodes: - total cpus - cpus in use - ratio of virtual:physical cpus We also set correctly the cpu values at load time, but we don't do anything yet while moving instances around. The cpu ratio is shown in...
70db354e	06/04/2009 04:32 pm	Iustin Pop	Fix the ReplacePrimary instance move During a replace-primary instance move, on the real cluster the instance is temporarily started on the secondary, and as such we must check that the secondary node can hold it for this duration. Currently the code does not, and depending on cluster scoring it will put instances on such...
9dcec001	06/01/2009 04:48 pm	Iustin Pop	Rework the tryAlloc/tryReloc functions Currently tryAlloc/tryReloc do not return the new instance, as this is not needed for IAllocator alloc/reloc requests. However, for computing the space, the new instance is useful, so we modify these functions to return this information too....
e2fa2baf	06/01/2009 12:55 pm	Iustin Pop	Add copyright/license information This doc-patch adds copyright and license information to (hopefully) all needed files.
0991ed70	06/01/2009 12:18 pm	Iustin Pop	Small whitespace change
dbba5246	06/01/2009 12:18 pm	Iustin Pop	Move some alloc functions from hail into Cluster These are generic enough to be used from multiple places, they belong better in Cluster.hs than in the hail source.
d85a0a0f	06/01/2009 12:18 pm	Iustin Pop	Cleanup an old function Also replace a type with its synonim.
9188aeef	06/01/2009 12:18 pm	Iustin Pop	Lots of documentation updates This patch does only doc build changes, doc changes and function move around (for more logical documentation). It should have no impact at all on the code.
f9fc7a63	05/27/2009 11:01 pm	Iustin Pop	Remove an unused type synonim
608efcce	05/27/2009 10:45 pm	Iustin Pop	Add type synonyms for the node/instance indices This is a first step towards full datatype renaming. That requires more changes, so at first we only want to document clearly what is a node index, what is an instance index, and what is a plain Int.
262a08a2	05/27/2009 02:17 am	Iustin Pop	Change the module import hierarchy This patch makes the Types module a base module, and Node/Instance ones import it, from the previous (opposite) situation. This will allow in the future to use newtypes for the index and name types.
5e15f460	05/25/2009 09:31 pm	Iustin Pop	hail: Implement non-mirrored instance allocation This patch implements non-mirrored instance allocation, by allocating as secondary node “noSecondary”.
4a340313	05/25/2009 02:09 am	Iustin Pop	Implement hail allocate (for 2-node requests) This patch implements allocate for two node requests. One node requests can be done as soon as we have a valid allocateOn function for single nodes.
58709f92	05/25/2009 02:06 am	Iustin Pop	Working implementation if relocate This patch completes the implementation of hail relocate. It maps all valid destination nodes through a ReplaceSecondary IMove, filters out the failed relocations, computes the resulting scores and picks the lowest one.
db1bcfe8	05/24/2009 02:29 am	Iustin Pop	Remove most uses of ktn/kti This patch removes all uses of ktn/kti from the past-loader stages.
dbd6700b	05/24/2009 02:05 am	Iustin Pop	Remove some extraneous uses of ktn/kti Since we have Node/Instance.name, we can now simplify a few constructs.
446d8827	05/24/2009 01:16 am	Iustin Pop	Move checkData from Cluster to Loader This moves the remaining loading function to Loader (together with its associated support functions).
e4c5beaf	05/23/2009 02:29 am	Iustin Pop	More code reorganizations This new big patch does a couple of more cleanups in the loading of data chapter: - introduce a Types module that holds most types (except the base Node/Instance/etc.) so that multiple other modules can use these (instead of only Cluster and its users)...
040afc35	05/22/2009 08:03 pm	Iustin Pop	Rework the loader model This big patch changes the loader model from “string data as common format” to actual object structures as common format. The text loading function move from Cluster.hs to a new Text.hs module, some common functions are moved to a new Loader.hs module, and the...
7e7f6ca2	05/21/2009 03:54 am	Iustin Pop	Experimental support for non-redundant instances This patch adds experimental support to hbal for non-redundant instances (i.e. instances with only one node). They are currently handled as non-moveable, and as such the algorithm simply ignores them. Supports needs to be added when reading from RAPI via hscan, and...
b33a2243	05/21/2009 03:31 am	Iustin Pop	Small doc addition
1c035cb3	05/21/2009 03:26 am	Iustin Pop	Introduce nice errors on invalid input fields This patch switches from plain read to a wrapper over readsPrec that returns better error messages than the buildin 'Prelude: no parse'.
62007053	05/21/2009 03:10 am	Iustin Pop	Split node/instance parsing into functions This allows easy checking for valid format of the input data (row-wise).
9d3fada5	05/21/2009 02:37 am	Iustin Pop	Add initial validation checks in Cluster.loadData This patch converts loadTabular and loadData to a monadic form, thus allowing meaningful error messages from the node/instance load routines.
fd22ce8e	05/21/2009 02:09 am	Iustin Pop	Convert Cluster.loadData to Result return This patch changes Cluster.loadData to return a Result, instead of directly the values; this will allow us to return meaningful error values (e.g. when an instances lives on unknown node) rather than simply abort. Currently the result is always an Ok, the actual signalling of...
234d8af0	05/20/2009 12:59 am	Iustin Pop	Don't consider offline nodes as N+1 failed This is just a cosmetic (I hope) change; the nodes shouldn't be used anyway, and we only correct the display message.

Synnefo » snf-ganeti

root / Ganeti / HTools / Cluster.hs @ adc5c176