/ - Changes - snf-ganeti - Greek Research and Technology Network's projects

| Branch: | Tag: | Revision:

root @ b7b29191

#	Date	Author	Comment
b7b29191	11/24/2010 03:55 pm	Iustin Pop	RAPI: read the group UUID from the server This depends on future support from Ganeti (2.4+). Signed-off-by: Iustin Pop <iustin@google.com> Reviewed-by: Balazs Lecz <leczb@google.com>
31463db5	11/24/2010 03:55 pm	Iustin Pop	IAlloc: read group uuid from the input message This makes the code incompatible with JSON files from Ganeti pre-2.4. Signed-off-by: Iustin Pop <iustin@google.com> Reviewed-by: Balazs Lecz <leczb@google.com>
b3707354	11/24/2010 03:55 pm	Iustin Pop	Text: read/save the node group UUID Compatibility with old text files is kept by using the default UUID if the file (or even some records) don't have a UUID. Signed-off-by: Iustin Pop <iustin@google.com> Reviewed-by: Balazs Lecz <leczb@google.com>
f5ed8632	11/24/2010 03:55 pm	Iustin Pop	Luxi: read the node uuid from the cluster This makes the code incompatible with Ganeti pre-2.4. Signed-off-by: Iustin Pop <iustin@google.com> Reviewed-by: Balazs Lecz <leczb@google.com>
a68004b7	11/24/2010 03:55 pm	Iustin Pop	Node: add the node group's UUID This is not used anywhere yet, and the backend are all just adding the default UUID, not the real one. The patch also allows displaying the group UUID in the node list. Signed-off-by: Iustin Pop <iustin@google.com> Reviewed-by: Balazs Lecz <leczb@google.com>
9b9da389	11/24/2010 03:54 pm	Iustin Pop	Utils: add a default UUID This will be used as a placeholder for the cases when we need a UUID (any UUID), but we don't have one handy. Signed-off-by: Iustin Pop <iustin@google.com> Reviewed-by: Balazs Lecz <leczb@google.com>
a3e8da03	11/23/2010 04:46 pm	Iustin Pop	Merge branch 'devel-0.2' into master
7570569e	11/23/2010 12:58 pm	Iustin Pop	Improve the standard deviation computation This does just two passes, instead of three, over the list. This reduces the overall runtime well enough (~25%) in some tests, but it's not reproducible using profiling, so I don't know how much the function itself is being sped-up....
543e859d	11/23/2010 12:58 pm	Iustin Pop	hbal: change handling of signal Currently, hbal does a one-two signal handling, where the first signal causes graceful termination, and the second one an immediate on (either SIGINT or SIGTERM can be used, interchangeably). However, this poses a timing problem: if two programs want to send a graceful termination...
5e718042	11/19/2010 01:35 pm	Iustin Pop	Simu loader: move the loading to non-IO code While we don't actually have IO code in the Simu loader, we do have the same interface. So we move the code again to a separate parseData function which is exported.
b3f0710c	11/19/2010 01:08 pm	Iustin Pop	Luxi loader: split parsing from loading
748bfcc2	11/19/2010 01:06 pm	Iustin Pop	Rapi loader: split parsing from loading The change is similar to the text loader change.
dadfc261	11/19/2010 01:00 pm	Iustin Pop	Text loader: split parsing from loadData This change, which will be followed by similar changes in the other loaders, splits the parsing of the data from the actual loading from disk. Since the parsing doesn't usually involve IO actions, we will be able to better test the parsing. The loading becomes a smaller part of...
9d775204	11/11/2010 01:02 pm	Iustin Pop	Ignore nodes which are not vm_capable This break compatibility with Ganeti pre-2.3.
92d43268	11/09/2010 09:51 am	Iustin Pop	Merge branch 'devel-0.2' devel-0.2: Fix tag exclusion weight
306cccd5	11/09/2010 09:37 am	Iustin Pop	Fix tag exclusion weight Currently, the tag exclusion metric has a weight of one, which means there might be cases where we won't move instances around because it upsets the cluster metrics. However, we do want to make a higher effort for cleaning up tag collisions, so we increase the weight to an...
718f135d	10/26/2010 12:37 pm	Iustin Pop	Force UTF-8 locale for pandoc invocation Pandoc 1.5.x uses the locale information to parse its input files (only 1.5, pre and post version use always UTF-8). Hence we need to enforce a UTF-8 locale for proper parsing of input files.
49148d15	10/25/2010 06:39 pm	Iustin Pop	Move from hand-written man pages to RST/pandoc This simplifies the maintenance of the man pages, and unifies the rst-to-* converter to pandoc.
92921ea4	10/21/2010 04:58 pm	Iustin Pop	Add design for htools/Ganeti 2.3 sync This is a work in progress, will be modified along with the progress of Ganeti 2.3.
ca8e1c6a	10/07/2010 04:09 pm	Iustin Pop	Update NEWS file for 0.2.7 release
e3ae9508	10/07/2010 03:42 pm	Iustin Pop	Fix some warnings in unittests
4886952e	10/06/2010 04:23 pm	Iustin Pop	Add a hack for normalized CPU values in hspace Currently, the key metrics/tiered spec computations show the virtual cpu count. However, since we do have a maximum ration Vcpu/Pcpu, we can also show the “normalized” cpu count, i.e. the equivalent physical cpu count...
03c6d8fa	10/06/2010 03:56 pm	Iustin Pop	Improve the error message for tiered alloc option
03cb89f0	09/15/2010 06:30 pm	Iustin Pop	hbal: implement user-friendly termination requests Currently, hbal will abort immediately when requested (^C, or SIGINT, etc.). This is not nice, since then the already started jobs need to be tracked manually. This patch adds a signal handler for SIGINT and SIGTERM, which will, the...
5f715404	09/03/2010 06:10 pm	Iustin Pop	Document the gain options in hbal's manpage
848b65c9	09/03/2010 06:02 pm	Iustin Pop	Use the mingain options in the balancing algorithm Also adds them in hbal.
4f807a57	09/03/2010 03:35 pm	Iustin Pop	Add new CLI options for min gain during balancing Recent hbal seems to run many steps for small improvements (< 1e-3), so we should stop early in this case. We add a new option (-g), that will be used for the minimum gain during balancing. This check will only become active when the cluster score is...
d78ceb9e	09/02/2010 03:45 pm	Iustin Pop	Makefile: make the rst2html converter more strict This will make the automated builds flag any problems.
adc5c176	09/02/2010 03:43 pm	Iustin Pop	Add some more debugging functions These are just variations of the standard debug, but are provided for simpler code, since lazyness is something causing non-computation of debug statements.
74e89a14	09/02/2010 03:43 pm	Iustin Pop	Fix ReplaceSecondary moves for offline nodes The addition of a new secondary on a node is doing two memory tests: - in strict mode, reject if we get into N+1 failure - reject if the new instance memory is greater than the free memory (not available memory) on the node...
49d977db	09/02/2010 03:43 pm	Iustin Pop	Update NEWS file
db43d7b3	08/30/2010 12:12 pm	Iustin Pop	Update man pages for the new -S option
10852adb	08/30/2010 12:12 pm	Iustin Pop	hspace: mark new instances as running Otherwise the saved cluster state and the in-memory one are wrong.
3e9501d0	08/30/2010 12:12 pm	Iustin Pop	Implement cluster state saving in hspace This also uncovered a few issues with the allocation model (instances not being marked up, etc.). Compared to hbal, hspace will generate either one or two files (for both the standard and the tiered allocation mode), depending on the input...
94d08202	08/30/2010 12:12 pm	Iustin Pop	Change iterateAlloc to return the instance list The Cluster.iterateAlloc and tieredAlloc functions are changed to also return the updated instance list, since it is needed to have a “full” cluster view.
748654f7	08/30/2010 12:12 pm	Iustin Pop	Implement cluster state saving in hbal Also move the LUXI execution (-X) to the end, after all the output messages are printed. No good in waiting for the messages for a long while, especially as they are not up-to-date stats after the job execution, just an estimation of what the state will be.
4a273e97	08/30/2010 12:12 pm	Iustin Pop	Abstract the cluster serialization from hscan.hs This is currently hardcoded in an internal function in hscan.hs, and we move it to Text.hs for later use.
02da9d07	08/25/2010 07:40 pm	Iustin Pop	Add a new option --save-cluster This option will in the future be used to serialize the cluster state in hbal and hspace after the rebalance/allocation steps.
50811e2c	08/25/2010 07:04 pm	Iustin Pop	Add unittest for Node text serialization This checks that the Node text serialization and deserialization operations are idempotent when combined other.
a070c426	08/25/2010 06:53 pm	Iustin Pop	Switch unittest to custom hostnames Currently, the hostnames are almost fully arbitrary chars, which breaks the assumption that nodes/instances will be normal DNS hostnames. This patch adds some custom generators for these hostnames, that will allow better testing of text loader serialization/deserialization.
3bf75b7d	08/24/2010 07:30 pm	Iustin Pop	Move text serialization functions to Text.hs Currently these are in hscan, and cannot be reused easily.
57ef88df	07/29/2010 07:03 am	Iustin Pop	Fix a couple of typos in the manpages Again, thanks to lintian.
0ca66853	07/27/2010 09:44 pm	Iustin Pop	hail: fix error message for failed multi-evac Currently we show the instance index, but this makes no sense outside the current running program. Instead, we show the instance name.
84edb64b	07/27/2010 03:03 am	Iustin Pop	Update NEWS file for the 0.2.6 release
303bb0ed	07/27/2010 03:03 am	Iustin Pop	NEWS: Add double blank lines before headers This looks better for text-only viewing…
f688711c	07/23/2010 03:50 am	Iustin Pop	hscan: return exit code 2 for RAPI failures If some clusters failed during RAPI collection, exit with exit code 2 so that tests can detect this failure.
b7478ce1	07/23/2010 03:32 am	Iustin Pop	More enhancements to live-test.sh
b8262965	07/22/2010 04:57 pm	Iustin Pop	Fix another haddock issue
691dcd2a	07/22/2010 06:03 am	Iustin Pop	Remove an obsolete function and add Utils tests
b880f1d1	07/22/2010 03:32 am	Iustin Pop	Extend the live-test The (recently-enabled) live test coverage stats found a few low-hanging fruits in the tests we do…
7e9e8245	07/22/2010 02:27 am	Iustin Pop	Use --union for hpc sum … which fixes the issue noted in the previous commit (almost a brown paper bag change).
dc61c50b	07/22/2010 01:43 am	Iustin Pop	Preliminary support for coverage during live-test While this doesn't work correctly yet (hpc sum seems to only take common modules, not the sum of modules?), it prepares for gathering coverage data during live-test (as an alternative to unittest coverage data).
223dbe53	07/22/2010 01:42 am	Iustin Pop	Add some more imports to QC.hs This is needed so that in the coverage report we list all modules, even the ones we don't test at all, such that we get the complete results.
c3c7a0c1	07/22/2010 01:42 am	Iustin Pop	Change the meaning of the N+1 fail metric Currently, this metric tracks the nodes failing the N+1 check. While this helps (in some cases) to evacuate such nodes, it's not a good metric since rarely it will change during a step (only at the last instance moving away). Therefore we replace it with the count of...
8a3b30ca	07/22/2010 01:42 am	Iustin Pop	Introduce per-metric weights Currently all metrics have the same weight (we just sum them together). However, for the hard constraints (N+1 failures, offline nodes, etc.) we should handle the metrics differently based on their meaning. For example, an instance living on a primary offline node is worse than an...
2cae47e9	07/22/2010 01:42 am	Iustin Pop	Allow balancing moves to introduce N+1 errors This patch switches the applyMove function to the extended versions of Node.addPri and addSec, and passes the override flag based on the state of the node that we're moving away from.
3e3c9393	07/22/2010 01:42 am	Iustin Pop	Introduce a relaxed add instance mode In case an instance is living on an offline node, it doesn't make sense to refuse moving it because that would create N+1 failures; failing N+1 is still much better than not running at all. Similarly, if the secondary node of an instance is offline, meaning the instance doesn't...
2849670b	07/19/2010 02:20 pm	Iustin Pop	Remove obsolete Container.maxNameLen This was only used in one place (hbal), and is obsolete by the change to the dual name/alias structure.
14c972c7	07/19/2010 02:20 pm	Iustin Pop	hbal: print short names in steps list This was a regression from the name handling changes, as we started using the original names for the solution list (which is not designed for parsing/feeding back into ganeti).
fb33aaaf	07/19/2010 02:20 pm	Iustin Pop	Remove an obsolete function printSolution is no longer used, as we print the solution iteratively now.
6dfa04fd	07/19/2010 12:13 am	Iustin Pop	Allow '+' in node list fields When the field list is prefixed with a plus sign, this will extend the default field list, instead of replacing it entirely.
16f08e82	07/19/2010 12:13 am	Iustin Pop	Update the node list fields This patch renames the pri/sec to pcnt/scnt, and adds the real primary and secondary instance lists, the peermap and the index of a node as selectable options.
124b7cd7	07/19/2010 12:13 am	Iustin Pop	Cleanup a node's peer map when possible If the last secondary instance of a peer is deleted (detected by the new peer memory value being equal to zero), then the pair (pdx, 0) should be deleted completely. This is not optimization per se, but rather cleanup...
f9acea10	07/16/2010 09:31 pm	Iustin Pop	Fix handling of offline options and short names This needs to be abstracted in a separate function, but in the meantime we fix the issue in both places. Signed-off-by: Iustin Pop <iustin@google.com>
95446d7a	06/21/2010 12:12 pm	Iustin Pop	Fix another haddock special-char issue
db079755	06/21/2010 05:59 am	Iustin Pop	Remove JOB_STATUS_GONE and add unittests … for the serialization/deserialization of the job and opcode status. Job status 'gone' was not actually used. It can be reintroduced if needed.
41065165	06/21/2010 05:46 am	Iustin Pop	Add opcode status constants/type This mirrors, again, the Ganeti constats, and are added for future use.
7e98f782	06/21/2010 05:46 am	Iustin Pop	Rename the job status constants The rename is done such that we match Ganeti's own constants.
95f490de	06/08/2010 03:48 am	Iustin Pop	Optimise the Luxi.recvMsg function Since the current buffer cannot contain (during network reads) an EOM, we should look for the EOM only in the newly-received string. While this shouldn't make much difference, in some tests it cuts the recvMsg total time by around half....
04282772	06/08/2010 01:09 am	Iustin Pop	Complete the client Luxi implementation All current Luxi calls are supported after this patch. A bug in ArchiveJob is also fixed (Ganeti's job IDs are strings).
9622919d	06/08/2010 12:35 am	Iustin Pop	Add support for more LUXI calls While not are directly useful, having them will open some possibilities (e.g. polling for job changes in hbal's -X mode, and auto-archiving the jobs once they are successful).
4a007641	06/03/2010 12:08 am	Iustin Pop	Fix some lint errors in the unit tests
683b1ca7	06/02/2010 11:55 pm	Iustin Pop	Change the Luxi operations structure Currently, we define the LuxiOp type as a simple enumeration, and leave the arguments structure to the users of the Ganeti.Luxi module. This is suboptimal for a couple of reasons: first, we decouple the operation type from operation arguments, and that means we don't use the type...
9c0a748f	06/01/2010 11:51 pm	Iustin Pop	Fix a warning in Loader tests Incomplete pattern match…
c088674b	06/01/2010 08:54 pm	Iustin Pop	Add a few Loader tests These are not comprehensive, but at least we have a start.
8c5652f6	05/30/2010 09:16 pm	Iustin Pop	Modify the test runner to show test exceptions QuickCheck's batch driver (at least v1) doesn't show the test aborts, but simply discards the specific exception and increases the abort count. This makes it hard to debug the tests, so we modify our own test...
9e35522c	05/28/2010 12:13 pm	Iustin Pop	Reduce the warnings during the unittests Since the unittests are not 'clean' from the p.o.v. of type declarations, and cannot be made clean in all respects (e.g. orphan instances), we silence some warnings for the test target, to have a cleaner output.
06fe0cea	05/28/2010 12:39 am	Iustin Pop	Improve the test driver The tests are moved to a separate data structure, and we can select a subset of tests to run.
88f25dd0	05/28/2010 12:25 am	Iustin Pop	Introduce OpCode unittests
f36a8028	05/28/2010 12:00 am	Iustin Pop	Introduce suport for optional keys in JObjects Some keys are optional in the Ganeti opcodes (e.g. ‘node’ in the OpReplaceDisks), and as such we need to transform them in a Maybe value, instead of failing. The patch reworks a bit fromObj and adds maybeFromObj which parses such...
c96d44df	05/27/2010 11:37 pm	Iustin Pop	Replace fromJResult with annotateJResult This patch removes all old uses of fromJResult with the annotated version, and removes the non-annotated version. All JSON parsing points should now have annotated errors.
c8b662f1	05/27/2010 11:32 pm	Iustin Pop	Add annotations to loadJSArray This allows, for example, the RAPI backend to detail which information (instance or node data) fails to parse.
50d26669	05/27/2010 11:23 pm	Iustin Pop	Change fromObj error messages Currently fromObj doesn't detail what we're trying to read, which can lead to cryptic messages: "Cannot read Int". The patch changes this function to annotate the error messages with the key/value we're trying to convert, by using a new version of fromJResult....
82ea2874	05/27/2010 01:11 am	Iustin Pop	A few more small Node unit-tests
39d11971	05/25/2010 08:17 pm	Iustin Pop	Add more unittests Instance, Node and Text modules have improved coverage.
3fea6959	05/20/2010 07:45 pm	Iustin Pop	Add more unit tests for allocation/balance The patch adds some simple unit-tests for both the allocation function (we can allocate small instances on an empty cluster, we can allocate in tiered more starting from any size) and the balancing functions (one...
3ce8009a	05/20/2010 01:31 pm	Iustin Pop	Move two functions from hspace to Cluster.hs This is done so we can test a longer pipeline.
8423f76b	05/20/2010 01:31 pm	Iustin Pop	Make CStats instance of show This helps debugging via ghci.
ada2fc6d	05/20/2010 12:19 pm	Iustin Pop	Clarify options related to name passing After the name patches, we can pass in either the short or the full name, so update the hbal man page accordingly.
381be58a	05/20/2010 12:19 pm	Iustin Pop	Another haddoc fix…
c854092b	05/20/2010 12:07 pm	Iustin Pop	Accept both full and short names in CLI This patch introduces some new functionality in the base Element type and in Container which supports searching for all 'known' names of an element, such that both short and full names are accept for various options like '-O' and '--excluded-instances'.
3e4480e0	05/20/2010 12:07 pm	Iustin Pop	Stop modifying names for internal computations Currently the name used internally is modified and holds the shortened name of the nodes/instances. This has caused issues before, since we always have to strip the suffix from input data and reapply it if we...
8bcdde0c	05/20/2010 12:07 pm	Iustin Pop	Add a new node/instance field This new field ('alias') will hold the shortened/beautified display name. When resetting the name, the alias is reset too, and there's a new function to update only the alias.
49f9627a	05/20/2010 12:07 pm	Iustin Pop	Change some test constants First, we reduce the max size of the disks, since Int on 32bits will overflow for big simulated clusters. This is a real issue, that will need fixing in real life, but for now we just "silence" this test. Second, we increase the amount of time a test is allowed to run,...
3ed46bb7	05/19/2010 04:28 pm	Iustin Pop	Fix some haddock comments
8fcf251f	05/19/2010 04:09 pm	Iustin Pop	Add more unit tests This increases the overall coverage by 5%-10% (depending on coverage type). Some modules are still not unittested at all, as HUnit is a better choice for them.
1e3dccc8	05/19/2010 04:08 pm	Iustin Pop	Shuffle some constants around … and export more functions. This will help with unit testing.
f4c0b8c5	05/18/2010 07:31 pm	Iustin Pop	Remove the noLimit values and always use limits This patch moves from allowing no-limits for disk/cpu ratios, and always use a real limit. For disk, it's simple since we use 0, which means no reservations for disks. For CPU, we set an (arbitrary) limit of 64 v/p,...
317b1040	05/17/2010 05:40 pm	Iustin Pop	hspace: change handling of N+1 bad clusters Currently we just print a fake result and exit early. This is bad, since it doesn't use the same codepaths for all the result printing, and has already led to a bug where hspace looks like completely ignoring the...
e2436511	05/04/2010 02:42 pm	Iustin Pop	Fix hspace's KM metrics We returned the KM_POOL_* metrics as the final state, not as the delta between the final and the initial state.

Synnefo » snf-ganeti

root @ b7b29191