Add DCStatus data type for the data collectors
Also adds the DCStatusCode, part of DCStatus, and the addStatusutility function for adding the "status" field to an already existingJSValue.
The design document is updated to have the status codes sorted by increasing...
Add Kind data type for data collectors
Also, add it to the DRBD data collector, and export it from there.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Add data type for data collector category
Also, update the DRBD data collector to use and export it.
Export the dcVersionInformation for the Drbd collector
Also, update the JSON output (and the design document) so that it is notin camelcase anymore. This is part of a bigger effort to remove camelcasefrom the exposed JSON.
Signed-off-by: Michele Tartara <mtartara@google.com>...
Add data collector version data type
Define the new data type and update the DRBD data collector to use it.
Fix comment line position in DRBD data collector
Export dcName information of Drbd data collector
Refactor ispecs in ipolicy structures
Minimum and maximum instance specs are put together into a single elementof the instance policy. This is in preparation for introducing multiplemin/max specs.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>...
gnt-cluster modify: dis/enabling storage types
This patch extends the 'gnt-cluster modify' command to manipulate the listof enabled storage types. Note that this currenlty does no validationwith respect to whether or not there are instances currently using a storage...
gnt-cluster info: show enabled storage types
This extends the 'gnt-cluster info' command to list the storage typesthat are enabled on the cluster. It also fixes the broken indentationin the 'handleCall' function.
Signed-off-by: Helga Velroyen <helgav@google.com>...
Add 'enabled_storage_types' to the cluster config
This patch adds the cluster's new field 'enabled_storage_types'to the configuration objects in python and haskell.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add monitoring HTTP API structure
Add all the supported commands to the API.The actual response is still to be implemented.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Add basic HTTP server functionalities to Mond
Add a stub implementation of the Mond HTTP server to Mond using the Haskellsnap-server library.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add the core of the monitoring daemon
This commit adds the core infrastructure of the monitoring daemon,and integrates it in the build and test systems.
The actual functionality of the monitoring daemon is still completelymissing.
Typo 'repot' in Server.hs
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add Mond to the list of possible daemons
Also, add its logfiles and extra log files.
Fix network query field types/names in the Haskell code
The headers/type/descriptions had some differences from the Pythoncode, when checked for exact equivalence.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Fix gnt-backup list -o node via confd
Currently, the 'node' field is declared as a simple config field, sowhen only selecting this fields, the runtime gathering is no longerrun and it's presumed that all nodes have a backup. So the output isnot truthful (instead of just listing the nodes with at least one...
Make gnt-node list -o(p|s)inst_list output stable
Currently, both the Python and Haskell code return the internalinstance list unsorted, which means the output can vary depending onthe phase of the moon (well, the Haskell code actually uses internallya tree, sorted by the instance name, but it's implementation detail)....
Sort instance list in gnt-group list -opinst_list
The Python code currently sorts this, but the Haskell code not.
This should maybe have a test, but I'm not sure how far we want toencode such properties in tests… (and the real reason I'm not addingone is that we don't have a way to generate a random cluster with...
Change node disk/hv_state query in confd
Currently, the Python code returns either FS_UNAVAIL (if theseattributes are None) or the proper dicts. As we don't allow editing ofthese attributes, in most cases they will therefore be FS_UNAVAIL onthe client....
A few style fixes in Ganeti.Network
Side-effects of working on some other network-related stuff…
Add simple Ip4Address/Ip4Network types
This patch adds some very simple IPv4 address/network types, and usesthem in the 'Network' config object.
We need these in order to properly compute the reserved IP addresses,without depending on an external library (which I haven't found, by...
Add missing external_reservations query field in confd
Based on the implemented Ip4Network/Address types, we can now computethe (external) reservations.
Fix style error in hconfd
The first line of a function should be blank, unless it is able to contain thewhole function.
Merge branch 'devel-2.7'
Signed-off-by: Iustin Pop <iustin@google.com>...
Change hbal behaviour in case of early exit
Currently, hbal exits with status 1 if early exit is requested, evenwhen all jobs are successful. This is counter-intuitive behaviour, solet's fix it (Issue 386).
Note that the man page had conflicting information already, so it's a...
Conflicts: Makefile.am (curl changes and new hs directories)...
Constants.hs.in: improve Haddock markup in the template
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Switch Attoparsec parser from double to rational
According to the documentation, “This function is almost ten timesfaster than rational, but is slightly less accurate. For 94.2% ofnumbers, this function and rational give identical results, but forthe remaining 5.8%, this function loses precision around the 15th...
Fix node partial name matching in Haskell code
This implements QffHostname and fixes the node listing (as well asexport listing when filtering on node name).
This bug was hidden by the fact that node listing with "gnt-node listaa" works if you don't have live queries (as it was originally), as...
Fix bug in group queries related to node/instance fields
Since we use the primitive string type for group UUIDs, the groupfields have a bug where we pass the group name as filter for nodetests, whereas the nodes themselves use the group UUID. This results...
Abstract the individual query functions
After implementing a few of the query executor functions, it turns outthat we have the same general pattern:
- compile the filter- extract the selected fields- determine whether we need to run collectors- do a first pass filtering...
Allow confd to serve network list-fields queries
The fields are not yet complete, but at least we can enable thelist-field query to see what is there already.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Rename/make uniform the other query entities
Following the new naming style introduced in Exports.hs, this patchrenames the other resources to export non-qualified names (fieldMap asopposed to nodeFieldMap), and to use qualified module imports.
Also fixes a haddock issue in a docstring....
Fix improperly formatted docstring
Change the docstring of chompPrefix to prevent the error"doc comment parse failed" that was raised by some version ofhaddock while generating the documentation for this function.
Fix low verbosity levels in htools
In a few cases, we tested the verbosity level for (== 0), instead ofhigher/lower than a certain value. If the user passes multiple"--quiet" options, this can result in negative verbosity levels, whichbehave like "extra verbosity"....
Fix confd issue regarding --no-lvm-storage
If cluster is initialized with --no-lvm-storage then volume_group_namedoes not exist in config.data. Thus we must define it as optional inconfd.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>[iustin@google.com: fixed Haskell RPC definition]...
HRoller: print only online nodes
To make the graphs work even when instances live on offline nodes (eg.because we're offlining them just to exclude them, or because they haveinstance still on them) we just filter them out at the end, when we'regoing to print out the result....
HRoller: allow filtering by node group
Accept the -G option, and if it's passed require that it matches anodegroup, then only output nodes belonging to that group.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Improve the rpc-test program
This is an ugly patch, sorry. It adds the following features torpc-test, to help with (stress) testing the Haskell RPC client:
- customisable repeat count for the RPCs- customisable parallelisation factor- options to show timing stats and other information...
Implement Export queries in Haskell
This is a simple query as it has only two fields, however it's thefirst query that doesn't have a clear 'base' object and 1:1correspondence between such objects and the results (groups, nodes andnetworks do so).
We keep nodes as the 'base' object, since that's what we want to...
Add export_list RPC call definitions
This is straightforward, as the call has no parameters and a very simple return type.
Improve TemplateHaskell code to support empty objects
Currently, an empty objects will generate warnings as the arguments ofvarious functions are unused. By adding conditional code for this, wecan support generation of empty objects, e.g. like needed in Rpc code....
Fix another docstring typo
… no comment :)
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Switch the curl bindings from optional to required
Currently, we support curl being optional via some sporting exercises:ifdefs in the code, data types that represent 'Curl is disabled'state, etc. However, with the future work on RPC, we would have toeven make the dependencies list conditional on it, etc. This is too...
Reduce duplication of curl options computation
Some curl option are request-specific, but not node specific: e.g. rpctimeout, etc. The patch changes the HttpClientRequest type so that wecan pre-seed such options, instead of rebuilding the list in eachindividual request execution....
Simplify RPC error cases
This patch removes the node from the RPC error constructursCurlLayerError and OfflineNodeError. The rationale is that we anywayreturn tuples (node, result), and removing this duplication allowssimplified signatures/calls in the execution of RPC calls....
Add two utility functions for handling Either lists
These two functions permit operating in bulk on only the Left or Rightvalues in the original list, then reassembling the list back in theoriginal order.
Add a Ganeti-specific implementation of Curl Multi
As we want to be able to run queries against multiple nodes inparallel, and furthermore in parallel with other work, we need toimplement the Curl Multi interface (see libcurl-multi(3)).
This patch adds a Ganeti-specific such implementation, to be used...
Switch the RPC module over to the multi interface
This replaces the very-basic parMap of IO actions (fully serialised,as parMap won't work here), to the multi interface.
This makes a simple "time gnt-node list" on a 6-node cluster go from3.2s to ~0.9s, and allows even better parallelisation - before,...
Status change reason support for Reboot
Add support to the Reboot command for specifying the reason for the laststatus change.
Some features are implemented as functions, even if used only once, becausethey will be used by the future patches introducing reason support for all...
Infrastructure for specifying instance status change reason
This patch introduces some infrastructural modifications that will be used bythe following commits to implement the support for specifying the reason forthe last status change of an instance....
Unit tests for Query/Network.hs
This patch adds a couple of unit tests for Query/Network.hs.Note that they'll need to be adapted, once issue 362 is addressed.
Make Confd client usable for testing
Allow the Confd client to be able to connect to an arbitrary serverinstead of just the real one running on a cluster.
This is meant to be used for testing.
If a data collector using the Confd client is tested with shelltest, it will...
Add Haskell parser for "xm uptime"
In order to fetch precise information about the uptime of the VMsrunning in Xen, we need to analyze the output of the "xm uptime" command.
This commit adds the parser to do that, and its tests.
Add Haskell parser for "xm list --long"
In order to fetch precise information about the status of the VMs running inXen, we need to analyze the output of the "xm list --long" command.
Add request type to Confd server for getting instance list
Add to Confd server a new request type (and its implementation) to ask forthe list of instances in a node.
Add a function to change an OpCode's priority
This simply updates the metaopcode submit priority.
Add functions to parse CLI-level format of priorities
The current serialisation format for submit priorities isinteger-based, same as the opcode json serialisation. But for CLIlevel, we need to support a string-based format, so we add functionsto parse and format this representation....
Add CLI-level option to override the priority
This just defined the new priority, with the same name as the Python one.
Enable use of the priority option in hbal
This patch adds the option to hbal, and uses it to tweak the submittedjobs. There are also two small shelltests for testing the parsing.
Make hbal opcode annotation more generic
Currently, hbal code always uses annotateOpCode function, which meanswe would have to pass the options data to all function in the callchain if we wanted to make this more flexible.
By abstracting the type of the annotator and passing it as an argument...
Removes check for conflicts from NetworkDisconnect
This removes the check for conflicts from the Haskellversion of the OpCode NetworkDisconnect. This alignesthe Haskell code with the patch"Force conflicts check in LUNetworkDisconnect" (whichis currently under review). I will submit these patches...
Remove network_type slot (Issue 363)
This slot was not used by Ganeti so the same info can beprovided via tags. In order not to break configuration datawe add a FromDict() method in Network config object thatremoves the deprecated network_type (if found) and then invoke...
Remove family and size from network objects
This info is not used by Ganeti and therefore is removed.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>Reviewed-by: Guido Trotter <ultrotter@google.com>
Make ParticalNic's network field of type String
This was applied to "master" along with extra changes affecting themaster branch only. Cherry-picking just the Objects.hs change.
Remove use of 'head' and add hlint warning for it
Since 'head' is unsafe to use in most cases, this patch removes itsuse from most of the code, adds a lint warning for it (and for tail aswell), and adds override annotations in the few cases where it'sactually OK to use it (mainly when using head over the result of...
Harep.hs: fix a couple typos in comments and docstrings
Signed-off-by: Dato Simó <dato@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Making ParticalNic's network field of type String
This is yet another fix for type confusion between pythonand haskell. ;) The network field of PartialNic should bea string and not of type Network. This makes it necessaryto add a helper function to look up a network by name...
Fix typo in a comment
Fix Haskell log file naming after virtual cluster changes
Commit 3329f4de changed the Haskell log file from constants tofunctions, but introduced a bug: it uses now the daemon name insteadof the correct log file, which means "ganeti-confd.log" instead of...
harep: create repair jobs
Implement 'doRepair' to create a repair job from a list of opcodes ifthe instance's policy allows it (otherwise set an ENOPERM result label),and the instance was previously healthy (i.e. not in ArFailed orArPendingRepair)....
harep: do not wait for repair job completion to set tags
Because of instance locks, after submitting a repair job we weren't able toset the "pending" tag until at least the first opcode of the job finished.Introduce a small delay in the repair job so as to allow the subsequent...
harep: finish execution with a summary of instance states
The harep tool prints messages for every action that it performs (or thatit doesn't perform). In case nothing is to be done at all, always printsome statistics of the current state of the cluster....
harep: initial parsing of tags
Parse auto-repair tags to set each instance in one of ArHealthy, ArFailed,or ArPendingRepair. The implementation tries to be well behaved when oldtags have been left behind, which future patches will still try not to do....
harep: check for completed jobs at the start of the program
As a first step before detecting any brokeness with instances, see if anyof our previous repairs have completed, and move instances to ArFailed orArHealthy accordingly. Do nothing if there are still running jobs for the...
harep: pure function to detect brokeness with instances
Add a 'detectBroken' function that determines whether an instance is in anunhealthy state, and what's needed to repair it. The repair is specified asan AutoRepairType constant, and a list of opcodes. The opcodes will only be...
Program/Harep.hs: add skeleton for the new auto-repair tool
harep(1) detects certain kind of problems with instances and applies theallowed set of solutions. See doc/design-autorepair.rst.
Signed-off-by: Dato Simó <dato@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Fix typo ("Abrreviation" -> "Abbreviation" in Common.hs)
Jobs.hs: add a submitJobs function
This new 'submitJobs' function doesn't wait for job completion, it justreturns the job IDs immediately.
Jobs.hs: add an execJobsWaitOk function
This new 'execJobsWaitOk' returns "Bad" not only if an error occurs, but ifany of the submitted jobs doesn't succeed.
Jobs.hs: start with a shorter delay in waitForJobs
Instead of waiting 15 seconds every iteration of the waiting loop, startwith 0.5 seconds and increment exponentially until a certain maximum isreached. Keep the maximum at 15 seconds.
Signed-off-by: Dato Simó <dato@google.com>...
HTools/Types.hs: minor adjustments to auto-repair types
In particular:
- make ArHealthy take an optional AutoRepairData; this allows to represent the situation where a repair completed successfully, and hence there's an associated tag we might want to know about....
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
CLI.hs: fix double spaces in option help strings
Some help strings with continuation backslashes ('\') were providing aspace both before and after the backslash, resulting in double spaces inhelp output. Provide it only after the backslash, which fixes the issue and...
Add fields 'inst_cnt' and 'inst_list' to network queries
This adds the fields 'inst_cnt' and 'inst_list' to theHaskell implementation of the network queries.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Moving network query helper functions to Network.hs
So far, a couple of helper function for the networkqueries resided in Config.hs. I figured it makes moresense to move them to Query/Network.hs, since they arevery tailored to that purpose.
More fields for network queries
This adds more fields to the network queries:- group_cnt, free_count, reserved_count, and map
First part of Network Queries in Haskell
This is the beginning of the implementation of networkqueries. This includes establishing all infrastructureto run the network queries and implement querying ofsome simpler fields and the node group listing.
Convert Maybe results to RSUnavail
When displaying query results of type Maybe, one could use thefunction rsMaybe. Unfortunately, this function maps 'Nothing'values to RSNoData which gets displayed as '?' in the list ofquery results. These semantics do not fit if the result is of...
Extend config by networks and networks by UUIDs
For network queries to work, we need to extend the generalconfig type to include all available networks. Additionally,we add UUIDs to the network type itself.
Typo in comment of network type
Renames and cleanup of variable names in confd
The current names are quite confusing; this patch cleans up theconfusion by making sure we use different terms for the two threads,etc.
No actual code changes besides the renames.