Fix opcodes and parameters
Merge branch 'stable-2.9'
Merge branch 'stable-2.8' into stable-2.9
Update Harep, Query server, and tests
Update Harep, Haskell query server, and tests concerning Luxi andopcodes to reflect the changes to Haskell to Python opcodegeneration. This change is necessary because TagObject is replaced byTagKind and some types in opcodes and parameters changed to be...
Add cleanup parameter to instance failover
Most of the code is shared with instance migrate, so we actually only needto add the parameter and pass its value along the the common code.
Also, tests and harep are updated to support the right set of options to...
hroller: option --full-evacuation
Add an option to hroller, to plan for full evacuation of thenodes to be rebooted, i.e., also plan for replacement secondarynodes for all instances on the node after migrating out instanceswith this node as primary.
Signed-off-by: Klaus Aehlig <aehlig@google.com>...
Extract a partition functional
Separate the partitionNonRedundant function in hroller into ageneral functional that partitions a list of nodes accordingto some clearing strategy and the specialization of movingnon-redundant instances out. In this way, we don't have to...
Extract functional for greedily clearing nodes
The method clearNodes in hroller greedily clears nodes ofnon-redundant instances by moving them to a different node. This patchseparates the greedy clearing algorithm from the specialization tonon-redundant instances; in this way, we don't have to duplicate code...
Make hroller not consider offline nodes for evacuation
When planing on where to evacuate the non-redundant instancesof the nodes to be rebooted, it doesn't make sense to consideroffline nodes. So add this restriction to hroller.
Update comments in hroller code
hroller schedules moves of instances to have rebooted nodesfree of instances with this node as primary. Update the commentsto reflect that this move planning is for non-redundant instancesonly.
Remove obsolete TODO
Originally, hroller started as a tool for offline maintenance only.There it made sense to warn about instances still running. By now,default planning is to migrate instance off the nodes to be rebooted,with options for other behavior (like pretending that all instances...
Index instances by their UUID
No longer index instances by their name but by their UUID in the clusterconfig. This change changes large parts of the code, as the followingadjustments were necessary: * Change the index key to UUID in the configuration and the...
Index nodes by their UUID
No longer index nodes by their name but by their UUID in the clusterconfig. This change changes large parts of the code, as the followingadjustments were necessary: * Change the index key to UUID in the configuration and the ConfigWriter, including all methods....
Add type annotation to avoid monomorphism restriction
Even though we need the let-bound variable showMoves onlyat type [(String, String)] -> IO (), it's most general typewould be (PrintfArg a, PrintfArg b) => [(a, b)] -> IO ().This causes the monomorphism restriction apply to that binding,...
add option --print-moves to hroller
If non-redundant instances are present in the cluster, hroller willplan for them to move to other nodes while the group is rebooted.This adds an option to also show this plan.
hspace prints info about spindles
Statistics about spindles are tracked. In human-readable output, spindlesare printed only when used (i.e., exclusive storage is enabled). Formachine-oriented output, they are always there.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>...
Spindles become part of htools resource spec
Spindles are now part of resource spec. Instances get created with spindlesspecified (which are just ignored when exclusive storage is disabled).
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Add spindles to instance disks in htools
A new data type is introduced for disks to store both size and spindles.When available, spindles are filled with input data. Except for loading andstoring, spindles are ignored.
Restrict instance moves in hroller to the same node group
When scheduling rolling reboots, hroller looks for nodes to evacuatethe non-redundant instances to. This is done by greedily movinginstances to other nodes that can take them, policy wise and capacity...
Merge branch 'stable-2.8' into master
Parse NIC data from allocation request in hail
Add a NIC type and extend the Instance type by a list of NIC's. Parsethe NIC's in allocation requests and store them for now. Later patcheswill make use of this field in order to ensure that the requestedinstance is only placed in node groups wich are connected to those...
hroller: option to ignore non-redundant instances
Add an option to hroller restoring the old behavior on not takingany non-redundant instances into account when forming rebootgroups.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Make hroller also plan for non-redundant instances
Non-redundant instances need to be moved to a different nodebefore maintenance of the node. Even though they can be moved toany node, there must be enough capacity to host the instances of thereboot group to be evacuated....
hroller: option to skip nodes with non-redundant instances
So far, hroller ignores the fact, that non-redundant instances exist.One option to deal is non-redundant instances is to not schedule thosenodes for reboot. This is supported by adding the option --skip-non-redundant....
Remove trailing whitespace
Support online-maintenance in hroller
Make hroller take into account the nodes (redundant) instanceswill be migrated to. This be behavior can be overridden by the--offline-maintenance option which will make hroller plan underthe assumption that all instances will be shutdown before starting...
Add option --one-step-only to hroller
Add a new option to hroller to only output information about the firstreboot group. Together with the option --node-tags this allows for thefollowing work flow. First tag all nodes; then repeatedly compute thefirst node group, handle these nodes and remove the tags. In between...
Sort reboot groups by size
Make hroller output the node groups not containing the master nodesorted by size, largest group first. The master node still remainsthe last node of the last reboot group. In this way, most progressis made when switching back to normal cluster operations after the...
Fix lint errors (redundant bracket)
Add option to hroller to select nodes based on tags
Add option --node-tags to tell hroller to consider only nodeswith these tags. A use case would be a tag tracking on whichnodes the maintenance has not yet been carried out, e.g., ifrolling reboots are interleaved with other cluster operations....
Make hroller filter the nodes before coloring the graph
Hroller used to first compute a coloring of the node graph and thenfilter out the nodes that it had to work on. While the only filteringwas according to node groups this did not make a difference, as there...
hspace: Handle multiple ipolicy specs
With tiered allocation, hspace uses all the max specs in turn as theinitial instance spec.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Add multiple min/max specs in instance policy
Now instance policies can contain more than one min/max specs. This is themain element of the "Constrained instance sizes" section in the"Partitioned Ganeti" design doc.
This is a big patch, but changing the type of a configuration item requires...
Make hroller insist on finding precisely one master node
As people rely on the master node being the last node of the lastgroup, make hroller fail, if no master node could be found in thecluster. This happens, e.g., if a backend format is used that does not...
Make Hroller present master node last
If in the list of nodes to be scheduled for maintaince,one is marked as being the master node, schedule itas the last node in the last group.
Fix warnings hlint 1.8.43 complained about
These lines are ok according to previous versions of hlint but triggeran error with version 1.8.43.
Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Merge branch 'devel-2.7'
Make the disks parameter available to the constructor
In that way, tools building on Instance will benefit from the correctedverification semantics of the instance policy on disk space.
Refactor ispecs in ipolicy structures
Minimum and maximum instance specs are put together into a single elementof the instance policy. This is in preparation for introducing multiplemin/max specs.
Signed-off-by: Iustin Pop <iustin@google.com>...
Change hbal behaviour in case of early exit
Currently, hbal exits with status 1 if early exit is requested, evenwhen all jobs are successful. This is counter-intuitive behaviour, solet's fix it (Issue 386).
Note that the man page had conflicting information already, so it's a...
Fix low verbosity levels in htools
In a few cases, we tested the verbosity level for (== 0), instead ofhigher/lower than a certain value. If the user passes multiple"--quiet" options, this can result in negative verbosity levels, whichbehave like "extra verbosity"....
HRoller: allow filtering by node group
Accept the -G option, and if it's passed require that it matches anodegroup, then only output nodes belonging to that group.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
HRoller: print only online nodes
To make the graphs work even when instances live on offline nodes (eg.because we're offlining them just to exclude them, or because they haveinstance still on them) we just filter them out at the end, when we'regoing to print out the result....
Enable use of the priority option in hbal
This patch adds the option to hbal, and uses it to tweak the submittedjobs. There are also two small shelltests for testing the parsing.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Make hbal opcode annotation more generic
Currently, hbal code always uses annotateOpCode function, which meanswe would have to pass the options data to all function in the callchain if we wanted to make this more flexible.
By abstracting the type of the annotator and passing it as an argument...
Remove use of 'head' and add hlint warning for it
Since 'head' is unsafe to use in most cases, this patch removes itsuse from most of the code, adds a lint warning for it (and for tail aswell), and adds override annotations in the few cases where it'sactually OK to use it (mainly when using head over the result of...
Harep.hs: fix a couple typos in comments and docstrings
Signed-off-by: Dato Simó <dato@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
harep: finish execution with a summary of instance states
The harep tool prints messages for every action that it performs (or thatit doesn't perform). In case nothing is to be done at all, always printsome statistics of the current state of the cluster....
harep: do not wait for repair job completion to set tags
Because of instance locks, after submitting a repair job we weren't able toset the "pending" tag until at least the first opcode of the job finished.Introduce a small delay in the repair job so as to allow the subsequent...
harep: create repair jobs
Implement 'doRepair' to create a repair job from a list of opcodes ifthe instance's policy allows it (otherwise set an ENOPERM result label),and the instance was previously healthy (i.e. not in ArFailed orArPendingRepair)....
harep: pure function to detect brokeness with instances
Add a 'detectBroken' function that determines whether an instance is in anunhealthy state, and what's needed to repair it. The repair is specified asan AutoRepairType constant, and a list of opcodes. The opcodes will only be...
harep: check for completed jobs at the start of the program
As a first step before detecting any brokeness with instances, see if anyof our previous repairs have completed, and move instances to ArFailed orArHealthy accordingly. Do nothing if there are still running jobs for the...
harep: initial parsing of tags
Parse auto-repair tags to set each instance in one of ArHealthy, ArFailed,or ArPendingRepair. The implementation tries to be well behaved when oldtags have been left behind, which future patches will still try not to do....
Program/Harep.hs: add skeleton for the new auto-repair tool
harep(1) detects certain kind of problems with instances and applies theallowed set of solutions. See doc/design-autorepair.rst.
Signed-off-by: Dato Simó <dato@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Loader.hs: ignore expired ArSuspended policies
At the moment, because 'mergeData' is pure, it may set instance auto-repairpolicies that are of the form `ArSuspended $ Until timestamp_in_the_past`.If later on the auto-repair tool notices this, it has lost access to what...
Fix a bad data type in Hcheck.hs
While trying to understand why some code was not being tested, Irealised that we have a bad data type in Hcheck.hs.
We have "data Level = GroupLvl | ClusterLvl", but then we need to passthe group name/index as well, so we have functions that look like the...
Move src/Ganeti/HTools/Program.hs to Program/Main.hs
This removes one more tab conflict; this is the last module in ourcode where we have both x.hs and x/.
Furthermore, we collapse all actual code into the new Main.hs module,leaving the htools.hs basically empty (will allow better testing in...
Rename htools/ to src/
Per offline discussions, this is the first patch of therenames. Tested with "make distcheck", seems to work fine.
The only change outside of the renaming is a bit of simplification inthe .gitignore rules; otherwise, simply s/htools/src/....