Remove the noLimit values and always use limits
This patch moves from allowing no-limits for disk/cpu ratios, and alwaysuse a real limit. For disk, it's simple since we use 0, which means noreservations for disks. For CPU, we set an (arbitrary) limit of 64 v/p,...
Fix hspace's KM metrics
We returned the KM_POOL_* metrics as the final state, not as the deltabetween the final and the initial state.
Fix Node hiCpu computation
In case we're not enabling limits, let's restrict this to -1, instead of-1 times the number of pcpus.
Add a new function to compute allocation deltas
Given two cluster states, the new function can answer the followingquestions:
- how much resources currently allocated- how much resources finally allocated (delta from above is how much we can actually allocate on the cluster)...
Introduce total vcpu tracking in CStats
We add a new field that tracks the available virtual cpus (expressed asnode cpus times the vcpu ratio).
Merge branch 'master' into next
Fix iallocator crash when no solutions exist
Commit 5436576 added an un-guarded `head' call, which crashes with“Prelude.head: empty list” when no results exists for the per-instanceallocation/relocation calls.
This patch fixes this, and also adds another check for an unguarded...
Fix IAllocator multi-evacuate message
Since Ganeti passes full host names (not common-suffix-stripped), weneed to remove the suffix from the evac_nodes keys too. In case one nodeis not part of the cluster, it will lead to a wrong error message, butfor now it fixes the problem.
Fix a haddock comment issue
For some versions of haddock, this can create problems.
Abstract instance running states into a list
This removes some manual checks from a few places in the code with asingle list defined once.
A number of small fixes from hlint
Fix unused-do-binds for ghc 6.12
GHC 6.12 has some new warnings, which are valid in most cases except(IMHO) printf usage.
Fix unused imports for ghc 6.12
GHC 6.12 has become more picky about unused imports, so we need toremove/tighten some of them.
hscan: implement LUXI backend scanning
This allows hscan to work also with NO_CURL (but only for the localmachine, of course).
Loader: abort for unknown to-be-excluded instances
balance function: use the movable flag directly
Instead of deciding based on secondary node, use the new flag.
Update the loader pipeline to set the movable flag
This updates the movable flag on instances if they have only one node(we don't rely on OpMoveInstance) or if they are set so via the commandline options.
This doesn't yet enable the use of the new flag.
Add a 'movable' flag on instances
This will be used instead of checking for no secondary and forsimplifying 'do not touch' instances.
Add an option for excluding instances from moves
Add a tryEvac function
This will be used by the node evacuate IAllocator request type.
Signed-off-by: Iustin Pop <iustin@google.com>
Implement IAllocator node evacuate request
This patch adds the new request loading/execution (trivial), but theactual response formatting becomes more difficult as now the responsetype differs by request.
IAllocator: move some keys into per-request data
Since not all structures will have these keys in the future, we movethem into per-structure keys.
Change an internal type from Maybe to list
In preparation for multiple responses, we change from Maybe to List(both used in the container sense).
This allows us to keep the same workflow for all kind of requests.
Move a type declaration to Node.hs
We'll need AllocElement in both Cluster and IAlloc in the future, so wemove it to Node.hs which is imported by both.
Implement evacuation mode in hbal
This mode restricts the list of instances to be moved to the instancesliving on the offline (and drained) nodes.
Add an evac mode CLI option
Reorder options in CLI.hs
This should be no code change, just reordering of the options.
Fix secondary node selection for existing N+1
In case a secondary node is already N+1 failed, currently the nodeselection will accept a node that cannot start (at all) the new instanceas valid. This is wrong, so we add a new simple check to prevent the...
Rewrite the node add checks for simpler layout
This will make it clearer than many if…then choices.
Move instance relocation test upper in the chain
Currently we test each instance for relocation in checkMove; however, itis a little more clear if we pass only the relocatable instances tocheckMove. The patch also slightly rewrites (indendation/style) the...
Split the balancing function in two parts
Currently in the balancing function we do two thing:
- take the decision where to do a new balancing round or not- and actually computing the balancing round
This is not nice, as the two parts are conceptually separate, so this...
Fixing a typo in option description
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>Signed-off-by: Iustin Pop <iustin@google.com>
Switch the text file format to single-file
This patch changes from the two separate files to a single file, withsections separated by a blank line. Currently only the node and instancedata is accepted, later the cluster tags will be read too via thisformat....
Change the signatures of the text loader slightly
This is in preparation for the text format changes.
Convert n1_score metric from % to count
This increases the priority of fixing N+1 failures compared to balancingmetrics.
Metric: count of primary instances/offline nodes
This helps with evacuation/failover of instances on 2-node clusters withone one offline.
Offline instance metric: change from % to count
Currently we use the offline instance percentage (with range [0, 1]),but this is not good, since we want the evacuation of such instances tohave a high priority; therefore we change this to a count of offline...
Use the oper_ram field if available
For the RAPI and LUXI backends, we can get the actual memory usage (ifinstances are running) via the oper_ram, whereas backend/memory onlytell what the instance will use at the next boot.
Not using oper_ram means that the node model is flawed and we consider...
rapi, luxi: treat drained nodes as offline
Commit e97f211 changed the iallocator backend to handle drained nodes asoffline. This commit completes that change by making the rapi and luxibackend do the same (the text backend ignores any '?' values which are...
Fix typo breaking LUXI backend
This really shows the need for actual dist-time full testing (notunittests).
Fix unittests after instance tags addition
Configure exclusion tags via the cluster tags
This patch adds reading of the exclusion tags from the cluster tags: anytags starting with htools:iextags: will convert their suffix into anexclusion tags prefix. In other words, "htools:iextags:service" will...
Read cluster tags in the IAllocator backend
Read cluster tags in the LUXI backend
Read cluster tags in the RAPI backend
This also shows them in hbal in verbose mode.
Introduce support for reading the cluster tags
While these are not actually populated from the backends, and all theprograms ignore them, this patch contains the changes in the functiontypes required.
Add a command-line option to filter exclusion tags
Since we don't want all instance tags to be used for exclusion, we add acommand line option to filter on these. Since the iallocator protocolcannot accept command line options, currently it's not possible to...
Add a new node list field
This patch adds a new node list field (ptags), showing the primaryinstance tags.
Node: add function for conflicting primary count
Use conflicting primaries count in cluster score
This small patch adds the number of conflicting primaries in the clusterscore. This is different from the other non-CV metrics where we usuallycompute the percentage of failing instances (for that metric); but for a...
Specialize the math functions
The statistics functions are currently defined as polymorphic with aFloating constraint. Changing this to monomorphic on Double type makesthem stricter and much more performant (~70% speedup). This is a cheapway to recoup some of the loses incurred by the recent proliferation of...
Collapse the statistical functions into one
This allows us to get rid of two duplicate list length computations,with a minor speedup.
Introduce tag-based exclusion of primary instances
This patch introduces exclusion of primary instances based on tags. Thisis incomplete as currently all tags are being excluded, and we don'toptimise towards relocation of instances sharing tags on the same node.
Add a tags attribute to instances
… and read it in all the loaders. hscan is modified to save it to thefiles it generates.
The attribute is not yet used in any place.
Small change in some list arguments
This is simpler than the concat operator.
Allow overriding the field list in -p
The print nodes option can now accept an optional field list tocustomise the output. This is ugly, since the field names do not matchthe header names, but it is at least barely customisable (at runtime).
Move more node-listing functionality in Node.hs
This will prepare for the runtime-selectable field list.
Change the default dynamic usage to baseUtil
This fixed the unbalanced secondary instances on partially emptyclusters, and helps in general for the cases where real utilisation datais not available.
Add a few comments in the scoring function
Enhance the error reporting for Rapi and Luxi
Currently the JSON conversion in Rapi and Luxi are giving somethinglike: Error: failed to load data. Details: Unable to read Double
This doesn't tell one where the error is (in a node specification? and...
Change the Utils.fromObj signature
Currently the fromObj function takes a JSON object which is thenconverted into a list of (String, JSValue) in which we make a lookup.However, most of the callers of this function call it repeatedly on thesame object, which means we do the object→list conversion repeatedly....
Make some CLI options more consistent
Both the simulate and the tiered allocation mode take a machine spec oninput via a comma-separated list. This patch makes this a little bitmore consistent (always use disk,ram,cpu in this order).
hspace: show tiered-alloc stats in the output
This is a first attempt to get a readable output of tiered allocationstats in hspace's output. Not very nice, but it should be somewhatparseable.
A small style change in Node.hs
This imports PeerMap as P and reindents some lines.
Add support for shrinking instance specs
This patch adds a function that, for some given failure modes, shrinks agiven instance in the hope that allocation will succeed when retriedwith the new spec.
Convert option parsing to a monadic flow
This allows us to do verification of option arguments in the assignmentfunctions themselves.
Rework the instance spec CLI options
This patch reworks the internal handling of the instance spec CLIoption, and adds a tiered spec option that will be used in hspace toenable the (auxiliary) tiered-spec allocation mode.
It also introduces a new data type for holding the instance...
Some cleanup of Loader.mergeData
This doesn't need to be a monadic function, let's make it a simpler one.
hbal: ignore unknown instance in dynload file
Since the utilisation file might be generated at a different time fromthe hbal run, and instances could dissapear in the meantime, it's betterto simply ignore unknown instances rather than abort.
Expand the --print-instances output
This adds run status, resource parameters and load parameters forinstances.
Change the Container.findByName function
This patch changes the signature and implementation of the function;returning the item makes more sense (saves a lookup later again in thecontainer, and applying idx is cheap), and the previous implementationwas ugly.
Some small style fixes
Simplify the cstats initializer
Since all values are initialized to zero, the exact ordering is notimportant and thus we can use the positional mode for simpler code.
The patch also adds docstrings to the cstats functions.
Simplify Cluster.computeMoves
Since we now have an actual type for describing the instance moves(IMove), it's simpler to convert this into the move description/movecommands, rather than re-computing the move based on initial and finalnodes. This makes the shell commands computation and over-Luxi command...
Remove obsolete export
The ‘Placement’ type has been moved to Types.hs but we kept exporting itfrom Cluster, which is not needed.
Generalise the node/instance listing
This patch introduces a generic formatTable function (based on, andsimilar to the Ganeti one, but different and more FP in style) andchanges the node and instance listing to it.
The node list (due to the many variables) is still a little bit hackish...
Fix instance listing for non-redundant case
Fix two haddoc/happy docstring issues
Start using the utilisation scores in balancing
This enables the per-node load/total available capacity scores to beused in balancing. Note that the total available capacity is currentlyfixed at zero and cannot be changed by the user.
Add loading and processing of utilisation data
This patch adds loading and processing the utilisation data duringinstance moves. While the data is not yet used, it is correctly modifiedby instance changes between nodes.
hbal has the new ‘-U’ command line argument for this. The format of the...
Add an option to input utilisation data
Merge the Node.setPri and Node.addCpus functions
The latter is only used right after the former in the Loader module, andwe'll need more of this 'update not with the data of this instance'functionality (which is different than addPri where all information must...
Move some utility functions to Utils.hs
These were already duplicate (Text and Simu) and we need tryRead in more places.
Show the load on nodes in node lists
The strange printf usage is due to some limitation (it seems) in ghc forvery long argument lists. The whole printout should be rewritten later.
Add initial structure for utilisation balancing
This patch adds the datatypes and modifies the nodes and instance types to havesuch attributes. They are not used yet in any way.
Allow displaying the instance map in hbal
This is similar to --print-nodes, but with much fewer fields.
Add an explicit export list to Instance.hs
This exports all functions, but it's still good to have.
More hlint fixes
This makes (for now) the code hlint-clean. This is per se not a hugegain, but it allows easier tracking of regressions in style later(one-two new violations are easier to diagnose when not hidden among 20“known” ones).
Style change: camel-casing of unittests
Style change: cluster CStats camel-casing
This is again the cs_x to csX name change.
Style change: node and instance attributes
This changes from a_b to aB in all node and instance attributes, tomatch the standard Haskell style. Also attributes that should have beencamel-cased but weren't were changed (e.g. plist → pList, pnode →pNode).
Modify the internals of the detailed CV scores
Before we used a tuple; since we'll need more metrics in the future,it's simpler to transform this into a list of doubles, whose elementsare handled homogeneously by all the code that needs them.
Add a command line option for executing jobs
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Change iMoveToJob to properly create migrates
The current Cluster.iMoveToJob always creates failovers, which is notwhat we want. This simply used the original instances status to selectbetween these two (this is not optimal by the way, since the status...
Extend the MoveJob type to hold the instance index
This will be needed in order to generate the proper instance move commands.
Fix haddock issues with tuple members
It seems that haddock cannot document tuple members - but arguably, onceone needs to do that, tuples should not be used anymore.
This just moves the comments to the tuple comment.
Signed-off-by: Iustin Pop <iustin@google.com>...
parseNode: don't lookup values in drained nodes
Currently parseNode skips looking for values in offline nodes, but triesto read them for drained ones. With this patch we treat offline anddrained nodes in the same way (which is compatible with the iallocator...
Store the instance move in the MoveJobs
This will automatically sort our Ganeti jobs into the independent jobsets, and then we can submit them separately.