History | View | Annotate | Download (18.5 kB)
Fix ReplaceSecondary moves for offline nodes
The addition of a new secondary on a node is doing two memory tests:- in strict mode, reject if we get into N+1 failure- reject if the new instance memory is greater than the free memory (not available memory) on the node...
Add unittest for Node text serialization
This checks that the Node text serialization and deserializationoperations are idempotent when combined other.
Introduce a relaxed add instance mode
In case an instance is living on an offline node, it doesn't make senseto refuse moving it because that would create N+1 failures; failing N+1is still much better than not running at all. Similarly, if thesecondary node of an instance is offline, meaning the instance doesn't...
Update the node list fields
This patch renames the pri/sec to pcnt/scnt, and adds the real primaryand secondary instance lists, the peermap and the index of a node asselectable options.
Cleanup a node's peer map when possible
If the last secondary instance of a peer is deleted (detected by the newpeer memory value being equal to zero), then the pair (pdx, 0) should bedeleted completely. This is not optimization per se, but rather cleanup...
A few more small Node unit-tests
Accept both full and short names in CLI
This patch introduces some new functionality in the base Element typeand in Container which supports searching for all 'known' names of anelement, such that both short and full names are accept for variousoptions like '-O' and '--excluded-instances'.
Stop modifying names for internal computations
Currently the name used internally is modified and holds the shortenedname of the nodes/instances. This has caused issues before, since wealways have to strip the suffix from input data and reapply it if we...
Add a new node/instance field
This new field ('alias') will hold the shortened/beautified displayname. When resetting the name, the alias is reset too, and there's a newfunction to update only the alias.
Fix some haddock comments
Shuffle some constants around
… and export more functions. This will help with unit testing.
Remove the noLimit values and always use limits
This patch moves from allowing no-limits for disk/cpu ratios, and alwaysuse a real limit. For disk, it's simple since we use 0, which means noreservations for disks. For CPU, we set an (arbitrary) limit of 64 v/p,...
Fix Node hiCpu computation
In case we're not enabling limits, let's restrict this to -1, instead of-1 times the number of pcpus.
Introduce total vcpu tracking in CStats
We add a new field that tracks the available virtual cpus (expressed asnode cpus times the vcpu ratio).
A number of small fixes from hlint
Move a type declaration to Node.hs
We'll need AllocElement in both Cluster and IAlloc in the future, so wemove it to Node.hs which is imported by both.
Signed-off-by: Iustin Pop <iustin@google.com>
Fix secondary node selection for existing N+1
In case a secondary node is already N+1 failed, currently the nodeselection will accept a node that cannot start (at all) the new instanceas valid. This is wrong, so we add a new simple check to prevent the...
Rewrite the node add checks for simpler layout
This will make it clearer than many if…then choices.
Node: add function for conflicting primary count
Add a new node list field
This patch adds a new node list field (ptags), showing the primaryinstance tags.
Introduce tag-based exclusion of primary instances
This patch introduces exclusion of primary instances based on tags. Thisis incomplete as currently all tags are being excluded, and we don'toptimise towards relocation of instances sharing tags on the same node.
Move more node-listing functionality in Node.hs
This will prepare for the runtime-selectable field list.
A small style change in Node.hs
This imports PeerMap as P and reindents some lines.
Some small style fixes
Generalise the node/instance listing
This patch introduces a generic formatTable function (based on, andsimilar to the Ganeti one, but different and more FP in style) andchanges the node and instance listing to it.
The node list (due to the many variables) is still a little bit hackish...
Start using the utilisation scores in balancing
This enables the per-node load/total available capacity scores to beused in balancing. Note that the total available capacity is currentlyfixed at zero and cannot be changed by the user.
Add loading and processing of utilisation data
This patch adds loading and processing the utilisation data duringinstance moves. While the data is not yet used, it is correctly modifiedby instance changes between nodes.
hbal has the new ‘-U’ command line argument for this. The format of the...
Merge the Node.setPri and Node.addCpus functions
The latter is only used right after the former in the Loader module, andwe'll need more of this 'update not with the data of this instance'functionality (which is different than addPri where all information must...
Show the load on nodes in node lists
The strange printf usage is due to some limitation (it seems) in ghc forvery long argument lists. The whole printout should be rewritten later.
Add initial structure for utilisation balancing
This patch adds the datatypes and modifies the nodes and instance types to havesuch attributes. They are not used yet in any way.
Style change: node and instance attributes
This changes from a_b to aB in all node and instance attributes, tomatch the standard Haskell style. Also attributes that should have beencamel-cased but weren't were changed (e.g. plist → pList, pnode →pNode).
Turn on, and fix, more warnings
The Makefile was intented to be -Wall and not simply -W, but I missedthat. This enables more warnings and also enables -Werror (except forthe tests).
Export more stats in hspace
This patch changes Cluster.totalResources to compute more resources andprints them in hspace.
Fix hlint-generated warnings
This big patch cleans up the code per hlint indications. Many removalsof extra parentheses, replacements of concat . map with concabtMap,extra dollar signs, eta reductions, etc. were performed.
The code still compiles and passes a couple of manual tests on sample...
Return correct failure data from Node.add*
This patch alters the Node.addPri/addSec to return correct failure data.It removes the computeFailN1 function from the module as that used tocombine both mem and disk checks in the same function and thus the real...
Introduce a new type for allocation results
Currently the allocation/move operations workflow return ‘Maybe a’,which is very convenient but loses all details about the failure mode.
This patch introduces a new data type which encodes the specific failure...
Add an availDisk node function
This function returns the amount of available disk, which depends onwhether a low disk limit has been configured or not and on the free diskspace of the node.
Add two new autocomputed vars to Nodes
Currently we track the max disk usage/max vcpus as percentages, howeversometimes it's easier to check against minimum free disk or maximumnumber of cpus, as units instead of percentages.
This patch adds two new variables, lo_dsk, hi_cpu, which are recomputed...
Implement cpu/disk limits in instance moves
We modify Node.addPri/addSec to take into account the limits on instanceadds.
Add two new node attributes
Two new min disk free ratio and max cpu usage attributes are added to thenodes. These will be used in the future to restrict allocation.
Small changes to the node list output
This is just some cleanup of the node list output, adding pcpu/vcpucounters, and making the display slightly nicer.
Update cpu counters correctly after pinst changes
The cpu counters are update on primary instance adds/removes.
Add cpu-count-related attributes to nodes
This patch adds cpu-count related attributes to nodes: - total cpus - cpus in use - ratio of virtual:physical cpus
We also set correctly the cpu values at load time, but we don't doanything yet while moving instances around. The cpu ratio is shown in...
Add copyright/license information
This doc-patch adds copyright and license information to (hopefully) allneeded files.
Lots of documentation updates
This patch does only doc build changes, doc changes and function movearound (for more logical documentation). It should have no impact at allon the code.
Finish removal of unused params from PeerMap
This completes the removal started earlier byt removeing the need topass the number of nodes to Node.buildPeers, which is now unused.
Add test infrastructure and initial tests
This patch adds a QuickCheck-based test infrastructure and initial testsbased on it. The PeerMap module has a 100% coverage ☺
Side-note: one has to read the source of QuickCheck to see how to use it(especially the Batch submodule), the docs are not enough…
Remove unused parameters from PeerMap creation
We remove some unused arguments (added way back for compatibility withArrays, which we didn't use in the end). This makes the code clearer(and doesn't need the Ndx type to be an instance of Num).
Add type synonyms for the node/instance indices
This is a first step towards full datatype renaming. That requires morechanges, so at first we only want to document clearly what is a nodeindex, what is an instance index, and what is a plain Int.
Change the module import hierarchy
This patch makes the Types module a base module, and Node/Instance onesimport it, from the previous (opposite) situation. This will allow inthe future to use newtypes for the index and name types.
Remove some extraneous uses of ktn/kti
Since we have Node/Instance.name, we can now simplify a few constructs.
Add a small class for Nodes and Instances
Since both nodes and instances support some common functionality (namesand indices), we add a class so that we can access these attributes in ageneric way.
Add back names to nodes/instances
In order to simplify the data structures, we add back the name on thenode and instance objects. We still keep the index for, well, indexing,but we will use the name directly from the object, in order to get ridof the ktn/kti arguments which are passed around everywhere.
More code reorganizations
This new big patch does a couple of more cleanups in the loading of datachapter: - introduce a Types module that holds most types (except the base Node/Instance/etc.) so that multiple other modules can use these (instead of only Cluster and its users)...
Rework the loader model
This big patch changes the loader model from “string data as commonformat” to actual object structures as common format.
The text loading function move from Cluster.hs to a new Text.hs module,some common functions are moved to a new Loader.hs module, and the...
Add support for 'offline' nodes
This patch drops compatiblity with Ganeti 1.2 and adds support foroffline nodes in the cluster. When reading from RAPI, the drained nodesare considered offline so that we don't allocate on them too.
Update all needed node fields on f_mem change
This fixes the setFmem function which didn't compute other relatedfields after free memory change. Ideally, this should be abstracted sothat add/remove Pri and similar functions could reuse it instead ofduplicating code.
Fix interaction between down instances and nodes
If an instance is down, it's memory is not reflected in the node usedmemory, and thus the node free memory is higher than the actual value.This patch deducts the memory for such instances from the node free...
Show the x_mem/i_mem in node list
This patch adds checking of cluster data in the binaries and display ofnode's x_mem/i_mem in the node list.
Add a new node filed x_mem
Nodes can have some memory unaccounted for, due to (e.g.) hypervisoroverhead, rounding errors in reporting, etc.
It is better if we model this memory explicitly instead of hiding it,and actually since the n_mem addition it is actually required to do so....
Remove unused and obsolete function
The Node.str function is very old and is not useful since the nodeobjects have much more fields today. This patch removes it, and ifneeded a full node display can be done via ‘show’.
Add node memory field to Node objects
This patch adds a new n_mem field to the node objects, and implementsread/save/show support for it. The field is not currently used (exceptin the node list) but will be used for checking data consistency andinstance up/down status.
Pass actual types to node/instance constructors
This patch changes the parameters passed to the node and instanceconstructors from generic Strings (which are then parsed via “read”) tothe actual used types, by converting them earlier in Cluster.loadData.
Some small changes in preparation for hscan
This patch does some small changes: - fixes a comment - export more node functions (unneeded now, but hscan will use them) - fixes Makefile rule for building the programs
Show offline nodes in the node status list
This patch adds a new ‘-’ flag for the node status which denotes offlinenodes.
Add a new 'offline' Node attribute
This patch adds a new node attribute - offline - which will serve toskip nodes from the target candidate list.
Small doc update in Node.hs
Introduce a namespace for the modules
The modules are moved from the ‘top’ namespace to ‘Ganeti.HTools’, incompliance with standard practices.