Add definitions for more Luxi calls
Split the Luxi generic parts from the loader
The Luxi loader implements both a generic Ganeti Luxi client and theloader; it is better if these two are separated. The patch adds aGaneti/Luxi.hs (not under HTools!) since that is generic for Ganeti, andnot related necessarily to htools.
hbal: Implement grouping of moves into jobsets
Since moving two instances between different node-quadruples (inst X: A,B → C, D and inst Y: E, F → G, H) can be parallelised by Ganeti, itmakes sense to split the operation list into jobsets whose execution...
Change ExtLoader to only handle I/O errors
Due to the Control.Exception changes between 6.8 and 6.10, using itportably is difficult. Since we're only interested in handling I/Oerrors, we can use prelude's catch and not have to deal withControl.Exception at all....
Brown-paper-bag release fixing haddock issues
Haddock doesn't like pre-processed files (at least not in all versions).Thus we need to remove the ExtLoader module from the haddock-procesedfile list.
Update NEWS file for the 0.1.7 release
Make the test suite return an reasonable exit code
Test.QuickCheck.Batch.runTests doesn't return any error statistics,which makes the test suite just display errorrs and always exit withexit code 0. This is not good, since one cannot then actually batch run...
Brown bag fix: invert a test
During testing I used the test inversely to see it triggers correctly,and committed by mistake the inverted test. Fixing it.
Turn on, and fix, more warnings
The Makefile was intented to be -Wall and not simply -W, but I missedthat. This enables more warnings and also enables -Werror (except forthe tests).
Add support for building without curl
Since curl is not always needed (e.g. when only using luxi or lesslikely file backends only) and is also not always available, it isuseful for building without it. This of course disabled the RAPIbackend.
This patch changes ExtLoader to build with the ‘-cpp’ option which makes...
Split the exernal data loader out of CLI.hs
Currently the external data loader is in CLI.hs, which makes allprograms that need cli functionality (options, etc.) link against thenetwork modules (most importantly curl). This patch splits thisfunctionality into a new module such that (for example) hail which only...
Fix luxi recvMsg for messages bigger than 4K
This patch fixes a logic bug in luxi that breaks receive of messagesbigger than 4096 bytes. The send message is not impacted as it uses adifferent algorithm.
Add some more instance tests
This include instance text load tests.
Test some cases for the cluster score computation
Split the balancing algorithm in two parts
Currently the computation, recursing part and the IO part (progressupdates) of the balancing main function (iterateDepth) are all in thesame function, which makes it hard to test. This patch moves thedecision/computation part (whether to proceed one more round, whether we...
Implement support for 'cheap' moves only
This patch adds support for cheap (failover/migrate) operations only inthe balancing algorithm and in the hbal command line options.
This allows a very quick balancing (compared to allowing replace-disks)which can be useful as a scheduled operation.
Simplify the wrapIO function
This fixes one warning from hlint.
Use migrate or failover based on instance state
While we can't guarantee that the instance will be in the same state bythe time the migrate/failover command will be run, we can at least tryto do the right thing assuming no other changes to the cluster state....
Improve the error message for command line errors
Instead of using ioError . userError, we format the error ourselves.This is nicer - no ‘)’ at the end of the output.
Update NEWS file for the 0.1.6 release
Add a simulated cluster data loader
This is useful especially for hspace, where we might want to simulate ahypothetical cluster to check allocation beforehand.
Fix a typo in hbal.hs
Signed-off-by: Guido Trotter <ultrotter@google.com>
CLI: Handle error better
This patch adds an error handler for any exceptions that are raisedduring the external data load phase. This can be improved further, butit's a good start.
Unify the command line options and structures
This patch moves all the command line options and their internalrepresentation into CLI.hs. This means that duplicated options betweenany two binaries are no longer declared twice, and that we no longerneed the two *Option classes.
Document the --vcpus option to hspace
Fix a few hlint errors
Man page updates
This patch beautifies the man pages for hbal and hspace.
CLI: Prevent incompatible options to be selected
This patch makes CLI abort if more than one backend is selected.
Update documentation for the new luxi backend
Add support for luxi backend in CLI/hspace/hbal
This patch changes the backend selection method in CLI to prefer, in order: - a RAPI specification - a Luxi specification - and finally the node/instance files
It also modifies hspace and hbal to provide a ‘-L’ command line option...
Initial commit of the luxi backend
This patch adds a luxi backend that allows direct query of the masterdaemon on the local node. This patch doesn't enable the backend to beused.
There are a couple of things still missing in the implementation: - we don't have a master timeout in reads and writes, only a...
Introduce timeout in RAPI queries
The patch adds two constants in Types.hs for connect and query timeout,then modifies Rapi.hs to use them as the connect and general curltimeout.
Rapi could be improved more, as currently we wait double the totaltimeout due to not aborting early in case the node queries failed.
Fix a haddoc issue
Update NEWS file for the 0.1.5 release
This is basically a hspace release, so the changelog is small.
hspace: fix failure handling of tryAlloc results
Currently hspace doesn't handle failures from tryAlloc correctly; thispatch changes the iterateDepth function in hspace to return a Result (…)so that errors can be propagated correctly.
The patch also changes one output key to be more clear and a typo in...
Change the tryAlloc/tryReloc workflow
Currently, the tryAlloc and tryReloc function return a list with all theresults, both failures and successes. This is fine for hail, which doesone round of allocations, but is not so good for hspace, which doesiterative rounds; since at each (successful) step we only take the best...
Simplify the Cluster.tryAlloc structures
Currently the tryAlloc function calls theallocateOnSingle/allocateOnPair and the builds a new tuple with thosefunctions's result plus the new node list. This is however suboptimalin two respects: - the new nodes added are the 'old' versions of the respective nodes,...
Slight change to the internal allocation results
Currently the Cluster.AllocSolution type is defined as a list of‘(OpResult Node.list, …)’ and the results for applyMove are defined as‘(OpResult Node.List, …)’. Both these means that the failure/successindication is hidden in the first elements of this tuple, which makes is...
Add a 'tags' makefile target
This uses hasktags for building emacs TAGS.
hspace: switch output to shell-script format
This (big) patch changes the output of hspace from text-format(separated by ‘: ’) to a shell-snippet, in ‘key=value’ format.
This will allow sourcing the output or parsing it via awk/sed/etc.
hspace: move instance count and score into CStats
Currently the instance count and cluster score are separated from theother initial/final phase stats, even though they are very similar. Thispatch moves computation of these two into totalResources/CStats and...
Fix unittests
The recent OpResult and CPU values additions broke unittests.
Export more stats in hspace
This patch changes Cluster.totalResources to compute more resources andprints them in hspace.
Show errors on stderr instead of stdout
Currently many of the exit and warning conditions mistakenly display errormessages on stdout, which makes parsing the output of programs harder. Thispatch attempts to fix such occurrences.
Fix score calculation to work with empty clusters
Currently the cluster score calculation includes an offline instancepercentage, expressed as “offline inst / (offline + online inst)”, whichresults in NaN for empty clusters. This patch changes the calculation...
Some docstring updates
hspace: convert N1 error exit into FailN1 result
Currently hspace exits with an error if the cluster is not N+1 compliantat the beginning of the run. This patch changes hspace such that thiscondition is instead treated as a zero-allocation-possible, FailN1 mode....
hspace: add display of instance spec
This is mostly for user-friendliness in the default mode, when we don'tspecify the instance parameters.
Optimize the Utils.stdDev function
This patch optimizes the stdDev function in two respects: - first, we don't do sum . map which builds an intermediate list, but instead use a fold over the list to build incrementally the sum; this should reduce both the time and space characteristics, as we...
Take the foldl out of Loader.fixNodes
Currently Loader.fixNodes is foldl' with a complicated function. Itmakes more sense to take foldl' out of this function (and put it intothe caller) and let fixNodes be only this internal function.
Simplify Cluster.computeMoves
This patch changes the function Cluster.computeMoves to use guards and acouple of subexpressions in order to greatly simplify it.
Fix hlint-generated warnings
This big patch cleans up the code per hlint indications. Many removalsof extra parentheses, replacements of concat . map with concabtMap,extra dollar signs, eta reductions, etc. were performed.
The code still compiles and passes a couple of manual tests on sample...
Add computation of the failure reason in hspace
This patch enhances hspace to report why the allocation sequencestopped, both in absolute error count and for the top reason.
Return correct failure data from Node.add*
This patch alters the Node.addPri/addSec to return correct failure data.It removes the computeFailN1 function from the module as that used tocombine both mem and disk checks in the same function and thus the real...
Introduce a new type for allocation results
Currently the allocation/move operations workflow return ‘Maybe a’,which is very convenient but loses all details about the failure mode.
This patch introduces a new data type which encodes the specific failure...
Remove hn1 and related code
hn1 was deprecated for a while and this patch removes it altogether. Thesupport code in Cluster.hs is also removed.
Display two more stats in hspace
This adds two new stats - sum of reserved ram and disk.
Fix totalResources avail disk computation
This uses the newly-added Node.availDisk to compute the actual availabledisk correctl, and display the total allocatable disk in hspace.
Add an availDisk node function
This function returns the amount of available disk, which depends onwhether a low disk limit has been configured or not and on the free diskspace of the node.
Add two new autocomputed vars to Nodes
Currently we track the max disk usage/max vcpus as percentages, howeversometimes it's easier to check against minimum free disk or maximumnumber of cpus, as units instead of percentages.
This patch adds two new variables, lo_dsk, hi_cpu, which are recomputed...
Add a new type for cluster statistics
Currently totalResources returns a 5-tuple of integers. This is not easyto handle, as each change on the return type means that each caller mustbe updated.
This patch adds a new type for cluster stats and uses that instead as...
Enhance hspace resource display
The display of cluster resources is extracted into a separate functionand enhanced to display more stats.
Add display of more stats in hspace
This patch changes Cluster.totalResources to compute more details aboutthe cluster status, and enhances hspace to display more of these.
Fix a haddock/docstring issue
Update NEWS file for the 0.1.4 release
Fix some hscan bugs
Currently hscan has a number of bugs: - doesn't add the common suffix (csf) to the instance's nodes - doesn't export the cpus for neither nodes nor instances - doesn't support single-node instances
This patch fixes these issues.
Some documentation updates for the new parameters
Add cpu/disk limits in hbal
Add setting of node limits in hspace
Implement cpu/disk limits in instance moves
We modify Node.addPri/addSec to take into account the limits on instanceadds.
Add two new node attributes
Two new min disk free ratio and max cpu usage attributes are added to thenodes. These will be used in the future to restrict allocation.
Fix 'unused X' warnings
This removes some unused functions and imports to cleanup the warnings.
Fix the various monomorphism warning
In a few places (e.g. tryRead or any printf call) it's a little bit hardto add the correct type signatures, but in the it is possible to fixthese warnings (which can bite one in subtle cases).
Small changes to the node list output
This is just some cleanup of the node list output, adding pcpu/vcpucounters, and making the display slightly nicer.
Add cpu ratio to cluster calculation
Update cpu counters correctly after pinst changes
The cpu counters are update on primary instance adds/removes.
Add cpu-count-related attributes to nodes
This patch adds cpu-count related attributes to nodes: - total cpus - cpus in use - ratio of virtual:physical cpus
We also set correctly the cpu values at load time, but we don't doanything yet while moving instances around. The cpu ratio is shown in...
Add a new vcpus attribute to instances
This patch adds reading of vcpu count for instances, in preparation forusing the vcpu ratio in cluster scoring.
Fix reading of total disk space in iallocator
IAllocator currently uses a wrong key name for reading the total diskspace (‘disk_usage’ which was copied from RAPI, but the actualiallocator key is ‘disk_space_total’).
This patch fixes that and also makes iallocator always use this key,...
Update NEWS and README for the 0.1.3 release
Small updates to the documentation and make a new small release.
Fix the ReplacePrimary instance move
During a replace-primary instance move, on the real cluster the instanceis temporarily started on the secondary, and as such we must check thatthe secondary node can hold it for this duration. Currently the codedoes not, and depending on cluster scoring it will put instances on such...
Update NEWS file for the 0.1.2 release
Update the README file with hspace informations
Fix hspace with plain type instances
This also fixes other required node numbers.
Add a man page for hspace
Rework the tryAlloc/tryReloc functions
Currently tryAlloc/tryReloc do not return the new instance, as this isnot needed for IAllocator alloc/reloc requests. However, for computingthe space, the new instance is useful, so we modify these functions toreturn this information too....
Add an utility function for triples
Initial add of the hspace tool
This is a tool that checks how many instances (of same size, specifiedby command line arguments) can be added to a cluster while remaining N+1compliant.
Small doc change
And an alignment issue.
Ensure consistent naming of the tools
This patch makes sure that all references to the name of the software isganeti-htools, not simply htools.
Small documentation update
Add copyright/license information
This doc-patch adds copyright and license information to (hopefully) allneeded files.
tests: move the test declaration in QC.hs
This patch moves the test declaration into QC.hs, so that test.hs has tobe modified only when we add a new test category.
Move some alloc functions from hail into Cluster
These are generic enough to be used from multiple places, they belongbetter in Cluster.hs than in the hail source.
Small whitespace change
Move the RqType and Request types to Loader.hs
These two will be more generic than now, and belong somewhere else -Loader.hs is a generic module for data loading, thus we move them there.
Cleanup an old function
Also replace a type with its synonim.
Lots of documentation updates
This patch does only doc build changes, doc changes and function movearound (for more logical documentation). It should have no impact at allon the code.
Change the check rule in Makefile
Since ghc won't trigger recompilation due to the -fhpc flag, it's notuseful to rm && make test, as this will only relink the binary.Therefore we simplify this rule.
A simple test for Container.addTwo