utils.cfunc: Cleanup, more flexibility
- Split code using ctypes directly into a helper class- Don't load “libc.so.6”, but use handle for main program instead (see comment in code)- Clarify comment on errno with older ctypes versions- Rename unittest since it can't be used for other functions (modifies...
Rename utils.mlock to utils.cfunc
Renaming so that more code using ctypes could be added to the same file.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add design doc for virtual(ised) clusters
I am currently able to run a 2-node virtual cluster on my machine,with a very ad-hoc setup. But the results show clearly that this isdoable, and that given the right tools, setting up such a cluster willbe quite easy....
Document some useful Haskell tips
This improves devnotes.rst with some tricks for Haskell development,and additionally it does two Makefile improvements:
- properly document lib/_vcsversion.py as a requirement for Constants.hs (but do not require rebuild when updated)...
Further cleanup in hspace
This moves the checking of results from the allocation functions to aseparate function, so that we have less code duplication. It also doesa bit of simplification in the printing functions.
Signed-off-by: Iustin Pop <iustin@google.com>...
A bit of cleanup in hspace
The node offline/mcpu is identical to hbal's setNodesStatus, so let'smove that to CLI.hs and reuse it in hspace (also, rename it and dropone 's').
Also, the check for the number of nodes is obsolete, as we computethat from the disk template....
Add a type synonym for the allocation function sig
Both iterateAlloc and tieredAlloc share the same signature, but it'snot documented nor exported (needed for refactoring).
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
htools: Simplify Luxi query results parsing
The logic is not entirely correct—the new Query interface exports thefield status, and we don't use that yet. But the new code should bemore readable.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Adjust htools code to new Luxi argument format
This partially undoes commit 92678b3, more specifically it removes theStore data type and the associated code, since all Luxi arguments arenow lists.
Furthermore, since the qfilter field on Query is complex (it's...
constants: Verify exported names
The “constants” module is a bit special in the sense that we don't wantto export random stuff from it. This unittest checks the namingconvention and removes imported modules from the module's namespace.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
http.client: Remove HTTP client pool code
This patch removes all remains of the HTTP client pool. Newly added unittestsprovide 96% coverage on http.client.
rpc: Remove thread-local storage with HTTP pool
The HTTP pool is no longer used.
Merge branch 'devel-2.5'
Tiny optimisation related to filter parsing
Currently, we get a luxi Client, then parse the filter, then executethe query. If parsing the filter fails, we connected to the masterdneedlessly.
Merge branch 'stable-2.5' into devel-2.5
Standardise LUXI call argument types
Currently, we have 4 types of arguments in LUXI calls:
- most common, a list of values- a single argument that is sent as a list of one element- a single argument that is sent by itself- a dictionary (only Query and QueryFields)...
Rename filter and filter_ to qfilter
We currently use 'filter' as the OpCode, QueryRequest and RAPI fieldname for representing a query filter. However, since 'filter' is abuilt-in function, we actually have to use filter_ throughout the codein order to not override the built-in function....
Merge branch 'devel-2.4' into stable-2.5
rpc: Disable HTTP client pool and reduce memory consumption
We noticed that “ganeti-masterd” can use large amounts of memory,especially on large clusters. Measurements showed a single PycURL clientusing about 500 kB of heap memory (the actual usage depends on versions,...
Haskell support for generic Query in Luxi
Untill now htools did not have support for generic Query in Luxi. Thispatch introduces Query as a supported Luxi operation and replacesQueryNodes, QueryInstances and QueryGroups with Query.
Signed-off-by: Agata Murawska <agatamurawska@google.com>...
TH simplification for Luxi
This patch simplifies the generation of save constructors for LuxiOpby always using showJSON over an array of JSValues, instead of havingto pass showJSON in most cases, except the 5-tuple case.
Dots in docstings and hlint error fixes for htools
Signed-off-by: Agata Murawska <agatamurawska@google.com>Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Add design doc for the resource model changes
This is not complete, but is as close as I can get it for now. Iexpect people actually implementing the various changes to extend thedesign doc.
Preserve bridge MTU in KVM ifup script
Closes: #201 - KVM_IFUP does not set bridge-MTU on tap devicesSigned-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Remove the oneline output option in hbal
This was, AFAIK, never used, and complicates the output code enoughthat it's better to remove it.
Rework/split hbal's main function
This is just moving code around. A subsequent patch will do a bit morecleanup and changing the output.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Agata Murawska <agatamurawska@google.com>
Skip application of 'id' in TH code
This is just beautification when dumping splices to stdout, as ghcwill optimise the 'id' away anyway.
Original generate code:
opToArgs QueryTags kind name = J.showJSON (id kind, id name)
Afterwards:
opToArgs QueryTags kind name = J.showJSON (kind, name)...
Don't send gratuitous ARP if master IP setup fails
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Document --ignore-errors and --error-codes
Update the man page of gnt-cluster to contain the documentation of the--ignore-errors and --error-codes verify options. Also, include the listof the error codes and their documentation.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>...
Add error codes documentation
Generalize docpp and sphinx_ext
hail: Fix result for node evacuation
According to the iallocator documentation the “node-evacuate” call needsto return a list of jobs, not a list of lists of jobs.
Use TemplateHaskell to create LUXI operations
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Documentation update for ovfconverter
Fixes for ovfconverter + vmware
Demote to warnings the errors in --ignore-errors
Treat the gnt-cluster verify errors identified by the error codes in--ignore-errors as warnings; just print a warning message for the user.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Add --ignore-errors parameter to cluster verify
lib/cli.py- add IGNORE_ERROR_OPT;
client/gnt_cluster.py- pass the ignore_errors parameter to the opcodes
lib/opcode.py- update OpClusterVerifyConfig, OpClusterVerify and OpClusterVerifyGroup to accept the ignore_errors parameter...
Move cluster verify error codes to constants
- move the cluster verify error codes from cmdlib._VerifyErrors to constants;- add to each of them the CV (Cluster Verify) prefix;- add the CV_ALL_ECODES and CV_ALL_ECODES_STRINGS constants;- wrap the lines that exceed 80 characters after changing the error...
Restore backend.GetMasterInfo return values order
Change 5a8648eb609f7e3a8d7ad7f82e93cfdd467a8fb5 changed the order of thereturn values of backend.GetMasterInfo(). This broke the users of themaster_info RPC.
This change restores the original order, and adds a comment in...
Add cluster netmask parameter
Add the master_netmask cluster parameter, that represents the netmask ofthe master IP, encoded as a CIDR suffix.
This parameter can be set via the --master-netmask of gnt-cluster initand gnt-cluster modify. The default behaviour is to be consistent with...
Add ValidateNetmask and GetClass IPAddress methods
Also, add related tests to the test suite.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
cluster-merge: log an info message at node readd
node readd can take a long time, it's good to have info messages to seeprogress.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Andrea Spadaccini <spadaccio@google.com>
Bump version to 2.5.0~rc1
Fix Makefile rules for QCHelper.hs
Include QCHelper.hs in the distributed files, and also exclude it andthe THH.hs file from coverage reports.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Andrea Spadaccini <spadaccio@google.com>
Fix issue when verifying cluster files
If a cluster has any non-master-candidate nodes, those don't contain allfiles (e.g. config.data). With commit aef59ae764dc (March 31st, 2011)the logic was changed and subsequently verifying a cluster with non-mcnodes would complain....
Revert "utils.log: Write error messages to stderr"
This reverts commit 34aa8b7c4bb6f5e2e788108e024c9cd70bdb3431. Writingerror messages to stderr would also include backtraces, something wetried to avoid in the past.
Fix adding nodes after commit 64c7b3831dc
Commit 64c7b3831dc changed the RPC call for verifying SSH connections.Unfortunately this case in adding nodes was missed.
Some TH simplifications
Now that the basic code works, let's use some aliases for simpler codeand less ))))))))).
A few minor test improvements
This patch adds a few niceties to the test suite:
- allows matching test groups case insensitive and emit warnings when we give test group names that don't match anything- add a new operator that is similar to assertEqual in Python: it...
Use TemplateHaskell to decorate tests with names
This makes error message change from "Test 4 failed …" to "Testprop_Loader_mergeData failed", which is much more readable. It alsoremoves the duplication of test suite names in the test.hs file.
Use TemplateHaskell to generate opcode serialisation
This replaces the hand-coded opcode serialisation code withauto-generation based on TemplateHaskell.
Use TemplateHaskell to build the opID function
This replaces the hand-coded opID with one automatically generatedfrom the constructor names, similar to the way Python does it, exceptit's done at compilation time as opposed to runtime.
Again, the code line delta does not favour this patch, but this...
Use TemplateHaskell instead of hand-coded instances
This patch replaces the current hard-coded JSON instances (all alike,just manual conversion to/from string) with auto-generated code basedon Template Haskell(http://www.haskell.org/haskellwiki/Template_Haskell)....
Rename some helper functions for consistency
This changes the names for some helper functions so that futurepatches are touching less unrelated code. The change replacesshortened prefixes with the full type name.
Split part of Utils.hs into JSON.hs
Utils is a bit big, let's split the JSON stuff (not all of it) into aseparate module that doesn't have any other dependencies.
LUClusterVerifyGroup: Spread SSH checks over more nodes
When verifying a group the code would always check SSH to all nodes inthe same group, as well as the first node for every other group. On bigclusters this can cause issues since many nodes will try to connect to...
Optimise cli.JobExecutor with many pending jobs
In the case we submit many pending jobs (> 100) to the masterd, theJobExecutor 'spams' the master daemon with status requests for thestatus of all the jobs, even though in the end it will only choose asingle job for polling....
Use --yes to deactivate master ip in cluster merge
Use deactivate-master-ip in cluster-merge
Use the gnt-cluster deactivate-master-ip command in cluster-merge todisable the master IP.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>...
Add gnt-cluster commands to toggle the master IP
Split starting and stopping master IP and daemons
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
listrunner: Don't pass arguments if there are none
If no arguments were specified the “exec_args” variable was “None”,leading to the command being run as “… ./… None”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>...
ssh: Quote strings in error message
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
utils.log: Write error messages to stderr
When “gnt-cluster copyfile” failed it would only print “Copy of file …to node … failed”. A detailed message is written using logging.error.Writing error messages to stderr can be helpful in figuring out whatwent wrong (the messages also go to the log file, but not everyone might...
Add signal handling doc to hbal man page
Also remove a bug note, since hbal can now for a long time directlyexecute jobs.
Adapt non-KVM hypervisors to new migration RPCs
Add memory transfer progress info to migration
Make migration RPC non-blocking
To add status reporting for the KVM migration, the instance_migrate RPCmust be non-blocking. Moreover, there must be a way to represent themigration status and a way to fetch it.
Move _TimeoutExpired to utils
Add an allocation limit to hspace
This is very useful for testing/benchmarking.
Small simplification in tryAlloc
Change how node pairs are generated/used
Currently, the node pairs used for allocation are a simple [(primary,secondary)] list of tuples, as this is how they were used before theprevious patch. However, for that patch, we use them separately perprimary node, and we have to unpack this list right after generation....
Parallelise instance allocation/capacity computation
This patch finally enables parallelisation in instance placement.
My original try for enabling this didn't work well, but it took awhile (and liberal use of threadscope) to understand why. The attempt...
Abstract comparison of AllocElements
This is moved outside of the concatAllocs as it will be needed inanother place in the future.
Change type of Cluster.AllocSolution
Originally, this data type was used both by instance allocation (1result), and by instance relocation (many results, one perinstance). As such, the field 'asSolutions' was a list, and thevarious code paths checked whether the length of the list matches the...
Migration: warn the user about hv version mismatch
Fix handling of cluster verify hooks
The change to enforce boolean results for cluster verify group opcodemissed the HooksCallBack, which uses a very ugly 1/0logic. Furthermore, the logic is wrong, since it unconditionallyresets the verify result to true....
http.client: Show pending requests as “owner”
In the context of the lock monitor a “pending” item does not yet own therequested resource. Since these HTTP requests are already undergoingthey should be shown as owners.
http.client: Add nice name to requests
With this change a node name instead of the IP address can be shown forpending RPC requests:Name Pendingrpc/node18.example.com/test_delay thread:Jq1/Job692/TEST_DELAY
rpc/http: Show pending RPC requests in lock monitor
Not all requests use an instance of RpcRunner yet and therefore won'tshow up (only instances have access to the global Ganeti context).Currently only the IP address is accessible. Another patch will add a...
http.client: Factorize code interacting with cURL
This simplifies HttpClientPool.ProcessRequests significantly and will behandy for showing pending RPC requests in the lock monitor.
Redistribute the RAPI certificate
This reverts to the old behaviour in Ganeti 2.4 and before.
Adding qemu-img dependency to INSTALL
http.client: Reduce performance impact by assertion
Call dict.values once instead of N times.
rpc: Overhaul client structure
- Clearly separate node name to IP address resolution into separate functions- Simplified code structure (one code path instead of several)- Fully unittested- Preparation for more RPC improvements
rpc: Make compression function module-global
No need to keep it in the class.
Keep only one global RPC runner in Ganeti context
Instead of having one RPC runner per mcpu processor this will keep onlyone instance as part of the masterd-wide Ganeti context. Upcomingpatches will change the RPC runner to report pending requests to the...
Update INSTALL with ovfconverter requirements
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
TemporaryFilesManager implementation
Export: unittests
Export: documentation
Export: saving data to ovf file
Export: parsing data from config file