(snap) Snapshot support for ExtStorage
Extend existing RPC params with the snapshot name andadd allow snapshot not only for LVM but also for EXT.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
(grnet) Move disk options before nic ones in kvm command
Older versions of Ganeti did ensure that during startupdisk devices will be inserted before nic devices in PCIconfiguration space. KVM inserts devices to PCI slotsdepending on the order of command line options....
(2.13) Pass the access parameter to ExtStorage template
Add the ExtStorage template to the set of templates that accept the'access' parameter. The default 'access' of the node-group forExtStorage devices will be 'kernelspace'.
Update the man page for gnt-instance to state that ExtStorage templates...
(2.13) Handle IDISK_ACCESS parameter in ComputeDisks
The IDISK_ACCESS disk parameter was not handled in the 'ComputeDisks'function, thus the 'access' parameter was ignored during the instancecreation. This patch fixes this and also fixes a typo in'_VerifyDiskModification'....
(2.13) Implement GetUserspaceAccessUri for ExtStorage
Allow ExtStorage devices to support userspace access.The 'attach' script of an ExtStorage provider is now allowed to returnmore than one line. The first line will contain as always the blockdevice path. Each one of the extra lines will contain a URI to be used...
(2.13) Move ExtStorage code out from bdev
Move the ExtStorage related code out from bdev to a newfile called 'extstorage.py'.
Signed-off-by: Ilias Tsitsimpis <iliastsi@grnet.gr>Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>...
(2.13) Design document for ExtStorage userspace access
This patch extends the 'shared-storage' design document and morespecifically the ExtStorage Interface to support userspace disk access.
Signed-off-by: Ilias Tsitsimpis <iliastsi@grnet.gr>Signed-off-by: Thomas Thrainer <thomasth@google.com>...
(2.13) Add 'access' disk option to man pages
Update 'gnt-instance' man page and document the 'access' disk option.Also fix a typo in 'metavg' disk parameter.
Signed-off-by: Ilias Tsitsimpis <iliastsi@grnet.gr>Signed-off-by: Klaus Aehlig <aehlig@google.com>...
(2.13) Make 'access' an optional disk parameter
This patch makes 'access' an optional disk parameter just likespindles, mode, name, vg and metavg. This option can only be set to'kernelspace' or 'userspace'. When 'userspace' is used, the instancewill access this disk directly without going through a block device....
(2.13) Add DiskParams to Disk object
The 'DiskParams' slot was missing from Haskell's Disk objects.Since Wconfd is now responsible for writting the config file this wascausing the 'params' slot to not be written in the config file.
Signed-off-by: Ilias Tsitsimpis <iliastsi@grnet.gr>...
(2.13) Rename DiskParams to GroupDiskParams
DiskParams was used for the cluster/group disk parameters type. Thispatch renames it to GroupDiskParams and uses the DiskParams type forthe parameters of one single Disk object.
(2.11) Add andRestArguments to IDiskParams
In this way, we cann pass through the opaque parametersrequired for disk creation and modification in the case ofexternal storage.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>...
(2.11) Add function providing the canonical andRestArguments
The field catching the remaining fields will always be of the sameshape, so add a function for this to make usage simple.
(2.11) Add genAndRestArguments :: Gen (Map String JSValue)
So that objects using AndRestArguments are available for testing.As the AndRestArguments are intended for passing through additionalparameters passed on the command line, we restrict them to the...
(2.11) Add additional constructor AndRestArguments to OptionalType
A field of this type will capture all the remaining fieldsof an object as JSValues. Obviously, the intended use isto have precisely one such field. This mechanism will allowto pass opaque values trough, as it is, e.g., required for...
(2.11) Add 'provider' to IDiskParams
IDISK_PROVIDER was included in python's IDISK_PARAMS, so itshould also be included in the Haskell code.
Now that luxid creates and enqueues jobs, without this patch theExtStorage interface is broken as the user can not pass the disk...
(2.11) Make BlockDev subclasses adhere the interface for Create
In commit 702c3270 two new parameters were added to theCreate function of BlockDev. Make subclasses also adherethis specification.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
(2.11) Make BlockDev subclasses adhere to new interface
In commit 702c3270 two new parameters were added to theconstructor of BlockDev. Make the subclassess accept theseadditional parameters as well.
(2.11) Make disk.name and disk.uuid available in bdev
Until now Disk name and uuid was not available on bdev level.In case of ExtStorage, this info is useful, and may be for othertemplates in the future too.
This patch treats the name and uuid object slots just like the size...
(2.13) kvm: Add migration capabilities as an hvparam
Latest QEMU versions support various migration capabilities. Eachcan be enabled/disabled with 'migrate_set_capability' monitorcommand.
Version 1.7.0 defines x-rdma-pin-all, auto-converge, zero-blocks,...
(2.8r) Workaround for Issue 621
Upon LUNetworkDisconnect() and LUNetworkConnect() try to acquireall cluster's instances.
By that _LS_ACQUIRE_ALL acquire mode is set and not_LS_ACQUIRE_EXACT and thus the deleted lock does cause any problem.
NOTE: This workaround is not merged upstream. They prefer to have...
'Raise' called inside 'CheckPrereq' needs the prereq kw
This patch fixes the missing 'prereq' keyword in calls to 'Raise' inthe control flow of 'CheckPrereq', and updates the tests.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Use more efficient statistics for the standard deviation
Instead of using the full sample as statistics providingenough information to compute the standard deviation, usea slightly more elaborate one. It contains the standardstatistics count, sum, and sum of squares, which can also...
Use statistics updates when allocating on pairs
When considering the various ways of positioning an instanceon a pair of nodes, make use of the fact that the statisticsare extremely similar (only two nodes changed) and obtain thenew statistics by updating the old one, rather than by recomputing...
Factor score computation through abstract statistics
Logically separate the computation of the cluster score intotwo steps: the computation of the abstract statistics and itsevaluation. In this way, we obtain an abstract value which wecan update instead of recomputing it when considering different...
Verify the update of the standard deviation statistics
Add a test that verifies that the error introduced by updatinga standard-deviation statistics of a sample with at least twoelements is not too large, as compared to the direct computation.
Signed-off-by: Klaus Aehlig <aehlig@google.com>...
Add data type for abstract statistics
Our cluster score is a weighted sum of certain sums andstandard deviations of node characteristics. When placinga single instance, the cluster score of a big number ofquite similar clusters are computed: that of the original...
Relax test requirements
Instead of insisting on perfect equality of scoreallow for numerical inaccuracies and consider everythingall differences in the cluster score smaller than 1e-12negligible. Given that, by default, a cluster with ascore of less than 1e-9 is considered perfectly balanced,...
Fix gnt-network client wrt instances report
Let the gnt-network client expect a list of instance names and notUUIDs as returned by QueryNetworks (by both old and new style querymechanism).
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>Reviewed-by: Helga Velroyen <helgav@google.com>
Fix QueryNetworks wrt instances
QueryNetworks tries to find which instances are connected to whichnetworks. The query mechanism in Haskell was written back when NICsreferred to a network via its name and not its UUID. Fix luxi tocomply with the current implementation (network slot of NIC object...
tiered allocation: try canonical search path first
In tiered allocation, instances are put on the cluster, while theyfit---and once no more instances of the given size can be fit, smallerinstances are tried next. There is obviously some heuristics involved...
Add QA config flag for all performance tests
Add a config flag similar to "os", "env" or "rapi" which disables allperformance related tests centrally. The individual config flags forjobqueue and parallel processing focused tests are not touched.
Also, add the flags to qa-sample.json....
build-bash-completion: reduce branches
The 'build-bash-completion' script has an enormousfunction which triggered a 'too many branches' linterror and was quite easily splittable in logicalsub-functions.
Signed-off-by: Helga Velroyen <helgav@google.com>...
Convert all the classes to new-style classes
... to make lint shut up.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Jose Lopes <jabolopes@google.com>
Merge branch 'stable-2.9' into stable-2.10
Conflicts: src/Ganeti/Monitoring/Server.hs: trivial
Improve haskell style
...by fixing lint warnings found by HLint v1.8.57. In particular,make sure 'make hlint' passes for this version of hlint.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Add --no-locks option to gnt-debug delay
Add the possibility to don't acquire locks during `gnt-debug delay`.This allows to run many delay jobs in parallel instead of havingthem run sequentially.
Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>...
Include design-performance-tests.rst in index
Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Document the --force-failover option
Extend the gnt-group man page by documenting the --force-failoveroption of the evacuation command.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>
Support group evacuation by failover
Support evacuating a node group not using migration.This can be useful if the group evacuated to has differenthardware.
Add an option --force-failover
...to be added to gnt-group evacuate forcing evacuation bymeans for failovers instead of migrations.
Extend OpGroupEvacuate by a ForceFailover paramter
Add a parameter to OpGroupEvacuate to force failovers to beused instead of migrations. This can be useful, if a groupis evacuated to another with different hardware.
Mark performance tests design as implemented
The performance tests are implemented as outlined in the design doc, somark the document as implemented.
check-man-warnings: use C.UTF-8 and set LC_ALL
check-man-warnings currently partially forces the en_US.UTF-8 locale bysetting LANG. This implicitly assumes that the locale exists, whichmight not be the case when building e.g. in chroot environments. If the...
openvswitch fix
Document the --sequential option
Document that group evacuation is usually run in parallel, butcan be made sequentially by providing an appropriate option.
Support sequential evacuation
Make gnt-group evacuate support the --sequential option,which causes all evacuation moves to be execuated sequentially.This can be used to avoid congestion on a possibly slow linkbetween the node groups.
Add an option --sequential
...which can be used to tell commands like gnt-group evacuate tosequentially perform their action to keep load away from the cluster.
Extend OpGroupEvacuate by a sequential paramter
...telling it to run all the evacuation jobs sequentially.This might be useful to avoid too much load that otherwisemight occur.
Fix passing of ispecs in cluster init during QA
The ispecs were previously passed as multiple parameters to gnt-clusterinit, which did not yield the desired result. This patch changes thisbehavior and passes the min/std/max values in one parameter.
Signed-off-by: Thomas Thrainer <thomasth@google.com>...
On expanding jobs, extend reason trail
Certain op-codes expand to a set of jobs. Forthose new jobs, extend their reason trail withthe reasons of the job that expanded to them.In this way, also for indirectly genreated jobsa complete trace back to the initiator can be...
Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Postpone 2.10.4 release to May 15th
Due to some tests not being completed by today, postpone the 2.10.4release to tomorrow.
Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Don't fail QA if submitting a job takes too long
Degrade a QA error which was triggered if job sumission take too long toa warning. This will prevent spurious QA failures.
Revision bump for the 2.10.4 release
Prepare NEWS file for 2.10.4 release
Preparing the NEWS file for the release of 2.10.4 on Wednesday.
Add LC_ALL=en_US.UTF-8 before running check-man-warnings
It will be fail if LC_ALL was set to non-utf8 locale(e.g, 'C') byfollowing error.
col: Invalid or incomplete multibyte or wide character man: command exited with status 1: col -b -p -x
Signed-off-by: Yuto KAWAMURA <kawamuray.dadada@gmail.com>...
Move QAThreadGroup to qa_job_utils.py
Move QAThreadGroup to the utils module so it can easily be used withQAThread.
Extract GetJobStatuses and use an unified version
Unify two very similar functions which query the ganeti cluster for jobstatuses during QA.
Run disk template specific tests only if possible
Only run disk template specific tests if the corresponding disk templateis really enabled. Also, move the (up to now wrong) check out ofqa_performance.py to ganeti-qa.py, so no no-time test runs are reported...
Test parallel instance ops and plain instances
Test various instance operations while another instance is created inparallel.Also enable a test which creates twice as many plain instances as thereare nodes in the cluster in parallel.
Test parallel creation of DRBD instances
Test the performance of parallel creation (and immediate removal) of DRBDbacked instances. Twice as many instances are created as there are nodesin the cluster.
This also required some refactoring of the test code in order to reduce...
Test parallel job submission performance
Submit 200 delay jobs and verify that the submission rate does not dropas more jobs are added to the queue. Also verify that a `gnt-clusterinfo` is not slowed down by a large number of jobs in the queue.
Test parallel instance query operations
For each created instance, a `gnt-instance info` is issued. In addition,`gnt-instance list` is issued as often as well.
Test parallel instance operations
Test parallel starting, stopping, rebooting and (if supported)reinstalling instances.
Test parallel instance modification
Submit modifications of backend parameters as well as OS parameters inparallel for the maximum amount of instances available.
Test parallel node-count instance creation
Test the parallel creation (and removal) of as many instances as thereare nodes in the cluster.
Test parallel instance creation and removal
This is the first performance related test. It creates as many instancesas available in the QA config in parallel and removes them (again inparallel) immediately after the creation succeeded.
In order to ease writing of additional tests, a lot of the logic is kept...
Fail in replace-disks if attaching disks fails
Previously, if attaching the new secondary during a replace-disksoperations failed, only a warning was emitted. The subsequent sync-disksoperation cannot finish in such a case, however.
Therefore, this patch changes the warning into an error. This way it's...
Conflicts: configure.ac # Taken both contributions
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Add a basic test for --restricted-migration
Essentially verify that, in the given example, a solution is stillfound and that the original present failover is dropped.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Describe the --restricted-migration option
Add the --restricted-migration option to the man page together witha hint on the intended use case.
Support restricted migration
Make hbal support an option to disallow ReplacePrimary movesand restrict ReplaceAndFailover to instances where the primarynode is drained. If used in evacuation mode, the only migrationmoves will be off the drained nodes....
Add an option for restricted migration
This option will allow node evacuation with migrations onlyoff the nodes to be evacuated.
Add an example for node evacuation
The configuration shows an unbalanced cluster witha node being drained. The natural evacuation strategyincludes frf-moves.
KVM: set IFF_ONE_QUEUE on created tap interfaces
The IFF_ONE_QUEUE flag directs the kernel to only queue tap packets once(as opposed to queueing them twice, once for the device, and once for theqdisc), possibly avoiding interface stalls when one of the queues overruns....
Add configure option to pass GHC flags
Adding the HEXTRA option to make might not be practical for a changethat should be always applied, e.g., hiding a certain package. Thispatch allows the flags to be specified at the configure level.
Signed-off-by: Hrvoje Ribicic <riba@google.com>...
Add a test for parsing version strings
...even in the presence of patch levels.
Set correct Ganeti version on setup commands
When asked to execute a setup command, prefix it witha command sequence to test for the existence of theneeded Ganeti version and switching to it.
Add a utility to combine shell commands
Add a function that combines individual shell commands intoa single command (calling the standard shell) that executesthe given commands in sequence while they succeed.
Add design doc for performance tests
This design doc describes which tests are added in order to test theperformance of Ganeti, specifically when handling multiple jobs inparallel.
Note that this design doc is submitted to stable-2.10 so performancechanges over different Ganeti versions can be captured....
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
If Automake version > 1.11, force serial tests
This fixes broken compilation on Debian Jessie (#802).See also http://stackoverflow.com/questions/15820844/
Thanks to Apollon Oikonomopoulos for finding the solutionand Klaus Aehlig for helping to do it conditionally....
Fix failed DRBD disk creation cleanup
When creating a DRBD disk, Ganeti reserves minor numbers on a per-nodebasis. In case of a failed disk creation, these reservations should bereleased. During the name/uuid refactoring, the invocation of thefunction that releases the minors was not updated, resulting in no...
Fix lint errors introduced during cherry-pick
Calm a few lint errors introduced during cherry-picking code inqa_job_utils.py. The fixes were intentionally made in a way which shouldproduce merge conflicts later on, so it's not forgotten to undo them.
Hooking up verification for shared file storage
As for the cluster modify, it was also forgotten tohook up the verification of the shared file storagepaths despite all infrastructure was done already.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Fix --shared-file-storage-dir option of gnt-cluster modify
While all infrastructure to make shared-file storageruntime-configurable was already submitted, the actualsetting of the path was forgotten. This patch fixes it.
Clarify default setting of 'metavg'
This fixes issue 810, suggesting to clarify where thedefault for 'metavg' comes from.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Fix invocation of GetCommandOutput in QA
The cherry-picked function _GetOutputFromMaster() callsGetCommandOutput() with parameters only present in newer Ganetibranches. Remove those parameters as they cause errors in stable-2.10.
Clean up RunWithLocks
This patch cleans RunWithLocks up a little bit by reducing the numberof delay function terminations, and using the QAThread class to ensureexceptions are thrown at the right time and in the right place.
Add an exception-trapping thread class
To have better control over threads, this patch adds a helper Threadsubclass which captures any exceptions occurring for later use.
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>...
Wait for delay to provide interruption information
The RunWithLocks test assumed that gnt-debug delay would have the infoneeded for interruption ready immediately after being run, and in somesituations this is not the case. This patch makes the test more patient...
Add an expected block option to RunWithLocks
To compensate for the cases where a QA test is supposed to block whena lock is present, add an additional option showing whether blocking issupposed to happen or not.
Track if a QA test was blocked by locks
This patch adds threading to the RunWithTests function, allowing onethread to execute the QA test, and the other to monitor if it is beingblocked by locks set up during the test. If it is, terminate theblocking job, and let the QA continue, reporting the test failure at...
Add a RunWithLocks QA utility function
This patch adds a QA utility function that acquires a set of locks, andattempts to run a given function with the locks in place. Should thegiven function block, this function does not detect this - laterpatches will address the issue....
Set exclusion tags correctly in requested instance
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Jose Lopes <jabolopes@google.com>
Export extractExTags and updateExclTags
...from the htools Loader. These functions are needed whenparsing the requested instance of an allocator request.