Centralize the creation of a WConfd context in Python code
This will allow easier modification of the structure of a clientidentity later.
Also add a helper method for creating a WConfd context from a context.
Signed-off-by: Petr Pudlak <pudlak@google.com>...
Merge branch 'stable-2.11' into master
Merge branch 'stable-2.10' into stable-2.11
Merge branch 'stable-2.9' into stable-2.10
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>
kvm: Add migration capabilities as an hvparam
Latest QEMU versions support various migration capabilities. Eachcan be enabled/disabled with 'migrate_set_capability' monitorcommand.
Version 1.7.0 defines x-rdma-pin-all, auto-converge, zero-blocks,and xbzrle migration capabilities....
Make watcher submit queries low priority
Make the watcher collect its data using low-priority jobs,to avoid blocking user/admin jobs. Note that repair jobs arestill submitted normal priority. Fixes issue 772.
Signed-off-by: Klaus Aehlig <aehlig@google.com>...
Add a function for getting the list of candidate certs
.. to ConfigWriter as well.
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Move utility functions for candidate certs. to ConfigWriter
In particular AddNodeToCandidateCerts and RemoveNodeFromCandidateCerts.
Calling 'cfg.Update(cluster)' causes problems in WConfd, asit doesn't operate on a shared configuration object any more....
Show OS variant information in gnt-os info
Currently, the non-standard/modified per-OS hypervisor parameters, orOS specific parameters can be listed only by the 'gnt-cluster info'command, which is a non-standard place to show them. Extend the'gnt-os info' command to display the available/supported OS variants...
Skip rename when OS scripts are absent
When an instance does not have OS scripts because, for example, ituses an OS image, do not rename the instance after an import.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Use raw disks in import/export when OS scripts are absent
When an instance does not have OS scripts because, for example, it hasan OS image, then the import/export should not try to run the OSscripts. Instead, it should use raw import/export.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>...
Add the thread identifier parameter to gnt-debug listlocks
.. to comply with the updated WConfd interface.
This makes the call less useful as the thread ID is often unknown tousers. This needs to be improved in the future.
Add a thread ID to the WConfd client id
This allows to distinguish threads that don't have a job id, which isneeded for answering queries.
Since Python thread IDs aren't guaranteed to be unique, in future it'dbe preferable to use a different, unique identifier....
Fix conflict between virtio + spice or soundhw
With regard to PCI slot occupied by a KVM instance we haveobserved the following:
1) Slot 0 will always be Host bridge.2) Slot 1 will always be ISA bridge.3) Slot 2 will always be VGA controller (even with -display none)....
Fix bitarray ops wrt PCI slots
Introduce new method `_GetFreeSlot()` responsible only for bitarrayoperations. It fixes search in case of bitarray is either '0000..'or '1111..'.
Use it instead of `_UpdatePCISlots()` and in `_GetFreePCISlot()`.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>...
Remove unused functions to check OS variants
... as this is now performed on the node.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Remove unused RPC 'os_get'
... include the RPC post process hook.
Remove calls to 'CheckNodeHasOS'
... because 'CheckOSParams' already checks the OS variant.
Add 'force_variant' to RPC 'os_validate'
Move function 'CheckOSVariant' to the node and add parameter'force_variant' to RPC 'os_validate', thus making the node verify theOS variant together with the rest of the OS params.
Remove SSH copyfile from LU and assume the file exists
According to Ganeti design, files should not be copied from master tonodes and instead they are assumed to exist and it is the user'sresponsibility to assure that the file does exist.
Fix OS image detection on master
Extend '_DumpDevice' to enable/disable file truncation
... because when the data source is infinite, truncation is notnecessary, but when the data source is finite and is, for example,smaller than the device, truncation can reduce the disk size.
Fix disk truncation in download and dump OS images
Check if OS image exists on the node before dumping
Make mcpu acquire WConfD locks
So far, the mcpu acquires locks that live in memoryof masterd. This design does not fit with our jobs-as-processesgoal. So make mcpu acquire the corresponding locks in WConfDinstead.
Note that this implies changes in various other files that call...
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Rename compression option in gnt-backup export
The gnt-backup export option --compress did not do what most wouldexpect upon seeing the name: compress the exported image. Instead, itused compression to try and speed up the transfer of the image,decompressing it prior to writing the file. To reduce confusion, this...
Instance reinstall with OS images
OS images in 'LUInstanceCreate' and OS scripts optional
Instance create with OS image
Extend 'LUInstanceCreate' to image the instance's first disk if an OSimage is specified via the OS params. If the OS image is a file, itwill be copied to the node via SSH. If ths OS image is a URL, it willbe passed directly to the node, which will then download the file....
Function to check if the OS image parameter is valid
Add helper function to check if the OS image parameter contained inthe OS parameters of an opcode is valid.
Function to image disks while ensuring that disks are paused
Function to remove instance if disks are degraded
RPC 'blockdev_image' to image devices
Add RPC 'blockdev_image' that uses 'ganeti.backend.BlockdevImage' todump an image to an instance's disk device, optionally downloadingthat image.
Helper function to image a device by downloading or dumping
Add 'BlockdevImage' which downloads a file and dumps it to aninstance's disk if the path is a URL, otherwise it dumps the filedirectly to the instance's disk.
Helper function that downloads an image and dumps it to disk
Generalize 'WipeDevice' to 'DumpDevice'
Helper functions to get and update OS image from OSParams
Add 'GetOSImage' and 'PutOSImage' which handle the OS image key in theOS parameters dict.
Fix export order according to definition order
Fix docstrings
Fix several docstrings.
Reuse method to parse name from OS 'name+variant' string
Fix error introduced during merge
A parameter was lost while resolving a conflict in the signature of a function.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
gnt-cluster copyfile: accept relative paths
If, on the command line, the argument to gnt-cluster copyfile isa relative path, consider this a shorthand for the correspondingabsolute path.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Merge branch 'origin/stable-2.8' into stable-2.9
Let WConfd distribute SSConf to nodes
.. and remove the corresponding code from lib/config.py.
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Let the SSConf RPC server side handle lists
Since on Haskell side we represent SSConf as list of lines, let thePython side understand it as well.
Let WConfd distribute the configuration to MCs
.. and remove the distribution from lib/config.py
Add a new RPC server call for uploading a single file
The server side processes the request exactly the same as for"upload_file".
Unlike "upload_file", the new call "upload_file_single" declares allrequired fields without requiring additional preprocessing....
Add more meaningful error messages to asserts in vcluster
.. to simplify debugging of RPC calls.
Improve RAPI detection of the watcher
If the watcher is not allowed to access RAPI, it doesn't mean that it is deadand needs to be restarted.
Fixes Issue 752.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Enable a timeout for instance shutdown
Add the timeout parameter to the StopInstance function of the hypervisor baseclass and to all its implementations.
Also, change the tests as required by this change.
Signed-off-by: Michele Tartara <mtartara@google.com>...
Allow KVM commands to have a timeout
Modify the function that sends commands to the KVM monitor so that it ispossible to specify an optional timeout after which the command is killed.
Allow xen commands to have a timeout
Modify the function that runs Xen commands so that it is possible to specify anoptional timeout after which the command is killed.
Fix wrong docstring
Fields must be the final elements in an epytext string.
Use correct lockfile for gnt-debug wconfd
As jobs are currently running in masterd, use the masterd livelockfile.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Add utility to guess livelock file for an owner
As livelock files are constructed in a systematic manner,we can guess what the livelock file for a given owner is.While this will not necessarily work perfectly, it will beuseful to simplify direct debugging of WConfD....
Make masterd create a livelock file
...so that it can request resources from WConfd.
Rename setup_queue to setup_context in masterd
...as this function sets up a much richer context than justthe job queue, including the current lock management.
Add utilities for liveliness lock files
To request resources from WConfD, requesters have to providethe name of a file they own an exclusive lock on. In this way,their death can be detected. Add utility functions to obtainsuch a file name.
Ensure the existence of LIVELOCK_DIR
Add a path to store the lock files presented to WConfD
When requesting resources from WConfD, a file has to bepresented where an exclusive lock is owned on, so thatWConfD can detect when the requester dies. Add a pathto a directory where these files are kept in....
Convert int to float when checking config. consistency
When reading the configuration file from RPC JSON, values without afloating point are parsed as 'int', not as 'float', and later theconsistency check fails.
This patch adds an automatic conversion from 'int' to 'float' during...
Align timestamps in gnt-job info
This patch aligns the timestamps output as a part of gnt-job info, andperforms minor refactorings in the process.
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Add alignment support to PrintGenericInfo
Aligning dictionary entries makes no difference to a YAML parser, butmakes the output much easier to read and compare. This patch adds thepossibility of specifying alignment groups to ordered dictionaryentries....
Make gnt-job info output valid YAML
This patch changes gnt-job info to use standard functions defined incli.py, and output valid YAML.
Make PrintGenericInfo handle tuples better
The PrintGenericInfo function in cli.py did not handle tuples ascontainers of items, making it impossible for these to be deserializedautomatically when a YAML parser is used. This patch adds separatehandling of tuples, including inlining them for readability when...
Make gnt-debug delay interruptible
The gnt-debug delay command could be useful as a means of acquiringlocks for testing purposes. In practice, to be useful it should beinterruptible, otherwise we risk race conditions or long delays.
This patch follows the examples of the move-instance command and the...
Add the interruptible option to gnt-debug delay
This patch allows the opcode option to be used through the gnt-debugclient.
Factor Unix domain socket creation into helper class
As the delay class will also have to start using domain sockets,extract the functionality into a helper class.
Fix minor accidental concatenation
Handle incorrect duration more elegantly
The previous version of the LUTestDelay opcode relied on the utilityfunction complaining about the negative duration. As this function hasbeen removed for now, do the check ourselves, and issue a moreappropriate exception....
Make gnt-debug delay command run in parallel
The gnt-debug delay command executes the delay first on the master, andonly then on all the other nodes, causing a significant delay. Thispatch makes the command treat the master as it would all other nodes....
Fix typo in RAPI client utility
Remove duplicated '_CheckOSVariant'
It seems '_CheckOSVariant' was moved from 'ganeti.cmdlib.instance' to'ganeti.cmdlib.instance_utils' but the source was never deleted. Thispatch deletes the source copy if this function.
Use node UUIDs for executing LU hooks
LUNodeAdd, the only LU using a node name still, is changed to overwritePreparePostHookNodes() and use node UUIDs only as well.This allows to remove the support for 3-tuples as results ofBuildHooksNodes() and removes the translation to node names....
Add PreparePostHookNodes to LUs
This method can be used to alter the list of node UUIDs on which posthooks are executed. PreparePostHookNodes is called after Exec, so LUscan use data only known after the execution of the LU.
Signed-off-by: Thomas Thrainer <thomasth@google.com>...
Fix error propagation in post-commit hooks
An error in the post-commit hooks could not be propagated correctly and couldresult in e.g. the return code of gnt-cluster verify to be 0 even in presence ofan error in its output.
Fixes Issue 744.
Add listlocks to gnt-debug wconfd
So that wconfd's locking can be debugged directly.
Stop watcher from restarting down instances during an opcode
This patch changes the watcher to check whether an instance that isdown is also locked by some LU before attempting to restart theinstance. Without checking the lock status, the watcher could think...
Remove unused import in rpc/transport.py
.. which got there by mistake.
Retry luxi/wconfd RPC calls if the connection is closed
Since the daemon can decide to close a client connection after atimeout, the client needs to be able to automatically reconnect.
This patch introduces this functionality into the RPC client:If an attempt to send data fails on 'Broken pipe', it's retried one more...
Allow cluster mac prefix modification
Extend LUClusterSetParams to allow the modification of the clustermac-prefix setting in 'gnt-cluster modify' command.
This fixes part of issue 239.
Signed-off-by: Dimitris Bliablias <bl.dimitris@gmail.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Show mac prefix setting in gnt-cluster info
Include mac-prefix setting in the output of 'gnt-cluster info'command.
Setting correct permissions of client cert (split-user)
This patch makes sure that the client certificate getsthe right permissions and owner when created. Additionallyit enhances the 'ensure_dirs' script to correct thepermissions in case they are broken for whatever reason....
Add a command to gnt-debug to test various aspects of wconfd
For debugging purpose, support direct communication to WConfD fromthe commandline for some of its commands. For the time being, supportthe echo command.
Add some whitespace to fix formatting
Some error messages were lacking some spaces between linesto make it more readable.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Consider old client cert only when available
This fixes a bug which occurred only after upgradingfrom 2.10 to 2.11. During the cluster renew-cryptooperation, Ganeti tries to include the old certificatein the candidate map while it is providing newcertificates. This failed when there was no certificate...
Fix return of 'Validate'
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Add reason for job pickup to the trail
Add a new entry in the reason trail when a job is picked up by MasterD from thehard drive, after LuxiD put it there.
Note that the signature of NameToReasonSrc is changed in an incompatible way,although it's a public method because in this commit we also change its only...
Make the AddReason method public
It will need to be accessed from outside the class too in one of the nextcommits.
Let config.py use WConfd for reading/writing the config
Currently it only relays the reads/writes to the file to WConfd,everything else yet remains in config.py.
Also if the 'ConfigWriter' is opened in "offline" mode (like inbootstrap.py), it doesn't use WConfd and resorts to the original...
Start WConfd temporarily during master failover
.. in order to update the configuration and distribute ssconf, beforestarting the daemons by the scripts.
Merge branch 'origin/stable-2.10' into stable-2.11
Signed-off-by: Hrvoje Ribicic <riba@google.com>...