KVM: reserve a PCI slot for the SCSI controller
Currently instances with disk_type=scsi are broken, because the SCSIcontroller uses a PCI slot not accounted for in the PCI assignmentlogic. We reserve a throw-away slot just for that.
This is a temporary workaround and will be reverted during the upcoming...
Check for LVM-based verification results only when enabled
This patch fixes a little glitch in 'gnt-cluster verify'.If LVM-based storage was disabled, it would still checkLVM-related verification results and print a confusingerror message.
Signed-off-by: Helga Velroyen <helgav@google.com>...
Fix "existing" typos
This patch fixes the wording of a couple of messages,including two typos of the word 'existing'.
Signed-off-by: Helga Velroyen <helgav@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Fix output of gnt-instance info after migration
After migrating a DRBD based instance, the output of gnt-instance infowas wrong wrt. DRBD minors. This patch fixes the output in such cases.
Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Verify configuration version number before parsing
As the attempt to convert the dict used as json representationof the configuration into a configuration object already makesassumptions about the internal representation, verify the versionbefore such an attempt. Fixes issue 783....
Merge branch 'stable-2.9' into stable-2.10
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>...
Merge branch 'stable-2.8' into stable-2.9
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
Fix specification of TIDiskParams
Commit 580b1fdd incorrectly assumes that disk parameters arejust the standard ones, whereas the man page explicitly statesthat additional parameters can be passed as well, if they makesense for the chosen storage type. Fix this....
Add aliases for nodes
This patch adds a single alias for the secondary_ip property of thenode.
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>
Add renaming of group custom ndparams, ipolicy, diskparams
This patch adds the ability to set the group-specific parameters in thesame way they are described when returned by the info command - withthe "custom_" prefix.
Signed-off-by: Hrvoje Ribicic <riba@google.com>...
Add renaming of instance custom params
Much like the groups before, this patch allows custom_* params to besubmitted under the same name they can be retrieved as in the infocall.
Add support for value aliases to RAPI
This patch extends the metaclass used to generate RAPI handlers toallow creating aliases of certain values returned by GET methods.
Add aliases for cluster parameters
This patch adds aliases for two cluster parameters.
Add reason parameter to RAPI client functions
Only the functions for starting, stopping and rebooting a VM had a reasonparameter. Now, all the RAPI client functions generating opcodes do.
Also, one test is expanded to verify that a RAPI request with both body and...
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>
Make watcher submit queries low priority
Make the watcher collect its data using low-priority jobs,to avoid blocking user/admin jobs. Note that repair jobs arestill submitted normal priority. Fixes issue 772.
Signed-off-by: Klaus Aehlig <aehlig@google.com>...
Fix conflict between virtio + spice or soundhw
With regard to PCI slot occupied by a KVM instance we haveobserved the following:
1) Slot 0 will always be Host bridge.2) Slot 1 will always be ISA bridge.3) Slot 2 will always be VGA controller (even with -display none)....
Fix bitarray ops wrt PCI slots
Introduce new method `_GetFreeSlot()` responsible only for bitarrayoperations. It fixes search in case of bitarray is either '0000..'or '1111..'.
Use it instead of `_UpdatePCISlots()` and in `_GetFreePCISlot()`.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>...
Fix error introduced during merge
A parameter was lost while resolving a conflict in the signature of a function.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Klaus Aehlig <aehlig@google.com>
gnt-cluster copyfile: accept relative paths
If, on the command line, the argument to gnt-cluster copyfile isa relative path, consider this a shorthand for the correspondingabsolute path.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Merge branch 'origin/stable-2.8' into stable-2.9
Improve RAPI detection of the watcher
If the watcher is not allowed to access RAPI, it doesn't mean that it is deadand needs to be restarted.
Fixes Issue 752.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>
Enable a timeout for instance shutdown
Add the timeout parameter to the StopInstance function of the hypervisor baseclass and to all its implementations.
Also, change the tests as required by this change.
Signed-off-by: Michele Tartara <mtartara@google.com>...
Allow KVM commands to have a timeout
Modify the function that sends commands to the KVM monitor so that it ispossible to specify an optional timeout after which the command is killed.
Allow xen commands to have a timeout
Modify the function that runs Xen commands so that it is possible to specify anoptional timeout after which the command is killed.
Fix wrong docstring
Fields must be the final elements in an epytext string.
Use node UUIDs for executing LU hooks
LUNodeAdd, the only LU using a node name still, is changed to overwritePreparePostHookNodes() and use node UUIDs only as well.This allows to remove the support for 3-tuples as results ofBuildHooksNodes() and removes the translation to node names....
Add PreparePostHookNodes to LUs
This method can be used to alter the list of node UUIDs on which posthooks are executed. PreparePostHookNodes is called after Exec, so LUscan use data only known after the execution of the LU.
Signed-off-by: Thomas Thrainer <thomasth@google.com>...
Fix error propagation in post-commit hooks
An error in the post-commit hooks could not be propagated correctly and couldresult in e.g. the return code of gnt-cluster verify to be 0 even in presence ofan error in its output.
Fixes Issue 744.
Merge branch 'origin/stable-2.9' into stable-2.10
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Make gnt-debug locks display fake job locks properly
When a job is dependent on other jobs, a fake lock is created whosepending entry contains a list of job ids waiting on the job. gnt-debuglocks did not expect the job ids to be ints, crashing when encountering...
Make NiceSort treat integers well
NiceSort is invoked on arrays that may contain strings, but in othersituations can contain ints as well. As this surprisingly makes sense,add a tiny modification to make NiceSort work in these conditions.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
Fix expression describing optional parameters
The NIC's network and vlan are also newly added, hence need to beconsidered optional to remain backwards compatible.
Let the instance's tuple of nodes start with the primary
Before the tuple of nodes of an instance was created from a set, listingthe nodes in alphabetical order. This patch ensures that the primarynode is always the first one in the list.
Signed-off-by: Petr Pudlak <pudlak@google.com>...
Conflicts: lib/cmdlib/instance.py: manually apply 0973f9ed on...
Improve job status assert affected by race condition
In the sliver of time between choosing a waiting job to be executed andtrying to acquire locks for its execution, the status of the job can bechanged to canceling. An assert checking the job status neglected to...
Export and import Disk/NIC name
Name of Disk/NIC were not exported during backup until now.Use the exported info during gnt-backup import.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Fix backup import in case NIC is inside a network
Network UUID is written in .ini file during backup exportbut is not used by _ReadExportParams(). This patch fixes it.
Please note that in case a network is given, link and mode shouldnot be included in NIC options....
Override get() method of ConfigParser
During backup import/export SafeConfigParser() is used tosave/restore instance's configuration. There is a possibility if anexport is done with a different Ganeti version, a specific value notto be saved during export (e.g. the NIC/Disk name) but still...
Workaround for monitor bug related to greeting msg
QMP may return multiple greeting messages upon connection.This is reported on qemu-devel. The fix is one-liner butuntil it get's released this is a quick and dirty workaroundthat flushes the client's buffer after getting the first...
hotplug: Verify if a command succeeded or not
Just after issuing _CallHoplugCommands() we invoke_VerifyHotplugCommand() which parses `info pci` resultand searches for given PCI slot and device id.
If we previously had removed a device but it is still there...
hotplug: Call each qemu commmand with an own socat
Previously we issued one socat command with two "\n" separatedactions (e.g. netdev_add ...\ndevice_add...)
After having observed a strange monitor behavior [1] splittingthose commands and introducing a sleep time in between, may reduce...
Make the LUInstanceCreate return node names, not UUIDs
The LUInstanceCreate returned names instead of UUIDs in 2.6. Along theway, the names were internally replaced with UUIDs, and the abstractionleaked. This patch fixes the issue.
upgrade: start daemons after ensure-dirs
On upgrading a cluster, we only can rely on daemons startingup cleanly, if all needed directories are generated first. Soensure-dirs needs to be run first.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Petr Pudlak <pudlak@google.com>
Gracefully handle degraded instances in verification
The current code assumes that every instance either is of typediskless or has at least one disk. However, with the option toremove individual disk degraded 0-disk non-diskless instancescan occur. While such instances usually are not useful, Ganeti...
Be aware of the degraded case when cleaning up an instance
In the case of a degraded file-based instance, the file storage directoryfor that instance cannot be obtained by looking at the first disk. Usethe standard location, computed from first principles, in this case....
Preserve disk basename on instance rename
For file-based instances, upon rename, the directory containingthe instance disks is moved. Therefore, the basename needs tobe preserved in this case. Fix this. Note that so far, thisworked by accident as before 94e252a3 file names used to be...
Add QA tests for RAPI multi-instance allocation
The instance multi-allocation had no tests to detect its breakage, andthis patch fixes that.
Fix multi-allocation RAPI method
The OpInstanceMultiAlloc that the instances-multi-alloc RAPI methoduses accepts a list of OpInstanceCreate opcodes rather than a list ofdictionaries as provided by the method. This patch correctly constructsthe opcodes, allowing the RAPI call to work as expected....
Assign unique filenames to filebased disks
With the new format for cmdline arguments, the user is able to add adisk to an instance at a specific index. But filebased disks' filenameshave the form "{0}/disk{1}" where '{0}' is the file_storage_dir and'{1}' is the index of the disk. So if an instance has 3 disks and we...
Fix 'hvparams' of '_InstanceStartupMemory' on hypervisors
Most hypervisors were calling '_InstanceStartupMemory' but not passingthe 'hvparams' keyword argument. Actually, it is not necessary topass this argument given that it is an attribute in the instance...
Run drbdsetup syncer only on network attach
As late as DRBD 8.3.11, the drbdsetup syncer command has a bug causingnodes to hang from time to time, requiring manual intervention to fix.The use of the command cannot be avoided, but the incidence of use can...
Add correct locking of master node to gnt-debug delay
The gnt-debug delay command required locks for all nodes except themaster - this patch fixes the issue by adding master to the lockswhenever needed.
Add job id type assert to jqueue.py
While the changes introduced in previous patches should stop any jobid parameters reaching the queue as strings, add an assertion here tocatch any strings making it through.
Add job id transformation/check to Luxi Python client
This patch adds checks to the Luxi client, making sure that job idsare converted from strings to ints before being passed on, or that anerror is reported.
query: fix detection of master in _GetNodeRole()
Commit 1c3231aa changed the invocation of _GetNodeRole() to pass themaster node by UUID and not by name, but didn't change theimplementation to compare the nodes by name. As a result, the masternode (which is also a master candidate) would always fall through to the...
Include target node in hooks nodes for migration
In case of DRBD, hooks run on both primary (source) and secondary(target) nodes. To get the same behavior for DTS_EXT_MIRROR, where wedo not have secondary node, we should explicitly add target node tohooks nodes during instance migration/failover....
Remove deprecated _ERROR_DATA_KEY in QMP
Commit de253f14 of QEMU repo "BREAKS QMP's compatibility forthe error response" as it removes "data" key from qmp errorresponse messages. To this end we only log "class" and "desc" values of the message.
Run postupgrade hook after upgrade
To allow for necessary last-moment adaptions, of the new cluster,we run the post-upgrade hook of the target version, providingthe version we originally started from.
Provide path to post-upgrade
Also add the current version to the intent-to-upgrade file
Our design states, that the intent-to-upgrade file contains "the currentversion of ganeti, the version to change to, and the process ID". Make theimplementation fit with that design.
Improve backwards compatibility of Issue 649 fix
Commit e6e4ff4cf8d0100f331f94f7a27aa1e03a5d0e7d fixed Issue 649 by switching theseparator for usb_devices from comma to space. That solved the problem withthe command line, but RAPI was able to work with commas too, so, for backwards...
Change usb_devices separator to whitespace
The usb_devices parameter was using comma as a list separator, but this cannotwork because comma is already used as the hypervisor parameter separator.
Change it to use whitespace as a separator, in accordance to what already done...
Ensure that all the hypervisors exist in the config file
All the hypervisors are supposed to exist in the config file, but it might notbe so after upgrades from old versions. This patch ensures that all the missinghypervisors are added with their default values to the config file....
Fix pylint 0.26.0/Python 2.7 warning
pylint 0.26.0 on Python 2.7 generates a warning on the string '\ ',recommending to use the r prefix. This patch adds the missing prefix.
Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Add support for blktap2 file-driver
Newer Xen versions use blktap2 instead of blktap. This patch adds supportfor it in Ganeti.
Fixes Issue 638.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>
Fix RAPI network tag handling
The network tags were absent from an if check used to actually listtags. The patch fixes the oversight, and adds a proper error message incase the issue occurs again for a new tag type.
Make network tags searchable
This patch adds the network tags to the tags searched by gnt-clustersearch-tags, and in the process cleans up the code slightly.
Signed-off-by: Hrvoje Ribicic <riba@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Pass hvparams to GetInstanceInfo
...so that the xen command to be called can be determined. Thisfixes another semantical conflict of the last merge.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Jose Lopes <jabolopes@google.com>
Adapt parameters that moved to instance variables
Due to a change in the code organization in stable-2.9, somemethod variables became instance variables, causing a semanticmerge conflict. Fix this.
Signed-off-by: Klaus Aehlig <aehlig@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Support reseting arbitrary params of ext disks
If param=default and the param already exists then we removeit from params dict. This is stolen by GetUpdatedParams() whichis used for hvparams modification/inheritance.
This means that 'default' value is not accepted for an arbitrary...
Allow modification of arbitrary params for ext
Disks of ext template are allowed to have arbitrary parametersstored in the Disk object's params slot. Those parameters can bepassed during creation of a new disk, either in LUInstanceCreate()or in LUInsanceSetParams(). Still those parameters can not be...
Do not clear disk.params in UpgradeConfig()
Commits 5dbee5e and cce4616 fix disk upgrades concerning paramsslot. Since 2.7 params slot should be empty and gets filledany time needed.
Still ext template allows passing arbitrary params per disk.These params should be saved in config file for future use....
SetDiskID() before accepting an instance
SetDiskID() fills physical_id slot of a Disk object.
LUInstanceSetParams() does not invoke SetDiskID() upon creation of anew disk. As a result the physical_id slot of the Disk object inconfig data is missing.
In case of ext disk template, in AcceptInstance() we invoke...
Lock group(s) when creating instances
This is required to prevent race conditions such as removing a networkfrom a group and adding an instance at the same time. (See issue 621#2.)
Signed-off-by: Petr Pudlak <pudlak@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>...
Fix job error message after unclean master shutdown
According to commit 599ee321eb, any job-related error messages shouldbe encoded within a Ganeti-specific error and not passed on as astring, to allow for easier parsing.
For jobs suffering from an undesirable status after an unclean master...
Add default file_driver if missing
If the file driver of an instance with file based storage is not specified, thedefault one is automatically added by the UpgradeConfig function.
Fixes Issue 571.
Signed-off-by: Michele Tartara <mtartara@google.com>Reviewed-by: Helga Velroyen <helgav@google.com>
Xen handle domain shutdown
Update Xen backend to properly recognize when a domain has beenshutdown by the user and to properly cleanup a shutdown domain whenGaneti requests Xen to stop this domain.
Partial cherry-pick from 9d22cc90609e3ee8f0f2b34b793a3daced3c0e61...
Fix evacuation out of drained node
Introduce _UpgradeSerializedRuntime() method
This method is invoked during _AnalizeSerializedRuntime() and ismeant to modify runtime files in the way cfgupgrade does forconfig.data. This could remove deprecated fields, change theformat of the file, add/remove sections, etc....
Fix a bug in InstanceSetParams concerning names
In case no name is passed in disk modifications we shouldkeep the old one. If name=none then set disk name to None.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>Reviewed-by: Jose A. Lopes <jabolopes@google.com>
SingleNotifyPipeCondition: don't share pollers
As widely known Ganeti uses a better1 lock condition notificationlibrary based on operating system pipes.
Inside this library we were using a shared poller for all threadswaiting for a condition. While poller is not thread safe, since (1)...
Fix error printing
Fixes issue 616.
Signed-off-by: Jose A. Lopes <jabolopes@google.com>Reviewed-by: Michele Tartara <mtartara@google.com>
Allow link local IPv6 gateways
Each host using IPv6 always has a link local address in fe80::/10. It iscommon to use fe80::1 as default gateway to ease client configuration.Ganeti prevented this usage, because it made sure that the IPv6 gatewayis in the IPv6 network the instance is connected to....
Fix NODE/NODE_RES locking in LUInstanceCreate
Both NODE and NODE_RES locks were acquired opportunistically if sorequested by the user. LUInstanceCreate requires, however, that theactually locked elements on NODE and NODE_RES level are the same.
This patch changes the locking of NODE_RES such that those locks are not...
KVM: use custom KVM path if set for version checking
This commit fixes two TODOs from 2008 about using the hardcoded"default" path for KVM where a custom one could've been set through`gnt-cluster modify`.
As a result, `gnt-cluster verify` will no longer fail if a custom...
Export NIC's UUID and name to network scripts
In case of kvm None values are not allowed in env dictso we have to add name only if not None.
In case of Xen since we are writing on a file thatis going to be sourced we should not add INTERFACE_NAME=None....
Use HooksDict() to export network options in Xen
Remove duplicate code that exports network options to environmentvariables.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>Signed-off-by: Thomas Thrainer <thomasth@google.com>Reviewed-by: Thomas Thrainer <thomasth@google.com>
Export tags via GetTags() to network scripts
Use GetTags() instance method in order to export instance tagsto NIC configuration scripts and files of kvm and xen hypervisors.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>Signed-off-by: Thomas Thrainer <thomasth@google.com>...
Introduce --hotplug-if-possible option
This will be useful for an external entity using RAPI thatwants to modify devices of instances.
The common use case for that is:"I want to add a NIC/disk to an instance. If it is runningthen try to hotplug the device. If not, then just add it to config."...