ht.WithDesc: Work around pylint warning
Explicitely defining “__call__” silences a pylint warning when wrappedtype check functions are used directly. I had no idea pylint is thisintelligent.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Merge branch 'devel-2.4'
ht: Add new check for numbers
Places which receive floats can usually also deal with integers, e.g.OpTestDelay. Tests are added and the new check function is used for theaforementioned opcode and verifying query results.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Fix off-by-one bug in job serial generation
Commit 009e73d0 (September 2009) changed the job queue to generatemultiple job serials at once. Ever since it would return one more thanrequested.
The “serial” file in the job queue directory is defined to contain the...
Reverts the patch series about console wrappers
This reverts commits 030a9cb8022b83bf43ec14dfbafd943299bc01c4 andae082df0000a785b693b2f4aa434650a81a94bdf.
There are two problems:
- Makefile.am breakage, which is trivial to revert- unittest breakage, which honestly I'm not sure how to fix and how...
Add gnt-instance start --pause
Creates the instance, but pauses execution before booting. This combinedwith 'gnt-instance console' unpausing instances means that the entireboot process can be viewed and monitored.
Signed-off-by: Stephen Shirley <diamond@google.com>...
Adding a wrapper around connecting to kvm console
The wrapper will connect to the console, and check in the background ifthe instance is paused, unpausing it as necessary.
Signed-off-by: Stephen Shirley <diamond@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Adding a wrapper around "xm console"
Fix lint error
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
RAPI: Document all feature strings
- Use constants and an assertion- Update documentation for node migration
Remove old node evacuation opcode
LUNodeEvacStrategy has been replaced with LUNodeEvacuate.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Change RAPI for new node evacuation opcode
The change is not backwards compatible, see the updated NEWS file.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Change “gnt-node evacuate” to use new opcode
By default it'll now evacuate all instances from the node, notjust secondaries.
Add new opcode to evacuate node
This new opcode will replace LUNodeEvacStrategy, which used to return alist of instances and new secondary nodes. With the new opcode theiallocator (if available) is tasked to generate the necessary operationsin the form of opcodes. This moves some logic from the client to the...
Alias gnt-job show to gnt-job info
Am I the only one to make that mistake 10 times a week?
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Fix cluster verify for empty node groups
There were some implicit assertions in the code that all node groupshave nodes, which is not necessarily true.
Additionally, the patch does a wrapping change.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Fix bug in recreate-disks for DRBD instances
The new functionality in 2.4.2 for recreate-disks to change nodes isbroken for DRBD instances: it simply changes the nodes without caringfor the DRBD minors mapping, which will lead to conflicts in non-empty...
Fix a lint warning
Patch db8e5f1c removed the use of feedback_fn, hence pylint warnnow.
KVM: configure bridged NICs at migration start
Commit 5d9bfd870 moved tap interface handling from KVM to Ganeti, partlyto also solve the problem of routed interfaces getting configured tooearly during live migrations, causing network anomalies. In that...
Fix bug in drbd8 replace disks on current nodes
Currently the drbd8 replace-disks on the same node (i.e. -p or -s) hasa bug in that it does modify the instance disk temporarily beforechanging it back to the same value. However, we don't need to, andshouldn't do that: what this operation do is simply change the LVM...
Conflicts: lib/cmdlib.py - use RequireSharedFileStorage there
Signed-off-by: Guido Trotter <ultrotter@google.com>...
remove bootstrap._InitSharedFileStorage
This function is a copy of bootstrap._InitFileStorage with the followingdifferences: - check constants.ENABLE_SHARED_FILE_STORAGE and not constants.ENABLE_FILE_STORAGE - use different local variable names - one different error string...
LUInstanceCreate: use opcodes.RequireFileStorage
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Don't add ",boot=on" to disks on kvm >= 0.14
Under newer kvm this prevents the vm from starting.Ah, change!
KVM: fix per-instance stored UID value
When using the pool security model, ExecuteKVMRuntime was storing theinstance's UID using str(uid), which would result in storing theLockedUid._repr__() result:
$ cat /var/run/ganeti/kvm-hypervisor/uid/xxxxxxxxxxxxx...
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Add one forgotten element to the file disk path
This was left out during the fix/refactoring
Conflicts: lib/cmdlib.py - use constants.DTS_FILEBASED...
Add DTS_FILEBASED constant
LUInstanceCreate: fix file storage dir calculation
- Move the calculation at the beginning of CheckPrereq, since it doesn't modify any state, but still keeps locks- Only perform the calculation if the actual disk template is filebased- Error out if there is no defined file storage dir...
Check that filestorage is enabled when requested
Remove self.op.file_storage_dir isabs check
As the manpage says, and the code does, self.op.file_storage_dir is anadditional relative path under the cluster file storage dir. As such itshould not be absolute.
Replace iallocator's mreloc w/ change-group and node-evac
This patch removes all occurrences of the “multi-relocate” iallocatormode. Commit 25ee7fd845 updated the design document and introducedseparate modes, “change-group” and “node-evacuate”. The constants aren't...
jqueue: Allow loading of archived jobs
Chained jobs need to look at previous jobs, including archived ones. Anice side-effect of this change is the ability to look at archived jobsusing “gnt-job info <id>” as long as the ID is known.
Adding basic abstraction layer for caching
This includes an own simple cache implementation and aninterface to a memcache instance.
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix locking issues in LUClusterVerifyGroup
- Use functions in ConfigWriter instead of custom loops- Calculate nodes only once instances locks are acquired, removes one potential race condition- Don't retrieve lists of all node/instance information without locks...
cmdlib: Acquire BGL for LUClusterVerifyConfig
LUClusterVerifyConfig verifies a number of configuration settings. Fordoing so, it needs a consistent list of nodes, groups and instances. Sofar no locks were acquired at all (except for the BGL in shared mode)....
Export/import instance tags
Fix issue with tags on instance creation
Commit 720f56c85a added the ability to specify tags when creating aninstance. The “tags” attribute of an instance object needs to be a set,but the patch's code saved it as a list, causing breakage in other parts...
Export instance tags to instance hooks
Instance hooks now get an INSTANCE_TAGS environment variable, which contains aspace-delimited list of the affected instance's tags.
Also update the documentation to reflect the change.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>...
Add tagging option to gnt-instance create
Add TAG_ADD_OPT option to cli.py and use it in gnt-instance. Modifycli.GenericInstanceCreate() accordingly.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>Signed-off-by: Iustin Pop <iustin@google.com>...
Add tag handling to {Op,LU}InstanceCreate
Add a tag slot to opcodes.OpInstanceCreate. We do not reuse _PTags, as this isintended for OpTagsSet and thus:
a) is not documented b) does not carry a default value, making it mandatory
Also pass the tags to the iallocator during instance creation....
http.client: Make debug log less noisy
The HTTP client code generates quite a lot of debug log messages. Withthis patch they're hidden unless explicitely enabled in the code.
jqueue: Fix potential race condition when cancelling queued jobs
When a job was cancelled, its status would be changed and the filewritten again. Since this was a final status, the job file could bemoved anytime for archival. If the job was still in the queue, however,...
iallocator: add ht-checking for the request
Currently, we only ht-check the result value from the iallocator, andwe send whatever we happen to check manually in the LUs that call theiallocator.
This is not good, as we have to duplicate checks in many places, and...
iallocator: rename mem_size to memory
Currently, the iallocator in 'allocate' requires mem_size on inputbut serialises that as 'memory'. This inconsistency makes it hard toautomatically validate the parameters, hence this patch renamesmem_size.
Signed-off-by: Iustin Pop <iustin@google.com>...
iallocator: change default for target_groups
Per the design doc, the target_groups request key "if present, it musteither be the empty list, or contain a list of group UUIDs". Currentlyit defaults to None/null, which is not valid.
iallocator: export the hypervisor value
In 'allocate' mode, the documentation specifies that we export thehypervisor value (“Allocation needs, in addition: … hypervisor, thehypervisor of this instance”) and we need that on input, however wedon't actually export it....
iallocator: fix incomplete refactoring
Commit fdbe29ee changed the iallocator modes from 'r'/'w' to'ro'/'rw', but forgot one check in LUTestAllocator. This patch justcompletes the replacements.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
gnt-node migrate: Use LU-generated jobs
Until now LUNodeMigrate used multiple tasklets to evacuate all primaryinstances on a node. In some cases it would acquire all node locks,which isn't good on big clusters. With upcoming improvements to the LUsfor instance failover and migration, switching to separate jobs looks...
Fix argument order in ReserveLV and ReserveMAC
ConfigWriter.ReserveLV() and Configwriter.ReserveMAC() calledTemporaryReservationManager.Reserve() with the ec_id and resource argumentsswapped. As a result, two reservation attempts for the same resource type...
ht: Accept both int and long as integers
This fixes a unittest failure on 32 bit systems. A recently addedunittest for ht.TJobId uses a rather large number (2347625220). On 64bit systems it is stored as “int”. On 32 bit systems however, Pythonuses “long”. The two types can be intermixed in Python as the...
ht: Add checks for anything, regexp, job ID, container items
The check for container items is useful for tuples and/or lists withnon-uniform values. The “anything” check can be used when any valueshould be accepted for an item.
The job ID check, which uses the regexp check, will be used for...
TLReplaceDisks: Move assertion checking locks
Commit 1bee66f3 added assertions for ensuring only the necessary locksare kept while replacing disks. One of them makes sure locks have beenreleased during the operation. Unfortunately the commit added the check...
cli.JobExecutor: Handle empty name, allow adding job IDs
With LU-generated jobs only the ID is known.
cli.JobExecutor: Use counter for indexing jobs
If “SubmitPending” were mixed with calls to “QueueJob”, jobs in theinternal structures will get duplicate indices. With this change eachqueued job is assigned a unique index, which will be used for sorting...
Fix bug in LUNodeMigrate
Commit aac4511a added CheckArguments to LUNodeMigrate with a call to_CheckIAllocatorOrNode. When no default iallocator is defined,evacuating a node would always fail:
$ gnt-node migrate node123Migrate instance(s) '...'?y/[n]/?: y...
config: Add method to get members of nodes' groups
This will be used for locking during node evacuation.
node evac: don't call IAllocator if no instances
Currently we generate an empty list only for the '-n node' invocation,but for iallocator we still call the iallocator (which needs an RPCcall, etc.). By moving the computation of instances outside of the if...
gnt-debug: rename allocator to iallocator
I'm always confused by this strange difference, so let's rename thecommand to match what it tests.
RPC/Backend: Make UploadFile uid and gid agnostic
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Resolve uid/gid upon mainloop run
GetEntResolver: Make it possible to resolve uid/gid to name
utils.algo: Add InvertDict to invert a dict
autotools: Add noded group
Fix a couple of style mistakes
Cluster verify: accept a --node-group option
This will trigger a ClusterVerifyGroup operation only on the specifiedgroup, skipping other groups as well as cluster-wide verifications.
Signed-off-by: Adeodato Simo <dato@google.com>Signed-off-by: Guido Trotter <ultrotter@google.com>...
Cluster verify: check for nodes/instances with no group
Previously, all nodes and instances would always be visited/verified. Bydriving the verification by node group now, we will miss nodes andinstances that can't be reached from existing node groups, should that rare...
Cluster verify: fix LV checks for split instances
When sharding by group, if a mirrored instance is split (primary andsecondary) between two groups, its volumes will not be properly checked:the group of the primary will warn about a missing volume in the secondary,...
Cluster verify: make NV_NODELIST smaller
To cope with increasing cluster sizes, we now make nodes try to contact allother nodes in their group, and one node from every other group.
Cluster verify: verify hypervisor parameters only once
The list of all hypervisor parameters has to be computed inLUClusterVerifyGroup, since it needs to be passed to nodes asNV_HVPARAMS. However, it is better only to verify said parameters once,out of LUClusterVerifyConfig....
Split LUClusterVerify into LUClusterVerify{Config,Group}
With this change, LUClusterVerifyConfig becomes a "light" LU that onlyverifies the global config and other, master-only settings, and the bulk ofnode/instance verification is done by LUClusterVerifyGroup, which only acts...
Cluster verify: factor out error codes and functions
We move all error code definitions, plus the _Error and _ErrorIf helpers,to a private _VerifyErrors mix-in class that can be later shared by the newtwo cluster verify LUs.
(_Error and _ErrorIf code was moved around verbatim, except to disable...
Cluster verify: make "instance runs in wrong node" node-driven
Previously, the "instance should not be running in this node" error wascomputed by verifying, for each instance, whether any node other than itsprimary was running it. But this is not a well-suited approach if we were...
Verify an absent vm_capable node for files
If we're not verifying all nodes, adding a node outside the currentgroup for file checksums helps us making sure checksums are the same inall of the cluster.
Cluster verify: master must be present for _VerifyFiles
This commit prepares the call to _VerifyFiles for the case when the masternode is not one of the nodes that's being verified (which will be the casefor all node groups but one). We fix it by always passing master info and...
Cluster verify: don't assume we're verifying all nodes/instances
This commit fixes a few initial simple cases in which it was assumed thatwe're always working over the whole cluster. With this change, wedifferentiate between "nodes/instances to verify" and "checks that need...
Cluster verify: gather node/instance list in CheckPrereq
This commit introduces no behavior changes, and is only a minor refactoringthat aids with a cleaner division of future LUClusterVerify work. Thechange consists in:
- substitute the {node,instance}{list,info} structures previously created...
Merge remote branch 'origin/devel-2.4'
cli: Replace hardcoded disk templates with constants
mcpu: Add missing docstring to _ProcessResult
config: Add function to get instances in node group
This will be used for evacuating instances in a node group.
iallocator: Stricter check for multi-evac result
Check new secondary nodes' group like it's already done formulti-relocation requests.
cmdlib: Use ganeti.ht for checking iallocator result
ht: Add strict check for dictionaries
This allows checking specific dictionary items, unlike TDictor TDictOf.
cmdlib: Remove punctuation from error messages
gnt-debug: New iallocator mode
Add new iallocator mode to LUTestAllocator
cmdlib.IAllocator: Add multi-relocate support
Add constants for multi-relocation iallocator mode
Implement no_remember at RAPI level
Implement no_remember at CLI level
Introduce instance start/stop no_remember attribute
This will allow stopping or starting an instance without changing theremembered state. While this seems counter-intuitive at first (it willcreate cluster verify errors), it can help in a few corner cases:...
cmdlib.IAllocator: Fewer temporary variables
Reduce the number of temporary variables and generate dictionaries inone go.
TLMigrateInstance: do not migrate to self
Check that the instance is not being migrated to its current primary nodeduring CheckPrereq. Otherwise migration is aborted because the instance isalready running and cleaned-up, which causes the running instance to be killed....
SharedLock: Implement downgrade from exclusive to shared mode
If a job needs to modify a resource and then wait for a result, it mustacquire the resource lock in exclusive mode. In some cases it would bepossible to only have a shared lock for waiting. Until now it was not...
gnt-debug: Use constants for iallocator direction