History | View | Annotate | Download (481.1 kB)
Merge branch 'devel-2.5'
Conflicts: lib/cmdlib.py - trivial
Signed-off-by: Guido Trotter <ultrotter@google.com>...
Allow per-hypervisor optional files
Rather than just allowing files for all nodes to be optional, we allowoptional files to be per-category. The way this works is that they mustbe included in both lists (the new code checks for this).
The code also removes a duplicate assert (present both in verify and...
Add hypervisors ancillary files list
These lists will be used to declare some of the files not necessarilyneeded on all nodes. The files selected are files without which thevarious hypervisors can still work, but that when they are presentshould be synchronized across the cluster (or node group)....
Upload spice files in redist-conf
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Rename filter and filter_ to qfilter
We currently use 'filter' as the OpCode, QueryRequest and RAPI fieldname for representing a query filter. However, since 'filter' is abuilt-in function, we actually have to use filter_ throughout the codein order to not override the built-in function....
Add error codes documentation
Demote to warnings the errors in --ignore-errors
Treat the gnt-cluster verify errors identified by the error codes in--ignore-errors as warnings; just print a warning message for the user.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Add --ignore-errors parameter to cluster verify
lib/cli.py- add IGNORE_ERROR_OPT;
client/gnt_cluster.py- pass the ignore_errors parameter to the opcodes
lib/opcode.py- update OpClusterVerifyConfig, OpClusterVerify and OpClusterVerifyGroup to accept the ignore_errors parameter...
Move cluster verify error codes to constants
- move the cluster verify error codes from cmdlib._VerifyErrors to constants;- add to each of them the CV (Cluster Verify) prefix;- add the CV_ALL_ECODES and CV_ALL_ECODES_STRINGS constants;- wrap the lines that exceed 80 characters after changing the error...
Add cluster netmask parameter
Add the master_netmask cluster parameter, that represents the netmask ofthe master IP, encoded as a CIDR suffix.
This parameter can be set via the --master-netmask of gnt-cluster initand gnt-cluster modify. The default behaviour is to be consistent with...
Merge branch 'stable-2.5' into devel-2.5
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Fix issue when verifying cluster files
If a cluster has any non-master-candidate nodes, those don't contain allfiles (e.g. config.data). With commit aef59ae764dc (March 31st, 2011)the logic was changed and subsequently verifying a cluster with non-mcnodes would complain....
Fix adding nodes after commit 64c7b3831dc
Commit 64c7b3831dc changed the RPC call for verifying SSH connections.Unfortunately this case in adding nodes was missed.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
LUClusterVerifyGroup: Spread SSH checks over more nodes
When verifying a group the code would always check SSH to all nodes inthe same group, as well as the first node for every other group. On bigclusters this can cause issues since many nodes will try to connect to...
Add gnt-cluster commands to toggle the master IP
Split starting and stopping master IP and daemons
Add memory transfer progress info to migration
Make migration RPC non-blocking
To add status reporting for the KVM migration, the instance_migrate RPCmust be non-blocking. Moreover, there must be a way to represent themigration status and a way to fetch it.
Migration: warn the user about hv version mismatch
Fix handling of cluster verify hooks
The change to enforce boolean results for cluster verify group opcodemissed the HooksCallBack, which uses a very ugly 1/0logic. Furthermore, the logic is wrong, since it unconditionallyresets the verify result to true....
Redistribute the RAPI certificate
This reverts to the old behaviour in Ganeti 2.4 and before.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix OS creation's error handling when pausing sync
Commit 41e1e79 introduced a feature in which when wait_for_sync is notset, DRBD sync is paused during the OS installation.
Doing so, however, broke OS creation's error handling: the result valuefrom the instance_os_add RPC call was overwritten by the one of the...
cmdlib: Support for CPU pinning
Signed-off-by: Tsachy Shacham <tsachy@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix for auto parameters on import
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
DeprecationWarning fixes for pylint
In version 0.21, pylint unified all the disable-* (and enable-*)directives to disable (resp. enable). This leads to a lot ofDeprecationWarning being emitted even if one uses the recommendedversion of pylint (0.21.1, as stated in devnotes.rst)....
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Two more PEP8 fixes
cmdlib: Avoid wrapping using backslash
gnt_group: Avoid * magic using keyword arguments (the “pep8” tooldoesn't like the inline comment in this case and will complain aboutspaces around the “*” operator)
PEP8 style fixes
Identified using the “pep8” utility.
Allow importing instance with full auto parameters
Disk template is no longer required when importing instance
… provided that disk_template value is set in the config.ini file.
Signed-off-by: Agata Murawska <agatamurawska@google.com>Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Documentation fix for importing with --src-dir option
Get rid of {disk,nic}_count variables
This also fixes an issue if "disk_template = diskless" and no"disk_count" was specified, while doing an import of said instancespecifications.
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Small improvements for cluster verify
- Check if BGL is actually owned- Show group name as feedback
Allow locking to be used via OpQuery
The original design for query2 specifically excluded locking, but nowit's turned out that it would be a good thing to have in watcher. Thispatch adds a new parameter to OpQuery and enables its use in LUQuery. Amissing function is added to LUGroupQuery, a comment clarified in...
Change OpClusterVerifyConfig's result, verify results
This patch removes the list of node groups (not used anymore sincecommit fcad7225e3fc) from OpClusterVerifyConfig's result and adds resultverification to all OpClusterVerify* opcodes.
Use LU-generated jobs for verifying cluster
This patch moves the logic for verifying the various node groups in acluster into the master daemon. Job dependencies are used to ensure theconfiguration, which requires the BGL, is verified first.
With this change it will be possible to expose whole-cluster...
Allow fixing of split instances via relocate
Currently, the IAllocator code requests strictly that the (set of) groups ofthe nodes we're relocating from is equal to the set of groups we'rerelocating to.
This, however, makes is impossible to fix split instances, since (by...
Further cleanup after multi-evacuate removal
Commit f0edfcf6 removed the parsing of multi-evacuate result, but thecode went from:
if mode in (multi-evac, relocate): … if mode relocate: …
to:
if mode relocate: … if mode == relocate...
Fix bug in IAllocator parsing of Evacuate result
Commit 342f9172 added stricter checks for the iallocator result inevacuate mode, but it does this irrespective of the resultstatus. When the result has failed and (according to the design) thelist of nodes is empty, this code will trigger the following:...
Remove iallocator's “multi-evacuate” mode
It is no longer used and has been deprecated in 2.5.
Add docstring to cmdlib.TLReplaceDisks._FindFaultyDisks
LUGroupVerifyDisks: Use _CheckInstanceNodeGroups' result
… instead of getting the list of instances once again from theconfiguration.
cmdlib: Factorize checking node groups' instances
Remove 15-second sleep from LUInstanceCreate
Remove 15 second sleep when wait_for_sync is not set. LUInstanceCreate alreadycalls _WaitForSync with oneshot=True, which already performs an internalwait-loop for disks to start syncing.
Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>...
Add a readability alias
lu.glm.list_owned becomes lu.owned_locks, which is clearer for thereader.
Also rename three variables (which were before named owned_locks) tomake clearer what they track.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Add opcode to change instance's group
This is quite similar to evacuating a group, but the lockingis different.
Factorize checking instance's node groups
Lock potential target nodes for group evacuation
All potential target nodes should be locked while calculatinga group evacuation.
Small changes in group evacuation
- Use OpPrereqError in CheckPrereq- Clarify command synopsis
cmdlib: Factorize getting iallocator
The same logic will be used for changing an instance's group.
Pause DRBD sync for OS install if not wait_for_sync
When wait_for_sync is set to False in LUInstanceCreate, Ganeti lets DRBD syncin the background while performing the rest of the installation steps,including OS installation.
However, OS installation is a very disk-intensive task that intereferes badly...
Optimise use of repeated/looping GetNodeInfo
This adds a new ConfigWriter.GetMultiNodeInfo function and replacesmultiple/looping calls to GetNodeInfo with it.
Optimise use of repeated/looping GetInstanceInfo
Similar to the previous patch, this adds a helper function toeliminate repeated calls info ConfigWriter.
Fix types passed to IAllocator
Iallocator mode reloc, parameter reloc_from takes a list; half of thecode already forced this parameter to list, we add the other two caseswhere it is needed.
Add primary/second nodes' group as query fields
These will be very useful for ganeti-watcher as it needs to retrieveinstances by group.
Remove requirement for variants on OS API v15+
This removes:
- the check in backend that such OSes have a variants file or if it exists that is non-empty; in order for this to work, we also rework the logic in backend._TryOSFromDisk to allow for optional OS files...
Fix group verification of offline nodes
Commit aef59ae7 reworked the file verification, but forgot to takeinto account offline nodes.
The fact that this was not detected yet is due to the fact that wedon't test clusters with offline nodes in QA :(
Signed-off-by: Iustin Pop <iustin@google.com>...
Disallow variants for OSes that don't support them
Otherwise we get no variant checks at all, but the variant is stillrecorded.
Add helper for declaring all locks shared
This patch adds a function for abstracting“dict.fromkeys(locking.LEVELS, 1)”. It also removes a duplicateassignment for the share_locks in LUInstanceQuerydata.
Additionally, it moves the _SupportsOob function to the helper...
Change OpClusterVerifyDisks to per-group opcodes
Until now verifying disks, which is also used by the watcher,would lock all nodes and instances. With this patch the opcodeis changed to operate on per nodegroup, requiring fewer locks.
Both “gnt-cluster” and “ganeti-watcher” are changed for the...
cmdlib: Give instance name in error message on group evacuation
cmdlib: Factorize mapping instance LVs to node/volume
Most boring patch ever
s/'/"/ in (hopefully) the right places.
gnt-instance info: Return static info if node offline
Before this patch “gnt-instance info” would fail with the error message“Error checking node $node: Node is marked offline” if the instance'sprimary node is marked offline and the user didn't explicitely request...
Ignore offline primary when failing over
When the source node for a failover is marked offline, there's no needto require the user to specify “--ignore-consistency”.
To make it work at all, a number of bugs introduced by the merge ofmigration and failover are also fixed by this patch....
Merge branch 'devel-2.4'
gnt-node volumes: Fix instance names
Commit 84d7e26b changed “objects.Instance.MapLVsByN” to not just returnthe LV name, but to include the volume group name (e.g.“xenvg/d67e8700….disk0_data”). This in turn broke the mapping of volumenames in LUNodeQueryvols, stopping instance names from displayed in...
Fix instance failover (missing argument)
More fallout from commit 323f9095b49d.
Add error state to LUGroupEvacuate's exceptions
Add new opcode for evacuating group
Fix node evacuation
- Adjust for new iallocator result format- Split some code into helper functions
Add gnt-instance start --pause
Creates the instance, but pauses execution before booting. This combinedwith 'gnt-instance console' unpausing instances means that the entireboot process can be viewed and monitored.
Signed-off-by: Stephen Shirley <diamond@google.com>...
Remove old node evacuation opcode
LUNodeEvacStrategy has been replaced with LUNodeEvacuate.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Add new opcode to evacuate node
This new opcode will replace LUNodeEvacStrategy, which used to return alist of instances and new secondary nodes. With the new opcode theiallocator (if available) is tasked to generate the necessary operationsin the form of opcodes. This moves some logic from the client to the...
Fix cluster verify for empty node groups
There were some implicit assertions in the code that all node groupshave nodes, which is not necessarily true.
Additionally, the patch does a wrapping change.
Fix a lint warning
Patch db8e5f1c removed the use of feedback_fn, hence pylint warnnow.
Fix bug in recreate-disks for DRBD instances
The new functionality in 2.4.2 for recreate-disks to change nodes isbroken for DRBD instances: it simply changes the nodes without caringfor the DRBD minors mapping, which will lead to conflicts in non-empty...
Fix bug in drbd8 replace disks on current nodes
Currently the drbd8 replace-disks on the same node (i.e. -p or -s) hasa bug in that it does modify the instance disk temporarily beforechanging it back to the same value. However, we don't need to, andshouldn't do that: what this operation do is simply change the LVM...
Conflicts: lib/cmdlib.py - use RequireSharedFileStorage there
LUInstanceCreate: use opcodes.RequireFileStorage
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Add one forgotten element to the file disk path
This was left out during the fix/refactoring
Conflicts: lib/cmdlib.py - use constants.DTS_FILEBASED...
Add DTS_FILEBASED constant
LUInstanceCreate: fix file storage dir calculation
- Move the calculation at the beginning of CheckPrereq, since it doesn't modify any state, but still keeps locks- Only perform the calculation if the actual disk template is filebased- Error out if there is no defined file storage dir...
Check that filestorage is enabled when requested
Remove self.op.file_storage_dir isabs check
As the manpage says, and the code does, self.op.file_storage_dir is anadditional relative path under the cluster file storage dir. As such itshould not be absolute.
Replace iallocator's mreloc w/ change-group and node-evac
This patch removes all occurrences of the “multi-relocate” iallocatormode. Commit 25ee7fd845 updated the design document and introducedseparate modes, “change-group” and “node-evacuate”. The constants aren't...
Fix locking issues in LUClusterVerifyGroup
- Use functions in ConfigWriter instead of custom loops- Calculate nodes only once instances locks are acquired, removes one potential race condition- Don't retrieve lists of all node/instance information without locks...
cmdlib: Acquire BGL for LUClusterVerifyConfig
LUClusterVerifyConfig verifies a number of configuration settings. Fordoing so, it needs a consistent list of nodes, groups and instances. Sofar no locks were acquired at all (except for the BGL in shared mode)....
Export/import instance tags