History | View | Annotate | Download (495.4 kB)
Merge branch 'devel-2.5'
Merge branch 'stable-2.5' into devel-2.5
Merge branch 'devel-2.4' into devel-2.5
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Warn if we enable maintain-node-health without confd
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
LUInstanceCreate: Release unused node locks
After iallocator ran we can release any unused node locks. Since theymust be in exclusive mode this should improve parallelization duringinstance creation.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
cmdlib.TLReplaceDisks: Use itertools.count
… instead of a variable which needs to be incremented for every step.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Transition into and out of offline instance state
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Introduce admin_state as 3-values data type
Rename admin_up to admin_state
cmdlib._ReleaseLock: Do nothing if no locks are owned
The locking library doesn't like it when “release()” is called ona lockset or lock which isn't held by the current thread. Insteadof modifying the library, which could have other side-effects,this rather simple change avoids errors when a LU simply tries to...
Use resource lock when setting node parameters
Also acquire instance and resource locks in shared mode (see comment).
Use node resource lock for replacing instance disks
If early-release is not used, the resource lock is kept while waitingfor disks to sync.
Hold node resource locks while setting instance parameters
Important for when disks are converted. Release locks once they're notneeded anymore. Make liberal use of assertions.
Hold node resource lock while moving instance
Acquire node resource lock when removing instance
Removing an instance affects available disk space and memory.
Use node resource lock when recreating instance disks
Recreating disks conflicts with other disk operations, therefore thenode resource lock must be acquired.
LUClusterRepairDiskSizes: Use node resource locks
Since this doesn't really touch the node, but it conflicts with e.g.growing a disk, the resource lock must be acquired.
LUInstanceGrowDisk: Use node resource lock
Also add one more feedback line. Downgrade instance lock to shared modewhile we're only waiting for disks to sync. The node lock is releasedwhen not needed anymore.
LUInstanceCreate: Hold node resource lock
The node resource lock is released once the disks are in sync (that is,after wiping).
LUNodeQueryvols: Acquire all locks in shared mode
Nothing is being written to.
LUNodeQueryStorage: Acquire all locks in shared mode
Nothing is written to.
cmdlib: Share lock in LUInstanceConsole
No writes are being done.
LUNodeQuery: Call implementation's DeclareLocks function
Just in case we ever add locks for querying nodes. Currently_NodeQuery's DeclareLocks is a no-op function.
Change master IP address RPCs for external script
Change the master IP address RPC call chain to accept theuse_external_master_ip_script parameter. Introduces an unused parameterin backend.ActivateMasterIp and backend.DeactivateMasterIp, that will beused in the next commit....
Update cluster verify to check IP address scripts
Update cluster-verify to check the integrity of the default master IPaddress setup script and the presence and executability of the externalone (if currently in use by the cluster).
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>...
Add --use-external-mip-script flag
- add a command line flag to gnt-cluster init and modify to change the value of the cluster parameter use_external_mip_script;- add two constants representing the paths of the default script and of the external script;...
Add use_external_mip_script cluster parameter
Add the use_external_mip_script cluster parameter, that representswhether the master IP address turnup/turndown procedures must use ascript provided by the user (True) or the one provided by Ganeti(False)....
Ensure unused ports return to the free port pool
Ensure ports previously allocated by calling ConfigWriter's AllocatePort() arereturned to the pool of free ports when no longer needed:
Fail if node/group evacuation can't evacuate instances
If an instance can't be evacuated, only a message would be printed. Withthis change the operation always aborts. Newly added unittests check forthis behaviour.
LUNodeSetParams: Lock affected instances only
Until now LUNodeSetParams would lock all instances if a node'ssecondary IP address was to be changed and would then releaseall instances it didn't actually need. With this patch the LUoptimistically locks instances and, once it got the locks,...
Check BGL when adding/removing node
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Improve error handling in netmask change
- check if the master IP with the old netmask is up before attempting to change the netmask (to avoid a failed change netmask resulting in an undesired activation of the master IP);- improve error messages of the backend function;...
LUInstanceRename: Compare name with name
… instead of object with name.
LUClusterRepairDiskSizes: Acquire instance locks in exclusive mode
Instances are modified if their disk size doesn't match.
cmdlib: Allow specifying lock level when calculating node locks
This is needed to lock node resources.
Pass MasterNetworkParameters instances in RPCs
Pass instances of objects.MasterNetworkParameters when calling RPCs foractivation and deactivation of master IP.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Use MasterNetworkParameters attributes for RPC
Instead of manually unpacking the return values ofcfg.GetMasterNetworkParameters, let it return an instance ofobjects.MasterNetworkParameters and pass its attributes.
Uniform master IP activation and deactivation
Add the master IP family parameter to the master IP deactivation RPCs,so that the activation and deactivation interfaces are uniform.
Explicitly pass params to change_master_netmask
Make the master explicitly pass the parameters to thechange_master_netmask RPC, and change all the call flow to use the newinterface.
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Explicitly pass params to deactivate_master_ip
Make the master explicitly pass the parameters to thedeactivate_master_ip RPC, and change all the call flow to use the newinterface.
Explicitly pass params to activate_master_ip
To remove the usage of ssconf in the backend, the master needs to pushthe parameters of activate_master_ip to the backend.
This patch changes the entire call path of activate_master_ip to use thenew interface....
Check the results of master IP RPCs
A failed gnt-cluster (de)activate-master-ip would not produce any outputto the user. This patch adds code that checks for the results of theRPCs and raise an exception if appropriate.
Generalize HooksMaster
- remove any dependence on Logical Units from the HooksMaster;- add a new function parameter to the constructor, a function that is expected to convert the results of the hooks execution in a format understood by the HooksMaster;...
cmdlib: Fix issue when marking node as online
When a node is marked as online (“gnt-node modify -O no …”), an RPC ismade to the node to check whether the node daemon is running. My recentRPC changes led to offline nodes being ignored before the actual call is...
rpc: Convert call for HV parameter validation
Instead of filling the parameters in the RPC layer, that is now donebefore the wrapper is called, thereby simplifying the wrapper.
rpc: Convert two more instance-specific calls
Interface changes were necessary as these took more parameters than wereactually passed over the wire. Those parameters were just passed to theinstance serialization function.
cmdlib: Use constant for DRBD meta device size
… instead of a hardcoded value.
Fix parameters to RPC "os_validate"
All other RPC wrappers take the node name(s) as the first parameter.
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Conflicts: lib/cmdlib.py - trivial
Signed-off-by: Guido Trotter <ultrotter@google.com>...
Allow per-hypervisor optional files
Rather than just allowing files for all nodes to be optional, we allowoptional files to be per-category. The way this works is that they mustbe included in both lists (the new code checks for this).
The code also removes a duplicate assert (present both in verify and...
Add hypervisors ancillary files list
These lists will be used to declare some of the files not necessarilyneeded on all nodes. The files selected are files without which thevarious hypervisors can still work, but that when they are presentshould be synchronized across the cluster (or node group)....
Upload spice files in redist-conf
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
OpGroupVerifyDisks: Fix wrong result type declaration
If an instance had actually a missing disk, the type check would fail.
Rename filter and filter_ to qfilter
We currently use 'filter' as the OpCode, QueryRequest and RAPI fieldname for representing a query filter. However, since 'filter' is abuilt-in function, we actually have to use filter_ throughout the codein order to not override the built-in function....
Add error codes documentation
Demote to warnings the errors in --ignore-errors
Treat the gnt-cluster verify errors identified by the error codes in--ignore-errors as warnings; just print a warning message for the user.
Add --ignore-errors parameter to cluster verify
lib/cli.py- add IGNORE_ERROR_OPT;
client/gnt_cluster.py- pass the ignore_errors parameter to the opcodes
lib/opcode.py- update OpClusterVerifyConfig, OpClusterVerify and OpClusterVerifyGroup to accept the ignore_errors parameter...
Move cluster verify error codes to constants
- move the cluster verify error codes from cmdlib._VerifyErrors to constants;- add to each of them the CV (Cluster Verify) prefix;- add the CV_ALL_ECODES and CV_ALL_ECODES_STRINGS constants;- wrap the lines that exceed 80 characters after changing the error...
Add cluster netmask parameter
Add the master_netmask cluster parameter, that represents the netmask ofthe master IP, encoded as a CIDR suffix.
This parameter can be set via the --master-netmask of gnt-cluster initand gnt-cluster modify. The default behaviour is to be consistent with...
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Fix issue when verifying cluster files
If a cluster has any non-master-candidate nodes, those don't contain allfiles (e.g. config.data). With commit aef59ae764dc (March 31st, 2011)the logic was changed and subsequently verifying a cluster with non-mcnodes would complain....
Fix adding nodes after commit 64c7b3831dc
Commit 64c7b3831dc changed the RPC call for verifying SSH connections.Unfortunately this case in adding nodes was missed.
LUClusterVerifyGroup: Spread SSH checks over more nodes
When verifying a group the code would always check SSH to all nodes inthe same group, as well as the first node for every other group. On bigclusters this can cause issues since many nodes will try to connect to...
Add gnt-cluster commands to toggle the master IP
Split starting and stopping master IP and daemons
Add memory transfer progress info to migration
Make migration RPC non-blocking
To add status reporting for the KVM migration, the instance_migrate RPCmust be non-blocking. Moreover, there must be a way to represent themigration status and a way to fetch it.
Migration: warn the user about hv version mismatch
Fix handling of cluster verify hooks
The change to enforce boolean results for cluster verify group opcodemissed the HooksCallBack, which uses a very ugly 1/0logic. Furthermore, the logic is wrong, since it unconditionallyresets the verify result to true....
Redistribute the RAPI certificate
This reverts to the old behaviour in Ganeti 2.4 and before.
Fix OS creation's error handling when pausing sync
Commit 41e1e79 introduced a feature in which when wait_for_sync is notset, DRBD sync is paused during the OS installation.
Doing so, however, broke OS creation's error handling: the result valuefrom the instance_os_add RPC call was overwritten by the one of the...
cmdlib: Support for CPU pinning
Signed-off-by: Tsachy Shacham <tsachy@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix for auto parameters on import
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
DeprecationWarning fixes for pylint
In version 0.21, pylint unified all the disable-* (and enable-*)directives to disable (resp. enable). This leads to a lot ofDeprecationWarning being emitted even if one uses the recommendedversion of pylint (0.21.1, as stated in devnotes.rst)....
Two more PEP8 fixes
cmdlib: Avoid wrapping using backslash
gnt_group: Avoid * magic using keyword arguments (the “pep8” tooldoesn't like the inline comment in this case and will complain aboutspaces around the “*” operator)
PEP8 style fixes
Identified using the “pep8” utility.
Allow importing instance with full auto parameters
Disk template is no longer required when importing instance
… provided that disk_template value is set in the config.ini file.
Signed-off-by: Agata Murawska <agatamurawska@google.com>Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Documentation fix for importing with --src-dir option
Signed-off-by: Agata Murawska <agatamurawska@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>(cherry picked from commit b7d7876bd0e9844fab8be28bfa1fd5d563ec7412)
Conflicts:
lib/cmdlib.py (easily fixed)
Get rid of {disk,nic}_count variables
This also fixes an issue if "disk_template = diskless" and no"disk_count" was specified, while doing an import of said instancespecifications.
Small improvements for cluster verify
- Check if BGL is actually owned- Show group name as feedback
Allow locking to be used via OpQuery
The original design for query2 specifically excluded locking, but nowit's turned out that it would be a good thing to have in watcher. Thispatch adds a new parameter to OpQuery and enables its use in LUQuery. Amissing function is added to LUGroupQuery, a comment clarified in...
Change OpClusterVerifyConfig's result, verify results
This patch removes the list of node groups (not used anymore sincecommit fcad7225e3fc) from OpClusterVerifyConfig's result and adds resultverification to all OpClusterVerify* opcodes.
Use LU-generated jobs for verifying cluster
This patch moves the logic for verifying the various node groups in acluster into the master daemon. Job dependencies are used to ensure theconfiguration, which requires the BGL, is verified first.
With this change it will be possible to expose whole-cluster...
Allow fixing of split instances via relocate
Currently, the IAllocator code requests strictly that the (set of) groups ofthe nodes we're relocating from is equal to the set of groups we'rerelocating to.
This, however, makes is impossible to fix split instances, since (by...
Further cleanup after multi-evacuate removal
Commit f0edfcf6 removed the parsing of multi-evacuate result, but thecode went from:
if mode in (multi-evac, relocate): … if mode relocate: …
to:
if mode relocate: … if mode == relocate...
Fix bug in IAllocator parsing of Evacuate result
Commit 342f9172 added stricter checks for the iallocator result inevacuate mode, but it does this irrespective of the resultstatus. When the result has failed and (according to the design) thelist of nodes is empty, this code will trigger the following:...