History | View | Annotate | Download (171.2 kB)
Convert set to a list in LUGetTags
The set triggers exception on a list-tags command and RAPI calls for tagssince it is not serializable by JSON.
Reviewed-by: iustinp
Convert SetInstanceParams to concurrency
Grab a lock for the instance we're working on, and update its params.
Use Update in SetInstanceParams
When we set the instance params we're not adding a new instance, butjust updating an existing one, so why using AddInstance?
Convert LUConnectConsole to concurrency
For ConnectConsole we just need to lock the instance we're connectingto. We make a few rpcs to its primary node, but node daemons can nowhandle multiple queries and nodes cannot be removed till they haveinstances on them anyway. Note that since we return the ssh command, and...
Add _ExpandAndLockInstance auxiliary function.
LUs that take an instance name as input and need to expand its name andlock it can use it to simplify their ExpandNames call. Possibly, and_ExpandAndLockNode will come as well.
Convert two (simple) LUs to be concurrent
LUQueryClusterInfo and LUDumpClusterConfig can be made concurrent anddon't need to acquire any locks. In fact they don't interact with thecluster at all, but just with its configuration, which is thread-safe by...
Add missing empty line
Two top level definitions were separated only by one empty line.Fixing this.
Reviewed-by: imsnah
Remove the old locking functions
This removes (hopefully) all traces of the old locking functions anduses.
Convert LUTestDelay to concurrent usage
In order to do so: - We set REQ_BGL to False - We implement ExpandNames
That's it, really.
LogicalUnit: add ExpandNames function
New concurrent LUs will need to call ExpandNames so that any namespassed in by the user are canonicalized, and can be used by hooks,locking and other parts of the code. This was done in CheckPrereqbefore, but it's now splitted out, as it's needed for locking, which in...
Add a missing import to cmdlib
cmdlib uses some constants from locking (ie. locking levels) but doesn'timport it. This patch fixes the issue.
Fix an error accessing the cfg
Since the context is passed to LogicalUnit, rather than the cfg, we canonly access the cfg as self.cfg, self.context.cfg, or context.cfg (inthe constructor). cfg is not valid anymore.
Add and remove instance/node locks
Whenever we add an instance or node to the cluster (i.e. to the configand whenever we remove them we should add/remove locks as well). In thefuture we may want to optimize this so that the configwriter does it, orit's handled at the context level, but till we're adding/removing...
Pass context to LUs
Rather than passing a ConfigWriter to the LUs we'll pass the wholecontext, from which a ConfigWriter can be extracted, but we can alsoaccess the GanetiLockManager. This also fixes the places where a FakeLUis created.
Fix a typo in LUTestDelay docstring
Add REQ_BGL LogicalUnit run requirement
When logical units have REQ_BGL set (it is currently the default) theyneed to be the only ganeti operation run on the cluster, and we'llguarantee it at the master daemon level. Currently only one thread isrunning at a time, so this requirement is never broken....
AddNode: move the initial setup to boostrap
From the master node we can't start ssh and connect to the remote node,nor we can do it from ganeti-noded as this ssh section will possibly askfor key confirmation and password. So the code to copy the ganeti-noded...
LUAddNode: use node-verify to check node hostname
As we can't use ssh.VerifyNodeHostname directly, we'll set up a mininode-verify to do checking between the master and the new node. In thefuture networking checks, or more nodes, can be added as well.
LUAddNode: use self.sstore, not a local ss
Since we're inside a LU we have access to self.sstore.No need to use ss, which separate instantiation will disappear in a fewpatches! ;)
LUAddNode: upload files via rpc, not scp
We used to scp all the ssconf files, and the vnc password file to thenew node. With this patch we use the upload_file rpc, specifying justthe new node as a destination. All the files previously copied by scpare already allowed by the backend....
Change fping to TcpPing in two LUs
Two LUs are using RunCmd to call fping, in order to check for an IPpresence on the network. Substituting it with TcpPing will get rid ofit, which makes it not break in the new world order, where the mastercannot fork....
When removing a node don't ssh to it
Even in 1.2 this behaviour is broken, as the rpc call will remove thessh keys before we get a chance to log in. Now the rpc takes care ofshutting down the node daemon as well, so we definitely can avoid this.
This makes the LURemoveNode operation work again with the threaded...
Remove spurious check during LUAddNode
There is no point in checking whether the cluster VNC password fileexists as a prerequisite for AddNode, considering the check happens onthe master node, not the target one. Removing this check.
Improve LURemoveNode BuildHooksEnv docstring
Cleanup old DRBD 0.7.x code
Apparently there were still some leftovers. While removing an instance,I got the message "unhandled exception 'module' object has no attribute'LD_MD_R1'".
Fix gnt-cluster “command” and “copyfile”
Since the disabling of forking in the master daemon, the two ssh-basedsubcommands were not working anymore. However, there is no need at allfor the commands to be run from the master daemon (permissions to readthe cluster private ssh key notwithstanding), they can be run directly...
Add a ‘tags’ field to instance and node listing
Currently there isn't any easy way to list all nodes or instance andtheir tags; you have to query each node in turn, or list all the tagsvia something like “gnt-cluster search-tags '.*'”. Of course, this is...
Fix an error-handling case
There is a mistake in handling grow-disk for an invalid disk. This patchfixes it.
Implement disk grow at LU level
This patch adds a new opcode and LU for growing an instance's disk.
The opcode allows growing only one disk at time, and will throw an errorif the operation fails midway (e.g. on the primary node after it hasbeen increased on the secondary node). As such, it might actually leave...
Move SetKey to WritableSimpleStore and use it
Before we used to be able to update SimpleStore by just calling SetKey, thisfeature is now moved to an external class, which inherits from it. In thispatch the new WritableSimpleStore class is also put to use, in the LUs that...
Activate down instances' disks on replace-disks
When replacing disks or evacuating nodes with instances administrativelydown ganeti fails because the instance disks are not active. This patchactivates them, performs the replacement, and shuts them down again....
FailoverInstance: change AddInstance with Update
We're not adding a new instance, just making configuration changes tothe one we're working on.
Fix an error message in instance add
There is a mistake in the error message generated when we can't reach anode for checking for available disk space. Without it, the errormessage is:Failure: prerequisites not met for this operation:Cannot get current information from node '{u'gnte2.lab.k1024.org':...
Move InitCluster opcode into a single function
This allows us to initialize a new cluster. The code certainly containsbugs and hooks aren't implemented yet.
Move cmdlib._HasValidVG to utils.CheckVolumeGroupSize
This is required for splitting the cluster initialization code.
Move {Set,Remove}EtcHostsEntry wrappers to utils.py
This is required for the split of the cluster initialization code.
Reviewed-by: iustinp, ultrotter
Remove REQ_CLUSTER from opcode handling code
It's not needed anymore now that all opcodes require a cluster. Clusterinitialization was the only exception.
Add check for node memory in instance creation
Currently the check for enough memory is done only on instance startcommand and failover command. But we also start an instance in instancecreate, therefore we need to check this instead of failing to start in...
Show cluster hypervisor for gnt-cluster info
Author: schreiberalReviewed-by: iustinp
Forward-port: make gnt-modify work with new HVM parameters
This fixes gnt-instance modify so it actually works with thenew HVM parameters for Ganeti 1.2
Forward-port: show only parameters relevant to the instance
This patch modifies the code for "gnt-instance info .." to only displayinstance parameters that actually apply to that instance, i.e. for PVMinstances no HVM parameters are shown and vice versa....
Forward-port: patch 2/4 extended HVM features for 1.2
This patch adds the commandline extensions and the code to storeand display the extended HVM features.
Complete removal of md/drbd 0.7 code
This patch removes the last of the md and drbd 0.7 code. Cluster whichhave the old device types will be broken if they have this applied.
LURemoveInstance: fix op.ignore_failures usage
Currently: the LURemoveInstance.Exec() method uses the ignore_failuresattribute of the OpRemoveInstance opcode, but it doesn't check for itsexistence. The patch adds this attribute to _OP_REQP and to all the...
Implement node daemon conectivity tests
This patch adds in gnt-cluster verify checks for inter-node tcpcommunication checks on the node daemon port for both the primary and(if defined) secondary networks.
The output looks like (4-node cluster, one with the secondary interface...
Forward-port changes made to readd in 1.2
qa_node.py: Fix typo in messagecmdlib.py: Don't add readded node to node listganeti-qa.py: Make sure readd isn't done for master node
Use new ssconf function to check configuration version
Upgrades will be handled in future patches.
Export the number of cpus to iallocator scripts
Now that we have the number of cpus available from the hypervisors, wecan export this to the iallocator scripts.
Reviewed-by: ultrotter
Add node cpu count to gnt-node list
This patch adds the backend and frontend changes needed for being ableto list the cpu count.
Add cluster-verify hooks
Only post-hooks are run on cluster verify, and then their output is sent backto the LU, which upon failure displays it to the user and changes the result ofthe execution to a failure.
Add a LU Hooks notification function
Previously LUs could be failed by pre-hooks, and post-hooks just had effects bythemselves. This patch allows a LU to define the HooksCallBack function if itwants to know about its hooks' results and alter its results in response....
Remove NoHooksLU.BuildHooksEnv
Since NoHooksLU defines HPATH as None, BuildHooksEnv will never be called (asthe LogicalUnit.BuildHooksEnv docstring correctly states). Removing thefunction altogether, to avoid having dead code lying around, and to make sure...
LogicalUnit.BuildHooksEnv, update docstring
The LogicalUnit.BuildHooksEnv docstring used to say that the node list shouldnot include the master node. This is obviously not the case checking therelevant code, and double-checking with iustin he confirmed it just document...
Raise PrereqError when exporting file-based instance
This patch adds a check to LUExportInstance.CheckPrereq to raise anerror when an instance with file disks is exported.
Convert cli.SubmitOpCode to use the master
This patch converts the cli.py SubmitOpCode method to use the unixprotocol and thus execute the opcodes via the master.
The patch allows a partial burnin to work with the master. Currently thequery opcodes, since they are executed via the SubmitOpCode, are...
Move iallocator script execution to ganeti-noded
Currently the iallocator execution takes place in the master, which is aviolation of the current architecture, and will create problems with athreaded master daemon.
This patch moves the execution to the backend, similar to the hooks...
Fix iallocator instance info
The commit "IAllocator: some more info exported" broke the instance listgeneration due to a wrong index variable. This patch fixes that.
IAllocator: some more info exported
This patch adds the following information to the exported info: - hypervisor type (in the main dict) - total memory used by primary instances (in each node dict) (can be computed from the node+instance dicts, but it's cheap to compute...
IAllocator: simplify node info computation
Currently we try to convert the values returned by call_node_info toints, and if all succeed, we actually do the conversion. Simplify thisby doing it in one step.
The patch also adds exporting of node memory as 'reserved_memory'....
Style fixes for trunk
This small patch fixes: - wrong indentation in two places - use of 'os' variable that hides global scope os module
Implement replace secondary via the iallocator
This patch implements secondary replace via the iallocator. The newopcode parameter 'iallocator' behaves like this: if passed, it willalways compute and assign a new secondary, behaving in effect as if the...
Fix generalized relocate mode of IAllocator
The patch which generalized the IAllocator was half-true: it actuallyput the selection of the node inside the IAllocator, so callers were notable to specify replace primary node.
This patch does: - split the arguments to the constructor in three sets: mode and name...
Add gnt-backup remove functionality
This patch also fixes the LUExportInstance Prereq docstring.
Generalize the replace_secondary mode in iallocator
Currently the replace_secondary mode is too restrictive. This patchchanges this to a general 'relocate' mode where the node(s) to bechanged are specified via a new key in the request dict ('nodes') so...
Send required_nodes field to the iallocator scripts
This patch adds the 'required_nodes' field in the request dict for theiallocator.
This means that the handmade-checks in the create instance can besimplified, and that the dumb allocator can be made simple. Therefore...
Move all iallocator functions into a class
This patch moves all the iallocator function into a separate class thatis then somewhat easier to use. It doesn't bring any new functionality.
The patch also changes the way the iallocator is called - theOpTestAllocator opcode is no longer needed, and all its parameters...
Abstract the json functions into a separate module
This simple patch adds a new module that holds the simplejson functionsfor serialization/deserialization. This reduces the amount of redundantcode.
The patch also adds some normalizations to the json output:...
Add --readd option to “gnt-node add”
This allows us to readd a node after it failed and required areinstallation or replacement.
IAllocator part 3: LUCreateInstance changes
This (final) patch allows the instance's nodes to be selectedautomatically based on the passed allocator algorithm.
The patch changes the pnode opcode parameter from required to optional,now either the pnode or the iallocator must be passed....
Reorder checks in instance create
This patch reorders the checks in the instance create prereq so that allchecks and normalisations that are not node-dependent are done beforethe node dependent checks.
This is done so that, after the instance-related opcode parameters are...
Implement 'out' direction on allocator tests
This patch adds the paths for searching for instance allocators andmakes the LUTestAllocator code run the allocator and return the resultsif the direction specified is 'out'. 'out' means that the opcode will...
Allocator framework, 1st part: allocator input generation
In preparation for the introduction of automatic instance allocator,this patch adds an allocator simulation opcode, that based on the inputparameters, will return either the input message to the allocator...
Fix two pylint uninitialized variable errors
Bugfix: wrong identifier in CheckPrereq message
Move the disk size computation to its own function
This is currently hard-coded for the two drive case and will need to bereworked for multi-disk support.
The patch is needed to support passing the total required size to theiallocator interface.
Verify: make skipping checks possible
Add a general way to skip some checks at cluster-verify time and make the N+1memory redundancy check optional.
Verify: add N+1 Memory redundancy verification
For every node we check that we can host all the instances it's currentlysecondary for belonging to the same primary. This ensures that if a node failsall its instances can fit on their secondary node. The code only works when...
Verify: save instance config
Save the instance config after we queried it in an instance_cfg dict. This canbe used later by any function that wants it, without reloading it from theconfiguration module. It will be used for N+1 memory resilience checking....
Verify: add more instance information to node_info
The sisnt-by-pnode field contains all secondary instances of a node, grouped bytheir primary node. This information allows us to see quickly if when a nodedies some of its instances cannot be started on their secondary node....
Verify: add instance information to node_info
With this patch node_info is changed to store information about which primaryand secondary instances are configured on a node. This information is useful tocheck memory and disk allocation. A list of non-redundant instances is also...
Verify: Add and populate node_info dict
During information gathering we collect information from call_node_info, andthen when we cycle trough the nodes add it into a node_info dict containing anode's free memory and disk. This will be useful later to verify that the...
Rework the results of OpDiagnoseOS opcode
Currently, the opcode DiagnoseOS is the only opcode that return astructure of objects.OS (which is a custom class, and not a simplepython object) and furthermore all the processing of OS validity acrossnodes is left to the clients of this opcode....
Verify: remove useless check in _VerifyInstance
The list of instances passed to _VerifyInstance is the one coming fromself.cfg.GetInstanceList(). So there's no point, inside that function, inchecking whether the current instance is a member of that list. Moreover...
Verify: instance verification cleanup
The instance configuration is grabbed both in the _VerifyInstance function andin the loop that calls it. Clean this up by passing the configuration as aparameter.
Verify: fix crash when a node is down
Currently if ganeti-noded doesn't respond on a node gnt-cluster verify will diewhen verifying primary instances for that node. Fix this by just emitting anerror message if no information about running instances is returned from the...
Verify: fix ERROR message indentation
All ERROR messages in cluster verify are indented by four spaces, this one isindented by two. Fixing this skew.
Reviewed-by: imsnah, iustinp
Small code style fix
Bugfix instance create when file-storage-dir None
os.path.join does not like None as argument and fails withAttributeError: 'NoneType' object has no attribute 'startswith'.
This patch makes sure the passed argument is a string in any case.
Two small code style fixes
Modify LURenameInstance to support file backend
This patch does two things:- Modify LURenameInstance.Exec to rename directory when a file-based instance is renamed- Modify config.RenameInstance() to replace the directory name in config.data for file devices...
Modify LUCreateInstance to support file backend
- Modfiy _GenerateDiskTemplate to support file-based disk template- Modify _CreateDisks to create directory needed for file-based instances before creating the actual files- Modify _RemoveDisks to delete directory for file-based instances...
Improve disk consistency error message again
This new version includes all the possible failure options.
Fix misleading error message when checking disks
_CheckDiskConsistency outputs "Can't get any data from node NODE" when no drbdis found on the target node. This causes a misleading error message to beoutput for example on failover (when the primary node is down, or the instance...
Handle better failing over non-running instances
Right now if you try to failover an instance which is not marked as up theoperation will fail unless you pass the --ignore-consistency flag because thedisks won't be considered to be consistent. Allow them to be if we know the...
Improve export and fix export-on-norun bug
Currently gnt-backup export chains the ShutdownInstance and StartupInstanceopcodes to itself. This works but (a) it's suboptimal, because there's no needto deactivate the instance's disks as we are about to restart it anyway, and...
failover: only start instance if we should
gnt-instance failover on an instance marked as down will mistakenly bring itup. The watcher will then shut it down again, but it's a lot better (and safer)not to start it at all.
Change the 'gnt-cluster command' execution order
This patch makes the command execute last on the master (if the masteris selected). The order for the other nodes is unchanged.
The patch also updates the man page with some explanations and anexample....
parms->params Refactoring
- Substitute all occurences of name 'parms' with 'params'- Small codestyle fix
Skip HasValidVG when --no-lvm-storage on cluster init
This patch does two things:- Remove "vg_name" from _OP_REQP due to the introduction of --no-lvm-storage. Since vg_name option has as default now None and is only set to the DEFAULT_VG if lvm_storage is enabled, this is needed...
Add LUSetClusterParams to cmdlib
Add LUSetClusterParams, which is the LU to modify cluster options.This includes checks:- not to disable lvm storage when it's already disabled- not to enable lvm storage when it is already enabled- not to disable lvm when lvm-based instances are present...