storage: Check that mapper is either used or None
This is a followup patch to the one moving GetAllocatable out tomodule level.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Fix bug in “gnt-node list-storage”
LVM PV storage units would always show as allocatable, even when theyweren't. For some reason I have not been able to determine, the functionparsing the attributes (“_GetAllocatable”) was not even called and thelist opcode simply returned the attribute string as the value (e.g....
Improve import/export timeout settings
With this patch, the exporting node will retry to connect a few times.The receiving node will make use of the master's increased timeout (seeprevious patch).
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Increase remote import/export timeout
It's been shown that 60 seconds may not be enough to establish aconnection.
gnt-instance info: Show disk template
The data was already there, but not shown.
Remove unused import from client.gnt_instance
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Adeodato Simo <dato@google.com>
gnt-instance console: Improve error reporting
If the SSH command fails, this will give a more detailed errormessage than before.
Increase timeout for connection on remote import
The source cluster has to shut down an instance before it can beexported. Doing so can take a while, but the default connection timeoutis only 60 seconds. Adding the shutdown timeout on the receiving cluster...
jqueue: Fix cancelling while in waitlock in queue
Since the recent change to leave jobs in the “waitlock” status (commit5fd6b6947), cancelling a job while it's back in the queue would break.This patch handles these cases and adds a unittest.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
cli: Extend message for LUXI timeouts
Point out that jobs already submitted continue to run.
Fix timeout handling in LUXI client
If the socket can't be read in time, it raises “socket.timeout”, forwhich there is special handling code. Unfortunately the exception blockwas in the wrong order and “socket.error” caught it before.
Merge branch 'stable-2.3' into devel-2.3
Fix gnt-cluster verify with diskless instances
`gnt-cluster verify` was failing with KeyError if there was anydiskless instance in the cluster. This was because _CollectDiskInfo()was not including these instances in the returned dictionary, but theywere expected to be present in LUVerifyCluster.Exec()....
jqueue: Keep jobs in “waitlock” while returning to queue
Iustin Pop reported that a job's file is updated many times while itwaits for locks held by other thread(s). After an investigation it wasconcluded that the reason was a design decision for job priorities to...
Fix disk status verification in LUClusterVerify
Commit b8d26c6 added disk status verification, but it has two(different) bugs for not healthy nodes.
For offline nodes, we don't add at all the disk status to theinstance/node dict, with the result that the instance is not present in...
Merge branch 'devel-2.2' into devel-2.3
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Merge branch 'stable-2.2' into stable-2.3
Fix rename for file-backed instances
Currently the code wrongly changes the disk logical/physical idcomponent representing the path from "$storage_dir/$iname/disk$seq" to"$storage_dir/$iname/disk/$seq" (note the additional slash) breaking therename.
Signed-off-by: Guido Trotter <ultrotter@google.com>...
locking: Clarify message for removed locks
Just being told that a lock doesn't exist can be confusing. One casewere this happens is when a job (e.g. instance modify) waits for a jobremoving the instance (e.g. export with remove).
impexpd: Disable OpenSSL compression in socat if possible
This uses an option only available in patched socat versions. Moreinformation is available from the INSTALL update included in thispatch.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>...
config.py: need explicit %-formatting in errors.OpPrereqError.
Signed-off-by: Adeodato Simo <dato@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reinstall instance: disallow offline secondaries
Currently, reinstallation of a DRBD instance with the secondary node offline does:
node1# gnt-instance reinstall -f instance1Waiting for job 139053 for instance1...Thu Nov 18 01:36:09 2010 - WARNING: Could not prepare block device disk/0 on node node3 (is_primary=False, pass=1): Node is marked offline...
Fix breakage in OS state modify
I was using the feedback_fn function incorrectly (it doesn'tautomatically expand the arguments).
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Conflicts: lib/cmdlib.py (reverted & applied manually the change)
Signed-off-by: Iustin Pop <iustin@google.com>...
LUSetClusterParms: fix validation of beparams
Since the contents of the dict is validated via the ForceDictType, we cansimply require that it is a dict here. The previous check was wrong, as it wascopied from the HV checks (which also doesn't verify the leaf dict type)....
Add unittests for TemporaryReservationManager
And fix an error message.
TempReservationManager: Reserved() doesn't work
Note: It appears this has been around since the initial checkin ofTemporaryReservationManager. I have no idea what this could break, sosomeone else may want to test this more thoroughly.
Signed-off-by: David Knowles <dknowles@google.com>...
Fix disk checks in “gnt-cluster verify”
Tests have shown that the changes in commit b8d26c6e5 don't work aswanted. If any disk wasn't found on the node, all disks located on thesame node would show as faulty. The cause was incorrect exceptionhandling on the node....
Remove shebang line from ganeti.server.*
Some of then were forgotten.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Drop the -g shortcut for --vg-name
Changing the volume group is a lot less frequent than acting on a nodegroup. As such we drop the "-g" shortcut and require the long option tobe passed. In 2.3 the commands which used to accept the volume group as"-g" won't have any node group option, so no confusion will arise. Later...
Merge the common options between import and add
The "I always wanted to do this" commit.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Improve LookupNodeGroup's docstring
Add ConfigWriter.GetNodeGroup
Remove private ip mention in error message
There is no "private" ip in Ganeti, we only have primary and secondaryones. Whether they are public or private is a per-installation detail.
luxi: disable two lint errors
This is already disabled for the same type of request a couple of linesabove. The new code was introduced in e986f20c but didn't have thedisables.
Add -s option to gnt-node modify
We can now change a nodes' secondary ip.
config: Write ssconf after renaming instance
This fixes a bug where the ssconf_instance_list file wasnot updated after an instance rename.
Move ganeti-noded to ganeti.server.noded
Move ganeti-rapi to ganeti.server.rapi
Prepare move of daemons to ganeti.server
Move ganeti-masterd to ganeti.server.masterd
Move ganeti-confd to ganeti.server.confd
Move ganeti-watcher to ganeti.watcher
Add support and checks for version in LUXI
A new constant, LUXI_VERSION, is used to verify the peer's version. Theversion is optional, so old(er) clients and servers talking to peers notsupporting it won't break. Example with mismatching library:
$ gnt-instance list...
luxi.ProtocolError: Derive from errors.LuxiError
This allows LUXI errors to be encoded and serialized.
LUExportInstance: Accept instance already shut down
To remove the instance after an export it needs to be stopped. This canbe achived using the parameter “shutdown”, or by explicitly shuttingdown the instance before exporting. The latter would still require the...
GanetiLockManager, remove default values
The nodes and instances parameters to the constructor are mandatoryanyway, as a value of None will fail when creating the LockSet. Ratherthan fixing this adding code lines, since we never used the defaultvalue, let's remove them and require that the parameters are passed....
ConfigWriter.GetNodeGroupList
Prevent onlining a node without working noded
This is just a basic check, plus a warning. In the future, we might domore checks, or prevent simple onlining (without readd) if --force isnot passed.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Yet another rework in LUSetNodeParms
We will need the new role in CheckPrereq, so move its computation thereand save the new role to self.
Prevent moving/creating instances on non-vm nodes
This small patch modifies LUCreateInstance, LUReplaceDisks andLUMoveInstance to not use non-vm_capable nodes.
ConfigWriter: add some helper functions
This can be used to compute a node's instances easily, and a smallfunction to get all non-vm_capable nodes.
Add vm_capable to LUSetNodeParams
And also do some cleanup: we only run the role changed actions if thenode has actually changed roles.
Add vm_capable to gnt-node modify
Add support for vm_capable in cluster verify
The method to make vm_capable integrate easily into cluster verify is as follows:
- we add a new NV_VMNODES that represents nonvm-capable nodes the LU populates this list (it's expected that non-vm_capable nodes...
Add an UploadHelper to cmdlib
This is used in two places already, and will be needed in a third, solet's abstract it.
Add support for vm_capable in file distribution
Add the master/vm_capable flags in node add
Add the capability flags in node info output
Add a CheckNodeVmCapable helper in cmdlib
Also changes the error code for the other CheckNode* helpers toECODE_STATE, not ECODE_INVAL: ECODE_INVAL is for requests that areinvalid (e.g. create drbd instance with one node), whereas ECODE_STATEdenote requests that are not satisfiable due to cluster/node/instance...
LUClusterVerify: Complain if disk is marked faulty
This will show a warning if, for example, one side of a DRBDdisk becomes unavailable. The data is collected separatelyfrom the other verification data.
Example output:
Move gnt-backup to ganeti.client.gnt_backup
Move gnt-instance to ganeti.client.gnt_instance
Move gnt-job to ganeti.client.gnt_job
Move gnt-node to ganeti.client.gnt_node
Move gnt-cluster to ganeti.client.gnt_cluster
Move gnt-os to ganeti.client.gnt_os
Move gnt-debug to ganeti.client.gnt_debug
Allow programs to be part of the Ganeti library
Eventually this will help ensuring that clients and servers are of thesame version, as long as they're imported from the same path. Currentlyit's relatively easy for gnt-* and ganeti-* to be from a different...
Implement the master_capable flag in node modify
Add master_capab to gnt-node modify
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Export the capability flags in query, rapi, ialloc
Add the master/vm_capable flags to objects
This adds the flag and some initial handling. The rest of the changes,for cmdlib, come in a separate patch.
Rework node role changes
There have been many bugs in gnt-node modify. Let's try to introducesome more.
This patch reworks the node role changes from tracking the flag changesto completely overwriting the flags based on the new role. This pavesthe way for (in 2.4 or later) moving to a single attribute for nodes....
rpc: Work around epydoc warning
Aliasing the “threading” module allows us to avoid the “No informationavailable for ganeti.rpc._RpcThreadLocal's base threading.local” warningby epydoc.
Merge branch 'devel-2.2'
Allow remote imports without checked names
By default all names are checked (LUCreateInstance, name_check). In somecases it can be useful to disable this check, but doing so was notallowed for remote imports. One should be aware, however, that usingthis feature can lead to rename script failures when importing a remote...
Support modify of prealloc_wipe_disks config value
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Export a node's group information in iallocator
Rename node.nodegroup to node.group
In the context of a node, its group has (at least today) only onemeaning, that is the node's node group. As such, we renamenode.nodegroup to just node.group.
Note: if we want to keep node in there, it should be at least...
Rename --nodegroup to --node-group
For consistency with other CLI options.
Export node group data in iallocator
Split IAllocator._ComputeClusterData
The node and instance computations were all in this big function; weseparate them out for more clarity.
Putting the pieces together and invoke the wipe in cmdlib
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Adding RPC call for blockdev_wipe
Second iteration over backend.BlockdevWipe
This patch now uses dd entirely to wipe the disk, make itmuch easier to wipe in blocks so we can give interactive feedbackabout the status.
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>...
ConfigWriter: Fix typo in error message parts
Simplify and extend the instance OS env
Some parameters were missing (uuid, c/mtime). We simplify the exportmethod; unfortunately we cannot simply iterate over slots since themapping is not 1:1.
ConfigWriter: prevent using a foreign config
If the configuration file doesn't denote this node as master, we preventstartup. This would have detected our previous race condition moreeasily, hence we add it as a permanent check.
Fix bootstrap.MasterFailover race with watcher
This fixes a recently diagnosed race condition between master failoverand the watcher.
Currently, the master failover first stops the master daemon, checksthat the IP is no longer reachable, and then distributes the updated...
ConfigWriter: protect against multiple writers
This should fix the case where there are two masters that both try todistribute the configuration file to the cluster. The first one that does so,will "win" the ownership of the config.data.
backend.Upload: switch to utils.SafeWriteFile
This allows serialization of updates to a given file, with respect toother cooperating writers.
Add a "safe" file wrapper over WriteFile
Add functions to read and compare file 'ID's
LUSetInstanceParams: Remove unused attribute
“os_new” is not used anywhere, removing it.
Adding backend method to wipe a block device
Allow to specify wipe command and flags at configure time
Fix remote imports
A simple typo…
Fix typo introduced in 8d8c4ef
Commit 8d8c4ef broke instance reinstall with different OS, due to anattribute typo.
Fix clearing of the default iallocator
And also update the man page.