Drop the -g shortcut for --vg-name
Changing the volume group is a lot less frequent than acting on a nodegroup. As such we drop the "-g" shortcut and require the long option tobe passed. In 2.3 the commands which used to accept the volume group as"-g" won't have any node group option, so no confusion will arise. Later...
Merge the common options between import and add
The "I always wanted to do this" commit.
Signed-off-by: Guido Trotter <ultrotter@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
Improve LookupNodeGroup's docstring
Add ConfigWriter.GetNodeGroup
Remove private ip mention in error message
There is no "private" ip in Ganeti, we only have primary and secondaryones. Whether they are public or private is a per-installation detail.
luxi: disable two lint errors
This is already disabled for the same type of request a couple of linesabove. The new code was introduced in e986f20c but didn't have thedisables.
config: Write ssconf after renaming instance
This fixes a bug where the ssconf_instance_list file wasnot updated after an instance rename.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Move ganeti-noded to ganeti.server.noded
Move ganeti-rapi to ganeti.server.rapi
Prepare move of daemons to ganeti.server
Move ganeti-masterd to ganeti.server.masterd
Move ganeti-confd to ganeti.server.confd
Move ganeti-watcher to ganeti.watcher
Add support and checks for version in LUXI
A new constant, LUXI_VERSION, is used to verify the peer's version. Theversion is optional, so old(er) clients and servers talking to peers notsupporting it won't break. Example with mismatching library:
$ gnt-instance list...
luxi.ProtocolError: Derive from errors.LuxiError
This allows LUXI errors to be encoded and serialized.
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
LUExportInstance: Accept instance already shut down
To remove the instance after an export it needs to be stopped. This canbe achived using the parameter “shutdown”, or by explicitly shuttingdown the instance before exporting. The latter would still require the...
GanetiLockManager, remove default values
The nodes and instances parameters to the constructor are mandatoryanyway, as a value of None will fail when creating the LockSet. Ratherthan fixing this adding code lines, since we never used the defaultvalue, let's remove them and require that the parameters are passed....
ConfigWriter.GetNodeGroupList
Prevent onlining a node without working noded
This is just a basic check, plus a warning. In the future, we might domore checks, or prevent simple onlining (without readd) if --force isnot passed.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>
Yet another rework in LUSetNodeParms
We will need the new role in CheckPrereq, so move its computation thereand save the new role to self.
Prevent moving/creating instances on non-vm nodes
This small patch modifies LUCreateInstance, LUReplaceDisks andLUMoveInstance to not use non-vm_capable nodes.
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>
ConfigWriter: add some helper functions
This can be used to compute a node's instances easily, and a smallfunction to get all non-vm_capable nodes.
Add vm_capable to LUSetNodeParams
And also do some cleanup: we only run the role changed actions if thenode has actually changed roles.
Add vm_capable to gnt-node modify
Add support for vm_capable in cluster verify
The method to make vm_capable integrate easily into cluster verify is as follows:
- we add a new NV_VMNODES that represents nonvm-capable nodes the LU populates this list (it's expected that non-vm_capable nodes...
Add an UploadHelper to cmdlib
This is used in two places already, and will be needed in a third, solet's abstract it.
Add support for vm_capable in file distribution
Add the master/vm_capable flags in node add
Add the capability flags in node info output
Add a CheckNodeVmCapable helper in cmdlib
Also changes the error code for the other CheckNode* helpers toECODE_STATE, not ECODE_INVAL: ECODE_INVAL is for requests that areinvalid (e.g. create drbd instance with one node), whereas ECODE_STATEdenote requests that are not satisfiable due to cluster/node/instance...
LUClusterVerify: Complain if disk is marked faulty
This will show a warning if, for example, one side of a DRBDdisk becomes unavailable. The data is collected separatelyfrom the other verification data.
Example output:
Move gnt-backup to ganeti.client.gnt_backup
Move gnt-instance to ganeti.client.gnt_instance
Move gnt-job to ganeti.client.gnt_job
Move gnt-node to ganeti.client.gnt_node
Move gnt-cluster to ganeti.client.gnt_cluster
Move gnt-os to ganeti.client.gnt_os
Move gnt-debug to ganeti.client.gnt_debug
Allow programs to be part of the Ganeti library
Eventually this will help ensuring that clients and servers are of thesame version, as long as they're imported from the same path. Currentlyit's relatively easy for gnt-* and ganeti-* to be from a different...
Implement the master_capable flag in node modify
Add master_capab to gnt-node modify
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: Guido Trotter <ultrotter@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Export the capability flags in query, rapi, ialloc
Add the master/vm_capable flags to objects
This adds the flag and some initial handling. The rest of the changes,for cmdlib, come in a separate patch.
Rework node role changes
There have been many bugs in gnt-node modify. Let's try to introducesome more.
This patch reworks the node role changes from tracking the flag changesto completely overwriting the flags based on the new role. This pavesthe way for (in 2.4 or later) moving to a single attribute for nodes....
rpc: Work around epydoc warning
Aliasing the “threading” module allows us to avoid the “No informationavailable for ganeti.rpc._RpcThreadLocal's base threading.local” warningby epydoc.
Merge branch 'devel-2.2'
Allow remote imports without checked names
By default all names are checked (LUCreateInstance, name_check). In somecases it can be useful to disable this check, but doing so was notallowed for remote imports. One should be aware, however, that usingthis feature can lead to rename script failures when importing a remote...
Support modify of prealloc_wipe_disks config value
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Export a node's group information in iallocator
Rename node.nodegroup to node.group
In the context of a node, its group has (at least today) only onemeaning, that is the node's node group. As such, we renamenode.nodegroup to just node.group.
Note: if we want to keep node in there, it should be at least...
Rename --nodegroup to --node-group
For consistency with other CLI options.
Export node group data in iallocator
Split IAllocator._ComputeClusterData
The node and instance computations were all in this big function; weseparate them out for more clarity.
Putting the pieces together and invoke the wipe in cmdlib
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
Adding RPC call for blockdev_wipe
Second iteration over backend.BlockdevWipe
This patch now uses dd entirely to wipe the disk, make itmuch easier to wipe in blocks so we can give interactive feedbackabout the status.
Signed-off-by: René Nussbaumer <rn@google.com>Reviewed-by: Michael Hanselmann <hansmi@google.com>...
ConfigWriter: Fix typo in error message parts
Simplify and extend the instance OS env
Some parameters were missing (uuid, c/mtime). We simplify the exportmethod; unfortunately we cannot simply iterate over slots since themapping is not 1:1.
ConfigWriter: prevent using a foreign config
If the configuration file doesn't denote this node as master, we preventstartup. This would have detected our previous race condition moreeasily, hence we add it as a permanent check.
Signed-off-by: Iustin Pop <iustin@google.com>...
Fix bootstrap.MasterFailover race with watcher
This fixes a recently diagnosed race condition between master failoverand the watcher.
Currently, the master failover first stops the master daemon, checksthat the IP is no longer reachable, and then distributes the updated...
ConfigWriter: protect against multiple writers
This should fix the case where there are two masters that both try todistribute the configuration file to the cluster. The first one that does so,will "win" the ownership of the config.data.
backend.Upload: switch to utils.SafeWriteFile
This allows serialization of updates to a given file, with respect toother cooperating writers.
Add a "safe" file wrapper over WriteFile
Add functions to read and compare file 'ID's
LUSetInstanceParams: Remove unused attribute
“os_new” is not used anywhere, removing it.
Adding backend method to wipe a block device
Allow to specify wipe command and flags at configure time
Fix remote imports
A simple typo…
Signed-off-by: Iustin Pop <iustin@google.com>Reviewed-by: René Nussbaumer <rn@google.com>
Fix typo introduced in 8d8c4ef
Commit 8d8c4ef broke instance reinstall with different OS, due to anattribute typo.
Fix clearing of the default iallocator
And also update the man page.
gnt-instance reinstall: Allow overriding OS parameters
This allows OS installation scripts to make use of special parameters,e.g. to retain some data on reinstallation.
The RAPI resource is not updated as it takes all parameters via thequery string and encoding arbitrary data in a query string is tricky....
Add option to ignore offline node on instance start/stop
In some cases it can be useful to mark as an instance as startedor stopped while its primary node is offline. With this patch,a new option, “--ignore-offline”, is introduced to “gnt-instancestart” and “… stop”....
utils: Add function to find items in dictionary using regex
This basically extracts a small piece of code from ganeti-rapi and putsit into a utility function. RAPI resources are found using a dictionaryin which the keys can either be static strings or compiled regular...
Let gnt-cluster support prealloc_wipe_disks
This includes a new option gnt-cluster init and approriate outputon gnt-cluster info. Though gnt-cluster modify is not yet prepared.
http.client: Disable SSL session ID cache
This patch disables the SSL session ID cache for all cURL operations.This is needed because http.HttpBase's PyOpenSSL implementation does notcurrently set a context using SSL_set_session_id_context(3SSL), cURLtries to re-use the session ID and, according to...
http.auth: Fix docstring error
This was missing from commit 2287b920.
Merge branch 'stable-2.2'
Merge branch 'stable-2.2' into devel-2.2
Fix compatibility with Pyinotify 0.8
I didn't know why the code previously used“pyinotify.EventsCodes.ALL_FLAGS” instead of using the flags from“pyinotify.EventsCodes” directly. Turns out that Pyinotify 0.8 has themin “pyinotify”, not “pyinotify.EventsCodes”....
Extract base class from SingleFileEventHandler
The base class can contain code useful to other inotify users.As it is “SingleFileEventHandler” can not be used in ganeti-rapi,therefore it'll use its own small inotify handler class basedon this base class....
http.auth.ReadPasswordFile: Don't read file directly
Reading the file before this function allows for better errorreporting.
Move the parameter types to their own module
This is for cleanup, and for later reuse in other parts of the code(outside of LUs).
"Fix" handling of old software versions on startup
Currently, masterd startup with old software versions is very confusingfor users: we present two tracebacks, with a message in the middle about"version mismatch". This can lead to users believing that all that needs...
Export more information via LUQueryInstances/RAPI
Currently, the custom instance parameters (hv, be, nicp) are onlyqueryable via LUQueryInstanceData. LUQueryInstance returns only thefilled parameters, thus its users (especially RAPI) have no way to know...
Set list of trusted SSL CAs for client to verify
As per SSL_CTX_set_client_CA_list(3SSL), set the list of acceptable CAsadvertised to SSL clients to include the server's own certificate. Thisevidently fixes the pycurl/gnutls RPC client.
During the TLS Handshake, when client verification is requested, the...
Show instance state in instance console failures
The current message is not entirely clear, as it doesn't show the reasonwhy the instance is not running.
Fix epydoc errors
And sorry!
jqueue: Fix bug when cancelling jobs
If a job was cancelled while it was waiting for locks, an assertionwould've failed. This patch fixes the problem and provides a unittest to check for this situation.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
mcpu: Raise directly in _AcquireLocks
Removes code duplication.
jqueue/gnt-job: Add job priority fields for display
These fields can help with debugging.
jqueue: Resume jobs from “waitlock” status (2nd try)
Commit 5ef699a0e had to roll back an earlier attempt at implementingthis. With the improved job queue processer, this is finally possible.
Add prealloc_wipe_disks as a cluster-wide configuration variable
This is the first step for the support of wiping block devices priorto creation of the instance.
Conflicts: lib/rpc.py (trivial, copyright header)
RPC: disable curl's Expect header
This patch solves the very slow (~8-9 seconds) gnt-instance modifybehaviour. Well, it solves in general the slow RPC behaviour, but it wasmost visible in that LU.
It seems that curl's behaviour with regard to file uploads (via PUT) and...
jqueue, CancelJob: Check status only once per call
This simplifies the code a bit--the status is only checked once.
Fix a rare bug in StartDaemonChild and GenericMain
I've seen cases where the result from str(sys.exc_info()[1]) is ""; thisbreaks the error reporting as the parent relies on non-empty errormessages to properly detect child status (otherwise it will try to read...
Enhance the error reporting
Since daemon startup error will be often related to socket errors, so itmakes sense to change the original reporting:
Error when starting daemon process: "(98, 'Address already in use')"
Into:
Error when starting daemon process: 'Socket-related error: Address...
Change daemon.GenericMain/utils.Daemonize workflow
This patch copies the pipe-based error reporting functionality fromutils.StartDaemon (I gave up for now on tryin to merge the two).
This patch will fix two longstanding bugs:
- if we fork, we lose all error reporting from the child to the original...
Change utils.GenericMain protocol
Currently, GenericMain does a two-staged workflow:
- Check, before forking- then Exec, after forking
This means we don't have any possibility to treat preparation work(before the daemon is ready for work) different from the actual work....