Iustin Pop [Tue, 10 Feb 2009 14:43:57 +0000 (14:43 +0000)]
Convert blockdev_assemble rpc to (status, data)
This converts the RPC call blockdev_assemble to the new-style result
format. Note that we won't usually have error information, but it's the
first step toward it.
Reviewed-by: ultrotter
Iustin Pop [Tue, 10 Feb 2009 13:40:59 +0000 (13:40 +0000)]
RAPI: fix a pylint warning
Child classes of _R_TAGS must define TAG_LEVEL, but for good style let's
define it also here to at least ensure we don't get a 'Unknown
attribute' exception.
Of course, this also silences a pylint warning.
Reviewed-by: amishchenko
Guido Trotter [Tue, 10 Feb 2009 11:59:17 +0000 (11:59 +0000)]
LUSetInstanceParams: use the correct hvparams
In LUSetInstanceParam we used to save the dict without defaults for the
instance params as hv_inst, but to use the populated one for the
instance (hv_new). Fixing this leads to instances without all the
parameters set.
Reviewed-by: iustinp
Guido Trotter [Tue, 10 Feb 2009 10:53:22 +0000 (10:53 +0000)]
KVM: Correct CheckParameterSyntax docstring
The comment is not really true anymore, as we have a lot of parameters
nowadays.
Reviewed-by: iustinp
Guido Trotter [Tue, 10 Feb 2009 10:53:08 +0000 (10:53 +0000)]
KVM: Fix _CallMonitorCommand error message
1) Only instance_name is available
2) There was a missing string parameter
Reviewed-by: iustinp
Iustin Pop [Tue, 10 Feb 2009 08:13:39 +0000 (08:13 +0000)]
Fix one more RAPI QA test
This was skipped in the previous QA patch.
Reviewed-by: imsnah
Guido Trotter [Mon, 9 Feb 2009 15:17:15 +0000 (15:17 +0000)]
KVM: Add usb mouse type parameter
In some cases 'mouse' may work better than 'tablet', so we'll handle
both by allowing the user to specify a parameter. By default no mouse is
used.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:59 +0000 (15:16 +0000)]
KVM: allow netboot
With this patch we allow KVM instances to be booted off the network.
The only issue is that this is not compatible with virtio nics, so
we disallow them, when booting from the net.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:47 +0000 (15:16 +0000)]
KVM: actually support different nic types
When executing the KVM runtime we load the nic type from the runtime
hvparams and use it to specify the nic model type. As for the disk we
translate the DEV_PARAVIRTUAL type to 'virtio'.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:34 +0000 (15:16 +0000)]
KVM: export hvparams in the runtime
They'll be used to set the nic type when we execute the runtime, since
the nics are processed later. We need to save the hvparams because we
want to use the same one as when we saved the runtime, rather than use
the current instance ones, to avoid applying only some changed
parameters when the runtime is loaded.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:20 +0000 (15:16 +0000)]
KVM: actually support different disk types
By passing the relevant if= value to the disk we support different disk
types. The only change is that we'll translate "paravirtual" to
"virtio" to keep only one "paravirtualized" value, around ganeti. The
if= value is calculated outside the disks loop, as it's the same for all
disks (as currently ganeti doesn't support per-disk params).
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:07 +0000 (15:16 +0000)]
Xen-HVM: Improve the invalid disk/nic type error
Copy the message from the KVM one, adding a missing 'the' and a list of
possible values, to help the user in his decision.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:15:55 +0000 (15:15 +0000)]
KVM: parameters for different disk and nic types
- Add a bunch of NICs and DISKs types
- Specify which one are valid disks and nics for KVM (the new ones
toghether with some of the old ones)
- Add the default values (paravirtual)
- Allow the disk and nic types as parameters and check their validity
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:15:40 +0000 (15:15 +0000)]
Rename the device type constants
These are not HVM specific, so have been given an HT generic name.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:15:26 +0000 (15:15 +0000)]
s/HT_HVM_VNC_BASE_PORT/VNC_BASE_PORT/g
The VNC base port has nothing to do with HVM itself, and is general to
VNC itself, so we're removing the HT_HVM prefix to the constant.
Reviewed-by: iustinp
Iustin Pop [Mon, 9 Feb 2009 14:04:08 +0000 (14:04 +0000)]
Add a new instance query flag ‘disk_usage’
This patch adds a new instance query flag called disk_usage that
retrieves the overall space used by an instance on each of its nodes.
This can be used when balancing the cluster or checking N+1 status.
The flag is also exported in RAPI. Note the flag is currently broken for
file-based instances, as it represents the amount of space in the
cluster volume group.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 14:03:57 +0000 (14:03 +0000)]
Uniformize some function names in backend.py
Currently, the names of the functions in backend.py that are actually
RPC procedures and are called from ganeti-noded are not corresponding to
the RPC names. This makes it hard to actually see which functions are
exported and which functions are internal to backend.
This patch renames all blockdevice-related functions in backend.py match
the name of the RPC call (without the ‘call’ or ‘perspective’ prefix).
This should make it easier to grep for a given function called in
cmdlib, without having to open and check in ganet-inoded what backend
function it corresponds to.
The patch also does two minor extra cleanups (rename a variable and
change a logging level).
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 14:03:47 +0000 (14:03 +0000)]
bdev: add and use two utility functions
This patch adds two utility functions for raising BlockDeviceError
exceptions and for running functions while ignoring this error. Most of
the manual “raise errors.BlockDeviceError” cases are converted to
_ThrowError, as this makes the code clearer.
We also change most of the DRBD error messages to include the minor
number because with the parallel execution of commands it's not longer
possible to identify the failed DRBD just from the timestamp, and the
minor number can be mapped back to the instance easier.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 14:03:38 +0000 (14:03 +0000)]
rpc.call_blockdev_find: convert to (status, data)
This patch converts the call_blockdev_find - which searches for block
devices and returns their status - to the (status, data) format. We also
modify the backend function name to match the rpc call.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 10:41:21 +0000 (10:41 +0000)]
Export the cpu nodes and sockets from Xen
This is a hand-picked forward patch of commit 1755 on the 1.2 branch
(hand-picked since the trees diverged too much since then):
The patch changed the xen hypervisor to compute the number of cpu
sockets/nodes and enables the command line and the RAPI to show this
information (for RAPI is enabled by default in node details, for gnt-one
one can use the new “cnodes” and “csockets” fields).
Originally-Reviewed-by: ultrotter
For the KVM and fake hypervisors, the patch just exports 1 for both
nodes and sockets. This can be fixed, by looking at the
/sys/devices/system/cpu/cpuN/topology directories, and computing the
actual information, but that should be done in a separate patch.
Reviewed-by: imsnah
Iustin Pop [Mon, 9 Feb 2009 10:31:44 +0000 (10:31 +0000)]
Fix handling OS errors in AddOSToInstance
This patch fixes the error handling in the add OS to instance function
with regard to invalid OSes. Previously, we didn't handle any such
errors, with the end result that the user would have to look in the node
daemon log.
The patch also renames the name of the function to match the RPC call
name.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 09:24:38 +0000 (09:24 +0000)]
backend.DrbdAttachNet: don't ignore Open() errors
Currently the return value or errors from the block device Open() method
are ignored. This patch catches any BlockDeviceErrors and returns a
well-formatted result.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 09:24:29 +0000 (09:24 +0000)]
cmdlib: simplify some rpc error handling cases
By using the RemoteFailMsg() or the payload field of RpcResult, we can
simplify a few functions in cmdlib.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 09:24:21 +0000 (09:24 +0000)]
RpcResult: add a new payload field
For results which use the (status, payload) response type, it's easier
to define a ‘payload’ field on the result holding the payload than to
extract it using “data[1]” in the caller code.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 09:24:10 +0000 (09:24 +0000)]
LUCreateInstance: only set running flag at the end
In lockless queries, it's better if we see the instance in ADMIN_down
rather than ERROR_down during the time it's installed. As such, we
change the LU to only mark the instance 'up' at the time we are ready to
start it.
Reviewed-by: ultrotter
Guido Trotter [Sat, 7 Feb 2009 09:04:31 +0000 (09:04 +0000)]
KVM: don't boot from a virtio cdrom
Apparently it's not supported. Also add -boot command line parameters
to kvm, since they seem to help booting from the right place. Everything
will still only work when not using a kernel, but well... :)
Reviewed-by: iustinp
Guido Trotter [Sat, 7 Feb 2009 09:04:15 +0000 (09:04 +0000)]
KVM: don't boot from cdrom with no cdrom
Reviewed-by: iustinp
Guido Trotter [Sat, 7 Feb 2009 09:04:00 +0000 (09:04 +0000)]
Support cdrom image and boot order for KVM
The cdrom image has the same meaning than in Xen HVM, and so does
boot_order, even though it has a slightly different syntax, and uses the
value 'disk' too boot from disk and 'cdrom' to boot from cdrom.
Reviewed-by: iustinp
Guido Trotter [Sat, 7 Feb 2009 09:03:44 +0000 (09:03 +0000)]
Get rid of constants.HT_HVM_DEFAULT_BOOT_ORDER
Confusingly, as a leftober from 1.2, there was a
constants.HT_HVM_DEFAULT_BOOT_ORDER constant, with a value opposite to
the default HV_BOOT_ORDER hv param that got enabled only if
HV_BOOT_ORDER was set to None. Since setting it to None is very
hard/impossible for the user, and we didn't handle other "empty" values
(False, ''), we'll just force the parameter to have a valid value (after
all we have a default, and that's the way we use hvparams) and get rid
of the old constant altoghether.
Reviewed-by: iustinp
Iustin Pop [Fri, 6 Feb 2009 13:06:02 +0000 (13:06 +0000)]
QA: switch RAPI to https
Since we by default now use SSL for RAPI, we need to switch the QA
tests to SSL too.
Reviewed-by: amishchenko
Iustin Pop [Fri, 6 Feb 2009 08:09:10 +0000 (08:09 +0000)]
Fix rapi job listing
This patch fixes a couple of issues with the job listing:
- in case of a non-existing job, nicely raise 404 instead of 500
- in the job detail listing, also list the job log, the job
timestamps, etc.
- the opcode migrate instance was missing its description field
Reviewed-by: imsnah
Iustin Pop [Thu, 5 Feb 2009 14:09:06 +0000 (14:09 +0000)]
rapi: fix SSL mode and use SSL by default
This patch fixes the SSL mode (by actually constructing SSL parameters
from the command line options) and enables SSL by default; the old “-S”
option which enabled SSL is now changed to “--no-ssl”. The certificate
and key are by default pointing to the Ganeti auto-generated certificate
for rapi.
Reviewed-by: imsnah
Iustin Pop [Thu, 5 Feb 2009 14:08:56 +0000 (14:08 +0000)]
Small improvement to the init.d example file
The start_action function is changed so that it can be called with
arguments - this could be used to parse a defaults file, etc.
Reviewed-by: imsnah
Guido Trotter [Thu, 5 Feb 2009 13:37:00 +0000 (13:37 +0000)]
KVM: add VNC TLS and X509 parameters
With this parameters VNC for KVM is able to be protected by tls,
optionally with an x509 certificate, and optionally verifying the
client as well. Additionally in this patch we limit the bind address to
being a directory, rather than a file or a directory, for simplicity, as
it allows for the same level of control anyway.
Reviewed-by: iustinp
Guido Trotter [Thu, 5 Feb 2009 13:36:43 +0000 (13:36 +0000)]
KVM: allow binding vnc to a file
Before we forced the VNC_BIND_ADDRESS to be an ip. Now we also accept a
path, and bind the instance to it, or to a file in it if it's a
directory.
Reviewed-by: iustinp
Iustin Pop [Thu, 5 Feb 2009 10:45:32 +0000 (10:45 +0000)]
Fix some issues for lockless queries
This patch converts some more jobs with only queries into cheaper luxi
queries (no job created), and fixes some fallout from the lockless
queries changes.
Reviewed-by: ultrotter
Iustin Pop [Thu, 5 Feb 2009 09:47:09 +0000 (09:47 +0000)]
Revive RAPI QA tests for 2.0-style RAPI
This patch fixes the RAPI QA tests to work with today's RAPI code and
also does some other minor improvements:
- QA: only create the cluster if so configured (‘create-cluster’ key),
this allows running parts of the QA suite against existing clusters
- export the “hvparams” for instances in RAPI
Reviewed-by: imsnah
Iustin Pop [Wed, 4 Feb 2009 19:14:27 +0000 (19:14 +0000)]
rapi: fix 'bulk' processing and add locking option
This patch fixes the 'bulk' parameter (before any non-empty
specification was considered True, in conflict with the documentation,
i.e. bulk=0 still did bulk queries).
The patch also adds optional locking on the instance/node listing (does
not have effect when we only list names).
Reviewed-by: imsnah
Iustin Pop [Wed, 4 Feb 2009 19:14:14 +0000 (19:14 +0000)]
rapi: cleanup and update to latest 2.0 API
This patch cleans up and updates the RAPI interface:
- queries are changes to luxi queries instead of jobs, where possible
- since we changed the API version, we remove the old-style attributes
(sda_size, ip, etc.) and replace them with 2.0 style
- a small optimization in the instance and node list, don't query
twice the names in bulk output
- switch the instance and node lists to no locking
Reviewed-by: imsnah
Iustin Pop [Wed, 4 Feb 2009 15:11:58 +0000 (15:11 +0000)]
Enable lockless node queries
Similar to the instance list, this patch enables lockless node queris.
“gnt-node list” accepts now the “--sync” flag which enables locking, the
default is lockless.
Reviewed-by: imsnah
Iustin Pop [Wed, 4 Feb 2009 15:11:46 +0000 (15:11 +0000)]
rapi: fix authentication and queries
For queries, we don't want to require authentication. We fix this by adding an
override GetAuthRealm in the rapi daemon.
We also fix a method name.
Reviewed-by: imsnah
Iustin Pop [Wed, 4 Feb 2009 15:11:34 +0000 (15:11 +0000)]
Add one new luxi query: cluster info
This is the last query that RAPI executes via opcodes and is purely
static (config values only). As such, we can convert it safely to a
query instead of job.
Reviewed-by: imsnah
Iustin Pop [Wed, 4 Feb 2009 10:31:00 +0000 (10:31 +0000)]
ssconf: add some more keys and some fixes
This patch adds the online node list and instance list to the ssconf
keys. In order to do distribute correctly the instance list, we need to
update the cluster serial number on instance additions and removals.
The patch also changes the permissions on the ssconf files to be 0444:
- no write for root, in order to signal that these file should not be
modified
- read for everyone since the files don't contain sensitive data
anymore (and permissions can be controlled via the parent directory
if needed)
The patch also fixes a small typo on gnt-cluster.
Reviewed-by: ultrotter
Iustin Pop [Wed, 4 Feb 2009 10:30:47 +0000 (10:30 +0000)]
Implement lockless query operations
This patch adds the framework for, and enables lockless OpQueryInstances. This
means that instances will be shown in ERROR_up or ERROR_down state, even though
this is not an error (but just an in-progress job).
The framework is implemented as follows:
- the OpQueryInstances, OpQueryNodes and OpQueryExports opcodes take
an additional “use_locking” flag which will denote whether to lock
or not; this patch only implements this for LUQueryInstances
- the luxi query functions take an additional argument use_locking
which is passed to the master daemon, and then passed to the above
opcodes
- cli.py export a new SYNC_OPT command line options which implement
setting this flag to true
- except for gnt-instance list, which uses this option, and for
name-only queries (e.g. QueryNodes(fields=["names"])), all other
callers are setting this flag to True
- RAPI also sets the flag to True
The patch was tested with a continuous (0.2s sleep in-between)
gnt-instance list during a burnin, and no problems were observed.
Reviewed-by: ultrotter
Guido Trotter [Tue, 3 Feb 2009 16:05:10 +0000 (16:05 +0000)]
KVM: Make GetAllInstancesInfo concurrency-safe
Or actually more so. If this function gets called while instances get
shut down, it might try to report information on instances which don't
exits. Try to fail gracefully if that happens, by just skipping an
instance which has disappeared in the meantime.
Reviewed-by: iustinp
Guido Trotter [Tue, 3 Feb 2009 16:04:47 +0000 (16:04 +0000)]
Correct a typo in ReadPidFile's docstring
Reviewed-by: iustinp
Iustin Pop [Tue, 3 Feb 2009 15:42:42 +0000 (15:42 +0000)]
Fix unittest encoding breakage
Due to the fact that we sanitize now the output from environment
scripts, the unittest needs to be adjusted. My bad for not checking it.
Reviewed-by: imsnah
Iustin Pop [Tue, 3 Feb 2009 14:45:53 +0000 (14:45 +0000)]
Allow gnt-node evacuate to use an iallocator
This is a partial implementation of fully automated node evacuation:
we allow passing an iallocator and all instance replace-disks will be
execute via that iallocator.
The individual OpReplaceDisks opcodes are submitted in a single job,
which causes them to be executed serially and thus keeps the iallocator
runs consistent. This also changes the behaviour so that the first
reallocation that failed will stop all the reallocations.
Reviewed-by: ultrotter
Iustin Pop [Tue, 3 Feb 2009 14:45:43 +0000 (14:45 +0000)]
Add gnt-node migrate
This is a (modified) forward-port of commit 1190 on the 1.2 branch:
This is the same as gnt-node failover, and is also a cut&paste of its
code (almost). It will be really really useful to quickly empty a
healthy node. I can be persuaded to merge MigrateNode and FailoverNode
in a common codebase, but could also forget about it and submit it if
nobody cares.
Reviewed-by: iustinp
The original MigrateNode function has been converted to the 2.0 style
(cli.JobExecutor). Also commit 2076 has been added that fixes a missing
opcode parameter.
Original-author: ultrotter
Reviewed-by: ultrotter
Iustin Pop [Tue, 3 Feb 2009 14:45:32 +0000 (14:45 +0000)]
An attempt at fixing some encoding issues
This patch unifies the hardcoded re-encoding attempts into a single
function in utils.py. This function is used to take either an unicode or
str object and convert it to a ASCII-only str object which can be safely
displayed and transmitted.
We replace then the current manual re-encodings with this function. In
mcpu we stop re-encoding the hooks output and instead we do it right at
the hook generation in backend.py.
This passes on my 'custom' lvs output with non-ASCII chars. But there
are probably other places we will need to fix.
Reviewed-by: ultrotter
Iustin Pop [Tue, 3 Feb 2009 14:45:14 +0000 (14:45 +0000)]
lvmstrap: allow removable devices too
For testing or just in case a device is exported by a bad driver with
the 'removable' flag set, this patch adds a flag to lvmstrap that allows
it to use these devices too.
Reviewed-by: ultrotter
Iustin Pop [Tue, 3 Feb 2009 14:45:03 +0000 (14:45 +0000)]
Documentation: update the gnt-os manpage
This patch updates the gnt-os man page and the common footer page for
ganeti 2.0.
Reviewed-by: ultrotter
Iustin Pop [Tue, 3 Feb 2009 10:55:30 +0000 (10:55 +0000)]
Small patch for handling errors in node add
This small path hopefully fixes the handling of ssh verify errors in
node add (note: untested).
Reviewed-by: ultrotter
Iustin Pop [Tue, 3 Feb 2009 10:55:19 +0000 (10:55 +0000)]
ssh: more details on failure
In case we fail without output from the ssh command, we should at least
add the exit code or any other failure reason to the error message, and
log it and the cmdline used to the node daemon log.
Reviewed-by: imsnah
Guido Trotter [Tue, 3 Feb 2009 10:45:12 +0000 (10:45 +0000)]
Give a sane permission to the known_host file
Reviewed-by: iustinp
Iustin Pop [Mon, 2 Feb 2009 14:49:10 +0000 (14:49 +0000)]
A couple of small changes to the OS environment
This patch correctly exports the mode of disks (rw/ro) and also exports
the instance OS.
Reviewed-by: imsnah
Iustin Pop [Mon, 2 Feb 2009 11:23:48 +0000 (11:23 +0000)]
Whitespace change: bad indentation in constants.py
This patch only changes some indentation in constants.py.
Reviewed-by: imsnah
Iustin Pop [Mon, 2 Feb 2009 11:23:40 +0000 (11:23 +0000)]
Return error messages in node add ssh handling
When the rpc call node_add fails, we don't have any error message. This
patch changes the call to return (status, data) so that the user can see
the correct error message.
Reviewed-by: imsnah
Guido Trotter [Sun, 1 Feb 2009 09:48:37 +0000 (09:48 +0000)]
gnt-instance: support no_PARAMETER value
Since parameters get set to False if a no_ is prefixed don't try to
interpret those boolean values, and pass them unchanged.
Reviewed-by: iustinp
Guido Trotter [Sun, 1 Feb 2009 09:48:23 +0000 (09:48 +0000)]
LUQueryClusterInfo: filter hvparams
We don't need to show hvparams for hypervisors which are not enabled on
the cluster.
Reviewed-by: iustinp
Guido Trotter [Thu, 29 Jan 2009 15:51:58 +0000 (15:51 +0000)]
KVM: advise about VNC support on GetShellCommand
Reviewed-by: iustinp
Guido Trotter [Thu, 29 Jan 2009 15:51:44 +0000 (15:51 +0000)]
KVM: enable VNC if a VNC_BIND_ADDRESS is defined
We'll also enable a tablet usb device, as suggested by the kvm man page.
Reviewed-by: iustinp
Guido Trotter [Thu, 29 Jan 2009 15:51:29 +0000 (15:51 +0000)]
KVM: Allow the HV_VNC_BIND_ADDRESS parameter
Reviewed-by: iustinp
Guido Trotter [Thu, 29 Jan 2009 15:51:14 +0000 (15:51 +0000)]
LUAddNode: copy the vnc password file also for KVM
Before we used to copy the file if xen-hvm was enabled on the cluster,
no we'll do that if any enabled hypervisor is in the new HTS_USE_VNC
group.
Reviewed-by: iustinp
Guido Trotter [Thu, 29 Jan 2009 15:51:00 +0000 (15:51 +0000)]
Add HT_KVM to HTS_REQ_PORT
HT_KVM doesn't technically require a port, but if it has one it can give
vnc displays to instances.
Reviewed-by: iustinp
Guido Trotter [Thu, 29 Jan 2009 15:50:38 +0000 (15:50 +0000)]
KVM: make the kernel and initrd arguments optional
Under KVM we don't strictly need a kernel and initrd. If some are passed
we'll use them, otherwise the guest OS will need to behave as fully
native, and have its own boot loader and kernel.
The root_path hypervisor parameter becomes mandatory only if a kernel is
specified.
Reviewed-by: iustinp
Guido Trotter [Thu, 29 Jan 2009 15:47:21 +0000 (15:47 +0000)]
KVM: add the HV_SERIAL_CONSOLE parameter
Up until now a KVM instance was forced to have a serial port.
With this change this is no longer mandatory, by default we'll use one,
but if the HV_SERIAL_CONSOLE parameter is set to False we'll do without.
Reviewed-by: iustinp
Guido Trotter [Thu, 29 Jan 2009 15:47:06 +0000 (15:47 +0000)]
GetShellCommand: get hvparams and beparams
Sometimes the hypervisor will use the instance hv and/or be parameters
to determine the best shell command. This is not possible, though,
currently, as the instance hv/beparams are not filled, so we have to
pass the filled versions separately.
Reviewed-by: iustinp
Iustin Pop [Thu, 29 Jan 2009 15:09:21 +0000 (15:09 +0000)]
Implement software release version checks too
Currently the LUVerifyCluster only reports the protocol version changes,
not software ones. This is useful to know/monitor, so we add this too as
a warning.
Reviewed-by: ultrotter
Iustin Pop [Thu, 29 Jan 2009 15:09:11 +0000 (15:09 +0000)]
gnt-instance list: accept input names
Currently gnt-instance list will refuse to take arguments, and always
return the full list of instances. This patch allows it to pass names to
LUQueryInstances, so that we restrict the input to a given set of
instances.
Reviewed-by: ultrotter
Iustin Pop [Thu, 29 Jan 2009 15:08:57 +0000 (15:08 +0000)]
LUQueryInstances: keep the given order of names
Currently LUQueryInstances keeps the ordering of instances only in some cases,
and in others it will reorder the list. This patch fixes this by more clearly
separating the various cases (names passed or not and locking or not locking),
so that the output list is in the same order as always.
Of course, this disables the sorting when arguments are passed.
Reviewed-by: ultrotter
Iustin Pop [Thu, 29 Jan 2009 15:08:46 +0000 (15:08 +0000)]
locking.LockSet: don't modify input arguments
Currently LockSet.acquire() sorts in place it's input argument if it's a
list. This is not good, since callers might depend on a specific
ordering of the input data, and this is a 'hidden' modification.
We fix it by simply using a sorted copy, instead of sorting in place.
Reviewed-by: ultrotter
Iustin Pop [Thu, 29 Jan 2009 15:08:34 +0000 (15:08 +0000)]
Re-wrap some lines to keep them under 80 chars
This non-code change rewraps some lines in locking.py to keep them under
80 chars.
Reviewed-by: ultrotter
Iustin Pop [Thu, 29 Jan 2009 15:08:24 +0000 (15:08 +0000)]
Check that instance exists before confirm. queries
Currently we ask the user for confirmation, and only after (try to)
remove, failover or migrate the instance. This doesn't work nicely if
the instance doesn't exist, so we make a query for the instance before
the prompt, which will throw an error in case it doesn't exist.
Side-note: the way the query works today is not really nice. It would be
better if we could query explicitly for a missing instance name, so that
this is done cleaner (explicit check) instead of side-effect (throw
exception). We do add code for this explicit check, except that today it
won't be used actually.
Reviewed-by: ultrotter
Oleksiy Mishchenko [Thu, 29 Jan 2009 15:03:42 +0000 (15:03 +0000)]
RAPI: tag work
Generalize tag work for instances/nodes/cluster tag management.
Reviewed-by: iustinp
Oleksiy Mishchenko [Thu, 29 Jan 2009 15:03:00 +0000 (15:03 +0000)]
RAPI: rlib1 removal
The resources we still need moved to rlib2.
Reviewed-by: iustinp
Oleksiy Mishchenko [Thu, 29 Jan 2009 15:02:20 +0000 (15:02 +0000)]
RAPI: Implement /2 resource
Reviewed-by: iustinp
Oleksiy Mishchenko [Thu, 29 Jan 2009 14:52:41 +0000 (14:52 +0000)]
RAPI: Deprecate version Rapi version1
It is impossible to keep backward compatibility due to
significant changes in the Ganeti core.
Reviewed-by: iustinp
Iustin Pop [Wed, 28 Jan 2009 19:06:11 +0000 (19:06 +0000)]
Fix gnt-cluster modify -H and offline nodes
Reviewed-by: ultrotter
Iustin Pop [Wed, 28 Jan 2009 19:06:00 +0000 (19:06 +0000)]
Actually mark drives as read-only if so configured
This patch correctly marks the drives as read-only for Xen, and raises
and exception for KVM since it doesn't support read-only drives.
Reviewed-by: ultrotter
Iustin Pop [Wed, 28 Jan 2009 14:46:58 +0000 (14:46 +0000)]
Fix some issues related to job cancelling
This patch fixes two issues with the cancel mechanism:
- cancelled jobs show as such, and not in error state (we mark them as
OP_STATUS_CANCELED and not OP_STATUS_ERROR)
- queued jobs which are cancelled don't raise errors in the master (we
treat OP_STATUS_CANCELED now)
Reviewed-by: imsnah
Guido Trotter [Tue, 27 Jan 2009 16:44:38 +0000 (16:44 +0000)]
Xen: use utils.WriteFile for the instance configs
Also raise HypervisorError rather than OpExecError.
Reviewed-by: iustinp
Guido Trotter [Tue, 27 Jan 2009 16:44:23 +0000 (16:44 +0000)]
Xen: use utils.Readfile to read the VNC password
Also raise HypervisorError rather than OpExecError.
Reviewed-by: iustinp
Iustin Pop [Tue, 27 Jan 2009 15:41:38 +0000 (15:41 +0000)]
Implement disk verify checks in config verify
This patch adds a simple check that the 'mode' attribute of top-level disks is
correct. It does not recurse over children.
The framework could be extended with other checks in the future.
Reviewed-by: imsnah
Iustin Pop [Tue, 27 Jan 2009 15:41:26 +0000 (15:41 +0000)]
Fix the mode attribute of newly-created disks
Currently, only the LUSetInstanceParams correctly sets up the mode
attribute via a manual operation. We remove this and instead do the
correct setting in the generic _GenerateDiskTemplate function, so that
we set the mode correctly for all disk creations.
Reviewed-by: ultrotter
Iustin Pop [Tue, 27 Jan 2009 15:41:15 +0000 (15:41 +0000)]
Rework the multi-instance gnt commands
This patch changes the multi-instance gnt-* commands (gnt-instance
start/stop, gnt-node evacuate/failover) such that the individual
operations are submitted in parallel, ideally improving the speed of the
execution.
The patch does this by abstracting the job set functionality into a new
class in cli.py, that takes care of the job submit, job poll and error
handling.
Reviewed-by: ultrotter
Iustin Pop [Tue, 27 Jan 2009 15:41:01 +0000 (15:41 +0000)]
Fix single-job archiving (gnt-job archive)
This is a simply typo from the conversion to multi-job archiving.
Reviewed-by: imsnah
Guido Trotter [Tue, 27 Jan 2009 11:31:38 +0000 (11:31 +0000)]
KVM and Xen: add the HV_ROOT_PATH parameter
This parameter allows a different path to be passed to the instance
kernel. The new parameter is mandatory, and by default has the value of
the old hardcoded value for both kvm and xen.
Beta1 clusters will need to have this parameter added for their
instances to be able to boot.
Reviewed-by: iustinp
Guido Trotter [Tue, 27 Jan 2009 11:31:19 +0000 (11:31 +0000)]
KVM: implement GetShellCommandForConsole
This is a class method, because it calls _InstanceSerial, which is
another class method. The patch changes it to classmethod for all the
hypervisor classes.
Reviewed-by: iustinp
Guido Trotter [Tue, 27 Jan 2009 11:30:57 +0000 (11:30 +0000)]
KVM: classify _Instance{Monitor,Serial,KVMRuntime}
Those methods need nothing from the instantiated class, and just
manipulate strings, and fetch some class global variables, so they can
be classmethods.
Reviewed-by: iustinp
Iustin Pop [Mon, 26 Jan 2009 15:08:02 +0000 (15:08 +0000)]
Release 2.0 beta 1
Even though alpha started at 0, we release beta 1 first as we did for
1.2.
Reviewed-by: imsnah, ultrotter
Iustin Pop [Mon, 26 Jan 2009 12:34:59 +0000 (12:34 +0000)]
Update the NEWS documents for beta1
Also import the NEWS entries from the 1.2 branch which were added since
we created it.
Reviewed-by: ultrotter
Guido Trotter [Fri, 23 Jan 2009 17:02:26 +0000 (17:02 +0000)]
Xen and KVM: correct a typo when checking args
A missing 'be' was present in the error string for both xen and kvm,
when the kernel or initrd path was not absolute.
Reviewed-by: imsnah
Iustin Pop [Fri, 23 Jan 2009 13:33:41 +0000 (13:33 +0000)]
Sort the instance names in batcher
In case we submit multiple instances via batcher, it's nicer to have the
sorted nicely.
Reviewed-by: imsnah
Iustin Pop [Fri, 23 Jan 2009 13:33:32 +0000 (13:33 +0000)]
Fix batcher for 2.0-style disks and nics
This patch fixes the gnt-instance batch-create command, and in doing so
also slightly changes two other functions:
- we change utils.ParseUnit so that it accepts integer values also
(both ParseUnit(5) and ParseUnit("5") return the same value)
- a bridge 'None' in LUCreateInstance will be converted to the default
bridge; currently only missing bridges will be accepted to mean the
default one
The main changes to batcher were the change to variable number of disks
and NICs.
The patch also adds a batcher-instances.json example file copied from
the 1.2 branch and properly modified.
Reviewed-by: imsnah, killerfoxi
Iustin Pop [Fri, 23 Jan 2009 12:36:44 +0000 (12:36 +0000)]
Make iallocator work with offline nodes
This patch changes the iallocator framework to work with and properly
export to plugins offline nodes. It does this by only exporting the
static configuration data for those nodes, and not attempting to parse
the runtime data.
The patch also fixes bugs in iallocator related to the RpcResult
conversion, changes the should_run to admin_up attribute name (as per
the internals change), and adds “-I” as a short option for
“--iallocator” in gnt-instance, gnt-backup and burnin.
Reviewed-by: ultrotter
Iustin Pop [Fri, 23 Jan 2009 12:36:28 +0000 (12:36 +0000)]
Remove checking of DRBD metadata for validity
Currently the DRBD code checks that the metadata devices are valid
before creation, initial disk attachment and add children.
However, the process for checking validity requires a free DRBD minor,
and this conflict with parallel checking.
There are at least three possible solutions:
- serialize all checks, which means we reduce parallelism and need
extra locks
- don't pass a valid minor number, but one like “/dev/drbd256” (which
is invalid); this works for current version of DRBD, but since it's
not guaranteed to remain so it doesn't look nice
- don't do the checking at all, and rely on “drbdsetup ... disk ...”
to fail by itself
The reason for checking metadata was that in 1.2, this was much cheaper
than trying to activate devices (and the subsequent iteration over the
minors). However, in 2.0, they have the same cost, so we can choose
option 3: just remove the explicit checking and rely on drbdsetup and
the kernel to fail.
Since DRBD8._InitMeta still requires a minor number, the two places
where this is run are handled as follows:
- Create: we just use our own (unused currently) minor number
- AddChildren: we keep using FindUnusedMinor, with the caveat that
this function (used by replace-disks -n ...) cannot be yet
parallelized
Reviewed-by: ultrotter
Iustin Pop [Fri, 23 Jan 2009 12:36:18 +0000 (12:36 +0000)]
Rework the execution model in burnin
This patch changes (significantly) the execution model in burnin:
- for all runs, (almost) all instance mods in a single Burn* procedure
are done as part of a job; so for example add disk, stop, remove
disk, start are no longer done as separate jobs but as a single job
consisting of four opcodes
- for parallel runs, all Burn* procedures except the rename (which
uses a single target name) run in parallel; before, only the
creation was done in parallel
- due to the single-job execution and also parallel execution, the
logging messages are no longer happening synchronously with the
execution, so they are more informative than an actual execution log
The end result is that burnin now tests properly multi-opcode jobs and
also tests all opcodes (except rename) for parallel execution.
Note: On a test cluster, parallelization reduces burnin time from 23m to
15m.
Reviewed-by: ultrotter
Iustin Pop [Fri, 23 Jan 2009 12:36:09 +0000 (12:36 +0000)]
Relax the restrictions on temporary DRBD minors
Currently the restrictions are too harsh: there is a time interval
between an instance gets a new disk and before it is added to the
configuration in which the restriction is not met. We solve this by
allowing temporary DRBD minors to match existing minors (for the same
instance), such that parallel creations/minor allocations are OK.
The change is done by moving the add of temporary minors to the
minor map after the instance minors are computed, and only considering
them as duplicate if the instance name doesn't match.
Reviewed-by: ultrotter
Iustin Pop [Fri, 23 Jan 2009 12:36:00 +0000 (12:36 +0000)]
Introduce more configuration consistency checks
This patch enhances the duplicate DRBD minors checks (currently just a
few) and adds automatic checks of configuration consistency at
configuration file writing time.
In order to do so and show meaningful error messages, the
_UnlockedComputeDRBDMap function is changed to not raise errors in case
of duplicates, but instead return both the minors map and the duplicate
list, and its callers now raise the error. This allows the VerifyConfig
function to return a complete list of duplicates.
The new checks required some small updates to the unittests for the
config module.
Reviewed-by: ultrotter