Iustin Pop [Tue, 17 Feb 2009 12:44:02 +0000 (12:44 +0000)]
Show more details for failed xen commands
This patch also logs the output of the xm commands in case of failures;
some corner cases were forgotten in the last redo.
Reviewed-by: imsnah
Iustin Pop [Tue, 17 Feb 2009 12:43:51 +0000 (12:43 +0000)]
Update the install and admin documents
This is not a real update, just a quick pass changing the obvious parts.
Reviewed-by: imsnah
Iustin Pop [Mon, 16 Feb 2009 14:50:40 +0000 (14:50 +0000)]
QA: add support for burnin rename
This patch adds support for optionally doing the rename burnin test, and
adds an example to the sample QA file. To disable, either remove or
specify an empty rename target.
Reviewed-by: imsnah
Iustin Pop [Mon, 16 Feb 2009 14:50:30 +0000 (14:50 +0000)]
Fix some bugs in reboot
There are two issues fixed in this patch:
- first, the recent RPC changes caused loss of data in hard reboot
type; we weren't reporting any results from the stop/start instance
calls;
- second, in soft or hard reboots, we didn't initialized the disk
physical ID; based on the last state of the instance's disks, this
can create a failure in identifying the disks
After this patch, burnin works again with reboot, and reports errors
correctly.
Reviewed-by: imsnah
Iustin Pop [Mon, 16 Feb 2009 14:50:19 +0000 (14:50 +0000)]
Burnin: fix rename
In rename, we must stop different names in the first and second phases,
so we create two different opcodes for this purpose (instead of using
the same one twice, which doesn't work).
Reviewed-by: imsnah
Iustin Pop [Mon, 16 Feb 2009 13:05:45 +0000 (13:05 +0000)]
Update NEWS for beta 2
Reviewed-by: imsnah
Iustin Pop [Mon, 16 Feb 2009 12:17:18 +0000 (12:17 +0000)]
Convert IOErrors for /proc/drbd into our errors
If /proc/drbd can't be opened, this raises an IOError, but all the
error-handling behaviour in backend treats only BlockDeviceErrors. This
creates a plain failure in cluster verify and in other RPC calls.
This patch simply converts EnvironmentErrors into BlockDeviceErrors, and
also changes the RPC result for NV_DRBDLIST and its handling to be able
to show the error. The other RPC calls work by default now, due the
existing error handling.
Reviewed-by: ultrotter
Guido Trotter [Mon, 16 Feb 2009 12:16:59 +0000 (12:16 +0000)]
DEVNOTES: we have no --enable-rapi anymore
Remove it from the suggested development ./configure line
Reviewed-by: iustinp
Guido Trotter [Mon, 16 Feb 2009 12:09:25 +0000 (12:09 +0000)]
Convert default root partition to msdos style
As discussed with 2.0 msdos partition style should be the default in the
instance OS, so we're changing the default instance params accordingly.
A followup patch will update the debootstrap os.
Reviewed-by: iustinp
Iustin Pop [Mon, 16 Feb 2009 11:08:18 +0000 (11:08 +0000)]
watcher: fix checking of boot IDs
The recent change (commit 2151) to the watcher to make it handle offline
nodes also saves the offline attribute to the state file, but this is
not needed and also breaks the checking of the boot ID. This patch
simply removes it, restoring the correct behaviour.
Reviewed-by: imsnah
Iustin Pop [Mon, 16 Feb 2009 11:08:10 +0000 (11:08 +0000)]
watcher: autoarchive old jobs
This patch adds auto-archiving of jobs older than 6 hours to the
watcher.
Reviewed-by: imsnah
Iustin Pop [Fri, 13 Feb 2009 16:17:05 +0000 (16:17 +0000)]
RAPI: documentation updates
This patch fixes the version and does some update to the RAPI resources
docs.
Reviewed-by: imsnah
Iustin Pop [Fri, 13 Feb 2009 15:54:41 +0000 (15:54 +0000)]
RAPI: fixes related to write mode
This patch fixes many small issues related to write functions:
- update documentations w.r.t. how to add users
- update the instance add function for latest API
- add instance delete
- fix addition of tags
- update some error messages
Reviewed-by: imsnah
Iustin Pop [Fri, 13 Feb 2009 15:35:05 +0000 (15:35 +0000)]
Some small improvements to the fake hypervisor
This patch modifies the fake hypervisor to substract the memory “used”
by “running” instances from the free memory, so the actual node
information changes based on the running instances.
Also some style changes and fixes are added.
Reviewed-by: ultrotter
Iustin Pop [Fri, 13 Feb 2009 15:34:48 +0000 (15:34 +0000)]
Implement the backward-compatible ‘-s’ disk option
This patch adds back to the instance creation command (gnt-instace add,
gnt-backup import) the ‘-s’ short form option for specifying a
single-disk instance.
Also a small bug in gnt-backup import is fixed.
Reviewed-by: ultrotter
Guido Trotter [Fri, 13 Feb 2009 13:49:06 +0000 (13:49 +0000)]
SetInstanceParams: export nic changes to hooks
Currently we export the old instance "as is" and any nic changes get
lost, so hooks won't know of a different ip, bridge, or mac address.
This patch fixes it by putting the nics in the override dict, if any
changes are done.
Reviewed-by: iustinp
Guido Trotter [Fri, 13 Feb 2009 12:28:14 +0000 (12:28 +0000)]
Remove two fixed FIXME and convert one to TODO
The cli FIXME is not something broken, but rather some better handling
feature we'd rather have, and the two backend FIXME are done (disks have
their read only parameter set, and the error is raised and thus reaches
the master).
Reviewed-by: iustinp
Iustin Pop [Fri, 13 Feb 2009 11:38:26 +0000 (11:38 +0000)]
RAPI: format error messages as JSON
This patch changes the format of the HTTP error messages from text/html, which
is hard to parse from RAPI clients, to JSON which can be automatically parsed.
The error message is an object, which contains always three keys:
- code, an integer with the error code
- message, a short description
- explain, holding (if available) a description of the error
In order to implement this, there is a bit of change to the http server
and executor classes. I've tested and the error handling still works
(but less optimal, no error message) in case the error formatting itself
raises an exception.
Reviewed-by: imsnah
Iustin Pop [Fri, 13 Feb 2009 11:38:08 +0000 (11:38 +0000)]
Make RAPI return 502/504 errors for luxi errors
This changes the RAPI error codes for luxi errors; a timeout error is
now reported properly as 504, while any other luxi error is reported as
502.
It would be good to convert even more errors into proper return codes in
the future.
Reviewed-by: imsnah
Iustin Pop [Fri, 13 Feb 2009 11:37:57 +0000 (11:37 +0000)]
Fix ganeti-rapi startup with missing certificate
This patch displays a nicer error message compared to the default
stacktrace.
Reviewed-by: imsnah
Iustin Pop [Thu, 12 Feb 2009 18:13:23 +0000 (18:13 +0000)]
job queue: log the opcode error too
Currently we only log "Error in opcode ...", but we don't log the error itself.
This is not good for debugging.
Reviewed-by: ultrotter
Guido Trotter [Thu, 12 Feb 2009 17:35:43 +0000 (17:35 +0000)]
LUSetInstanceParams: Fix nic handling
CheckArguments:
Use constants.VALUE_NONE rather than hardcoding the string "none"
If we're adding a nic fill the nic_dict with default values
Check if the mac is syntactically valid, if we have one
Don't allow the mac to be 'auto' when modifying a nic
CheckPrereq:
Check that bridge and mac if present in the dict are not None
(before this wasn't handled at all)
Generate the nic mac address here if demanded
Exec:
Do not generate nics and macs
Reviewed-by: iustin
Guido Trotter [Thu, 12 Feb 2009 17:35:28 +0000 (17:35 +0000)]
ConfigWriter.AddInstance check instance mac
There is a race condition in CreateInstance, since the mac address is
generated early and only added to the config (and thus really assured to
be unique) only at this point. Since it's possible that another instance
gets the same mac address in the meantime with this check we'll make the
instance creation fail before modifying the config data and thus having
a wrong in-memory config (which is bad!!).
Note that the same race condition exists, for example, in
SetInstanceParams, and should be fully addressed by a way to revert
config changes if writing them fails!
Reviewed-by: iustin
Guido Trotter [Thu, 12 Feb 2009 17:35:10 +0000 (17:35 +0000)]
Instance Creation: generate nics earlier
We want the real nic to be shown to the hooks and the allocators, so
we'll generate them in CheckPrereq. We also write a comment about the
race condition we generate. This race condition existed even before, so
moving this generation will just lenghen it a bit. A separate patch
mitigates its effects.
This patch also adds an ENDIF comment for a very long if, and removes a
double empty line inside the CheckPrereq function of LUCreateInstance.
Reviewed-by: iustin
Iustin Pop [Thu, 12 Feb 2009 17:09:20 +0000 (17:09 +0000)]
Handle better broken disks
While running burnin:
File "/usr/lib/python2.4/site-packages/ganeti/objects.py", line 497, in __str__
val += ", size=%dm)>" % self.size
TypeError: int argument required
This happened while handling another error, so we lose the original
error information.
So we should try to handle this better.
Reviewed-by: ultrotter
Iustin Pop [Thu, 12 Feb 2009 17:05:08 +0000 (17:05 +0000)]
Update the command line scripts man pages for 2.0
This patch updates the gnt-* scripts to show the new 2.0 syntax. It's
not guaranteed to be 80% complete.
Reviewed-by: ultrotter
Iustin Pop [Thu, 12 Feb 2009 17:04:45 +0000 (17:04 +0000)]
Some command line scripts fixes
This patch changes the gnt-node and gnt-job list commands to accept
argument and list only the selected items, which is useful when having
many nodes or jobs.
It also removes the “--units” option from gnt-job list as we don't
actually use it.
Reviewed-by: imsnah
Iustin Pop [Thu, 12 Feb 2009 17:04:32 +0000 (17:04 +0000)]
Do not check 'None' disk IDs for duplicates
In case of 'None' logical or physical IDs, we don't need to check them
for duplicates. This case can happen for DRBD devices in case of newly
added disks, for example.
Reviewed-by: imsnah
Iustin Pop [Thu, 12 Feb 2009 17:04:19 +0000 (17:04 +0000)]
Prevent race condition on MAC addresses
This patch adds a temporary set for MACs that have been requested but
are not yet in the configuration (as part of an instance NIC). The MACs
of an instance are automatically removed from this set when the instance
is updated (or first added to the config).
Reviewed-by: ultrotter
Iustin Pop [Thu, 12 Feb 2009 17:04:07 +0000 (17:04 +0000)]
Always use the same short option for iallocator
This patch changes the scripts so that the short name for the
“--iallocator” option is always ‘-I’.
Reviewed-by: ultrotter
Iustin Pop [Thu, 12 Feb 2009 17:03:58 +0000 (17:03 +0000)]
Some batcher fixes
Currently the batcher hypervisor parameter must be a dict with one
element (e.g. {"xen-hvm": { "acpi": true }}). This is overly complex and
hard to validate correctly; the patch splits it in two:
- one "hypervisor" string parameter, with the name of the hypervisor
- one "hvparams" dictionary, with the hypervisor parameters
The patch also changes the error handling in parsing the definition file
- since this is not a long-running file, we are less concerned with safe
closing of the file, and more with presenting meaningful error
messages.
Reviewed-by: killerfoxi
Iustin Pop [Thu, 12 Feb 2009 17:03:46 +0000 (17:03 +0000)]
Some small fixes
This patch removes the admin_ram LUQueryInstances field (is broken
anyway) and fixes the VNC address checks in the Xen Hypervisor.
Reviewed-by: imsnah
Iustin Pop [Thu, 12 Feb 2009 17:03:33 +0000 (17:03 +0000)]
Fix LUQueryInstances fields.
The query fields are now regular expressions. We need to quote the dots,
otherwise invalid fields will be accepted but they will lose special
formatting in the cli scripts.
Reviewed-by: imsnah
Guido Trotter [Thu, 12 Feb 2009 09:15:52 +0000 (09:15 +0000)]
Apply the right permissions to /etc/hosts
In the current Ganeti version when modifying /etc/hosts we mistakenly
give it the permissions of the temporary file we create to define its
content, which is by default 0600. This breaks most non-root
applications, and thus must be corrected. This patch forces the mode to
be 0644 (but we might decide to just use the mode of the previous
/etc/hosts, if we want to be more polite against any eventual
administrative choice). We also add a new assertFileMode() method for
unit tests and actually check in the SetEtcHostsEntry and
RemoveEtcHostsEntry tests that the mode is correct, to be sure not to
reintroduce this bug again. Also, a FIXME is added in the original
functions stating that it would be nice to use WriteFile+fn() rather
than reimplementing its functionality again.
Reviewed-by: iustinp
Iustin Pop [Thu, 12 Feb 2009 07:34:21 +0000 (07:34 +0000)]
Fix RPC result handling in _AssembleInstanceDisks
For (status, data)-style RPC calls, the result data is in the ‘payload’
attribute. This was missed in the conversion patch, with the only side
effect that gnt-instance activate-disks didn't show a nice output
anymore.
Reviewed-by: ultrotter
Iustin Pop [Thu, 12 Feb 2009 07:33:41 +0000 (07:33 +0000)]
Man page updates for the ganeti daemons.
This patch adds new man pages for the master and RAPI daemons, and
updates the node daemon and watcher man pages.
Reviewed-by: ultrotter
Iustin Pop [Thu, 12 Feb 2009 07:32:03 +0000 (07:32 +0000)]
master daemon: allow skipping the voting process
This patch introduces a 'force' mode for the master daemon startup where
the voting process is not done, but the user has to confirm manually the
startup (before forking, of course).
Reviewed-by: imsnah
Iustin Pop [Thu, 12 Feb 2009 07:31:26 +0000 (07:31 +0000)]
Remove a duplicate line in sed_vars
LOCALSTATEDIR is added twice to the sed variables.
Reviewed-by: imsnah
Iustin Pop [Thu, 12 Feb 2009 07:31:04 +0000 (07:31 +0000)]
ConfigWriter: add checks for duplicate disk IDs
This patch adds a safety check for duplicate disk logical/physical IDs,
in order to prevent possible software bugs.
Reviewed-by: imsnah
Iustin Pop [Thu, 12 Feb 2009 07:30:44 +0000 (07:30 +0000)]
Switch the instance_shutdown rpc to (status, data)
This patch changes the return type from this RPC call to include status
information and renames the backend method to match the RPC call name.
The patch is a little bigger than the reboot one, since this call is
used in more than one place. However, all the points of call have the
same usage pattern, so the patch is trivial.
Reviewed-by: ultrotter
Iustin Pop [Thu, 12 Feb 2009 07:30:06 +0000 (07:30 +0000)]
Switch the instance_reboot rpc to (status, data)
This small patch changes the return type from this RPC call to include
status information and renames the backend method to match the RPC call
name.
Reviewed-by: ultrotter
Guido Trotter [Wed, 11 Feb 2009 18:29:25 +0000 (18:29 +0000)]
FileStorage: abort creating over an existing file
In FileStorage there is a TODO:
decide whether we should check for existing files and
abort or not
After Ganeti ate my instance data I decided. Let's abort.
In general there is no reason we should overwrite existing files, and
doing it can be very harmful for preexisting files on the host.
Reviewed-by: iustinp
Guido Trotter [Wed, 11 Feb 2009 16:23:26 +0000 (16:23 +0000)]
gnt-instance fix a typo in AddInstance
It's hvparams, not opts.hvparams.
Reviewed-by: iustinp
Guido Trotter [Wed, 11 Feb 2009 16:23:11 +0000 (16:23 +0000)]
_GenerateDiskTemplate: correct file disk index
Currently when adding disks the base for the index is not taken into
account, and disk 0 is added twice.
Reviewed-by: iustinp
Guido Trotter [Wed, 11 Feb 2009 10:20:13 +0000 (10:20 +0000)]
gnt-cluster, pass hvparams directly to dict()
If hvparams is not set it will be [], so dict() will transform it to an
empty dict, which is safe in all cases.
Reviewed-by: iustinp
Guido Trotter [Wed, 11 Feb 2009 10:20:00 +0000 (10:20 +0000)]
ganeti-noded: Create LOCK_DIR if missing
We need this directory for locks, so if for any reason it's not there
we'll create it. The permissions are the standard /var/lock permissions.
Reviewed-by: iustinp
Guido Trotter [Wed, 11 Feb 2009 10:19:49 +0000 (10:19 +0000)]
HTS_USE_VNC, rename and remove KVM
Currently we use the HTS_USE_VNC constant only to copy the vnc password
file. While KVM uses vnc it currently has no password support, nor we'll
be on time making one for 2.0, so renaming the constant to
HTS_COPY_VNC_PASSWORD and only putting Xen HVM in it. In the future
(2.1) password handling will need to be reworked anyway.
Reviewed-by: iustinp
Iustin Pop [Tue, 10 Feb 2009 16:32:12 +0000 (16:32 +0000)]
Sort instance data in gnt-node info
The patch sorts the instance list in gnt-node info output, in order to
make it more readable (and stable).
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 16:05:24 +0000 (16:05 +0000)]
Some fixes to node add and re-add
The patch changes the pre-checks in node-add and re-add:
- if the node is not already in the cluster, refuse to re-add
- when re-adding, reuse the secondary IP from the cluster
configuration
- when re-adding, reset the offline and drained flags, so that RPC
calls work (and we can actually upload the keys)
The patch also adds a missing log entry in LUSetNodeParams.
Reviewed-by: imsnah
Guido Trotter [Tue, 10 Feb 2009 15:06:24 +0000 (15:06 +0000)]
Instance parameters: force typing
We want all the hv/be parameters to have a known type, rather than a
random mix of empty string, boolean values, and None, so we declare the
type of each variable and we enforce/convert it.
- Add some new constants for enforceable value types
- Add new constants dicts HVS_PARAMETER_TYPES and BES_PARAMETER_TYPES
holding not only the valid parameters but also their types
- Drop the old HVS_PARAMETERS and BES_PARAMETERS constants and calculate
the values from the type dict
- Convert all the default parameters to a valid type value
- Create a new ForceDictType utils function, to check/enforce a dict's
element value types, with relevant unit tests
- Drop a few custom functions to check/convert the BE param types in
utils and cli, in favor of ForceDictType
- Double-check the parameter types using ForceDictType in both scripts
and LogicalUnits, when possible.
As a bonus:
- Remove some old commented-out code in gnt-instance
- Remove some already fixed FIXME
- Fix a bug which prevented VALUE_DEFAULT to be applied to BE parameters
in SetInstanceParams because the value was checked for validity before
that transformation was made
- Fix a bug which prevented initing a cluster and passing hvparams to
work at all
- ForceDictType allows an allowed_values for exceptions, which makes us
able to do the checking even when some values must not be
converted/typechecked (for example the 'default' string in
SetInstanceParameters)
Reviewed-by: iustinp
Iustin Pop [Tue, 10 Feb 2009 14:47:01 +0000 (14:47 +0000)]
Implement modification of the drained flag
This patch adds LU and cli-level support for modification of the node
drained flag. It is similar to the offline changes.
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 14:46:48 +0000 (14:46 +0000)]
Prevent allocations on drained nodes
This patch adds checks for drained nodes in the logical units that
allocate or move instances around. We also update an error message (not
style-compliant).
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 14:46:37 +0000 (14:46 +0000)]
cluster verify: show correctly drained nodes
This patch changes slightly the output of gnt-cluster verify for drained
nodes, and also adds a note with the total number of drained nodes
(similar to the offline nodes note).
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 14:46:26 +0000 (14:46 +0000)]
ConfigWriter: handle the drained node flag
This patch changes the master candidate pool computations in
ConfigWriter to properly handle drained nodes. They are now excluded
from counting towards the reachable number of candidates.
The patch also adds verification of consistency for the node status.
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 14:46:15 +0000 (14:46 +0000)]
burnin: do not use drained nodes
This patch updates burnin not to use drained nodes (similar to the
handling of offline nodes).
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 14:46:06 +0000 (14:46 +0000)]
dumb allocator: do not use drained nodes
This patch changes the dumb allocator not to use drained nodes (similar
to offline nodes).
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 14:45:56 +0000 (14:45 +0000)]
Allow query of the drained node attribute
This patch exports the drained attribute:
- LUQueryNodes accepts now the drained field
- RAPI exports it for node objects
- gnt-node info shows it now (along newly-added master_candidate and
offline flags)
- gnt-node list can list it (but not by default)
- to the iallocator scripts
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 14:45:39 +0000 (14:45 +0000)]
Add a ‘drained’ attribute to node objects
This attribute will be used to prevent any allocation on the node (any
of replace-disks with new secondary this node, failover to the node,
migration to the node).
The patch adds the attribute and initializes it correctly in cluster
init and for new nodes.
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 14:45:18 +0000 (14:45 +0000)]
Some error message cleanups
Reviewed-by: imsnah
Iustin Pop [Tue, 10 Feb 2009 14:45:03 +0000 (14:45 +0000)]
Cleanup of DRBD8._CheckMetaSize
This patch converts the _CheckMetaSize method to raise exceptions
instead of logging and returning False. This fits now in the new rpc
return types, so it's a cheap change.
Reviewed-by: ultrotter
Iustin Pop [Tue, 10 Feb 2009 14:44:53 +0000 (14:44 +0000)]
Change the disk assembly to raise exceptions
This big patch converts the bdev Assemble() methods and the supporting
functions to raise exceptions instead of returning False. This is a big
patch, since the assembly functions touch other functions: add children,
creation, etc. However, the patch does not add much new code, rather it
reworks existing code.
One of the biggest changes is in the rework of the DRBD8._SlowAssemble()
method (one of the most complicated/ugly ones). Hopefully the new
version is a little bit more readable.
Reviewed-by: ultrotter
Iustin Pop [Tue, 10 Feb 2009 14:44:41 +0000 (14:44 +0000)]
Change BlockDev.Remove() failure result
Currently, the Remove() methods of block devices return True/False.
This doesn't permit any error detail reporting.
This patch changes the return type to None for success, and raises
BlockDeviceError in case of failure. This permits the details to be
passed up the stack.
The patch also simplifies a little the Remove method of file-based
devices (no stat first, just try unlink).
Reviewed-by: ultrotter
Iustin Pop [Tue, 10 Feb 2009 14:44:30 +0000 (14:44 +0000)]
Switch the blockdev_remove rpc to (status, data)
This converts the backend and cmdlib modules to a (status, data)
implementation of the blockdev_remove rpc call. bdev.py is not yet
converted, so we don't actually have error information.
We also fix a bug in _RemoveDisks by not reusing a variable.
Reviewed-by: ultrotter
Iustin Pop [Tue, 10 Feb 2009 14:44:18 +0000 (14:44 +0000)]
Change BlockDev.Shutdown() failure result
Currently, the Shutdown() methods of block devices return True/False.
This doesn't permit any error detail reporting.
This patch changes the return type to None for success, and raises
BlockDeviceError in case of failure. This permits the details to be
passed up the stack.
For LVM and file-backed devices, this is a simple change. For DRBD, we
first remove the shutdown of disks in case of network activation
failures (since with static minors the minor is used anyway, we don't
gain anything by clearing it), and the we simply change _ShutdownAll()
to raise an exception.
Reviewed-by: ultrotter
Iustin Pop [Tue, 10 Feb 2009 14:44:07 +0000 (14:44 +0000)]
Switch the blockdev_shutdown rpc to (status, data)
This converts the backend and cmdlib modules to a (status, data)
implementation of the blockdev_shutdown rpc call. bdev.py is not yet
converted, so we don't actually have error information.
We also fix a bug in _ShutdownInstanceDisks by not reusing a variable.
Reviewed-by: ultrotter
Iustin Pop [Tue, 10 Feb 2009 14:43:57 +0000 (14:43 +0000)]
Convert blockdev_assemble rpc to (status, data)
This converts the RPC call blockdev_assemble to the new-style result
format. Note that we won't usually have error information, but it's the
first step toward it.
Reviewed-by: ultrotter
Iustin Pop [Tue, 10 Feb 2009 13:40:59 +0000 (13:40 +0000)]
RAPI: fix a pylint warning
Child classes of _R_TAGS must define TAG_LEVEL, but for good style let's
define it also here to at least ensure we don't get a 'Unknown
attribute' exception.
Of course, this also silences a pylint warning.
Reviewed-by: amishchenko
Guido Trotter [Tue, 10 Feb 2009 11:59:17 +0000 (11:59 +0000)]
LUSetInstanceParams: use the correct hvparams
In LUSetInstanceParam we used to save the dict without defaults for the
instance params as hv_inst, but to use the populated one for the
instance (hv_new). Fixing this leads to instances without all the
parameters set.
Reviewed-by: iustinp
Guido Trotter [Tue, 10 Feb 2009 10:53:22 +0000 (10:53 +0000)]
KVM: Correct CheckParameterSyntax docstring
The comment is not really true anymore, as we have a lot of parameters
nowadays.
Reviewed-by: iustinp
Guido Trotter [Tue, 10 Feb 2009 10:53:08 +0000 (10:53 +0000)]
KVM: Fix _CallMonitorCommand error message
1) Only instance_name is available
2) There was a missing string parameter
Reviewed-by: iustinp
Iustin Pop [Tue, 10 Feb 2009 08:13:39 +0000 (08:13 +0000)]
Fix one more RAPI QA test
This was skipped in the previous QA patch.
Reviewed-by: imsnah
Guido Trotter [Mon, 9 Feb 2009 15:17:15 +0000 (15:17 +0000)]
KVM: Add usb mouse type parameter
In some cases 'mouse' may work better than 'tablet', so we'll handle
both by allowing the user to specify a parameter. By default no mouse is
used.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:59 +0000 (15:16 +0000)]
KVM: allow netboot
With this patch we allow KVM instances to be booted off the network.
The only issue is that this is not compatible with virtio nics, so
we disallow them, when booting from the net.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:47 +0000 (15:16 +0000)]
KVM: actually support different nic types
When executing the KVM runtime we load the nic type from the runtime
hvparams and use it to specify the nic model type. As for the disk we
translate the DEV_PARAVIRTUAL type to 'virtio'.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:34 +0000 (15:16 +0000)]
KVM: export hvparams in the runtime
They'll be used to set the nic type when we execute the runtime, since
the nics are processed later. We need to save the hvparams because we
want to use the same one as when we saved the runtime, rather than use
the current instance ones, to avoid applying only some changed
parameters when the runtime is loaded.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:20 +0000 (15:16 +0000)]
KVM: actually support different disk types
By passing the relevant if= value to the disk we support different disk
types. The only change is that we'll translate "paravirtual" to
"virtio" to keep only one "paravirtualized" value, around ganeti. The
if= value is calculated outside the disks loop, as it's the same for all
disks (as currently ganeti doesn't support per-disk params).
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:16:07 +0000 (15:16 +0000)]
Xen-HVM: Improve the invalid disk/nic type error
Copy the message from the KVM one, adding a missing 'the' and a list of
possible values, to help the user in his decision.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:15:55 +0000 (15:15 +0000)]
KVM: parameters for different disk and nic types
- Add a bunch of NICs and DISKs types
- Specify which one are valid disks and nics for KVM (the new ones
toghether with some of the old ones)
- Add the default values (paravirtual)
- Allow the disk and nic types as parameters and check their validity
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:15:40 +0000 (15:15 +0000)]
Rename the device type constants
These are not HVM specific, so have been given an HT generic name.
Reviewed-by: iustinp
Guido Trotter [Mon, 9 Feb 2009 15:15:26 +0000 (15:15 +0000)]
s/HT_HVM_VNC_BASE_PORT/VNC_BASE_PORT/g
The VNC base port has nothing to do with HVM itself, and is general to
VNC itself, so we're removing the HT_HVM prefix to the constant.
Reviewed-by: iustinp
Iustin Pop [Mon, 9 Feb 2009 14:04:08 +0000 (14:04 +0000)]
Add a new instance query flag ‘disk_usage’
This patch adds a new instance query flag called disk_usage that
retrieves the overall space used by an instance on each of its nodes.
This can be used when balancing the cluster or checking N+1 status.
The flag is also exported in RAPI. Note the flag is currently broken for
file-based instances, as it represents the amount of space in the
cluster volume group.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 14:03:57 +0000 (14:03 +0000)]
Uniformize some function names in backend.py
Currently, the names of the functions in backend.py that are actually
RPC procedures and are called from ganeti-noded are not corresponding to
the RPC names. This makes it hard to actually see which functions are
exported and which functions are internal to backend.
This patch renames all blockdevice-related functions in backend.py match
the name of the RPC call (without the ‘call’ or ‘perspective’ prefix).
This should make it easier to grep for a given function called in
cmdlib, without having to open and check in ganet-inoded what backend
function it corresponds to.
The patch also does two minor extra cleanups (rename a variable and
change a logging level).
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 14:03:47 +0000 (14:03 +0000)]
bdev: add and use two utility functions
This patch adds two utility functions for raising BlockDeviceError
exceptions and for running functions while ignoring this error. Most of
the manual “raise errors.BlockDeviceError” cases are converted to
_ThrowError, as this makes the code clearer.
We also change most of the DRBD error messages to include the minor
number because with the parallel execution of commands it's not longer
possible to identify the failed DRBD just from the timestamp, and the
minor number can be mapped back to the instance easier.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 14:03:38 +0000 (14:03 +0000)]
rpc.call_blockdev_find: convert to (status, data)
This patch converts the call_blockdev_find - which searches for block
devices and returns their status - to the (status, data) format. We also
modify the backend function name to match the rpc call.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 10:41:21 +0000 (10:41 +0000)]
Export the cpu nodes and sockets from Xen
This is a hand-picked forward patch of commit 1755 on the 1.2 branch
(hand-picked since the trees diverged too much since then):
The patch changed the xen hypervisor to compute the number of cpu
sockets/nodes and enables the command line and the RAPI to show this
information (for RAPI is enabled by default in node details, for gnt-one
one can use the new “cnodes” and “csockets” fields).
Originally-Reviewed-by: ultrotter
For the KVM and fake hypervisors, the patch just exports 1 for both
nodes and sockets. This can be fixed, by looking at the
/sys/devices/system/cpu/cpuN/topology directories, and computing the
actual information, but that should be done in a separate patch.
Reviewed-by: imsnah
Iustin Pop [Mon, 9 Feb 2009 10:31:44 +0000 (10:31 +0000)]
Fix handling OS errors in AddOSToInstance
This patch fixes the error handling in the add OS to instance function
with regard to invalid OSes. Previously, we didn't handle any such
errors, with the end result that the user would have to look in the node
daemon log.
The patch also renames the name of the function to match the RPC call
name.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 09:24:38 +0000 (09:24 +0000)]
backend.DrbdAttachNet: don't ignore Open() errors
Currently the return value or errors from the block device Open() method
are ignored. This patch catches any BlockDeviceErrors and returns a
well-formatted result.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 09:24:29 +0000 (09:24 +0000)]
cmdlib: simplify some rpc error handling cases
By using the RemoteFailMsg() or the payload field of RpcResult, we can
simplify a few functions in cmdlib.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 09:24:21 +0000 (09:24 +0000)]
RpcResult: add a new payload field
For results which use the (status, payload) response type, it's easier
to define a ‘payload’ field on the result holding the payload than to
extract it using “data[1]” in the caller code.
Reviewed-by: ultrotter
Iustin Pop [Mon, 9 Feb 2009 09:24:10 +0000 (09:24 +0000)]
LUCreateInstance: only set running flag at the end
In lockless queries, it's better if we see the instance in ADMIN_down
rather than ERROR_down during the time it's installed. As such, we
change the LU to only mark the instance 'up' at the time we are ready to
start it.
Reviewed-by: ultrotter
Guido Trotter [Sat, 7 Feb 2009 09:04:31 +0000 (09:04 +0000)]
KVM: don't boot from a virtio cdrom
Apparently it's not supported. Also add -boot command line parameters
to kvm, since they seem to help booting from the right place. Everything
will still only work when not using a kernel, but well... :)
Reviewed-by: iustinp
Guido Trotter [Sat, 7 Feb 2009 09:04:15 +0000 (09:04 +0000)]
KVM: don't boot from cdrom with no cdrom
Reviewed-by: iustinp
Guido Trotter [Sat, 7 Feb 2009 09:04:00 +0000 (09:04 +0000)]
Support cdrom image and boot order for KVM
The cdrom image has the same meaning than in Xen HVM, and so does
boot_order, even though it has a slightly different syntax, and uses the
value 'disk' too boot from disk and 'cdrom' to boot from cdrom.
Reviewed-by: iustinp
Guido Trotter [Sat, 7 Feb 2009 09:03:44 +0000 (09:03 +0000)]
Get rid of constants.HT_HVM_DEFAULT_BOOT_ORDER
Confusingly, as a leftober from 1.2, there was a
constants.HT_HVM_DEFAULT_BOOT_ORDER constant, with a value opposite to
the default HV_BOOT_ORDER hv param that got enabled only if
HV_BOOT_ORDER was set to None. Since setting it to None is very
hard/impossible for the user, and we didn't handle other "empty" values
(False, ''), we'll just force the parameter to have a valid value (after
all we have a default, and that's the way we use hvparams) and get rid
of the old constant altoghether.
Reviewed-by: iustinp
Iustin Pop [Fri, 6 Feb 2009 13:06:02 +0000 (13:06 +0000)]
QA: switch RAPI to https
Since we by default now use SSL for RAPI, we need to switch the QA
tests to SSL too.
Reviewed-by: amishchenko
Iustin Pop [Fri, 6 Feb 2009 08:09:10 +0000 (08:09 +0000)]
Fix rapi job listing
This patch fixes a couple of issues with the job listing:
- in case of a non-existing job, nicely raise 404 instead of 500
- in the job detail listing, also list the job log, the job
timestamps, etc.
- the opcode migrate instance was missing its description field
Reviewed-by: imsnah
Iustin Pop [Thu, 5 Feb 2009 14:09:06 +0000 (14:09 +0000)]
rapi: fix SSL mode and use SSL by default
This patch fixes the SSL mode (by actually constructing SSL parameters
from the command line options) and enables SSL by default; the old “-S”
option which enabled SSL is now changed to “--no-ssl”. The certificate
and key are by default pointing to the Ganeti auto-generated certificate
for rapi.
Reviewed-by: imsnah
Iustin Pop [Thu, 5 Feb 2009 14:08:56 +0000 (14:08 +0000)]
Small improvement to the init.d example file
The start_action function is changed so that it can be called with
arguments - this could be used to parse a defaults file, etc.
Reviewed-by: imsnah
Guido Trotter [Thu, 5 Feb 2009 13:37:00 +0000 (13:37 +0000)]
KVM: add VNC TLS and X509 parameters
With this parameters VNC for KVM is able to be protected by tls,
optionally with an x509 certificate, and optionally verifying the
client as well. Additionally in this patch we limit the bind address to
being a directory, rather than a file or a directory, for simplicity, as
it allows for the same level of control anyway.
Reviewed-by: iustinp
Guido Trotter [Thu, 5 Feb 2009 13:36:43 +0000 (13:36 +0000)]
KVM: allow binding vnc to a file
Before we forced the VNC_BIND_ADDRESS to be an ip. Now we also accept a
path, and bind the instance to it, or to a file in it if it's a
directory.
Reviewed-by: iustinp