ganeti-local
14 years agoAdd check for OpenSSL entropy status
Michael Hanselmann [Tue, 24 Nov 2009 14:55:03 +0000 (15:55 +0100)]
Add check for OpenSSL entropy status

By checking for this explicitly, the errors (SSLEAY_RAND_BYTES, “PRNG
not seeded”) will happen in the start-up phase of the daemon and not
only when executing remote procedure calls.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoA couple of doc updates
Iustin Pop [Tue, 24 Nov 2009 09:57:35 +0000 (10:57 +0100)]
A couple of doc updates

Clarify the fact that temporary HV/BE params in instance start override
and do not extend the configured parameters; and change the instance
list headers from HVM_* to * since many of the parameters apply to KVM
too. Also fix a typo in the rapi documention for '/2/nodes'.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoHandle EEXIST in utils.RenameFile
Michael Hanselmann [Thu, 19 Nov 2009 15:13:12 +0000 (16:13 +0100)]
Handle EEXIST in utils.RenameFile

This should fix an issue I've seen exactly once during testing. It might have
been caused by parallel RPC calls to archive jobs.

[…] ganeti-noded:112 ERROR Error in RPC call […]
 File "/usr/lib/python2.4/site-packages/ganeti/backend.py", line 2365, in JobQueueRename
   utils.RenameFile(old, new, mkdir=True)
 File "/usr/lib/python2.4/site-packages/ganeti/utils.py", line 322, in RenameFile
   os.makedirs(os.path.dirname(new), mkdir_mode)
 File "/usr/lib/python2.4/os.py", line 159, in makedirs
   mkdir(name, mode)
OSError: [Errno 17] File exists: '/var/lib/ganeti/queue/archive/0'

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRemove unused parameter “unlock” from cmdlib._WaitForSync
Michael Hanselmann [Thu, 19 Nov 2009 16:30:03 +0000 (17:30 +0100)]
Remove unused parameter “unlock” from cmdlib._WaitForSync

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix off-by-one error when modifying instance NIC
Michael Hanselmann [Mon, 16 Nov 2009 14:52:53 +0000 (15:52 +0100)]
Fix off-by-one error when modifying instance NIC

For an instance with exactly one NIC:

$ gnt-instance modify --net 1:ip=1.2.3.4 inst1
Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Invalid NIC index 1, valid values are 0 to 1

For an instance with no NIC at all, it fails with “Invalid NIC index 0, valid
values are 0 to 0”. This is fixed by this patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRe-add check for duplicate instance IP
Michael Hanselmann [Mon, 16 Nov 2009 13:51:29 +0000 (14:51 +0100)]
Re-add check for duplicate instance IP

This was originally implemented in 0ce8f948 and partially
rolled back in 9b65e0d4. Apart from re-adding the check,
this patch does some housekeeping by renaming the “_helper”
function to “_AddIpAddress”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix gnt-instance list documentation
Guido Trotter [Mon, 16 Nov 2009 11:02:13 +0000 (11:02 +0000)]
Fix gnt-instance list documentation

(1) Both the man page and the online help report the link and mode
fields, which are in the code called nic_link and nic_mode.
(2) Add missing fields to the online help.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoconfig: Style fixes
Michael Hanselmann [Fri, 13 Nov 2009 12:53:06 +0000 (13:53 +0100)]
config: Style fixes

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd packaging notes to documentation
Michael Hanselmann [Thu, 12 Nov 2009 16:38:40 +0000 (17:38 +0100)]
Add packaging notes to documentation

This includes a few paragraphs about daemon-util.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix epydoc error
Michael Hanselmann [Thu, 12 Nov 2009 15:03:35 +0000 (16:03 +0100)]
Fix epydoc error

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agosphinx: Treat warnings as errors
Michael Hanselmann [Thu, 12 Nov 2009 14:40:43 +0000 (15:40 +0100)]
sphinx: Treat warnings as errors

This makes it easier to catch warnings.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoInclude INSTALL in documentation
Michael Hanselmann [Thu, 12 Nov 2009 14:40:05 +0000 (15:40 +0100)]
Include INSTALL in documentation

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoConvert INSTALL to RST
Michael Hanselmann [Wed, 11 Nov 2009 17:10:23 +0000 (18:10 +0100)]
Convert INSTALL to RST

This is in preparation to including it into the large
documentation.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix change of cluster nic parameters
Guido Trotter [Thu, 12 Nov 2009 16:35:14 +0000 (16:35 +0000)]
Fix change of cluster nic parameters

To stay on the safe side, we check for errors in all instances, and
refuse to act, reporting on the errors we found, if there are any
problems.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoNIC.CheckParameterSyntax: fix bridged check
Guido Trotter [Thu, 12 Nov 2009 17:08:25 +0000 (17:08 +0000)]
NIC.CheckParameterSyntax: fix bridged check

We should match for the strings to be the same "==" not to point to the
same memory location with is, or we skip the actual check.

Signed-off-by: Guido Trotter <ultrotter@google.com>

14 years agoFix mispopulation of nic parameters at nic modify
Guido Trotter [Thu, 12 Nov 2009 15:44:51 +0000 (15:44 +0000)]
Fix mispopulation of nic parameters at nic modify

There's a bug in Ganeti 2.1 rc0 that makes nic parameters be populated
from the "filled in" dict, even if we're not changing any values in
them. This patch fixes the problem, by populating them from the correct
(unfilled) dict.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoBump version to 2.1.0~rc0 v2.1.0rc0
Michael Hanselmann [Wed, 11 Nov 2009 15:01:40 +0000 (16:01 +0100)]
Bump version to 2.1.0~rc0

Also add one item to NEWS.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUpdate RAPI documentation on job results
Iustin Pop [Wed, 11 Nov 2009 13:55:21 +0000 (14:55 +0100)]
Update RAPI documentation on job results

This documents the new error classifier added for OpPrereqError.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRevert "Backport AC_PATH_PROGS_FEATURE_CHECK"
Michael Hanselmann [Wed, 11 Nov 2009 13:02:16 +0000 (14:02 +0100)]
Revert "Backport AC_PATH_PROGS_FEATURE_CHECK"

This reverts commit 52b699ecaa688a2aaac00fa64558e249d0bc9a26.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix and simplify socat escape detection
Michael Hanselmann [Wed, 11 Nov 2009 13:00:17 +0000 (14:00 +0100)]
Fix and simplify socat escape detection

- Program paths should not be --with-… options (see
  Autoconf docs)
- Simplify checks for escape functionality
- Make SOCAT_USE_ESCAPE variable a bool

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUse “daemon-util” to reload SSH keys
Michael Hanselmann [Thu, 5 Nov 2009 11:21:48 +0000 (12:21 +0100)]
Use “daemon-util” to reload SSH keys

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoKVMHypervisor: fix broken error format string
Guido Trotter [Sun, 8 Nov 2009 13:17:11 +0000 (13:17 +0000)]
KVMHypervisor: fix broken error format string

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoConfigWriter: simplify GenerateDRBDSecret
Guido Trotter [Tue, 27 Oct 2009 21:19:11 +0000 (17:19 -0400)]
ConfigWriter: simplify GenerateDRBDSecret

We can do this by adding a new TemporaryReservationManager

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoConfigWriter: move _temporary_macs to reservation
Guido Trotter [Tue, 27 Oct 2009 20:08:19 +0000 (16:08 -0400)]
ConfigWriter: move _temporary_macs to reservation

This solves the race conditions in mac reservation, as macs are actually
reserved, under the current ec id.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoConfigWriter: move _temporary_ids to reservation
Guido Trotter [Tue, 27 Oct 2009 19:27:44 +0000 (15:27 -0400)]
ConfigWriter: move _temporary_ids to reservation

In order to do this we need to pass a job id when reserving a resource.
We have one during _EnsureUUIDs because we passed it in from AddNode and
AddInstance. During config upgrade we use a fake job ID which we then
cleanup. We can delete the _CleanupTemporaryIDs code, since the cleanup
is going to be done at job finish time by mcpu.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoTemporaryReservationManager
Guido Trotter [Tue, 27 Oct 2009 19:42:37 +0000 (15:42 -0400)]
TemporaryReservationManager

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoconfig.Add{Node,Instance}: get the ec id
Guido Trotter [Tue, 27 Oct 2009 18:09:07 +0000 (14:09 -0400)]
config.Add{Node,Instance}: get the ec id

This is ok because adding a node or instance cannot happen in a query.

We get the ec id from the LU and pass it to _EnsureUUID, which will
then for now not use it.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd config.DropECReservations
Guido Trotter [Tue, 27 Oct 2009 17:53:03 +0000 (13:53 -0400)]
Add config.DropECReservations

For now this function does nothing, but it gets called by mcpu when the
execution of an LU is done, making sure any pending reservations are
dropped.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoProcessor: support a unique execution id
Guido Trotter [Tue, 27 Oct 2009 15:47:49 +0000 (11:47 -0400)]
Processor: support a unique execution id

When the processor is executing a job, it can export the execution id to
its callers. This is not supported for Queries, as they're not executed
in a job.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRemove exceptions list from GenerateUniqueID
Guido Trotter [Tue, 27 Oct 2009 17:25:55 +0000 (13:25 -0400)]
Remove exceptions list from GenerateUniqueID

It's not used anywhere, so it's dead code.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd errors.ReservationError
Guido Trotter [Tue, 27 Oct 2009 21:17:36 +0000 (17:17 -0400)]
Add errors.ReservationError

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix pylint 'E' (error) codes
Iustin Pop [Wed, 4 Nov 2009 13:09:53 +0000 (14:09 +0100)]
Fix pylint 'E' (error) codes

This patch adds some silences and tweaks the code slightly so that
“pylint --rcfile pylintrc -e ganeti” doesn't give any errors.

The biggest change is in jqueue.py, the move of _RequireOpenQueue out of
the JobQueue class. Since that is actually a function and not a method
(never used as such) this makes sense, and also silences two pylint
errors.

Another real code change is in utils.py, where FieldSet.Matches will
return None instead of False for failure; this still works with the way
this class/method is used, and makes more sense (it resembles more
closely the re.match return values).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoA few more small documentation updates
Iustin Pop [Fri, 6 Nov 2009 12:21:51 +0000 (13:21 +0100)]
A few more small documentation updates

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRemove obsolete statement in autogen.sh
Iustin Pop [Fri, 6 Nov 2009 12:53:31 +0000 (13:53 +0100)]
Remove obsolete statement in autogen.sh

Nowadays we have actual files (tracket by VCS) in autotools/, so we know
the directory exists.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd use_localtime parameter for xen-hvm and kvm
Jun Futagawa [Thu, 5 Nov 2009 04:22:00 +0000 (13:22 +0900)]
Add use_localtime parameter for xen-hvm and kvm

Currently xen-hvm and kvm use different real time clock by default. To
reduce confusion, this patch adds an optional use_localtime parameter.

If the real time clock on the instance is set to local time, the
parameter use_localtime should be True. The default is False. Note that
the real time clock changes from local to UTC in xen-hvm with the
default parameter.

Signed-off-by: Jun Futagawa <jfut@integ.jp>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoDocumentation updates for the global hvparams
Iustin Pop [Wed, 4 Nov 2009 17:06:48 +0000 (18:06 +0100)]
Documentation updates for the global hvparams

This patch does multiple documentation updates for the new framework,
all pretty straightforward.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRemove the KVM_MIGRATION_PORT configure.ac param
Iustin Pop [Wed, 4 Nov 2009 16:49:39 +0000 (17:49 +0100)]
Remove the KVM_MIGRATION_PORT configure.ac param

Since this is easily configurable at run-time, we remove the
configure-time parameter. If anyone is building custom packages, then
the default can be tweaked by a one-line patch to constants.py.

Note that this also fixes the type of parameter, the default from
_autoconf.py is a string parameter. Shouldn't matter except if a cluster
run code between commit 78411c6 and this one.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoIntroduce 'global hypervisor parameters' support
Iustin Pop [Wed, 4 Nov 2009 16:41:04 +0000 (17:41 +0100)]
Introduce 'global hypervisor parameters' support

This patch adds support for global hypervisor parameters in instance
creation, instance modification, instance query and at instance load
time.

We basically prevent any query on these parameters, discard them at load
time, and do not allow their modification. Together, this should make
any such parameters go away if existing and not allowed to be added.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix the init script
Iustin Pop [Fri, 6 Nov 2009 10:55:49 +0000 (11:55 +0100)]
Fix the init script

The rewrite after the introduction of the daemon-util script has a
copy-paste error.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agognt-*: Print better error message for uninitialized cluster
Michael Hanselmann [Thu, 5 Nov 2009 13:30:33 +0000 (14:30 +0100)]
gnt-*: Print better error message for uninitialized cluster

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoCache JSON encoders and sort keys
Michael Hanselmann [Thu, 5 Nov 2009 12:31:21 +0000 (13:31 +0100)]
Cache JSON encoders and sort keys

The sort_keys argument is supported since simplejson 1.3.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd new “daemon-util” script to start/stop Ganeti daemons
Michael Hanselmann [Tue, 3 Nov 2009 13:42:51 +0000 (14:42 +0100)]
Add new “daemon-util” script to start/stop Ganeti daemons

Until now, Ganeti started and stopped its own daemons using custom functions.
To start, the daemon was just executed and then sent the appropriate signals to
stop it again. Init scripts would have to pay attention to the PID file and
other things.

With this patch, a new script is added (“daemon-util”, installed in
$prefix/lib/ganeti/), centralizing the starting and stopping of daemons. The
provided example init script is adjusted to use this new script. Ganeti's code
no longer calls its own init script.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agokvm console: use socat raw mode with escape
Guido Trotter [Thu, 29 Oct 2009 20:58:50 +0000 (16:58 -0400)]
kvm console: use socat raw mode with escape

If this is enabled at configure time, we pass in different parameters to
the socat console, making it a lot more manageable.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoconfigure: check for socat and its escape feature
Guido Trotter [Sat, 31 Oct 2009 21:38:08 +0000 (17:38 -0400)]
configure: check for socat and its escape feature

Currently we use a static value for the socat path, or we trust the
user-provided one. With this patch we still trust any user provided
value, but if none is passed we check for socat on the machine we're
being configured on. This allows us also to check if we can or cannot
use the escape= feature in socat.

If the user has forced the path in, he can also pass --with-socat-escape
in order to force the escape functionality to be used, even if a check
is not done.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoBackport AC_PATH_PROGS_FEATURE_CHECK
Guido Trotter [Tue, 3 Nov 2009 14:37:51 +0000 (09:37 -0500)]
Backport AC_PATH_PROGS_FEATURE_CHECK

In order to allow working with older versions of autoconf we backport
this macro, but only if it's not defined already (by autoconf itself).

This commit can be reverted after we decide support for autoconf 2.61
and below should be deprecated.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMigration: add check for listening target
Iustin Pop [Wed, 4 Nov 2009 13:21:55 +0000 (14:21 +0100)]
Migration: add check for listening target

This patch adds a check for listening on the remote port in Xen and KVM
migrations. This will be generating a single “load of migration failed”
message for KVM, but otherwise not prevent the migration. For Xen (which
has a dedicated, always listening daemon) this should not create any
problems.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoTLMigrateInstance: add error messagess during Exec
Iustin Pop [Wed, 4 Nov 2009 12:08:43 +0000 (13:08 +0100)]
TLMigrateInstance: add error messagess during Exec

Currently the migration of an instance doesn't show any error until the
end. We add two messages that show better the progress:

node1# gnt-instance migrate -f instance5
Wed Nov  4 04:04:34 2009 Migrating instance instance5
Wed Nov  4 04:04:34 2009 * checking disk consistency between source and target
Wed Nov  4 04:04:35 2009 * switching node node3 to secondary mode
Wed Nov  4 04:04:35 2009 * changing into standalone mode
Wed Nov  4 04:04:35 2009 * changing disks into dual-master mode
Wed Nov  4 04:04:40 2009 * wait until resync is done
Wed Nov  4 04:04:41 2009 * preparing node3 to accept the instance
Wed Nov  4 04:04:41 2009 * migrating instance to node3
Wed Nov  4 04:04:51 2009 Migration failed, aborting
Wed Nov  4 04:04:51 2009 * switching node node3 to secondary mode
Wed Nov  4 04:04:51 2009 * changing into standalone mode
Wed Nov  4 04:04:51 2009 * changing disks into single-master mode
Wed Nov  4 04:04:57 2009 * wait until resync is done
Failure: command execution error:
Could not migrate instance instance5: Failed to migrate instance: Failed
to migrate instance instance5: …

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agohypervisors: switch to using HV_MIGRATION_PORT
Iustin Pop [Wed, 4 Nov 2009 11:48:10 +0000 (12:48 +0100)]
hypervisors: switch to using HV_MIGRATION_PORT

This changes KVM to use HV_MIGRATION_PORT instead of KVM_MIGRATION_PORT
and enables passing the port for Xen migrations.

Since KVM_MIGRATION_PORT is not used anymore, we stop exporting it from
constants.py.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoIntroduce HV_MIGRATION_PORT hypervisor parameter
Iustin Pop [Wed, 4 Nov 2009 11:29:15 +0000 (12:29 +0100)]
Introduce HV_MIGRATION_PORT hypervisor parameter

This parameter will replace the direct use of KVM_MIGRATION_PORT and the
implicit use of the Xen migration port.

While it doesn't make sense to change this at instance level, we don't
have any other infrastructure for cluster-wide hypervisor parameters, so
we add it here (and document that it usually shouldn't be changed on a
per-instance basis).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agohypervisors: change MigrateInstance API
Iustin Pop [Wed, 4 Nov 2009 10:10:32 +0000 (11:10 +0100)]
hypervisors: change MigrateInstance API

Currently the $hypervisor.MigrateInstance takes the instance name. This
patch changes it to take the instance object, such that other instance
properties (especially hvparams) are available to it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRevert the instance IP conflicts
Iustin Pop [Wed, 4 Nov 2009 13:14:53 +0000 (14:14 +0100)]
Revert the instance IP conflicts

Since instances can live in different VLANs from nodes (especially in
routed mode), based on the 'link' parameter, we shouldn't always
restrict having duplicate IPs. Thus we only check the node IPs/cluster
IP for now.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUpdate gitignore rules
Iustin Pop [Wed, 4 Nov 2009 12:01:49 +0000 (13:01 +0100)]
Update gitignore rules

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>

14 years agoIntroduce a wrapper for hostname resolving
Iustin Pop [Wed, 4 Nov 2009 09:16:03 +0000 (10:16 +0100)]
Introduce a wrapper for hostname resolving

Currently a few of the LU's CheckPrereq use utils.HostInfo which raises
a resolver error in case of failure. This is an exception from the
standard that CheckPrereq should raise an OpPrereqError if the error is
in the 'pre' phase (so that it can be retried).

This patch adds a new error code (resolver_error) and a wrapper over
utils.HostInfo that just converts the ResolverError into
OpPrereqError(…, errors.ECODE_RESOLVER). It then uses this wrapper in
cmdlib, bootstrap and some scripts.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd a configuration verify check for duplicate IPs
Iustin Pop [Wed, 4 Nov 2009 09:09:33 +0000 (10:09 +0100)]
Add a configuration verify check for duplicate IPs

This patch adds a check that the cluster IP, the nodes primary (and
secondary, if enabled) IP(s) and the instances NIC IPs are unique in the
cluster.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoWorkaround fake failures in drbd+live migration
Iustin Pop [Tue, 3 Nov 2009 15:14:07 +0000 (16:14 +0100)]
Workaround fake failures in drbd+live migration

This patch is an attempt to fix the ugly issue during migration:
  Cannot resync disks on node …: [True, 100]

If my understanding is correct, sometimes we poll the /proc/drbd file at
an inoportune moment, while it's being updated, or while the DRBD device
is changing state, and we see an unexpected state.

Based on the assumption that this is just a transient state, rather than
aborting directly, we change the backend.DrbdWaitSync() function to
retry a few times the operation, giving DRBD a chance to settle down at
the end of the resync.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAnother round of pylint-related style fixes
Iustin Pop [Tue, 3 Nov 2009 13:58:38 +0000 (14:58 +0100)]
Another round of pylint-related style fixes

A newer version of pylint, more warnings…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRevert "configure: check for socat and its escape feature"
Iustin Pop [Tue, 3 Nov 2009 13:42:43 +0000 (14:42 +0100)]
Revert "configure: check for socat and its escape feature"

This reverts commit 37fc2cf5ba8919cef407199ee540aad4b1a9a2b6, since it
introduces configure.ac changes that depend on very very new autoconf
macros that are not present in current stable distros (and it was not
advertised as such).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRevert "kvm console: use socat raw mode with escape"
Iustin Pop [Tue, 3 Nov 2009 13:42:12 +0000 (14:42 +0100)]
Revert "kvm console: use socat raw mode with escape"

This reverts commit ce0eb6694e3fb2510035501539c7acc92a0f174e, since it depends
on 37fc2cf5ba8919cef407199ee540aad4b1a9a2b6 which will be reverted too.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoChange behaviour of ConfigWriter._WriteConfig
Iustin Pop [Tue, 3 Nov 2009 08:20:35 +0000 (09:20 +0100)]
Change behaviour of ConfigWriter._WriteConfig

This patch changes the behaviour of _WriteConfig in case of
configuration errors:

- before, it used to abort the saving (even though the in-memory
  configuration used by current jobs has already changed)
- now, we log it (both to the log and to the user) but continue, since
  we can't revert to a good version of the config anyway

This should make the internal behaviour of the code more consistent with
the external world, even though the config might be “wrong”; we leave
the cleanup to the user. This should not be as bad as it sounds, since
we haven't actually seen this case except for the ugly master candidates
handling, and that was fixed recently by Guido's patch series.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd an example script for backing up the config
Iustin Pop [Tue, 3 Nov 2009 12:54:06 +0000 (13:54 +0100)]
Add an example script for backing up the config

This requires git and lockfile-progs, and only backs up config.data (see
the comments why).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agokvm console: use socat raw mode with escape
Guido Trotter [Thu, 29 Oct 2009 20:58:50 +0000 (16:58 -0400)]
kvm console: use socat raw mode with escape

If this is enabled at configure time, we pass in different parameters to
the socat console, making it a lot more manageable.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoconfigure: check for socat and its escape feature
Guido Trotter [Sat, 31 Oct 2009 21:38:08 +0000 (17:38 -0400)]
configure: check for socat and its escape feature

Currently we use a static value for the socat path, or we trust the
user-provided one. With this patch we still trust any user provided
value, but if none is passed we check for socat on the machine we're
being configured on. This allows us also to check if we can or cannot
use the escape= feature in socat.

If the user has forced the path in, he can also pass --with-socat-escape
in order to force the escape functionality to be used, even if a check
is not done.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix version number in README
Guido Trotter [Mon, 2 Nov 2009 16:24:48 +0000 (11:24 -0500)]
Fix version number in README

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoutils: Convert to utils.Retry
Michael Hanselmann [Tue, 3 Nov 2009 13:02:26 +0000 (14:02 +0100)]
utils: Convert to utils.Retry

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoImplement cluster verify checks for wrong PV names
Iustin Pop [Tue, 3 Nov 2009 10:24:31 +0000 (11:24 +0100)]
Implement cluster verify checks for wrong PV names

Since ':' is not a valid character in PV names (for the way Ganeti uses
LVM), we need to check this and warn the user. This patch adds a new
NV_PVLIST cluster verify check and verifies the PV names returned from
the nodes.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoChange bdev.LogicalVolume.GetPVInfo usage
Iustin Pop [Tue, 3 Nov 2009 10:21:19 +0000 (11:21 +0100)]
Change bdev.LogicalVolume.GetPVInfo usage

We will need to enumerate selectively the PVs of (possible) many VGs and
not only the allocatable ones. For this we make the VG selection and the
allocatable filtering optional. The two callers are modified for this
new calling syntax.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoThrow specific error when ':' exists in PV names
Iustin Pop [Tue, 3 Nov 2009 10:03:19 +0000 (11:03 +0100)]
Throw specific error when ':' exists in PV names

While ':' is not actually a supporte character in PV names (it has a
special meaning for commands like lvcreate), we should throw specific
errors for this case instead of generic “Can't create LV”.

This patch does two things:

- modifies the separator used when listing PVs to be '|' such that we
  can actually parse ':' as part of PV names
- check if any of the discovered PVs have ':' in their name when
  creating LVs, and if so throw a specific error

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojqueue: Convert to utils.Retry
Michael Hanselmann [Fri, 30 Oct 2009 16:40:46 +0000 (17:40 +0100)]
jqueue: Convert to utils.Retry

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agohv_xen: Convert to utils.Retry
Michael Hanselmann [Fri, 30 Oct 2009 16:39:46 +0000 (17:39 +0100)]
hv_xen: Convert to utils.Retry

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agobootstrap: Convert to utils.Retry
Michael Hanselmann [Fri, 30 Oct 2009 16:38:38 +0000 (17:38 +0100)]
bootstrap: Convert to utils.Retry

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agobdev: Convert to utils.Retry
Michael Hanselmann [Fri, 30 Oct 2009 16:38:24 +0000 (17:38 +0100)]
bdev: Convert to utils.Retry

Also replaces a hardcoded limit of 15 seconds with 1/4
of NET_RECONFIG_TIMEOUT.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agobackend: Convert to utils.Retry
Michael Hanselmann [Fri, 30 Oct 2009 16:37:38 +0000 (17:37 +0100)]
backend: Convert to utils.Retry

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd generic retry loop function
Michael Hanselmann [Fri, 30 Oct 2009 16:36:59 +0000 (17:36 +0100)]
Add generic retry loop function

There are quite a few retry loops with timeouts in Ganeti's
code. Duplicating code is not good, so this patch introduces
a new function named “utils.Retry” to remedy this situation.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoIgnore log messages in unittests
Michael Hanselmann [Tue, 3 Nov 2009 10:29:30 +0000 (11:29 +0100)]
Ignore log messages in unittests

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoSome improvements to gnt-node repair-storage
Iustin Pop [Tue, 27 Oct 2009 07:54:24 +0000 (16:54 +0900)]
Some improvements to gnt-node repair-storage

Currently the repair storage has two issues:

- down instances are aborting the operation, even though they should be
  ignored (it's not technically possible to know their disk status
  unless we would activate their disks)
- if the VG is so broken that disks cannot be activated via gnt-instance
  activate-disks or gnt-instance startup, it's not possible to repair
  the VG at all

The patch makes the opcode skip down instances and also introduces an
``--ignore-consistency`` flag for forcing the execution of the LU.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoConvert the rest of the OpPrereqError users
Iustin Pop [Tue, 27 Oct 2009 05:55:15 +0000 (14:55 +0900)]
Convert the rest of the OpPrereqError users

This finishes the conversion of OpPrereqError creation to two-argument
style. Any leftovers as one-argument are not breaking anything, just
losing information about the errors.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd ecode to rpc.py's RpcResult.Raise()
Iustin Pop [Tue, 27 Oct 2009 05:27:46 +0000 (14:27 +0900)]
Add ecode to rpc.py's RpcResult.Raise()

This patch adds a new ecode argument to RpcResult.Raise(). This allows
specifying the error code (for both OpExec and OpPrereq errors).

Note that this patch also makes the OpExecError exceptions raised from
_FindFaultInstanceDisks have the error code classification.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoIntroduce two-argument style for OpPrereqError
Iustin Pop [Tue, 27 Oct 2009 05:15:53 +0000 (14:15 +0900)]
Introduce two-argument style for OpPrereqError

This patch introduces a two-argument style for OpPrereqError. Only the
direct raise calls in cmdlib.py are converted, other users will follow.

cli.py is modified to handle both two-argument style and the current
format. RAPI doesn't need modification as the way we encode errors is
already using a list for the error arguments, so RAPI users only need to
start checking the list length and the second argument.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRemove the OpRetryError exception
Iustin Pop [Mon, 2 Nov 2009 11:52:49 +0000 (12:52 +0100)]
Remove the OpRetryError exception

This is only used in two places, in an error path that is no longer
valid since Ganeti 2.0. We remove the try..except since we should not
get it anymore (and if we do, then we should catch it in all
config.Update cases) and we remove the exception class completely.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoActivate disks while exporting an instance
Michael Hanselmann [Thu, 29 Oct 2009 17:31:26 +0000 (18:31 +0100)]
Activate disks while exporting an instance

Exporting an instance not running or without activated disks
will fail. This patch makes sure to activate disks before
exporting an instance if it's in the ADMIN_down state.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoEpydoc fixes
Michael Hanselmann [Fri, 30 Oct 2009 16:33:18 +0000 (17:33 +0100)]
Epydoc fixes

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agobackend: Don't overwrite function parameter with loop variable
Michael Hanselmann [Fri, 30 Oct 2009 13:46:30 +0000 (14:46 +0100)]
backend: Don't overwrite function parameter with loop variable

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd QA test for “gnt-node {list,modify,repair}-storage”
Michael Hanselmann [Thu, 29 Oct 2009 11:42:29 +0000 (12:42 +0100)]
Add QA test for “gnt-node {list,modify,repair}-storage”

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoUnify the query fields for the storage framework
Iustin Pop [Tue, 27 Oct 2009 03:56:00 +0000 (12:56 +0900)]
Unify the query fields for the storage framework

This patch unifies the query fields in the storage framework for all
types. Note that the information is still computed on-demand, so if e.g.
the used disk space is not requested for the ‘file’ type, it won't be
computed on nodes.

Summary of changes:
- improve the LVM storage type to support multiple lvm fields in the
  LIST_FIELDS declaration and constant (not-computed via lvm commands)
  fields
- rename utils.GetFilesystemFreeSpace to utils.GetFilesystemStats
  returning tuple of (total, free)
- add used and free as valid fields for lvm-vg (use being computed as
  vg_size-vg_free)
- make allocatable accepted for all types (ones which are always
  allocatable always return True)
- add a new list field ‘type’ that gives the current selected type; not
  much useful today (except for understanding what the default output
  is) but in the future might help if we want to list multiple types
- add type, size and allocatable to the default output field list
- update the man page with details on how, for file storage, size ≠ used
  + free for non-mountpoint cases

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMake cluster initialization more reliable
Michael Hanselmann [Thu, 29 Oct 2009 17:30:56 +0000 (18:30 +0100)]
Make cluster initialization more reliable

There was a race condition between starting the node daemon
and sending requests to write the ssconf files. With this
patch, the initialization waits up to ten seconds for the
node daemon to become responsive.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoDon't show warnings on ADMIN_down instance failover
Michael Hanselmann [Thu, 29 Oct 2009 16:06:13 +0000 (17:06 +0100)]
Don't show warnings on ADMIN_down instance failover

Before:
$ gnt-instance failover -f inst1
… checking disk consistency between source and target
… - WARNING: Can't find disk on node node21.example.com
… shutting down instance on source node

After:
$ gnt-instance failover -f inst1
… not checking disk consistency as instance is not running
… shutting down instance on source node

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUpdate NEWS
Michael Hanselmann [Thu, 29 Oct 2009 10:09:19 +0000 (11:09 +0100)]
Update NEWS

Add rapi_users changes, rearrange a bit and one wording change.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd remote API users and passwords documentation
Michael Hanselmann [Wed, 28 Oct 2009 18:33:54 +0000 (19:33 +0100)]
Add remote API users and passwords documentation

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoganeti-rapi: Use new function to verify passwords
Michael Hanselmann [Wed, 28 Oct 2009 17:08:28 +0000 (18:08 +0100)]
ganeti-rapi: Use new function to verify passwords

This enables the use of hashed passwords in rapi_users.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agohttp.auth: Add new function to verify passwords
Michael Hanselmann [Wed, 28 Oct 2009 17:07:53 +0000 (18:07 +0100)]
http.auth: Add new function to verify passwords

This new function supports two schemes for passwords:
- Old-style cleartext passwords
- Hashed passwords according to RFC2617 (H(A1))

Schemes are differentiated by their prefix, a concept also
used in OpenLDAP. Cleartext passwords can no longer start
with an opening brace ("{") unless they're prefixed with
"{cleartext}" (case insensitive).

Currently there's no documentation for rapi_users at all.
It'll be in a consecutive patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMakefile.am: Add more checks to distcheck-hook
Michael Hanselmann [Tue, 27 Oct 2009 14:24:46 +0000 (15:24 +0100)]
Makefile.am: Add more checks to distcheck-hook

Also use grep only to convert find's output to an exit status.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoDocumentation updates
Iustin Pop [Tue, 27 Oct 2009 02:33:21 +0000 (11:33 +0900)]
Documentation updates

Our admin guide was very very trivial. This patch updates it to contain
advice on when to use which commands, removes the instance
administration part from the installation guide (moved to the admin
guide), and adds a walkthrough document that should be useable as a
starting point for new admins.

The patch also adds emacs variables to the documents, and rewraps some
which were not already at 72 chars.

The doc updates also show backwards-compatible commands for Ganeti 2.0,
as we don't have a good up-to-date 2.0 document and people might refer
to this set of documentation even when running that.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix another style issue
Iustin Pop [Tue, 27 Oct 2009 04:59:38 +0000 (13:59 +0900)]
Fix another style issue

For the Nth time, re-fix shadowing of outer-scope variable :)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMake gnt-node list-storage more standard
Iustin Pop [Tue, 27 Oct 2009 03:24:19 +0000 (12:24 +0900)]
Make gnt-node list-storage more standard

This patch adds support for the -o+field,… format that the other list
commands accept and changes the format of the allocatable field from
simply str(bool) to Y/N.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRename the node storage commands
Iustin Pop [Tue, 27 Oct 2009 03:13:01 +0000 (12:13 +0900)]
Rename the node storage commands

To reduce confusion, the following gnt-node commands are renamed:

- physical-volumes → list-storage
- modify-volume → modify-storage
- repair-volume → repair-storage

The NEWS file is update accordingly and it also gets emacs local
variables.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix an error handling case in TLReplaceDisks
Iustin Pop [Tue, 27 Oct 2009 04:44:55 +0000 (13:44 +0900)]
Fix an error handling case in TLReplaceDisks

pylint is your friend, since the compiler doesn't exist.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoProvide feedback from redistributing configuration
Michael Hanselmann [Tue, 27 Oct 2009 13:14:15 +0000 (14:14 +0100)]
Provide feedback from redistributing configuration

This is particularily useful for “gnt-cluster redist-conf”, but
also for all other cases where the configuration files are
rewritten on other nodes.

$ gnt-cluster redist-conf
… Copy of file /var/lib/ganeti/config.data to node … failed: Error while
executing backend function: [Errno 1] Operation not permitted
… Error while uploading ssconf files to node …: Error while executing backend
function: [Errno 1] Operation not permitted

$ gnt-node modify --offline no --force node3.example.com
… - WARNING: Not enough master candidates (desired 10, new value will be 4)
… Copy of file /var/lib/ganeti/config.data to node node8.example.com failed:
Error while executing backend function: [Errno 1] Operation not permitted
Modified node node3.example.com
 - offline -> True
 - master_candidate -> auto-demotion due to offline

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agobash_completion: Move common code into function
Michael Hanselmann [Mon, 26 Oct 2009 18:22:00 +0000 (19:22 +0100)]
bash_completion: Move common code into function

This reduces the size of the script by about 9 kB.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMakefile.am: Wrap long lines
Michael Hanselmann [Mon, 26 Oct 2009 11:43:19 +0000 (12:43 +0100)]
Makefile.am: Wrap long lines

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

14 years agoInclude NEWS in documentation again
Michael Hanselmann [Mon, 26 Oct 2009 11:39:05 +0000 (12:39 +0100)]
Include NEWS in documentation again

This was implemented in 350ecfecca and reverted in 700bb84367
after it broke “make distcheck”. With other changes in this
patch series this will work now.

Contributing to the original problem was that the news.rst file
was not distributed. When we distribute the build documentation,
the source must also be included (see Automake manual).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>