Michael Hanselmann [Wed, 11 Nov 2009 13:02:16 +0000 (14:02 +0100)]
Revert "Backport AC_PATH_PROGS_FEATURE_CHECK"
This reverts commit
52b699ecaa688a2aaac00fa64558e249d0bc9a26.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Wed, 11 Nov 2009 13:00:17 +0000 (14:00 +0100)]
Fix and simplify socat escape detection
- Program paths should not be --with-… options (see
Autoconf docs)
- Simplify checks for escape functionality
- Make SOCAT_USE_ESCAPE variable a bool
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 5 Nov 2009 11:21:48 +0000 (12:21 +0100)]
Use “daemon-util” to reload SSH keys
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Sun, 8 Nov 2009 13:17:11 +0000 (13:17 +0000)]
KVMHypervisor: fix broken error format string
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 27 Oct 2009 21:19:11 +0000 (17:19 -0400)]
ConfigWriter: simplify GenerateDRBDSecret
We can do this by adding a new TemporaryReservationManager
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 27 Oct 2009 20:08:19 +0000 (16:08 -0400)]
ConfigWriter: move _temporary_macs to reservation
This solves the race conditions in mac reservation, as macs are actually
reserved, under the current ec id.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 27 Oct 2009 19:27:44 +0000 (15:27 -0400)]
ConfigWriter: move _temporary_ids to reservation
In order to do this we need to pass a job id when reserving a resource.
We have one during _EnsureUUIDs because we passed it in from AddNode and
AddInstance. During config upgrade we use a fake job ID which we then
cleanup. We can delete the _CleanupTemporaryIDs code, since the cleanup
is going to be done at job finish time by mcpu.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 27 Oct 2009 19:42:37 +0000 (15:42 -0400)]
TemporaryReservationManager
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 27 Oct 2009 18:09:07 +0000 (14:09 -0400)]
config.Add{Node,Instance}: get the ec id
This is ok because adding a node or instance cannot happen in a query.
We get the ec id from the LU and pass it to _EnsureUUID, which will
then for now not use it.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 27 Oct 2009 17:53:03 +0000 (13:53 -0400)]
Add config.DropECReservations
For now this function does nothing, but it gets called by mcpu when the
execution of an LU is done, making sure any pending reservations are
dropped.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 27 Oct 2009 15:47:49 +0000 (11:47 -0400)]
Processor: support a unique execution id
When the processor is executing a job, it can export the execution id to
its callers. This is not supported for Queries, as they're not executed
in a job.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 27 Oct 2009 17:25:55 +0000 (13:25 -0400)]
Remove exceptions list from GenerateUniqueID
It's not used anywhere, so it's dead code.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 27 Oct 2009 21:17:36 +0000 (17:17 -0400)]
Add errors.ReservationError
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 4 Nov 2009 13:09:53 +0000 (14:09 +0100)]
Fix pylint 'E' (error) codes
This patch adds some silences and tweaks the code slightly so that
“pylint --rcfile pylintrc -e ganeti” doesn't give any errors.
The biggest change is in jqueue.py, the move of _RequireOpenQueue out of
the JobQueue class. Since that is actually a function and not a method
(never used as such) this makes sense, and also silences two pylint
errors.
Another real code change is in utils.py, where FieldSet.Matches will
return None instead of False for failure; this still works with the way
this class/method is used, and makes more sense (it resembles more
closely the re.match return values).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Fri, 6 Nov 2009 12:21:51 +0000 (13:21 +0100)]
A few more small documentation updates
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Fri, 6 Nov 2009 12:53:31 +0000 (13:53 +0100)]
Remove obsolete statement in autogen.sh
Nowadays we have actual files (tracket by VCS) in autotools/, so we know
the directory exists.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Jun Futagawa [Thu, 5 Nov 2009 04:22:00 +0000 (13:22 +0900)]
Add use_localtime parameter for xen-hvm and kvm
Currently xen-hvm and kvm use different real time clock by default. To
reduce confusion, this patch adds an optional use_localtime parameter.
If the real time clock on the instance is set to local time, the
parameter use_localtime should be True. The default is False. Note that
the real time clock changes from local to UTC in xen-hvm with the
default parameter.
Signed-off-by: Jun Futagawa <jfut@integ.jp>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 4 Nov 2009 17:06:48 +0000 (18:06 +0100)]
Documentation updates for the global hvparams
This patch does multiple documentation updates for the new framework,
all pretty straightforward.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 4 Nov 2009 16:49:39 +0000 (17:49 +0100)]
Remove the KVM_MIGRATION_PORT configure.ac param
Since this is easily configurable at run-time, we remove the
configure-time parameter. If anyone is building custom packages, then
the default can be tweaked by a one-line patch to constants.py.
Note that this also fixes the type of parameter, the default from
_autoconf.py is a string parameter. Shouldn't matter except if a cluster
run code between commit 78411c6 and this one.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 4 Nov 2009 16:41:04 +0000 (17:41 +0100)]
Introduce 'global hypervisor parameters' support
This patch adds support for global hypervisor parameters in instance
creation, instance modification, instance query and at instance load
time.
We basically prevent any query on these parameters, discard them at load
time, and do not allow their modification. Together, this should make
any such parameters go away if existing and not allowed to be added.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Fri, 6 Nov 2009 10:55:49 +0000 (11:55 +0100)]
Fix the init script
The rewrite after the introduction of the daemon-util script has a
copy-paste error.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 5 Nov 2009 13:30:33 +0000 (14:30 +0100)]
gnt-*: Print better error message for uninitialized cluster
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 5 Nov 2009 12:31:21 +0000 (13:31 +0100)]
Cache JSON encoders and sort keys
The sort_keys argument is supported since simplejson 1.3.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Tue, 3 Nov 2009 13:42:51 +0000 (14:42 +0100)]
Add new “daemon-util” script to start/stop Ganeti daemons
Until now, Ganeti started and stopped its own daemons using custom functions.
To start, the daemon was just executed and then sent the appropriate signals to
stop it again. Init scripts would have to pay attention to the PID file and
other things.
With this patch, a new script is added (“daemon-util”, installed in
$prefix/lib/ganeti/), centralizing the starting and stopping of daemons. The
provided example init script is adjusted to use this new script. Ganeti's code
no longer calls its own init script.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Thu, 29 Oct 2009 20:58:50 +0000 (16:58 -0400)]
kvm console: use socat raw mode with escape
If this is enabled at configure time, we pass in different parameters to
the socat console, making it a lot more manageable.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Sat, 31 Oct 2009 21:38:08 +0000 (17:38 -0400)]
configure: check for socat and its escape feature
Currently we use a static value for the socat path, or we trust the
user-provided one. With this patch we still trust any user provided
value, but if none is passed we check for socat on the machine we're
being configured on. This allows us also to check if we can or cannot
use the escape= feature in socat.
If the user has forced the path in, he can also pass --with-socat-escape
in order to force the escape functionality to be used, even if a check
is not done.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Tue, 3 Nov 2009 14:37:51 +0000 (09:37 -0500)]
Backport AC_PATH_PROGS_FEATURE_CHECK
In order to allow working with older versions of autoconf we backport
this macro, but only if it's not defined already (by autoconf itself).
This commit can be reverted after we decide support for autoconf 2.61
and below should be deprecated.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 4 Nov 2009 13:21:55 +0000 (14:21 +0100)]
Migration: add check for listening target
This patch adds a check for listening on the remote port in Xen and KVM
migrations. This will be generating a single “load of migration failed”
message for KVM, but otherwise not prevent the migration. For Xen (which
has a dedicated, always listening daemon) this should not create any
problems.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 4 Nov 2009 12:08:43 +0000 (13:08 +0100)]
TLMigrateInstance: add error messagess during Exec
Currently the migration of an instance doesn't show any error until the
end. We add two messages that show better the progress:
node1# gnt-instance migrate -f instance5
Wed Nov 4 04:04:34 2009 Migrating instance instance5
Wed Nov 4 04:04:34 2009 * checking disk consistency between source and target
Wed Nov 4 04:04:35 2009 * switching node node3 to secondary mode
Wed Nov 4 04:04:35 2009 * changing into standalone mode
Wed Nov 4 04:04:35 2009 * changing disks into dual-master mode
Wed Nov 4 04:04:40 2009 * wait until resync is done
Wed Nov 4 04:04:41 2009 * preparing node3 to accept the instance
Wed Nov 4 04:04:41 2009 * migrating instance to node3
Wed Nov 4 04:04:51 2009 Migration failed, aborting
Wed Nov 4 04:04:51 2009 * switching node node3 to secondary mode
Wed Nov 4 04:04:51 2009 * changing into standalone mode
Wed Nov 4 04:04:51 2009 * changing disks into single-master mode
Wed Nov 4 04:04:57 2009 * wait until resync is done
Failure: command execution error:
Could not migrate instance instance5: Failed to migrate instance: Failed
to migrate instance instance5: …
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 4 Nov 2009 11:48:10 +0000 (12:48 +0100)]
hypervisors: switch to using HV_MIGRATION_PORT
This changes KVM to use HV_MIGRATION_PORT instead of KVM_MIGRATION_PORT
and enables passing the port for Xen migrations.
Since KVM_MIGRATION_PORT is not used anymore, we stop exporting it from
constants.py.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 4 Nov 2009 11:29:15 +0000 (12:29 +0100)]
Introduce HV_MIGRATION_PORT hypervisor parameter
This parameter will replace the direct use of KVM_MIGRATION_PORT and the
implicit use of the Xen migration port.
While it doesn't make sense to change this at instance level, we don't
have any other infrastructure for cluster-wide hypervisor parameters, so
we add it here (and document that it usually shouldn't be changed on a
per-instance basis).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 4 Nov 2009 10:10:32 +0000 (11:10 +0100)]
hypervisors: change MigrateInstance API
Currently the $hypervisor.MigrateInstance takes the instance name. This
patch changes it to take the instance object, such that other instance
properties (especially hvparams) are available to it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 4 Nov 2009 13:14:53 +0000 (14:14 +0100)]
Revert the instance IP conflicts
Since instances can live in different VLANs from nodes (especially in
routed mode), based on the 'link' parameter, we shouldn't always
restrict having duplicate IPs. Thus we only check the node IPs/cluster
IP for now.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 4 Nov 2009 12:01:49 +0000 (13:01 +0100)]
Update gitignore rules
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Iustin Pop [Wed, 4 Nov 2009 09:16:03 +0000 (10:16 +0100)]
Introduce a wrapper for hostname resolving
Currently a few of the LU's CheckPrereq use utils.HostInfo which raises
a resolver error in case of failure. This is an exception from the
standard that CheckPrereq should raise an OpPrereqError if the error is
in the 'pre' phase (so that it can be retried).
This patch adds a new error code (resolver_error) and a wrapper over
utils.HostInfo that just converts the ResolverError into
OpPrereqError(…, errors.ECODE_RESOLVER). It then uses this wrapper in
cmdlib, bootstrap and some scripts.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Wed, 4 Nov 2009 09:09:33 +0000 (10:09 +0100)]
Add a configuration verify check for duplicate IPs
This patch adds a check that the cluster IP, the nodes primary (and
secondary, if enabled) IP(s) and the instances NIC IPs are unique in the
cluster.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 3 Nov 2009 15:14:07 +0000 (16:14 +0100)]
Workaround fake failures in drbd+live migration
This patch is an attempt to fix the ugly issue during migration:
Cannot resync disks on node …: [True, 100]
If my understanding is correct, sometimes we poll the /proc/drbd file at
an inoportune moment, while it's being updated, or while the DRBD device
is changing state, and we see an unexpected state.
Based on the assumption that this is just a transient state, rather than
aborting directly, we change the backend.DrbdWaitSync() function to
retry a few times the operation, giving DRBD a chance to settle down at
the end of the resync.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 3 Nov 2009 13:58:38 +0000 (14:58 +0100)]
Another round of pylint-related style fixes
A newer version of pylint, more warnings…
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 3 Nov 2009 13:42:43 +0000 (14:42 +0100)]
Revert "configure: check for socat and its escape feature"
This reverts commit
37fc2cf5ba8919cef407199ee540aad4b1a9a2b6, since it
introduces configure.ac changes that depend on very very new autoconf
macros that are not present in current stable distros (and it was not
advertised as such).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 3 Nov 2009 13:42:12 +0000 (14:42 +0100)]
Revert "kvm console: use socat raw mode with escape"
This reverts commit
ce0eb6694e3fb2510035501539c7acc92a0f174e, since it depends
on
37fc2cf5ba8919cef407199ee540aad4b1a9a2b6 which will be reverted too.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 3 Nov 2009 08:20:35 +0000 (09:20 +0100)]
Change behaviour of ConfigWriter._WriteConfig
This patch changes the behaviour of _WriteConfig in case of
configuration errors:
- before, it used to abort the saving (even though the in-memory
configuration used by current jobs has already changed)
- now, we log it (both to the log and to the user) but continue, since
we can't revert to a good version of the config anyway
This should make the internal behaviour of the code more consistent with
the external world, even though the config might be “wrong”; we leave
the cleanup to the user. This should not be as bad as it sounds, since
we haven't actually seen this case except for the ugly master candidates
handling, and that was fixed recently by Guido's patch series.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 3 Nov 2009 12:54:06 +0000 (13:54 +0100)]
Add an example script for backing up the config
This requires git and lockfile-progs, and only backs up config.data (see
the comments why).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 29 Oct 2009 20:58:50 +0000 (16:58 -0400)]
kvm console: use socat raw mode with escape
If this is enabled at configure time, we pass in different parameters to
the socat console, making it a lot more manageable.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Sat, 31 Oct 2009 21:38:08 +0000 (17:38 -0400)]
configure: check for socat and its escape feature
Currently we use a static value for the socat path, or we trust the
user-provided one. With this patch we still trust any user provided
value, but if none is passed we check for socat on the machine we're
being configured on. This allows us also to check if we can or cannot
use the escape= feature in socat.
If the user has forced the path in, he can also pass --with-socat-escape
in order to force the escape functionality to be used, even if a check
is not done.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 2 Nov 2009 16:24:48 +0000 (11:24 -0500)]
Fix version number in README
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 3 Nov 2009 13:02:26 +0000 (14:02 +0100)]
utils: Convert to utils.Retry
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 3 Nov 2009 10:24:31 +0000 (11:24 +0100)]
Implement cluster verify checks for wrong PV names
Since ':' is not a valid character in PV names (for the way Ganeti uses
LVM), we need to check this and warn the user. This patch adds a new
NV_PVLIST cluster verify check and verifies the PV names returned from
the nodes.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 3 Nov 2009 10:21:19 +0000 (11:21 +0100)]
Change bdev.LogicalVolume.GetPVInfo usage
We will need to enumerate selectively the PVs of (possible) many VGs and
not only the allocatable ones. For this we make the VG selection and the
allocatable filtering optional. The two callers are modified for this
new calling syntax.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 3 Nov 2009 10:03:19 +0000 (11:03 +0100)]
Throw specific error when ':' exists in PV names
While ':' is not actually a supporte character in PV names (it has a
special meaning for commands like lvcreate), we should throw specific
errors for this case instead of generic “Can't create LV”.
This patch does two things:
- modifies the separator used when listing PVs to be '|' such that we
can actually parse ':' as part of PV names
- check if any of the discovered PVs have ':' in their name when
creating LVs, and if so throw a specific error
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Fri, 30 Oct 2009 16:40:46 +0000 (17:40 +0100)]
jqueue: Convert to utils.Retry
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Oct 2009 16:39:46 +0000 (17:39 +0100)]
hv_xen: Convert to utils.Retry
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Oct 2009 16:38:38 +0000 (17:38 +0100)]
bootstrap: Convert to utils.Retry
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Oct 2009 16:38:24 +0000 (17:38 +0100)]
bdev: Convert to utils.Retry
Also replaces a hardcoded limit of 15 seconds with 1/4
of NET_RECONFIG_TIMEOUT.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Oct 2009 16:37:38 +0000 (17:37 +0100)]
backend: Convert to utils.Retry
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Oct 2009 16:36:59 +0000 (17:36 +0100)]
Add generic retry loop function
There are quite a few retry loops with timeouts in Ganeti's
code. Duplicating code is not good, so this patch introduces
a new function named “utils.Retry” to remedy this situation.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 3 Nov 2009 10:29:30 +0000 (11:29 +0100)]
Ignore log messages in unittests
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 27 Oct 2009 07:54:24 +0000 (16:54 +0900)]
Some improvements to gnt-node repair-storage
Currently the repair storage has two issues:
- down instances are aborting the operation, even though they should be
ignored (it's not technically possible to know their disk status
unless we would activate their disks)
- if the VG is so broken that disks cannot be activated via gnt-instance
activate-disks or gnt-instance startup, it's not possible to repair
the VG at all
The patch makes the opcode skip down instances and also introduces an
``--ignore-consistency`` flag for forcing the execution of the LU.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Oct 2009 05:55:15 +0000 (14:55 +0900)]
Convert the rest of the OpPrereqError users
This finishes the conversion of OpPrereqError creation to two-argument
style. Any leftovers as one-argument are not breaking anything, just
losing information about the errors.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Oct 2009 05:27:46 +0000 (14:27 +0900)]
Add ecode to rpc.py's RpcResult.Raise()
This patch adds a new ecode argument to RpcResult.Raise(). This allows
specifying the error code (for both OpExec and OpPrereq errors).
Note that this patch also makes the OpExecError exceptions raised from
_FindFaultInstanceDisks have the error code classification.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Oct 2009 05:15:53 +0000 (14:15 +0900)]
Introduce two-argument style for OpPrereqError
This patch introduces a two-argument style for OpPrereqError. Only the
direct raise calls in cmdlib.py are converted, other users will follow.
cli.py is modified to handle both two-argument style and the current
format. RAPI doesn't need modification as the way we encode errors is
already using a list for the error arguments, so RAPI users only need to
start checking the list length and the second argument.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 2 Nov 2009 11:52:49 +0000 (12:52 +0100)]
Remove the OpRetryError exception
This is only used in two places, in an error path that is no longer
valid since Ganeti 2.0. We remove the try..except since we should not
get it anymore (and if we do, then we should catch it in all
config.Update cases) and we remove the exception class completely.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Thu, 29 Oct 2009 17:31:26 +0000 (18:31 +0100)]
Activate disks while exporting an instance
Exporting an instance not running or without activated disks
will fail. This patch makes sure to activate disks before
exporting an instance if it's in the ADMIN_down state.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Oct 2009 16:33:18 +0000 (17:33 +0100)]
Epydoc fixes
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Fri, 30 Oct 2009 13:46:30 +0000 (14:46 +0100)]
backend: Don't overwrite function parameter with loop variable
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 29 Oct 2009 11:42:29 +0000 (12:42 +0100)]
Add QA test for “gnt-node {list,modify,repair}-storage”
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 27 Oct 2009 03:56:00 +0000 (12:56 +0900)]
Unify the query fields for the storage framework
This patch unifies the query fields in the storage framework for all
types. Note that the information is still computed on-demand, so if e.g.
the used disk space is not requested for the ‘file’ type, it won't be
computed on nodes.
Summary of changes:
- improve the LVM storage type to support multiple lvm fields in the
LIST_FIELDS declaration and constant (not-computed via lvm commands)
fields
- rename utils.GetFilesystemFreeSpace to utils.GetFilesystemStats
returning tuple of (total, free)
- add used and free as valid fields for lvm-vg (use being computed as
vg_size-vg_free)
- make allocatable accepted for all types (ones which are always
allocatable always return True)
- add a new list field ‘type’ that gives the current selected type; not
much useful today (except for understanding what the default output
is) but in the future might help if we want to list multiple types
- add type, size and allocatable to the default output field list
- update the man page with details on how, for file storage, size ≠ used
+ free for non-mountpoint cases
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Thu, 29 Oct 2009 17:30:56 +0000 (18:30 +0100)]
Make cluster initialization more reliable
There was a race condition between starting the node daemon
and sending requests to write the ssconf files. With this
patch, the initialization waits up to ten seconds for the
node daemon to become responsive.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 29 Oct 2009 16:06:13 +0000 (17:06 +0100)]
Don't show warnings on ADMIN_down instance failover
Before:
$ gnt-instance failover -f inst1
… checking disk consistency between source and target
… - WARNING: Can't find disk on node node21.example.com
… shutting down instance on source node
After:
$ gnt-instance failover -f inst1
… not checking disk consistency as instance is not running
… shutting down instance on source node
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 29 Oct 2009 10:09:19 +0000 (11:09 +0100)]
Update NEWS
Add rapi_users changes, rearrange a bit and one wording change.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Oct 2009 18:33:54 +0000 (19:33 +0100)]
Add remote API users and passwords documentation
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Oct 2009 17:08:28 +0000 (18:08 +0100)]
ganeti-rapi: Use new function to verify passwords
This enables the use of hashed passwords in rapi_users.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Oct 2009 17:07:53 +0000 (18:07 +0100)]
http.auth: Add new function to verify passwords
This new function supports two schemes for passwords:
- Old-style cleartext passwords
- Hashed passwords according to RFC2617 (H(A1))
Schemes are differentiated by their prefix, a concept also
used in OpenLDAP. Cleartext passwords can no longer start
with an opening brace ("{") unless they're prefixed with
"{cleartext}" (case insensitive).
Currently there's no documentation for rapi_users at all.
It'll be in a consecutive patch.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 27 Oct 2009 14:24:46 +0000 (15:24 +0100)]
Makefile.am: Add more checks to distcheck-hook
Also use grep only to convert find's output to an exit status.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 27 Oct 2009 02:33:21 +0000 (11:33 +0900)]
Documentation updates
Our admin guide was very very trivial. This patch updates it to contain
advice on when to use which commands, removes the instance
administration part from the installation guide (moved to the admin
guide), and adds a walkthrough document that should be useable as a
starting point for new admins.
The patch also adds emacs variables to the documents, and rewraps some
which were not already at 72 chars.
The doc updates also show backwards-compatible commands for Ganeti 2.0,
as we don't have a good up-to-date 2.0 document and people might refer
to this set of documentation even when running that.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Oct 2009 04:59:38 +0000 (13:59 +0900)]
Fix another style issue
For the Nth time, re-fix shadowing of outer-scope variable :)
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Oct 2009 03:24:19 +0000 (12:24 +0900)]
Make gnt-node list-storage more standard
This patch adds support for the -o+field,… format that the other list
commands accept and changes the format of the allocatable field from
simply str(bool) to Y/N.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Oct 2009 03:13:01 +0000 (12:13 +0900)]
Rename the node storage commands
To reduce confusion, the following gnt-node commands are renamed:
- physical-volumes → list-storage
- modify-volume → modify-storage
- repair-volume → repair-storage
The NEWS file is update accordingly and it also gets emacs local
variables.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Oct 2009 04:44:55 +0000 (13:44 +0900)]
Fix an error handling case in TLReplaceDisks
pylint is your friend, since the compiler doesn't exist.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Tue, 27 Oct 2009 13:14:15 +0000 (14:14 +0100)]
Provide feedback from redistributing configuration
This is particularily useful for “gnt-cluster redist-conf”, but
also for all other cases where the configuration files are
rewritten on other nodes.
$ gnt-cluster redist-conf
… Copy of file /var/lib/ganeti/config.data to node … failed: Error while
executing backend function: [Errno 1] Operation not permitted
… Error while uploading ssconf files to node …: Error while executing backend
function: [Errno 1] Operation not permitted
$ gnt-node modify --offline no --force node3.example.com
… - WARNING: Not enough master candidates (desired 10, new value will be 4)
… Copy of file /var/lib/ganeti/config.data to node node8.example.com failed:
Error while executing backend function: [Errno 1] Operation not permitted
Modified node node3.example.com
- offline -> True
- master_candidate -> auto-demotion due to offline
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 26 Oct 2009 18:22:00 +0000 (19:22 +0100)]
bash_completion: Move common code into function
This reduces the size of the script by about 9 kB.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 26 Oct 2009 11:43:19 +0000 (12:43 +0100)]
Makefile.am: Wrap long lines
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Mon, 26 Oct 2009 11:39:05 +0000 (12:39 +0100)]
Include NEWS in documentation again
This was implemented in
350ecfecca and reverted in
700bb84367
after it broke “make distcheck”. With other changes in this
patch series this will work now.
Contributing to the original problem was that the news.rst file
was not distributed. When we distribute the build documentation,
the source must also be included (see Automake manual).
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 26 Oct 2009 11:00:30 +0000 (12:00 +0100)]
Makefile.am: Don't include MAINTAINERCLEANFILES in EXTRA_DIST
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 26 Oct 2009 10:56:57 +0000 (11:56 +0100)]
Makefile.am: Use noinst_DATA instead of all-local target
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 26 Oct 2009 10:48:55 +0000 (11:48 +0100)]
Makefile.am: Make HTML doc building depend on stamp file
This patch also adds an explicit list of all files written by
sphinx (“docoutput”).
By using an explicit list the build process is more predictable
and will allow us to include the NEWS file again.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 23 Oct 2009 10:51:29 +0000 (12:51 +0200)]
Makefile.am: Use dependencies to create symlinks only if necessary
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Fri, 23 Oct 2009 10:04:29 +0000 (12:04 +0200)]
Makefile.am: Move stamp-directories to BUILT_SOURCES
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 26 Oct 2009 12:21:30 +0000 (21:21 +0900)]
Fix gnt-debug breakage due to options move
Commits d3ed23f and 4eb6265 broke gnt-debug due to renamed option
targets. Sorry again!
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 26 Oct 2009 12:07:47 +0000 (21:07 +0900)]
Fix gnt-node evacuate w. iallocator
Commit
2bb5c911 moved around and changed the _RunAllocator function in
the DiskReplace → TaskLet conversion, but in the process it changed the
relocate_from argument from a list of nodes to just the secondary node.
This breaks the protocol and current iallocator scripts.
This patch fixes that but also adds a local variable 'instance' since
it's not nice to write self.instance so many times.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Fri, 23 Oct 2009 14:42:37 +0000 (10:42 -0400)]
InstanceIpToNodePrimaryIpQuery: use a query dict
In 95b487b we changed InstanceIpToNodePrimaryIpQuery to be able to query
multiple instances at once. We also need to be able to query ips
belonging to a specific nic link, so what we do is:
1) Move the "query" argument to a dict, containing different fields
2) Explicit the "query for a single ip" or "query for a list" options.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Fri, 23 Oct 2009 14:29:37 +0000 (10:29 -0400)]
SimpleConfigReader: ips are partitioned by link
We were already half-doing it, but this completes the process.
1) We don't maintain a list of ips or an ip->instance map
2) We add a new link,ip->instance map (link->ips list we had)
3) We add the link parameter to GetInstanceByIp (making it
GetInstanceByLinkIp)
4) We change the GetInstanceByIp caller to pass None as link
(thus for now using only the default link)
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Fri, 23 Oct 2009 14:26:56 +0000 (10:26 -0400)]
SimpleConfigReader: queries for default nicparams
GetDefaultNicParams returns the default nic parameters.
GetDefaultNicLink returns the default nic link.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Wed, 7 Oct 2009 14:40:46 +0000 (15:40 +0100)]
Use RUN_IN_TEMPDIR in Makefile.am
Since we have this variable and use it in other places, remove the only
leftover hardcoded place.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Thu, 8 Oct 2009 09:12:25 +0000 (10:12 +0100)]
Import errors in confd __init__
It's used by some functions defined there.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Iustin Pop [Mon, 26 Oct 2009 05:39:49 +0000 (14:39 +0900)]
Allow '@' in tag values
This allows using an email address (as is) as part of a tag. The main
problem that could arise is when parsing tags from a shell script, but
(AFAIK) '@' is not a special character when used in values (happy to be
corrected if not true).
The patch also moves the re to be compiled at class init time, should
use less resources; in my tests it is fine to use a compiled re from
multiple threads.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 26 Oct 2009 05:28:20 +0000 (14:28 +0900)]
Fix gnt-node modify-volume
This was broken by me in 064c21f, sorry!
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Mon, 26 Oct 2009 05:18:23 +0000 (14:18 +0900)]
gnt-node: add short option -t for --storage-type
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 22 Oct 2009 21:43:31 +0000 (17:43 -0400)]
init script: allow singling out confd as well
Currently we can start/stop the various subdaemons, but not confd.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Michael Hanselmann [Thu, 22 Oct 2009 15:20:16 +0000 (17:20 +0200)]
cmdlib._AssembleInstanceDisks: Fix case where variable wouldn't be set
The “result” variable may not be set and/or come from the previous loop.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 22 Oct 2009 15:15:28 +0000 (17:15 +0200)]
Makefile: Use path from configure script for sphinx-build
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Ken Wehr <ksw@google.com>