Michael Hanselmann [Tue, 13 Oct 2009 16:29:56 +0000 (18:29 +0200)]
luxi: Pass socket path directly to exception, not in tuple
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Tue, 13 Oct 2009 12:26:49 +0000 (13:26 +0100)]
gnt-* use the correct opcode slot to build opcodes
gnt-* scripts were building wrong opcodes for commands which had the
shutdown_timeout slot (due to missing testing after renaming). Fixing.
Also change SHUTDOWN_TIMEOUT_OPT dest field name to "shutdown_timeout":
it was set to "timeout". It would still work that way, but possibly be
confusing.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Tue, 13 Oct 2009 11:22:35 +0000 (12:22 +0100)]
Update NEWS for instance shutdown timeout
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 13 Oct 2009 12:12:57 +0000 (14:12 +0200)]
Update documentation for recreate-disks
This also clarifies the UUIDs NEWS entry.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 13 Oct 2009 12:01:20 +0000 (14:01 +0200)]
rapi: fix tag operations
This patch fixes the tag PUT/DELETE operations, and additionally changes
the _Tags_* functions to take only positional and not keyword arguments
(the defaults do not make any sense at all, and they are always called
with all arguments).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Mon, 12 Oct 2009 15:18:37 +0000 (17:18 +0200)]
Update NEWS for Ganeti 2.1
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 12 Oct 2009 15:19:25 +0000 (17:19 +0200)]
Convert NEWS to ASCII
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Mon, 12 Oct 2009 18:05:48 +0000 (19:05 +0100)]
Update manpages for --shutdown-timeout
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 12 Oct 2009 11:49:50 +0000 (12:49 +0100)]
Add timeout options to other LUs
All the LUs that shut down the instance need to be able too pass the
timeout parameter as well.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 12 Oct 2009 15:43:55 +0000 (16:43 +0100)]
cli: add SHUTDOWN_TIMEOUT_OPT
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Mon, 12 Oct 2009 10:46:22 +0000 (12:46 +0200)]
mcpu: Change lock attempt timeout calculation
With this patch all timeouts are pre-calculated. The interface of
the _LockTimeoutStrategy class is also changed a bit; NextAttempt
now returns a new instance.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Wed, 7 Oct 2009 16:15:25 +0000 (18:15 +0200)]
Code and docstring style fixes
Found using pylint and epydoc.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Wed, 7 Oct 2009 14:11:22 +0000 (16:11 +0200)]
mcpu: Improve lock reporting with timeouts
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Wed, 7 Oct 2009 12:58:06 +0000 (14:58 +0200)]
mcpu: Implement lock timeouts
The timeout is always between ~0.1 and ~10.0 seconds. A small
variation of ±5% is added to prevent different jobs from
fighting each other. After 10 attempts to acquire the locks with
a timeout, a blocking acquire is made.
Lock status reporting will be improved in a separate patch.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 5 Oct 2009 14:16:15 +0000 (16:16 +0200)]
mcpu: Remove unused exclusive_BGL attribute
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Fri, 9 Oct 2009 11:39:17 +0000 (13:39 +0200)]
locking.LockSet: Implement acquire timeouts
The timeout passed to LockSet.acquire() is measured over all lock acquires. If
LockSet.acquire fails to acquire all requested locks within the specified
amount of time, all locks are released again and the acquire fails.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Fri, 9 Oct 2009 13:03:39 +0000 (14:03 +0100)]
Update gnt-instance(8) for shutdown --timeout
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Fri, 9 Oct 2009 10:52:38 +0000 (11:52 +0100)]
Accept shutdown timeout from the user
Using the new --timeout option:
- gnt-instance shutdown is changed to accept a timeout
- the opcode is changed to hold one
- the LU is changed to optionally get one
- the rpc is changed to carry one
- the backend is changed to take it as a parameter rather than
hardcoding it in the function
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Fri, 9 Oct 2009 10:52:02 +0000 (11:52 +0100)]
cli: add a timeout option
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Fri, 9 Oct 2009 11:04:07 +0000 (12:04 +0100)]
ChrootManager: clean StopInstance
Currently it has lots for duplicated code, and internal retries.
Clean it up with the following assumptions:
We'll probably be called more than once.
It is ok to fail to stop, unless we're called with force=True.
If we're called only once, and with force=True it's ok not to run the
chroot "cleanup" script (it's a destroy after all, why should chroots
have more chances than other instances?).
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 9 Oct 2009 10:49:54 +0000 (11:49 +0100)]
KVMHypervisor: use the StopInstance retry feature
Since we know StopInstance is going to be called more than once (at
least twice, once with force and once without, but normally quite a lot
more) we don't need our own sleep/loop, and we can just send one monitor
command per call.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 9 Oct 2009 10:35:34 +0000 (11:35 +0100)]
backend.InstanceShutdown: small cleanup
1) unhardcode the timeout, abstracting it in a constant
2) Use time.time() rather than hiding the timeout in a range()
3) call hyper.StopInstance multiple times
-- currently all hypervisors just ignore all calls but once
4) Use hyper.ListInstances() rather than GetInstanceList([hv_name])
-- it's cheaper :)
5) Change the final message to "forcing" from "using destroy"
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 9 Oct 2009 10:17:57 +0000 (11:17 +0100)]
Add default instance shutdown timeout constant
It reflects the "current" two minutes we give to the instance.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Fri, 9 Oct 2009 10:13:20 +0000 (11:13 +0100)]
Hypervisors: Add retry= to StopInstance
Currently some hypervisors need the stop operations to be retried more
than once, while other ones only do it in one pass. With this change
we'll handle retries outside the hypervisor code, but telling whether
this is the first try or not.
Since this option is not used for now, all hypervisors just return if
called with retry set to on, maintaining the old behavior. Since the
fake hypervisor has an idempotent StopInstance call, we avoid returning
in that case.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 9 Oct 2009 13:31:27 +0000 (14:31 +0100)]
Get rid of utils.CommaJoin
- We never remember to use it (5 uses vs 21 " ,".join())
- It's longer to write than " ,".join()
- The added value of the apostrophe in the string is not very much
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Tue, 6 Oct 2009 12:33:20 +0000 (13:33 +0100)]
ethers hook: allow more than one daemon pidfile
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Wed, 7 Oct 2009 09:45:29 +0000 (10:45 +0100)]
burnin: skip instance moves on single node
If we have only one node, instance moves fail, because it tries to move
the instance to itself. Skipping the operation, because in that case it
doesn't make sense.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 5 Oct 2009 09:36:01 +0000 (10:36 +0100)]
Match instance and node names case insensitively
Since DNS cannot contain two names with different cases anyway, this
should be ok.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 5 Oct 2009 09:34:20 +0000 (10:34 +0100)]
Add case_sensitive keyword to MatchNameComponent
Now featuring unit testing, and more deterministic results on some
corner cases.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 5 Oct 2009 16:57:42 +0000 (17:57 +0100)]
VNC password: move to hv param and use in kvm
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Michael Hanselmann [Mon, 5 Oct 2009 11:41:37 +0000 (13:41 +0200)]
Implement strict mode for devel/review
This should prevent typos in aliases from going unnoticed.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Fri, 2 Oct 2009 16:51:30 +0000 (17:51 +0100)]
Update ganeti-os-interface(7) for API 15
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 15:50:10 +0000 (16:50 +0100)]
Check the OS name for variants
If an OS supports variants, unless --force-variant is specified a valid
variant must be passed.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 15:47:36 +0000 (16:47 +0100)]
Allow --force-variant for instance add/reinstall
Passing this option makes an undeclared variant be passed to the os "as
is", hoping it'll be able to figure it out (as per the design doc).
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 15:48:38 +0000 (16:48 +0100)]
Add force_variant slot to Create/ReinstallInstance
These two opcode need to know whether an unknown variant must be forced
through or not.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 13:53:26 +0000 (14:53 +0100)]
Update client os lists to name+variant format
List of OSes are displayed by gnt-os list, rapi, and gnt-instance
reinstall --select-os, and checked by burnin. In all of these show the
list with name+variant, if the os has variants.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 13:41:38 +0000 (14:41 +0100)]
gnt-os diagnose: show os variants
We already show the per-node os variants, also show the global ones.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 13:36:32 +0000 (14:36 +0100)]
cli.CalculateOSNames
Given an os and its variants, return a list of "full" os names.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 13:34:52 +0000 (14:34 +0100)]
Add "variants" field to LUDiagnoseOS
If selected this field will contain a list of os variants supported on
all nodes.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 12:37:58 +0000 (13:37 +0100)]
Add per-node variants list to OS diagnose output
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 15:00:13 +0000 (16:00 +0100)]
OSFromDisk: handle variants when loading os
When we load an OS from disk, we need _TryOSFromDisk to get the real
name, without any variant. This allows any functionality that uses the
instance OS to handle a name with a variant.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 11:15:49 +0000 (12:15 +0100)]
OSEnvironment: populate OS_VARIANT
According to the design on api_version >= 15 the OS variant is the part
of the OS name after the "+" sign. If none is found, we just pass in the
first variant an OS declares (which is bound to exist, as we check for
it in _TryOSFromDisk).
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 10:47:55 +0000 (11:47 +0100)]
Populate OS variants if an api >= 15 is present
Adding the file name to the os_files dict will fill in the full path and
get it checked, if present we also read it and split into lines, one per
declared variant.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 10:27:40 +0000 (11:27 +0100)]
Add slot and constant for supported OS variants
The slot will contain a list of variants, and the variants file constant
contains the file in the os dir which is supposed to hold the list.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 10:39:59 +0000 (11:39 +0100)]
TryOSFromDisk: only check actual os scripts for +x
Currently all checked files in the loop are os scripts, so nothing will
change, but in the future we only want the +x bit on actual os scripts,
not necessarily all files.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 10:37:57 +0000 (11:37 +0100)]
TryOSFromDisk: s/os_scripts/os_files/
We'll be using this dict/loop to check more than just scripts, so we're
renaming the variables appropriately.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 10:26:24 +0000 (11:26 +0100)]
Convert os api version file name to a constant
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Fri, 2 Oct 2009 16:30:00 +0000 (17:30 +0100)]
Fix rpc.call_os_get to actually return the OS
Since nobody ever read the actual OS object, this bug was introduced in
the rpc conversion.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Jun Futagawa [Tue, 29 Sep 2009 07:43:15 +0000 (16:43 +0900)]
Add support for using the bootloader in xen-pvm
This patch adds three optional parameters:
- 'use_bootloader', whether use or not the bootloader
- 'bootloader_path', absolute path to the bootloader
- 'bootloader_args', extra arguments to the bootloader
Syntax:
gnt-cluster modify --hypervisor-parameters \
xen-pvm:bootloader_path=/usr/bin/pygrub,use_bootloader=False
gnt-instance modify -H use_bootloader=True instance1.example.com
If use_bootloader is True, each domU can boot with its own kernel
instead of using the dom0 kernel.
Signed-off-by: Jun Futagawa <jfut@integ.jp>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: ultortter
Michael Hanselmann [Fri, 2 Oct 2009 15:44:29 +0000 (17:44 +0200)]
Disallow "xrange" function
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 2 Oct 2009 15:33:33 +0000 (17:33 +0200)]
Replace all xrange() with range()
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 2 Oct 2009 12:04:44 +0000 (14:04 +0200)]
More locking tests race conditions fixes
There were more race conditions. By adding a notify function to
SharedLock.acquire we can prevent them.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 2 Oct 2009 10:14:11 +0000 (12:14 +0200)]
check-python-code: Show line number for problems
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Thu, 1 Oct 2009 16:29:29 +0000 (17:29 +0100)]
LUSetNodeParams: autopromote self when needed
If we're de-offlining or de-draining a node we need to promote it to MC
if we have not enough, or the config will be corrupt.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Thu, 1 Oct 2009 15:57:34 +0000 (16:57 +0100)]
Abstract self-promotion decision
During node add we decide whether to self promote to an MC. Abstract
this decision making to a separate function.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Thu, 1 Oct 2009 16:13:41 +0000 (17:13 +0100)]
Fix master candidate removal
Currently during a master candidate removal, when it's possible to
promote another node, the removal operation fails because of a corrupt
config before it's even possible to do the promotion. Fixing this by
doing the promotion before, excluding the current node.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Wed, 30 Sep 2009 16:25:29 +0000 (17:25 +0100)]
LUSetNodeParams: Don't break config on mc demotion.
If --force is used to demote an MC, but then there are not enough MCs in
the cluster, the configuration gets corrupted until a node is promoted.
In order to avoid that we only allow demotion with --force if the node
is offlined or drained at the same time, and we don't have any other
node available to promote.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Olivier Tharan <olive@google.com>
Guido Trotter [Thu, 1 Oct 2009 15:36:19 +0000 (16:36 +0100)]
Master candidate stats, return one more value
Other than returning the current number of candidates, and the number of
desired and possible candidates, we also return the maximum possible
number, even if greater than our desires. All callers for now ignore
this third value.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 30 Sep 2009 18:37:17 +0000 (19:37 +0100)]
SingleActionPipeCondition =~ s/Action/Notify/
With this patch we simplify usage on the SingleActionCondition (which
wasn't a condition at all) by making it a real condition. This way we
can just wait() on it, or notifyAll() as we would on a normal one. The
only catch is that notifyAll can be called only once, and wait can only
be called before notifyAll has, but luckily our PipeCondition, now quite
simplified, takes care of this, by providing a new SingleActionCondition
each time the previous one has been notified.
No Start/StopUsing function are needed anymore, and thus the condition
is a lot more robust, and there's no way file descriptors can be left
open, as they are closed in a finally block, in the same function where
they were opened, by the last thread exiting the class.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 1 Oct 2009 13:09:35 +0000 (14:09 +0100)]
testNotification: add more checking about order
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 1 Oct 2009 12:56:32 +0000 (13:56 +0100)]
Abstract base condition test cases
This way they can be used to test different condition classes.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 1 Oct 2009 09:41:36 +0000 (10:41 +0100)]
Move the "done" queue inside _ThreadedTestCase
All (ok, all but one) _ThreadedTestCase users have a done Queue, so we
move its building in the _ThreadedTestCase setUp
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Wed, 30 Sep 2009 17:30:07 +0000 (18:30 +0100)]
Abstract "base" condition code in a separate class
Each condition has an underlying lock, the acquire and release methods,
and a few helper methods to check that it's called in the proper way.
Abstract them to a separate class so we can have more than one without
duplicating this code.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Thu, 1 Oct 2009 15:22:02 +0000 (17:22 +0200)]
locking.SharedLock: Fix bug in delete function
SharedLock.__acquire_unlocked uses keyword parameters. Just passing
the timeout would set the “shared” parameter.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 29 Sep 2009 15:20:06 +0000 (17:20 +0200)]
Rename LockSet.acquire parameter “blocking” to “timeout”
Also remove the “blocking” parameter from LockSet.remove and
GanetiLockManager.remove. There's no point in implementing timeouts on removal
unless we need them.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 30 Sep 2009 16:24:43 +0000 (18:24 +0200)]
Try to fix locking unittests
Our automated test system found a few problems in the new locking
unittests. This patch should fix them, although I wasn't able to
reproduce the problem. All are race conditions.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 28 Sep 2009 15:44:19 +0000 (17:44 +0200)]
Change SharedLock to new pipe(2)-based condition
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Mon, 28 Sep 2009 15:43:53 +0000 (17:43 +0200)]
Add _PipeCondition class
_PipeCondition is a condition implemented using pipe(2) and poll(2).
It allows the implementation of timeouts without using a busy-wait loop
with time.sleep.
Unlike Python's built-in threading.Condition class and to save file
descriptors and an internal queue, it can only be used to notify
all waiters. Ganeti's use case for this condition class doesn't
require the ability to notify only one waiter.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Tue, 29 Sep 2009 14:12:12 +0000 (16:12 +0200)]
Add _SingleActionPipeCondition class
This class will be used as a basic block for pipe(2)-based
conditions. Upon initialization it creates a pipe and can be
notified once (hence the “single action” in the name). A
callable helper class is used to wait for notifications.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 24 Sep 2009 12:21:18 +0000 (14:21 +0200)]
SharedLock: implement timeouts
This patch greatly simplifies the SharedLock code and implements
timeouts for the acquire() and delete() functions. A wrapper around
Python's threading.Condition class must be used to ensure thread
safety when check whether there are any waiters left.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Luca Bigliardi [Wed, 30 Sep 2009 13:51:11 +0000 (14:51 +0100)]
Extend confd instances ips query
The query now accepts a link parameter.
Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 30 Sep 2009 12:57:24 +0000 (14:57 +0200)]
Merge remote branch 'origin/master' into mogu
* origin/master:
Fix burnin's verbose mode
Final NEWS update and version increase for 2.0.4
Encode the actual exception raised by LU execution
Move the luxi error handling into errors.py
Fix the confusing ssh/hostname message in node add
Add man page for ganeti-cleaner
Update NEWS file for version 2.0.4
Automatically cleanup _temporary_ids at save
Separate the computation of all config IDs
Change config upgrade to be explicit
Fix unittest breakage due to new test file
Fix /proc/drbd parsing in presence of gaps
repair-size: ensure child disks have sane sizes
Fix yet another bug in LURepairDiskSizes
Fix a bug in LURepairDiskSizes
Conflicts:
NEWS (trivial, the RST change)
lib/cmdlib.py (trivial, some small change in 2.0)
lib/config.py (due to the cherry-picks and UUID changes in 2.1)
Signed-off-by: Iustin Pop <iustin@google.com>
Luca Bigliardi [Tue, 29 Sep 2009 17:07:37 +0000 (18:07 +0100)]
Improve description of migrate/failover post hooks env
Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Luca Bigliardi [Tue, 29 Sep 2009 16:15:36 +0000 (17:15 +0100)]
Update env vars for instances in hooks documentation
Remove variables which are listed at the beginning of the section and variables
which are not declared when building hooks env.
Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 29 Sep 2009 15:48:42 +0000 (17:48 +0200)]
Fix burnin's verbose mode
The timestamp need special formatting, which was done for the internal
buffer storage but not for the messages logged in verbose mode. This
patch unifies the formatting for these two cases.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 29 Sep 2009 10:19:59 +0000 (12:19 +0200)]
Final NEWS update and version increase for 2.0.4
QA passed successfully, let's try to have a release.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 24 Sep 2009 16:31:31 +0000 (17:31 +0100)]
Add initial confd client unittests
Some basic tests for the confd client library
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Tue, 29 Sep 2009 08:25:33 +0000 (09:25 +0100)]
confd/client: make it possible to update peer list
Until now the peers have to be the same all the time. Adding a new
function to update the list, and call it from the constructor to avoid
duplicating code.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 29 Sep 2009 09:06:55 +0000 (11:06 +0200)]
Remove ‘-u’ from masterd shebang
This is not needed anymore - the original change was more than a year
ago when masterd was in its incipient phase.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 28 Sep 2009 15:53:30 +0000 (16:53 +0100)]
confd/client: pass self to upcalls
It may be handy for upcalls to know which client called them, and call
it back. So we create a new "client" field in the upcall target,
containing the current client instance
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 28 Sep 2009 09:23:53 +0000 (10:23 +0100)]
devel/upload.in: make it more project generic
Only install ganeti specific files if they exist. This way we can call
ganeti's devel/upload in another sub-projects (eg. nbma) and have it
uploaded to a host as well, without having to create a new script there.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 25 Sep 2009 17:15:05 +0000 (18:15 +0100)]
ConfdFilterCallback: fix a bug in expire
The HandleExpire function takes the whole "up" structure, and not just
the salt.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 25 Sep 2009 16:56:00 +0000 (17:56 +0100)]
ganeti-confd: cleanup imports
Many functionalities of confd have been moved to other classes/modules,
and the main confd daemon doesn't reference these modules directly
anymore.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 25 Sep 2009 14:51:00 +0000 (15:51 +0100)]
ganeti-confd: don't depend on the os log dir
ganeti-confd doesn't need to log anything related to os installations.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Thu, 27 Aug 2009 13:02:50 +0000 (15:02 +0200)]
Encode the actual exception raised by LU execution
Currently, the actual exception raised during an LU execution (one of
OpPrereqError, OpExecError, HooksError, etc.) is lost because the
jqueue.py code simply sets that to a str(err), and the code in cli.py
simply passes that string to OpExecError.
This patch moves to encoding the errors as per errors.EncodeError and
changes the cli code to parse and raise that (if possible).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit
bcb66fcabfb31ac63beebcc2249bbb8cb30703ae)
Iustin Pop [Thu, 27 Aug 2009 13:01:23 +0000 (15:01 +0200)]
Move the luxi error handling into errors.py
Currently the luxi error handling is hardcoded as special encoding on
the masterd-side and special decoding on the client side. This patch
moves it to errors.py such that other parts of the code can reuse the
same encoding.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit
6956e9cd11e7fc3849824d9953c622143a732bd7)
Iustin Pop [Fri, 25 Sep 2009 15:34:47 +0000 (17:34 +0200)]
Fix the confusing ssh/hostname message in node add
Before, it used to say:
ssh/hostname verification failed node1.example.com -> hostname mismatch, got
node2
Now it says for wrong hostnames (maybe too verbose):
ssh/hostname verification failed (checking from node1.example.com): hostname
mismatch, expected node2.example.com but got node3
And for non-FQDN hostnames:
ssh/hostname verification failed (checking from node1.example.com): hostname
not FQDN: expected node2.example.com but got node2
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Fri, 25 Sep 2009 15:25:47 +0000 (17:25 +0200)]
Add man page for ganeti-cleaner
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Fri, 25 Sep 2009 14:11:08 +0000 (15:11 +0100)]
Remove secrets and kill confd on cluster leave
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Wed, 23 Sep 2009 15:05:19 +0000 (16:05 +0100)]
Implement ConfdFilterCallback
This callback can be stacked with another one, and will filter duplicate
or old results, making handling of results easier.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 24 Sep 2009 15:34:25 +0000 (16:34 +0100)]
Confd client: add module level documentation
Populate the docstring with documentation on the client library's usage.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Fri, 25 Sep 2009 14:11:29 +0000 (16:11 +0200)]
Update NEWS file for version 2.0.4
We don't bump up the version yet, pending more QA tests.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Fri, 25 Sep 2009 11:53:34 +0000 (13:53 +0200)]
Add uuid on node/instance add and cluster init
This patch does a little bit of cleanup first, since we want to call
GenerateUniqueID without reacquiring the lock.
Note that we don't necessarily need to do this for the cluster, since at
first startup ConfigWriter will do it anyway. But it's better to
explicitely do this instead of relying on the automated upgrade.
Additionally this patch adds ctime/mtime population at cluster init
time. mtime is not necessarily needed (master will update it
automatically, but we're doing it anyway for consistency).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Fri, 25 Sep 2009 11:38:03 +0000 (13:38 +0200)]
Fix RAPI QA, broken by NIC parameters changes
This also adds the new nic.links query field.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 23 Sep 2009 16:39:05 +0000 (18:39 +0200)]
Automatically cleanup _temporary_ids at save
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit
34d657bae4361b9d6fd8c6314dc7cca57b51c773)
Iustin Pop [Wed, 23 Sep 2009 15:37:46 +0000 (17:37 +0200)]
Separate the computation of all config IDs
We will need this in another place, so we abstract the 'compute all
current IDs' functionality into a separate function. We also change the
name of the _ComputeAllLVs to _AllLVs to match the other _All*s
functions.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
(cherry picked from commit
34e54ebc1417f843bc96c86025f15fb76c3a023a)
Iustin Pop [Tue, 22 Sep 2009 13:54:20 +0000 (15:54 +0200)]
Change config upgrade to be explicit
Currently the config upgrade is done at each object instantiation, that
means that ganeti-noded will run UpgradeConfig on all objects received
remotely (instances, disks, nics). This is not so good, so this patch
changes it so that only the ConfigWriter runs this method at
configuration load time.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
(cherry picked from commit
90d726a80ce40a4990a26566211514e863985efc)
Iustin Pop [Fri, 25 Sep 2009 12:22:58 +0000 (14:22 +0200)]
Merge commit 'origin/next'
* commit 'origin/next': (74 commits)
Fix gnt-node modify online help
Fix gnt-job info entry in gnt-job(8)
locking: Don't swallow exceptions
Add check for duplicate MACs in instance add
scripts/gnt-node: fix a help string
Optimise multi-job submit
Extend gnt-debug with more debugging options
Return cluster tags from LUQueryClusterInfo
Add script to clean archived jobs after 21 days
rapi: export more static node information
Pass the correct signal to handlers
cli: Use ToStdout/ToStderr instead of print
Fix small typo in gnt-node
Simplify handling of boolean args in rapi
Fix checks in LUSetNodeParms for the master node
Improve the example startup script
Fix insserv dependencies
Fix a typo in InitCluster
Ignore results from drained nodes in iallocator
Ship the ethers hook
...
Michael Hanselmann [Thu, 24 Sep 2009 14:01:27 +0000 (16:01 +0200)]
Wrap documentation to max 72 characters per line
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 23 Sep 2009 15:36:36 +0000 (17:36 +0200)]
Set Vim textwidth in each documentation file
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>