ganeti-local
12 years agoFix types passed to IAllocator
Iustin Pop [Tue, 2 Aug 2011 13:01:34 +0000 (15:01 +0200)]
Fix types passed to IAllocator

Iallocator mode reloc, parameter reloc_from takes a list; half of the
code already forced this parameter to list, we add the other two cases
where it is needed.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: change absolute to relative symlinks
Iustin Pop [Tue, 2 Aug 2011 12:59:00 +0000 (14:59 +0200)]
htools: change absolute to relative symlinks

Currently we use absolute symlinks, but this doesn't work when we
install remotely (due to install first to local temp dir, then rsync
to remote machines). To fix, we change to manually-computed relative
paths, which is not best, but it works.

One possible alternative would be to use hard-links…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agojqueue: Add short delay before detecting job changes
Michael Hanselmann [Tue, 2 Aug 2011 09:48:09 +0000 (11:48 +0200)]
jqueue: Add short delay before detecting job changes

By sleeping for 100ms after receiving a notification for a changed job
file the job is given some additional time to change again. This
significantly reduces the number of LUXI calls for WaitForJobChanges
(depending on the job, in my tests with “gnt-cluster verify
--debug-simulate-errors” by about 80%), and improves performance (the
same job went from around 7 seconds to around 3.5 seconds).

This method is not perfect. The algorithm could be made more complex,
e.g. by increasing the delay on each change, etc., but for now this
simple change provides a good improvement.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd primary/second nodes' group as query fields
Michael Hanselmann [Thu, 28 Jul 2011 11:37:20 +0000 (13:37 +0200)]
Add primary/second nodes' group as query fields

These will be very useful for ganeti-watcher as it needs to retrieve
instances by group.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFix doclint failures
Iustin Pop [Tue, 2 Aug 2011 06:58:27 +0000 (08:58 +0200)]
Fix doclint failures

Commit 54ca6e4b2 renamed some arguments, but didn't also renames them
in the docstrings.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agowatcher: Separate function for writing instance status file
Michael Hanselmann [Fri, 29 Jul 2011 13:56:05 +0000 (15:56 +0200)]
watcher: Separate function for writing instance status file

For now this will do another query to the master daemon, but with the
split for node groups this issue will go away.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agowatcher: Make RAPI error messages less technical
Michael Hanselmann [Fri, 29 Jul 2011 13:49:55 +0000 (15:49 +0200)]
watcher: Make RAPI error messages less technical

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agowatcher.state: Use strings, not objects
Michael Hanselmann [Fri, 29 Jul 2011 13:43:14 +0000 (15:43 +0200)]
watcher.state: Use strings, not objects

Until now the state class would receive instances as objects
(ganeti.watcher.Instance), but this is not necessary. By using strings
the interface is simplified.

This patch also simplifies some code accessing the internal structures,
e.g. setting a key of a dictionary. Some instances of “del dict[key]”
are replaced with “dict.pop(key, None)” to suppress any exceptions if
the key doesn't exist.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agowatcher: Raise error on unknown hook status
Michael Hanselmann [Fri, 29 Jul 2011 13:20:42 +0000 (15:20 +0200)]
watcher: Raise error on unknown hook status

Also, remove punctuation from one error message.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agowatcher: Reformat constants
Michael Hanselmann [Fri, 29 Jul 2011 13:19:04 +0000 (15:19 +0200)]
watcher: Reformat constants

Make them match with style guide.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd new watcher constants
Michael Hanselmann [Fri, 29 Jul 2011 13:13:58 +0000 (15:13 +0200)]
Add new watcher constants

WATCHER_STATEFILE will be removed at the end of this
patch series.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFix formatting of frozensets
Stephen Shirley [Fri, 29 Jul 2011 12:15:40 +0000 (14:15 +0200)]
Fix formatting of frozensets

Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agocli: Add constant for node group option
Michael Hanselmann [Thu, 28 Jul 2011 09:26:36 +0000 (11:26 +0200)]
cli: Add constant for node group option

ganeti-watcher will use this constant to pass the option to itself for
processing all node groups.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoReplace %r with '%s' in masterd/instance.py
Iustin Pop [Fri, 29 Jul 2011 08:55:44 +0000 (10:55 +0200)]
Replace %r with '%s' in masterd/instance.py

I still don't know why Michael is a fan of %r, but in the meantime
this patch changes:

WARNING: import u'import-2011-07-29_01_39_33-y3gZKV' on node1 failed:
Exited with status 1

into:

WARNING: import 'import-2011-07-29_01_39_33-y3gZKV' on node1 failed:
Exited with status 1

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdd "reboot_behavior" hypervisor flag
Stephen Shirley [Mon, 20 Jun 2011 15:52:55 +0000 (17:52 +0200)]
Add "reboot_behavior" hypervisor flag

During instance installations, you do not want the instance to reboot
and start again with the same parameters, as that will most likely
re-start the install process. Therefore, when the instance requests a
reboot it should instead shutdown. This flag allows this to be
controlled.

Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoRemoved non-existing -t option from the gnt-cluster man page
Andrea Spadaccini [Fri, 29 Jul 2011 09:18:55 +0000 (10:18 +0100)]
Removed non-existing -t option from the gnt-cluster man page

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoClear the OS scripts environment
Iustin Pop [Thu, 28 Jul 2011 13:21:30 +0000 (15:21 +0200)]
Clear the OS scripts environment

The OS scripts currently run with the whole noded environment; this is
different from the hooks which run with a cleared one and most likely
an oversight.

This _might_ create problems when upgrading, so it needs to be clearly
announced for the new version.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agowatcher: Split state class into separate module
Michael Hanselmann [Wed, 27 Jul 2011 08:22:20 +0000 (10:22 +0200)]
watcher: Split state class into separate module

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoRename watcher's constant for instance status file
Michael Hanselmann [Wed, 27 Jul 2011 08:46:52 +0000 (10:46 +0200)]
Rename watcher's constant for instance status file

“upfile” is a bad name.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agoFixed a typo in the installation tutorial
Andrea Spadaccini [Thu, 28 Jul 2011 19:37:04 +0000 (20:37 +0100)]
Fixed a typo in the installation tutorial

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agowatcher: Split node maintenance into separate module
Michael Hanselmann [Tue, 26 Jul 2011 12:14:18 +0000 (14:14 +0200)]
watcher: Split node maintenance into separate module

The node maintenance class is standalone.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFixed doc compilation under Sphinx 1.0.7
Andrea Spadaccini [Thu, 28 Jul 2011 11:23:47 +0000 (12:23 +0100)]
Fixed doc compilation under Sphinx 1.0.7

Sphinx 1.0.7 complains if an indented block in .warning starts with :option.
This fixes it.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoMerge branch 'devel-2.4'
Iustin Pop [Thu, 28 Jul 2011 11:10:28 +0000 (13:10 +0200)]
Merge branch 'devel-2.4'

* devel-2.4:
  Add support for cluster/OS parameters in QA
  Add OS search path to gnt-cluster info

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoRemove requirement for variants on OS API v15+
Iustin Pop [Wed, 27 Jul 2011 15:18:48 +0000 (17:18 +0200)]
Remove requirement for variants on OS API v15+

This removes:

- the check in backend that such OSes have a variants file or if it
  exists that is non-empty; in order for this to work, we also rework
  the logic in backend._TryOSFromDisk to allow for optional OS files
- the check in cluster verify such OSes to have a non-empty variant
  list (the check for consistent variants is still kept)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdd support for cluster/OS parameters in QA
Iustin Pop [Thu, 28 Jul 2011 09:18:08 +0000 (11:18 +0200)]
Add support for cluster/OS parameters in QA

Currently there is no way to QA with (for example) an initrd because
the QA only inits the cluster with the default parameters. This makes
it impossible to QA using anything but the default parameters, which
doesn't always work.

Additionally, we add OS parameters and OS hypervisor parameters, for
completeness and for testing that these commands also work.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoRevert "cli.JobExecutor: Feedback function for info output"
Iustin Pop [Wed, 27 Jul 2011 16:46:47 +0000 (18:46 +0200)]
Revert "cli.JobExecutor: Feedback function for info output"

This reverts commit 7421df8e5f2cf31022085b332d1300640ba5854b.

The feedback_fn argument to JobExecutor is used for PollJob, and thus
has a fixed signature: a single arg, tuple of (timestamp, log type,
log message). It's use as drop-in replacement for ToStdout doesn't
work, as that function has a different signature.

For now, I propose to revert this, until we either change JobExecutor
to use the same log messages (and add an intermediate wrapper between
JobExecutor and ToStdout) or we add another parameter to
JobExecutor.__init__.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoExtend the ovf-support design with format translation
Agata Murawska [Thu, 28 Jul 2011 08:10:02 +0000 (10:10 +0200)]
Extend the ovf-support design with format translation

Signed-off-by: Agata Murawska <agatamurawska@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd a QA constant for cluster verify command
Iustin Pop [Wed, 27 Jul 2011 16:45:16 +0000 (18:45 +0200)]
Add a QA constant for cluster verify command

This seems to be used and reused multiple times, let's abstract it…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoFix group verification of offline nodes
Iustin Pop [Wed, 27 Jul 2011 16:02:46 +0000 (18:02 +0200)]
Fix group verification of offline nodes

Commit aef59ae7 reworked the file verification, but forgot to take
into account offline nodes.

The fact that this was not detected yet is due to the fact that we
don't test clusters with offline nodes in QA :(

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoDisallow variants for OSes that don't support them
Iustin Pop [Wed, 27 Jul 2011 11:53:24 +0000 (13:53 +0200)]
Disallow variants for OSes that don't support them

Otherwise we get no variant checks at all, but the variant is still
recorded.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoFix QA OS API failure
Iustin Pop [Wed, 27 Jul 2011 10:32:43 +0000 (12:32 +0200)]
Fix QA OS API failure

The patch changing the OS api in QA to 20 was not complete, sorry.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoQA: test using OS API v20
Iustin Pop [Tue, 26 Jul 2011 17:11:28 +0000 (19:11 +0200)]
QA: test using OS API v20

v20 is (mostly) a superset of the other versions, so testing with it
should be better than with V10. This detects properly the breakage
fixed by the previous patch.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoFix OS queries for API v20 w/parameters
Iustin Pop [Tue, 26 Jul 2011 16:34:46 +0000 (18:34 +0200)]
Fix OS queries for API v20 w/parameters

OS parameters is a list of tuples, so we can't pass it directly to
utils.NiceSort, hence we use a sort key.

This was not detected in QA since QA only tests API v10 :(

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdd helper for declaring all locks shared
Iustin Pop [Tue, 26 Jul 2011 11:03:20 +0000 (13:03 +0200)]
Add helper for declaring all locks shared

This patch adds a function for abstracting
“dict.fromkeys(locking.LEVELS, 1)”. It also removes a duplicate
assignment for the share_locks in LUInstanceQuerydata.

Additionally, it moves the _SupportsOob function to the helper
function list.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdd ht-based result checks to opcodes
Michael Hanselmann [Tue, 26 Jul 2011 11:12:11 +0000 (13:12 +0200)]
Add ht-based result checks to opcodes

This adds the infrastructure necessary to check opcode results using
ht-based functions. Checks are added for two opcodes.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoChange OpClusterVerifyDisks to per-group opcodes
Michael Hanselmann [Tue, 26 Jul 2011 09:34:00 +0000 (11:34 +0200)]
Change OpClusterVerifyDisks to per-group opcodes

Until now verifying disks, which is also used by the watcher,
would lock all nodes and instances. With this patch the opcode
is changed to operate on per nodegroup, requiring fewer locks.

Both “gnt-cluster” and “ganeti-watcher” are changed for the
new interface.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agocmdlib: Give instance name in error message on group evacuation
Michael Hanselmann [Tue, 26 Jul 2011 09:32:22 +0000 (11:32 +0200)]
cmdlib: Give instance name in error message on group evacuation

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agocmdlib: Factorize mapping instance LVs to node/volume
Michael Hanselmann [Tue, 26 Jul 2011 09:31:42 +0000 (11:31 +0200)]
cmdlib: Factorize mapping instance LVs to node/volume

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agocli.JobExecutor: Feedback function for info output
Michael Hanselmann [Tue, 26 Jul 2011 09:31:04 +0000 (11:31 +0200)]
cli.JobExecutor: Feedback function for info output

This will be used in the watcher where we don't want to
pollute stdout unless in debug mode.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd OS search path to gnt-cluster info
Ben Lipton [Mon, 25 Jul 2011 17:22:36 +0000 (13:22 -0400)]
Add OS search path to gnt-cluster info

Otherwise, it's pretty hard to figure it out from the command line.

Signed-off-by: Ben Lipton <benlipton@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agocluster-merge: remove a hardcoded constant
Guido Trotter [Tue, 26 Jul 2011 08:46:02 +0000 (10:46 +0200)]
cluster-merge: remove a hardcoded constant

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agocluster-merge: remove option list from usage
Guido Trotter [Tue, 26 Jul 2011 08:23:52 +0000 (10:23 +0200)]
cluster-merge: remove option list from usage

It doesn't make sense to have to keep them up to date twice, and --help
already lists all of them with help strings.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agocluster-merge: add instance restart strategy opt
Guido Trotter [Mon, 25 Jul 2011 15:17:23 +0000 (15:17 +0000)]
cluster-merge: add instance restart strategy opt

Right now we always restart all instances, which is not right if some
instances were already down for other reasons. Thus we add an option to
decide how to handle this. The right default should be "up" which is:
"restart all options which were switched off by the merge", but since
that's not implemented yet, the default remains the old one, for now.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFix recompilation of htools on regen-vcs-version
Iustin Pop [Mon, 25 Jul 2011 16:42:52 +0000 (18:42 +0200)]
Fix recompilation of htools on regen-vcs-version

Currently, most htools code depends on Constants.hs which is generated
from constants.py and also depends on _autoconf.py. Also, _autoconf.py
depends on vcs-version, which all together means that when 'make
regen-vcs-version' is run, for example by ./devel/upload, most of the
Haskell code needs recompilation.

Since htools already has its 'optimised' vcs-version (and doesn't use
the _autoconf.VCS_VERSION constants), we can optimise this as follows:

- _autoconf.py doesn't contain the VCS_VERSION anymore, and that is
  instead moved to _vcsversion.py
- constants.py depends on and imports this new module
- _autoconf.py doesn't get regenerated at vcs-version changes, but
  only at re-running configure/changing Makefile time

The end result is that only htools/Ganeti/HTools/Version.hs is
recompiled now, which is a significant speedup (usually < 1 second
versus 10 seconds previously).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agoAdd another name for the --yes-do-it option
Iustin Pop [Mon, 25 Jul 2011 16:26:18 +0000 (18:26 +0200)]
Add another name for the --yes-do-it option

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agoMost boring patch ever
Iustin Pop [Mon, 25 Jul 2011 11:07:36 +0000 (13:07 +0200)]
Most boring patch ever

s/'/"/ in (hopefully) the right places.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoMerge branch 'devel-2.4'
Michael Hanselmann [Mon, 25 Jul 2011 13:02:23 +0000 (15:02 +0200)]
Merge branch 'devel-2.4'

* devel-2.4:
  Reopen daemon's stdio on SIGHUP
  Reopen log file only once after SIGHUP
  Don't leak file descriptors when setting up daemon output
  Fix aliases in bash completion

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoReopen daemon's stdio on SIGHUP
Michael Hanselmann [Mon, 25 Jul 2011 12:35:51 +0000 (14:35 +0200)]
Reopen daemon's stdio on SIGHUP

Before this patch daemons would continue to refer to an old logfile for
their standard I/O if they had been asked to reopen the log (SIGHUP).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoReopen log file only once after SIGHUP
Michael Hanselmann [Mon, 25 Jul 2011 11:08:22 +0000 (13:08 +0200)]
Reopen log file only once after SIGHUP

Commit b6fa9a44 added a re-openable log handler. The log file is
reopened when a daemon is sent a HUP signal. Due to a bug in the code,
fixed by this patch, the log file would be reopened for every single log
message thereafter.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoDon't leak file descriptors when setting up daemon output
Michael Hanselmann [Mon, 25 Jul 2011 10:02:24 +0000 (12:02 +0200)]
Don't leak file descriptors when setting up daemon output

When a daemon's output is configured using “utils.SetupDaemonFDs”, the
function must use dup2(2). Unfortunately the code didn't close the
original file descriptors, leaking them in the process.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agohtools: rework the algorithm for ChangeAll mode
Iustin Pop [Mon, 18 Jul 2011 11:17:41 +0000 (13:17 +0200)]
htools: rework the algorithm for ChangeAll mode

I think I've identified the problem with the current ChangeAll
mode. The current algorithm works as follows:

- identify a new primary by choosing the node which gives best score
  as new secondary
- failover to it
- identify a new secondary by choosing the node which gives best score
  as new secondary

This means that the future primary is 'fixed' after the first
iteration, leaving to possibly suboptimal results. This patch changes
the algorithm to do what, in hindsight, seems the obvious thing to do:
- generate all pairs (primary, secondary)
- identify the pair that after the above sequence (r:np, f, r:ns)
  gives the best group score

This fixes some of the corner cases I've seen in relocation, but not
all; the remaining cases are related to multi-instance relocation and
while they can't be fixed in the current framework, the needed
rebalancing is much smaller than with the current algorithm.

The patch also fixes an issue with the docstring of another function.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agognt-instance info: Return static info if node offline
Michael Hanselmann [Fri, 22 Jul 2011 11:27:05 +0000 (13:27 +0200)]
gnt-instance info: Return static info if node offline

Before this patch “gnt-instance info” would fail with the error message
“Error checking node $node: Node is marked offline” if the instance's
primary node is marked offline and the user didn't explicitely request
static information only. With this patch the LU will automatically
return static information if the instance's primary node is marked
offline.

Some explicit loops are changed to map().

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoIgnore offline primary when failing over
Michael Hanselmann [Fri, 22 Jul 2011 11:04:57 +0000 (13:04 +0200)]
Ignore offline primary when failing over

When the source node for a failover is marked offline, there's no need
to require the user to specify “--ignore-consistency”.

To make it work at all, a number of bugs introduced by the merge of
migration and failover are also fixed by this patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agohtools: replace two hardcoded uses of pri+sec nodes
Iustin Pop [Fri, 15 Jul 2011 08:53:45 +0000 (10:53 +0200)]
htools: replace two hardcoded uses of pri+sec nodes

These two cases use explicit uses of primary and secondary nodes with
Instance.allNodes, which means the code is more flexible if the
internal layout of the instance changes.

I've verified that the output of involvedNodes  is not required to be
4-element long, and as such the function docstring has been updated.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agohtools: add target_node member to migrate opcode
Iustin Pop [Sat, 9 Jul 2011 17:48:36 +0000 (19:48 +0200)]
htools: add target_node member to migrate opcode

… and failover too. Not many changes otherwise except for
serialisation and unittests.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agohtools: do not change node disk for non-local storage
Iustin Pop [Sat, 9 Jul 2011 09:17:10 +0000 (11:17 +0200)]
htools: do not change node disk for non-local storage

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agohtools: add more functions for local disk storage
Iustin Pop [Sat, 9 Jul 2011 09:01:49 +0000 (11:01 +0200)]
htools: add more functions for local disk storage

These will be used in Node.hs for proper add/remove instance
code. Furthermore, we restrict the movable status to the right disk
templates only, so that we don't attempt to move the 'wrong' instance
types.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agoInitial design doc for OVF support
Agata Murawska [Thu, 21 Jul 2011 15:31:44 +0000 (17:31 +0200)]
Initial design doc for OVF support

Signed-off-by: Agata Murawska <agatamurawska@google.com>
[iustin@google.com: fixed formatting issues]

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoFix aliases in bash completion
Michael Hanselmann [Fri, 22 Jul 2011 09:55:46 +0000 (11:55 +0200)]
Fix aliases in bash completion

Ever since commit 2d48a3a2 aliases were not included in the bash
completion script. This patch also replaces one tab with two spaces.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agognt-instance console: Use query instead of opcode
Michael Hanselmann [Fri, 22 Jul 2011 06:14:39 +0000 (08:14 +0200)]
gnt-instance console: Use query instead of opcode

This means opening the console no longer requires the instance lock,
allowing it to be used during long-running operations (e.g. replacing a
disk).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoMerge branch 'devel-2.4'
Michael Hanselmann [Fri, 22 Jul 2011 09:05:55 +0000 (11:05 +0200)]
Merge branch 'devel-2.4'

* devel-2.4:
  gnt-node volumes: Fix instance names

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd opcode attribute for comments
Michael Hanselmann [Fri, 22 Jul 2011 05:35:44 +0000 (07:35 +0200)]
Add opcode attribute for comments

This attribute allows programmatic submitters of jobs (e.g. iallocator)
to add a comment to each opcode, describing its purpose. Example:

$ gnt-job info 123
Job ID: 123
  …
  Opcodes:
    OP_INSTANCE_REPLACE_DISKS
      …
      Input fields:
        comment: Replaces disks on inst1.example.com
      …

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agognt-node volumes: Fix instance names
Michael Hanselmann [Fri, 22 Jul 2011 08:27:09 +0000 (10:27 +0200)]
gnt-node volumes: Fix instance names

Commit 84d7e26b changed “objects.Instance.MapLVsByN” to not just return
the LV name, but to include the volume group name (e.g.
“xenvg/d67e8700….disk0_data”). This in turn broke the mapping of volume
names in LUNodeQueryvols, stopping instance names from displayed in
“gnt-node volumes”.

This patch fixes the issue and does some cleanup.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFixed one option name and a typo in the docs
Andrea Spadaccini [Thu, 21 Jul 2011 15:54:42 +0000 (16:54 +0100)]
Fixed one option name and a typo in the docs

The -g vg-name option was deprecated in commit
04367e70ad71eea3f0f19e7889dc68fb9783c98a.

Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFix instance failover (missing argument)
Michael Hanselmann [Thu, 21 Jul 2011 13:22:23 +0000 (15:22 +0200)]
Fix instance failover (missing argument)

More fallout from commit 323f9095b49d.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoImplement instance failover via RAPI
Michael Hanselmann [Thu, 21 Jul 2011 13:20:45 +0000 (15:20 +0200)]
Implement instance failover via RAPI

No idea why this was missed before.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoExport job dependencies through lock monitor
Michael Hanselmann [Thu, 21 Jul 2011 08:47:27 +0000 (10:47 +0200)]
Export job dependencies through lock monitor

This makes them visible to the user. Example:

$ gnt-debug locks -o name,pending
Name    Pending
job/890 job:891,892
job/892 job:894

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agolocking.GLM: Allow adding locks to monitor
Michael Hanselmann [Thu, 21 Jul 2011 08:49:21 +0000 (10:49 +0200)]
locking.GLM: Allow adding locks to monitor

This will be used for exporting job dependencies through
the lock monitor.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoMake lock monitor more versatile
Michael Hanselmann [Wed, 13 Jul 2011 20:43:22 +0000 (22:43 +0200)]
Make lock monitor more versatile

With this change it'll be possible to register other lock information
providers. One usecase for this are job dependencies, which can be shown
in the output of “gnt-debug locks”, too.

The lock monitor is changed to accept more than one return value from
the function providing the information. Unfortunately it's hard to keep
weak references to bound methods, so that I settled on keeping a weak
reference on the object instead (see note in docstring).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoUpdate documentation regarding Haskell dependencies
Iustin Pop [Fri, 8 Jul 2011 14:07:42 +0000 (16:07 +0200)]
Update documentation regarding Haskell dependencies

These were forgot when the supported library versions were changed.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: add two more small unittests
Iustin Pop [Fri, 8 Jul 2011 13:52:14 +0000 (15:52 +0200)]
htools: add two more small unittests

This adds tests for the opToResult and eitherToResult functions from
Types.hs, and changes two other tests for the same module to test JSON
serialisation (which automatically also tests the lower-level to/from
string conversion functions).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: update hail man page with the new modes
Iustin Pop [Fri, 8 Jul 2011 13:23:26 +0000 (15:23 +0200)]
htools: update hail man page with the new modes

Also mark the deprecated modes we no longer support.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: a few more hlint fixes
Iustin Pop [Fri, 8 Jul 2011 13:18:07 +0000 (15:18 +0200)]
htools: a few more hlint fixes

Tested only on GHC 7.x, will test on 6.1x too before commit.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: further docstring fixes
Iustin Pop [Fri, 8 Jul 2011 12:52:01 +0000 (14:52 +0200)]
htools: further docstring fixes

This adds parameter documentation for Cluster.iMoveToJob (I think it
was not clear if the new or old node list is needed) and fixes other
docstring style issues.

After this patch, all modules except for CLI.hs (which has many
obvious declarations for command-line options) and QC.hs (unittests)
have 100% doc-strings.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: add JSON instance for EvacMode
Iustin Pop [Fri, 8 Jul 2011 12:19:17 +0000 (14:19 +0200)]
htools: add JSON instance for EvacMode

This abstracts the JSON parsing of the type EvacMode near its
definition, and simplifies its conversion in IAlloc.parseData.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: add human-readable output to hspace
Iustin Pop [Fri, 8 Jul 2011 11:53:14 +0000 (13:53 +0200)]
htools: add human-readable output to hspace

Currently, hspace can only output a machine-readable format that
(while detailed) is hard to parse quickly by people. This patch adds
(and enables by default) a human-readable output that shows the most
important metrics in a simple format.

Most of the work of the patch is in moving the display of various
metrics from the 'main' function to separate functions, each of which
can output either a machine or human intended format.

The patch also corrects a bug in the CPU efficiency display: before,
the efficiency was computed as instance virtual CPUs divided by total
physical CPUs, which is almost always supra-unitary. More correct is
to divide by the total virtual CPUs, which shows a more meaningful
number (when the p-to-v CPU ratio has been defined correctly).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoFix job constants use in htools
Iustin Pop [Thu, 21 Jul 2011 11:35:36 +0000 (13:35 +0200)]
Fix job constants use in htools

Commit 56c094b4 added use of job constants, but I didn't pay
attention and ended up mixing things: job constants were used for
opcode ones, and the job ones didn't get converted.

This patch corrects it and uses only C.* constants throughout the Jobs
module.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoAdd error state to LUGroupEvacuate's exceptions
Michael Hanselmann [Thu, 21 Jul 2011 09:53:29 +0000 (11:53 +0200)]
Add error state to LUGroupEvacuate's exceptions

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoRename *_STATUS_WAITLOCK to …_WAITING
Michael Hanselmann [Thu, 21 Jul 2011 09:23:28 +0000 (11:23 +0200)]
Rename *_STATUS_WAITLOCK to …_WAITING

This patch renames the {JOB,OP}_STATUS_WAITLOCK constants to
{JOB,OP}_STATUS_WAITING, as per design document for chained jobs.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agognt-group: Add command to evacuate whole group
Michael Hanselmann [Wed, 20 Jul 2011 11:39:58 +0000 (13:39 +0200)]
gnt-group: Add command to evacuate whole group

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd new opcode for evacuating group
Michael Hanselmann [Tue, 17 May 2011 14:10:38 +0000 (16:10 +0200)]
Add new opcode for evacuating group

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFix locking issue with job dependencies
Michael Hanselmann [Thu, 14 Jul 2011 22:55:20 +0000 (00:55 +0200)]
Fix locking issue with job dependencies

When jobs waiting for a dependency are notified, they're re-added to the
queue. This would require owning the queue lock in exclusive mode, but
since the function doing so is called from within the job/opcode
processor, it only holds the lock in shared mode.

This patch changes the result of the processor from a boolean to a
status value (integer). This way the caller can be notified about
actions to take, including notifying waiting jobs. The function adding
jobs to the queue can now acquire the lock in exclusive mode.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agojqueue: Read-only jobs don't need processor lock
Michael Hanselmann [Thu, 14 Jul 2011 21:31:33 +0000 (23:31 +0200)]
jqueue: Read-only jobs don't need processor lock

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd support for KVM keymaps
Sébastien Bocahu [Wed, 20 Jul 2011 17:49:20 +0000 (19:49 +0200)]
Add support for KVM keymaps

Signed-off-by: Sébastien Bocahu <zecrazytux@zecrazytux.net>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agognt-debug: Add tests for job dependencies
Michael Hanselmann [Fri, 8 Jul 2011 01:43:23 +0000 (03:43 +0200)]
gnt-debug: Add tests for job dependencies

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agojqueue: Implement submitting multiple jobs with dependencies
Michael Hanselmann [Fri, 8 Jul 2011 21:49:03 +0000 (23:49 +0200)]
jqueue: Implement submitting multiple jobs with dependencies

With this change users of the “SubmitManyJobs” interface can use
relative job dependencies. Relative job IDs in dependencies are resolved
before handing the job off to the workerpool.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoFix node evacuation
Michael Hanselmann [Wed, 20 Jul 2011 11:18:41 +0000 (13:18 +0200)]
Fix node evacuation

- Adjust for new iallocator result format
- Split some code into helper functions

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoDo proper name lookup for the -O option
Guido Trotter [Fri, 15 Jul 2011 14:26:10 +0000 (14:26 +0000)]
Do proper name lookup for the -O option

hspace and hbal treat -O differently, and use aliases for short names
(although hbal succeeds in that, and hspace doesn't). Uniform this with
a name lookup, using the same functions we used for instance
selection/exclusion.

Some of the code is by the way a bit repetitive, and could probably be
merged in a single function. That needs to be a monadic one, though, so
I promise to do it as soon as I realize how to write them! ;)

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agojqueue: Add “writable” flag to memory objects
Michael Hanselmann [Thu, 14 Jul 2011 20:48:06 +0000 (22:48 +0200)]
jqueue: Add “writable” flag to memory objects

Basically only one instance of the job, the one being processed,
should be serialized to disk and replicated to other nodes. With
this flag assertions can be added in various places.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoImplement chained jobs
Michael Hanselmann [Wed, 1 Jun 2011 15:42:04 +0000 (17:42 +0200)]
Implement chained jobs

An overview is available in the design document for this change,
doc/design-chained-jobs.rst.

When a job enters the job processor, the current opcode's dependencies
are evaluated. If a referenced job has not yet reached the desired
status, the current job is registered as a dependant. The job processor
will continue to work on other pending tasks. When a job finishes it
notifies any pending dependants by re-adding them to the workerpool.

A per-job processor lock is necessary for rare cases where the same job
can be re-added twice.

There is no way to view waiting jobs at the moment, but I plan to
export this information to “gnt-debug locks”.

A so-called dependency manager takes care of managing waiting jobs and
keeping track of their status.

Unittests are included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd implementation details to design for chained jobs
Michael Hanselmann [Fri, 15 Jul 2011 21:45:04 +0000 (23:45 +0200)]
Add implementation details to design for chained jobs

As requested by Iustin.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agoAdd support for GPT by using parted for disks bigger than 2TB.
Pedro Macedo [Tue, 19 Jul 2011 15:37:56 +0000 (17:37 +0200)]
Add support for GPT by using parted for disks bigger than 2TB.

Signed-off-by: Pedro Macedo <pmacedo@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agoRemove constants for iallocator multi-relocate
Michael Hanselmann [Fri, 15 Jul 2011 22:56:33 +0000 (00:56 +0200)]
Remove constants for iallocator multi-relocate

They're no longer necessary.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

12 years agohtools: add a machine-readable CLI flag
Iustin Pop [Fri, 8 Jul 2011 11:07:21 +0000 (13:07 +0200)]
htools: add a machine-readable CLI flag

This will be used in hspace to toggle between "human" readable
and machine readable output formats.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agohtools: move the '-p' option to htools.rst
Iustin Pop [Fri, 8 Jul 2011 13:29:42 +0000 (15:29 +0200)]
htools: move the '-p' option to htools.rst

Since this is a common option and has a big description.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

12 years agohtools: move tiered spec map helper to Hspace.hs
Iustin Pop [Fri, 8 Jul 2011 10:46:05 +0000 (12:46 +0200)]
htools: move tiered spec map helper to Hspace.hs

This is used just in hspace, so let's help in making Cluster.hs
smaller. We also split the function in two, as computing the spec map
and formatting it are two different tasks.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agohtools: import the program modules in QC.hs
Iustin Pop [Fri, 8 Jul 2011 09:57:06 +0000 (11:57 +0200)]
htools: import the program modules in QC.hs

This adds the binaries code to the coverage, and thus the coverage
finally shows the real coverage over all logic code (except for the
htools.hs code, which is not logic code related to the algorithms, so
it doesn't matter — plus it's also very small).

Next steps will be to actually add coverage for this code, especially
for hbal and hspace, which are relatively big compared to hail and
hscan (around 800 expressions versus 200-300 expressions).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agohtools: switch hspace to the generic binary
Iustin Pop [Fri, 8 Jul 2011 09:50:16 +0000 (11:50 +0200)]
htools: switch hspace to the generic binary

This is the last patch of the binaries conversion.

As information, we now have a single binary that is approx. 5.4MiB in
size, compared to 4 binaries that were approx. 5.1-5.2MiB in size;
this will result in a smaller package and install size, and the single
compilation phase should also help.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agohtools: switch hscan to the generic binary
Iustin Pop [Fri, 8 Jul 2011 09:45:10 +0000 (11:45 +0200)]
htools: switch hscan to the generic binary

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

12 years agohtools: switch hbal to the generic binary
Iustin Pop [Fri, 8 Jul 2011 09:41:01 +0000 (11:41 +0200)]
htools: switch hbal to the generic binary

In addition, the patch adds a separate Makefile variable for holding
the binary roles to make it more clear what we symlink.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>