ganeti-local
13 years agoiallocator: Relocation nodes must be in same group
Michael Hanselmann [Wed, 27 Apr 2011 14:42:56 +0000 (16:42 +0200)]
iallocator: Relocation nodes must be in same group

Quoting from iallocator.rst: “[…] ``relocate`` request is used when an
existing instance needs to be moved within its node group […]”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoFix 'unused import' lint error
Iustin Pop [Thu, 28 Apr 2011 11:01:13 +0000 (13:01 +0200)]
Fix 'unused import' lint error

Sorry!

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoSetEtcHostsEntry: maintain existing ordering
Iustin Pop [Thu, 28 Apr 2011 08:26:53 +0000 (10:26 +0200)]
SetEtcHostsEntry: maintain existing ordering

Currently RemoveEtcHostsEntry keeps the ordering, but SetEtcHostsEntry
not, as it will always write the new entry at the end of file. I
personally dislike this as it "uglifies" my custom host files, so this
patch makes it update the record instead in-place so to say instead of
moving it.

The patch also simplifies the construction of the new line (we were
doing duplicate work for no gain).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoConvert utils.nodesetup to utils.WriteFile(data=…)
Iustin Pop [Wed, 27 Apr 2011 15:33:48 +0000 (17:33 +0200)]
Convert utils.nodesetup to utils.WriteFile(data=…)

It makes no sense to iteratively write the new etc/hosts file, as we
can pre-compute the desired contents (neither the old nor the new
versions are safe against concurrent changes anyway).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoQA: Add tests for node group tags
Michael Hanselmann [Thu, 21 Apr 2011 10:58:53 +0000 (12:58 +0200)]
QA: Add tests for node group tags

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoRAPI: Add support for tagging node groups
Michael Hanselmann [Thu, 21 Apr 2011 09:29:31 +0000 (11:29 +0200)]
RAPI: Add support for tagging node groups

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agognt-group: Add commands for tagging groups
Michael Hanselmann [Thu, 21 Apr 2011 09:25:56 +0000 (11:25 +0200)]
gnt-group: Add commands for tagging groups

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agomasterd: Add support for tagging node groups
Michael Hanselmann [Thu, 21 Apr 2011 09:24:53 +0000 (11:24 +0200)]
masterd: Add support for tagging node groups

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoQA: Adding a config option to disable cluster epo
René Nussbaumer [Thu, 21 Apr 2011 09:46:32 +0000 (11:46 +0200)]
QA: Adding a config option to disable cluster epo

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoTLMigrateInstance: remove 10s sleeps
Apollon Oikonomopoulos [Tue, 19 Apr 2011 16:23:14 +0000 (19:23 +0300)]
TLMigrateInstance: remove 10s sleeps

TLMigrateInstance._ExecMigration contains two 10-second sleeps between
individual migration steps.

Apart from prolonging the migration duration by 20s, the second sleep
causes FinalizeMigration to be called 10 seconds after the real
migration completion; since FinalizeMigration is used for configuring
KVM network interfaces of “incoming” instances, this incurs a
10-to-12-second-long network downtime for migrated instances.

This patch removes both calls.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoUpdate manpages and other documents with editor settings
Michael Hanselmann [Tue, 19 Apr 2011 13:08:10 +0000 (15:08 +0200)]
Update manpages and other documents with editor settings

No rewrapping is done in this patch, just updates to the settings.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agognt-group list: Query filter support
Michael Hanselmann [Fri, 8 Apr 2011 11:27:13 +0000 (13:27 +0200)]
gnt-group list: Query filter support

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agognt-node list: Query filter support
Michael Hanselmann [Tue, 19 Apr 2011 12:56:04 +0000 (14:56 +0200)]
gnt-node list: Query filter support

Update manpage, quote field names.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agognt-instance list: Query filter support
Michael Hanselmann [Fri, 8 Apr 2011 11:07:13 +0000 (13:07 +0200)]
gnt-instance list: Query filter support

Update manpage, quote field names.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agocli: Add support for parsing query filters
Michael Hanselmann [Thu, 7 Apr 2011 15:36:05 +0000 (17:36 +0200)]
cli: Add support for parsing query filters

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agocli: Add option to force names to be treated as filter
Michael Hanselmann [Thu, 7 Apr 2011 15:35:44 +0000 (17:35 +0200)]
cli: Add option to force names to be treated as filter

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoopcodes: Change parameter type definition for query filter
Michael Hanselmann [Thu, 7 Apr 2011 15:33:42 +0000 (17:33 +0200)]
opcodes: Change parameter type definition for query filter

The old definition wouldn't accept integers.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agocli: Error reporting for query filter parsing
Michael Hanselmann [Thu, 7 Apr 2011 15:33:08 +0000 (17:33 +0200)]
cli: Error reporting for query filter parsing

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoqlang: Add function to distinguish filters from names
Michael Hanselmann [Thu, 7 Apr 2011 15:31:34 +0000 (17:31 +0200)]
qlang: Add function to distinguish filters from names

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoUpdate ganeti.7 manpage for query filter language
Michael Hanselmann [Fri, 1 Apr 2011 12:24:15 +0000 (14:24 +0200)]
Update ganeti.7 manpage for query filter language

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoqlang: Add parser for query filter language
Michael Hanselmann [Fri, 15 Apr 2011 10:18:08 +0000 (12:18 +0200)]
qlang: Add parser for query filter language

With this parser, command line utilities will be able to provide filters
through query2 in a simplistic language. Example filters:

  name == "node3.example.com"
  master or (name == "node4.example.com")
  be/memory == 128 and name =~ /^web/i
  "inst1.example.com" in sinst_list
  status != "up"
  not master

Parts of the syntax came from Python, others from Perl. Documentation
will be added in follow-up patches.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agohtools: make some error messages more explicit
Iustin Pop [Mon, 18 Apr 2011 09:55:24 +0000 (11:55 +0200)]
htools: make some error messages more explicit

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd instance query field for OS parameters
Michael Hanselmann [Fri, 15 Apr 2011 16:31:27 +0000 (18:31 +0200)]
Add instance query field for OS parameters

These were not available as a query field before. Update unittests
and description text for the other “..params” fields.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoQA: also run gnt-cluster repair-disk-sizes
Iustin Pop [Thu, 14 Apr 2011 14:15:09 +0000 (16:15 +0200)]
QA: also run gnt-cluster repair-disk-sizes

So that we don't happen again to break this forever without realising
it.

The patch also replaces one ' with ".

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix shared_file_storage_dir on upgrades
Iustin Pop [Tue, 12 Apr 2011 14:38:11 +0000 (16:38 +0200)]
Fix shared_file_storage_dir on upgrades

If the cluster was upgraded from 2.4 or earlier, this key won't exist
(it's only set to a correct value on cluster init), so we need to
properly set it to a null string (disabled).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoQA: run the redist-conf command
Iustin Pop [Tue, 12 Apr 2011 14:09:55 +0000 (16:09 +0200)]
QA: run the redist-conf command

This was (AFAICS) completely missing from the QA suite.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoPrevent ssconf values from having non-string values
Iustin Pop [Tue, 12 Apr 2011 14:04:43 +0000 (16:04 +0200)]
Prevent ssconf values from having non-string values

For whatever reason, my test cluster managed to acquire
shared_file_storage_dir with a None value, instead of empty
string. This is not flagged in masterd itself, but the node daemon
will fail in writing the value to disk, as it calls len() on the
received value.

Since this is a bad case, we should detect it as soon as possible (we
basically shouldn't be able to set it), but in the meantime we at
least prevent ssconf writes with such values.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

13 years agoAdd some tests for the auto_balance attribute
Iustin Pop [Fri, 8 Apr 2011 10:29:08 +0000 (12:29 +0200)]
Add some tests for the auto_balance attribute

It tests node add/remove secondary, rather than cluster-level N+1
checks, but it's better than nothing.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

13 years agoNode operations: take into account auto_balance
Iustin Pop [Fri, 8 Apr 2011 08:40:16 +0000 (10:40 +0200)]
Node operations: take into account auto_balance

This patch changes the add to secondary/remove from secondary code to
not deduct/add the instance's memory if the instance is not
auto_balanced.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

13 years agoRead/write auto_balance via Text
Iustin Pop [Thu, 7 Apr 2011 15:14:16 +0000 (17:14 +0200)]
Read/write auto_balance via Text

This also means _another_ change in the text format; we really should
move to json…

The unittests are also update for the new 9-column layout and
additionally a bit of improvement is done.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

13 years agoRead auto_balance via Rapi
Iustin Pop [Thu, 7 Apr 2011 15:00:49 +0000 (17:00 +0200)]
Read auto_balance via Rapi

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

13 years agoRead auto_balance via Luxi
Iustin Pop [Thu, 7 Apr 2011 14:49:38 +0000 (16:49 +0200)]
Read auto_balance via Luxi

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

13 years agoShow the auto_balance flag in the instance listing
Iustin Pop [Thu, 7 Apr 2011 15:00:32 +0000 (17:00 +0200)]
Show the auto_balance flag in the instance listing

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Adeodato Simo <dato@google.com>

13 years agocli: Replace hardcoded strings with constants
Michael Hanselmann [Mon, 11 Apr 2011 14:36:49 +0000 (16:36 +0200)]
cli: Replace hardcoded strings with constants

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd a new attribute to Instance.Instance
Iustin Pop [Thu, 7 Apr 2011 14:23:44 +0000 (16:23 +0200)]
Add a new attribute to Instance.Instance

This will mirror Ganeti's be/auto_balance one, which we need to use to
properly match N+1 computations.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoSome more changes to Makefile.am for htools
Iustin Pop [Fri, 8 Apr 2011 09:14:51 +0000 (11:14 +0200)]
Some more changes to Makefile.am for htools

I duplicate the BINARY= rule in the ghc invocation in order to be able
to silence the if, which was confusing.

Additionally, a new target for running just the htools unit-tests is
provided.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agohtools: Make opcode naming consistent with Ganeti codebase
René Nussbaumer [Thu, 7 Apr 2011 10:42:09 +0000 (12:42 +0200)]
htools: Make opcode naming consistent with Ganeti codebase

This patch just cleans up the htools codebase to make it more consistent
with the naming of the Ganeti codebase.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoMerge branch 'devel-2.4'
Michael Hanselmann [Thu, 7 Apr 2011 10:03:51 +0000 (12:03 +0200)]
Merge branch 'devel-2.4'

* devel-2.4:
  LUInstanceQueryData: Don't acquire locks unless requested
  Increase the lock timeouts before we block-acquire
  daemon.py: move startup log message before prep_fn
  Display the actual memory values in N+1 failures
  ssh.VerifyNodeHostname: remove the quiet flag
  Add error checking and merging for cluster params
  RAPI: Document need for Content-type header in requests
  Fix output for “gnt-job info”
  watcher: Fix misleading usage output
  Clarify --force-join parameter message
  locking: Fix race condition in lock monitor
  utils: Export NiceSortKey function
  Revert "Only merge nodes that are known to not be offline"
  cluster-merge: only operate on online nodes
  Only merge nodes that are known to not be offline
  Treat empty oob_program param as default
  Fix bug in instance listing with orphan instances
  Fix bug related to log opening failures
  Bump version for 2.4.1 release
  cfgupgrade: Fix critical bug overwriting RAPI users file

Conflicts:
NEWS: Trivial
lib/opcodes.py: Added parameter descriptions, used variable for
  "use_locking"

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoMerge branch 'stable-2.4' into devel-2.4
Michael Hanselmann [Thu, 7 Apr 2011 09:44:52 +0000 (11:44 +0200)]
Merge branch 'stable-2.4' into devel-2.4

* stable-2.4:
  Add error checking and merging for cluster params
  Clarify --force-join parameter message
  Treat empty oob_program param as default
  Fix bug in instance listing with orphan instances
  Fix bug related to log opening failures
  Bump version for 2.4.1 release
  cfgupgrade: Fix critical bug overwriting RAPI users file

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoOpCodes.hs: make allow_failover optional
Iustin Pop [Thu, 7 Apr 2011 08:19:35 +0000 (10:19 +0200)]
OpCodes.hs: make allow_failover optional

And default to False, like in the Python codebase.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agohtools: add an utility function for JSON parsing
Iustin Pop [Wed, 6 Apr 2011 17:00:27 +0000 (19:00 +0200)]
htools: add an utility function for JSON parsing

This allows extracting values from a JSON object that might miss, but
have a well-defined default value.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoTwo small Makefile fixes related to htools
Iustin Pop [Thu, 7 Apr 2011 07:45:43 +0000 (09:45 +0200)]
Two small Makefile fixes related to htools

First, fix hs-coverage on non-pristine tree, where the index.html file
already existed, and second, disallow compilation of htools binaries
if configure, for some reason, didn't enable them.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agohtools: Use OpMigrateInstance with allow_failover option
René Nussbaumer [Wed, 6 Apr 2011 16:29:28 +0000 (18:29 +0200)]
htools: Use OpMigrateInstance with allow_failover option

Before hbal decided on the fly if an instance is migratable or not. As
we implemented failover fallback in commit d5cafd31456 we can start to
use that.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoLUInstanceQueryData: Don't acquire locks unless requested
Michael Hanselmann [Wed, 6 Apr 2011 16:32:31 +0000 (18:32 +0200)]
LUInstanceQueryData: Don't acquire locks unless requested

Until now LUInstanceQueryData always acquired locks for the instance(s)
and nodes involved. In combination with long-running operations this
prevented the use of “gnt-instance info”, even with the “--static”
option. With this patch, locks are only acquired when explicitely
requested in the opcode (like all query operations).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agognt-instance migrate: Adding --allow-failover option
René Nussbaumer [Tue, 29 Mar 2011 09:12:25 +0000 (11:12 +0200)]
gnt-instance migrate: Adding --allow-failover option

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoTLMigrateInstance: Merge failover code, allow fallback
René Nussbaumer [Mon, 28 Mar 2011 12:54:33 +0000 (14:54 +0200)]
TLMigrateInstance: Merge failover code, allow fallback

As the code for failover for checking is almost identical it's an easy
task to switch it over to the TLMigrateInstance. This allows us to
fallback to failover if migrate fails prereq check for some reason.

Please note that everything from LUInstanceFailover.Exec is taken over
unchanged to TLMigrateInstance._ExecFailover, only with adaption to
opcode fields and variable referencing, but not in logic. There still
needs to go some effort into merging the logic with the migration (for
example DRBD handling). But this should happen in a separate iteration.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoIncrease the lock timeouts before we block-acquire
Iustin Pop [Mon, 4 Apr 2011 13:59:39 +0000 (15:59 +0200)]
Increase the lock timeouts before we block-acquire

This has been observed to cause problems on real clusters via the
following mechanism:

- a long job (e.g. a replace-disks) is keeping an exclusive lock on an
  instance
- the watcher starts and submits its query instances opcode which
  wants shared locks for all instances
- after about an hour, the watcher job falls back to blocking acquire,
  after having acquired all other locks
- any instance opcode that wants an exclusive lock for an instance
  cannot start until the watcher has finished, even though there's no
  actual operation on that instance

In order to alleviate this problem, we simply increase the max timeout
until lock acquires are sent back to either blocking acquire or
priority increase. The timeout is computed such that we wait ~10 hours
(instead of one) for this to happen, which should be within the
maximum lifetime of a reasonable opcode on a healthy cluster. The
timeout also means that priority increases will happen every half hour.

We also increase the max wait interval to 15 seconds, otherwise we'd
have too many retries with the increased interval.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoutils: Add function generating regex for DNS name globbing
Michael Hanselmann [Wed, 30 Mar 2011 15:54:21 +0000 (17:54 +0200)]
utils: Add function generating regex for DNS name globbing

The intent of this function is to be able to provide a globbing operator
or query filters. One should be able to say, for example, something to
the effect of “gnt-instance shutdown '*.site'”.

Also rename a variable in MatchNameComponent.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoVerify file consistency using centrally computed list
Michael Hanselmann [Thu, 31 Mar 2011 16:43:25 +0000 (18:43 +0200)]
Verify file consistency using centrally computed list

Until now “gnt-cluster verify” (LUClusterVerify) would compute its own
list of files to check for consistency. This list was not complete and
certain inconsistencies were missed.

With this patch the code is changed to use the list of files used by
LUClusterRedistConf. The new check needs to be on a whole-cluster level,
and no longer per node.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agocmdlib: Factorize computation of ancillary files
Michael Hanselmann [Thu, 31 Mar 2011 16:39:52 +0000 (18:39 +0200)]
cmdlib: Factorize computation of ancillary files

… and change the logic in _RedistributeAncillaryFiles. The virtually
same list of files will be used to verify the files' consistency.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoqlang: Remove OP_GLOB operator
Michael Hanselmann [Fri, 1 Apr 2011 12:16:33 +0000 (14:16 +0200)]
qlang: Remove OP_GLOB operator

It'll be implemented using OP_REGEXP by the parser.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoquery: Add implementation of regex match operator
Michael Hanselmann [Wed, 30 Mar 2011 16:38:44 +0000 (18:38 +0200)]
query: Add implementation of regex match operator

So far this operator was not implemented. This patch adds an additional
value preparation function to the function table for binary operators,
used to compile the regular expression. Unittests are included.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agocmdlib: Fix mistake made in commit 75c7520f0
Michael Hanselmann [Mon, 4 Apr 2011 16:21:53 +0000 (18:21 +0200)]
cmdlib: Fix mistake made in commit 75c7520f0

Commit 75c7520f0 used the wrong constant. I double-checked all other
changes made in the commit.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agocmdlib: Replace hardcoded values with constants
Michael Hanselmann [Mon, 4 Apr 2011 14:40:23 +0000 (16:40 +0200)]
cmdlib: Replace hardcoded values with constants

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agodaemon.py: move startup log message before prep_fn
Iustin Pop [Mon, 4 Apr 2011 10:13:44 +0000 (12:13 +0200)]
daemon.py: move startup log message before prep_fn

Before this, the output in the rapi daemon log was:
2011-04-04 03:09:51,026: ganeti-rapi pid=17447 INFO Reading users file
at /var/lib/ganeti/rapi/users
2011-04-04 03:09:51,027: ganeti-rapi pid=17447 INFO ganeti-rapi daemon
startup

Which is confusing, as it might look like the read of the users file
is part of the previous run. This is because we log the 'daemon
startup' message after the prepare_fn, which can log things on its
own.

The patch simply moves the 'daemon startup' message just before
prepare_fn call.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoDisplay the actual memory values in N+1 failures
Iustin Pop [Mon, 4 Apr 2011 09:33:01 +0000 (11:33 +0200)]
Display the actual memory values in N+1 failures

This changes the display from:
Mon Apr  4 02:29:46 2011 * Verifying N+1 Memory redundancy
Mon Apr  4 02:29:46 2011   - ERROR: node node2: not enough memory to
accomodate instance failovers should node node1 fail

To:

Mon Apr  4 02:32:50 2011 * Verifying N+1 Memory redundancy
Mon Apr  4 02:32:50 2011   - ERROR: node node2: not enough memory to
accomodate instance failovers should node node1 fail (33536MiB needed,
27910MiB available)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoUpdate iallocator.rst for multi-reloc mode
Adeodato Simo [Wed, 30 Mar 2011 15:20:22 +0000 (16:20 +0100)]
Update iallocator.rst for multi-reloc mode

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoRAPI: Convert instance shutdown to the new FillOpCode
René Nussbaumer [Fri, 1 Apr 2011 12:42:07 +0000 (14:42 +0200)]
RAPI: Convert instance shutdown to the new FillOpCode

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix QA breakage caused by 3fd7f6524
Michael Hanselmann [Thu, 31 Mar 2011 12:31:06 +0000 (14:31 +0200)]
Fix QA breakage caused by 3fd7f6524

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agossh.VerifyNodeHostname: remove the quiet flag
Iustin Pop [Thu, 31 Mar 2011 16:41:09 +0000 (18:41 +0200)]
ssh.VerifyNodeHostname: remove the quiet flag

This is not needed for this function, and can interfere with debugging
of ssh failures.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd a simple wrapper over utils.Retry
Iustin Pop [Thu, 31 Mar 2011 16:07:35 +0000 (18:07 +0200)]
Add a simple wrapper over utils.Retry

The new wrapper makes moving legacy code to utils.Retry or adding
retries in existing code simpler.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAutomatically enable hail if enabled and found
Iustin Pop [Thu, 31 Mar 2011 10:52:52 +0000 (12:52 +0200)]
Automatically enable hail if enabled and found

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoExpose whether htools was enabled to Python code
Iustin Pop [Thu, 31 Mar 2011 10:49:07 +0000 (12:49 +0200)]
Expose whether htools was enabled to Python code

This exports whether htools was enabled at configure-time, and adds a
constant for our reference iallocator.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agotest.ganeti.process_unittest: Fix race condition
René Nussbaumer [Thu, 31 Mar 2011 08:21:54 +0000 (10:21 +0200)]
test.ganeti.process_unittest: Fix race condition

There was a race condition on heavily loaded testsystem causing randomly
to fail the timeout unittests as the signal handler is not yet setup but
the timeout has already hit.

Therefore we introduce a workaround to wait until a program reached a
certain point (for example after signal handling setup) before we
actually go for the real run. The wait of course has a timeout as well,
but it's pretty high. If we hit the 20 seconds we have really big issues
anyway.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoImprove references to htools in the documentation
Iustin Pop [Thu, 31 Mar 2011 08:51:31 +0000 (10:51 +0200)]
Improve references to htools in the documentation

Was not sure about the bit in admin.rst, hope it's fine.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoClarify the need for QuickCheck/Haskell tests
Iustin Pop [Wed, 30 Mar 2011 12:00:31 +0000 (14:00 +0200)]
Clarify the need for QuickCheck/Haskell tests

Expands the devnotes.rst doc and adds warnings in the Makefile.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoRAPI client: Remove support for version 0 instance creation requests
Michael Hanselmann [Wed, 30 Mar 2011 11:32:26 +0000 (13:32 +0200)]
RAPI client: Remove support for version 0 instance creation requests

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoRAPI server: Drop support for instance creation format 0
Michael Hanselmann [Wed, 30 Mar 2011 10:52:02 +0000 (12:52 +0200)]
RAPI server: Drop support for instance creation format 0

Ganeti 2.1.3, released in June 2010, added support for a new, extensible
instance creation request format, called version 1. This patch removes
support for the old and undocumented version 0 format.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoImproved GanetiRapiClient docstrings
Simeon Miteff [Mon, 28 Mar 2011 20:47:36 +0000 (22:47 +0200)]
Improved GanetiRapiClient docstrings

- Added @rtype and/or @return where missing
- Fixed @param for Query() filter_ parameter (colon was missing)

Signed-off-by: Simeon Miteff <simeon.miteff@gmail.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd design for inter-group instance moves (multi-reloc)
Adeodato Simo [Fri, 25 Mar 2011 20:57:44 +0000 (20:57 +0000)]
Add design for inter-group instance moves (multi-reloc)

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoiallocator.rst: refactor for readability; minor improvements
Adeodato Simo [Thu, 24 Mar 2011 21:28:47 +0000 (21:28 +0000)]
iallocator.rst: refactor for readability; minor improvements

This commits breaks down the "Input message" section of iallocator.rst into
two separate subsections: one detailing keys that are required in all
operation types; a second one detailing the "request" element, which is
different for each type of request.

Some other minor improvements are included as well:

  - update input example to version 2, and add the "nodegroups" and
    "enabled_hypervisors" top-level elements, and the "group" and
    "hypervisor" attributes for nodes and allocation request, respectively.

  - sort keys in the example dictionaries according to the order in earlier
    sections, for easy comparison of documentation with its examples.

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd error checking and merging for cluster params
Stephen Shirley [Fri, 25 Feb 2011 15:01:38 +0000 (16:01 +0100)]
Add error checking and merging for cluster params

Set the default stderr logging level to WARNING so the relevant output
can be seen.

Signed-off-by: Stephen Shirley <diamond@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoRelax instance ERROR on admin_down on offline node
René Nussbaumer [Mon, 28 Mar 2011 08:50:08 +0000 (10:50 +0200)]
Relax instance ERROR on admin_down on offline node

This fixes a issue, where an stopped instances is reported as ERROR
in cluster verify if it lives on a offline node. As the instances is
down this shouldn't happen.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoImplement submitting jobs from logical units
Michael Hanselmann [Fri, 25 Mar 2011 13:22:24 +0000 (14:22 +0100)]
Implement submitting jobs from logical units

The design details can be seen in the design document
(doc/design-lu-generated-jobs.rst).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoiallocator.rst: give pointers for alloc_policy semantics
Adeodato Simo [Wed, 23 Mar 2011 16:49:24 +0000 (16:49 +0000)]
iallocator.rst: give pointers for alloc_policy semantics

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoDoc fix in iallocator.rst: multi-evac requires "evac_nodes"
Adeodato Simo [Wed, 23 Mar 2011 16:41:54 +0000 (16:41 +0000)]
Doc fix in iallocator.rst: multi-evac requires "evac_nodes"

The request argument for multi-evacuate mode is "evac_nodes", not "nodes"
(the example later in the file has the correct name already).

Signed-off-by: Adeodato Simo <dato@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agowatcher: improve logging a bit
Iustin Pop [Thu, 24 Mar 2011 15:28:19 +0000 (16:28 +0100)]
watcher: improve logging a bit

Add some debug logging to detail why we don't run some steps.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

13 years agoRAPI: Document need for Content-type header in requests
Michael Hanselmann [Thu, 24 Mar 2011 14:13:12 +0000 (15:13 +0100)]
RAPI: Document need for Content-type header in requests

This was added to the NEWS file in commit ab221ddf, but never
documented properly.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoFix output for “gnt-job info”
Michael Hanselmann [Thu, 24 Mar 2011 11:51:31 +0000 (12:51 +0100)]
Fix output for “gnt-job info”

If the result of an opcode was a non-empty dictionary, it
would be impossible to differenciate between input and result:

  Input fields:
    […]
    debug_level: 0
    fields: cluster_name,master_node,volume_group_name
    jobs: [[True, u'37922'], [True, u'37923'], [True, u'37924']]

Expected output:

  Input fields:
    […]
    debug_level: 0
    fields: cluster_name,master_node,volume_group_name
  Result:
    jobs: [[True, u'37922'], [True, u'37923'], [True, u'37924']]

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoRemove old ensure-dirs (no longer needed)
René Nussbaumer [Mon, 21 Mar 2011 15:47:47 +0000 (16:47 +0100)]
Remove old ensure-dirs (no longer needed)

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoRewrite of ensure-dirs in python
René Nussbaumer [Fri, 18 Mar 2011 09:30:09 +0000 (10:30 +0100)]
Rewrite of ensure-dirs in python

I provided unittest to test the important pieces of the infrastructure.
The one remaining function (ResuriveEnsure) is not easy to unittest
but also not critical if it fails to operate correctly.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agohs-coverage: make a symlink hpc_index.html
Iustin Pop [Wed, 23 Mar 2011 15:23:50 +0000 (16:23 +0100)]
hs-coverage: make a symlink hpc_index.html

This allows Apache to display the directory in a nicer way.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAnother attempt at fixing htools build without curl
Iustin Pop [Wed, 23 Mar 2011 15:06:50 +0000 (16:06 +0100)]
Another attempt at fixing htools build without curl

OK, my previous small fix was not good. There is another issue: haddoc
(the documentation generator) needs to pass the same compiler options
(i.e. in our case, -DNO_CURL) to ghc. But in case of no curl, then it
shouldn't scan at all the RAPI library, as that is not used in our
builds.

Clearly, this is not a nice thing. So this patch changes from
including/excluding RAPI conditionally (in two places, the
ExtLoader.hs module and in hscan.hs), to always include RAPI, and
moves the curl/no curl login to RAPI itself, where it belongs.

Together with passing --optghc to haddock, this makes the builds
consistent both with and without RAPI. I also undo the removal of RAPI
from QC.hs.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix some lint warnings in htools code
Iustin Pop [Wed, 23 Mar 2011 14:42:09 +0000 (15:42 +0100)]
Fix some lint warnings in htools code

hlint gives more suggestions, but some make the code (IMHO) harder to
read.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoFix lint errors in the htools code
Iustin Pop [Wed, 23 Mar 2011 12:25:22 +0000 (13:25 +0100)]
Fix lint errors in the htools code

These are just changes from hlint suggestions. Still compiles and
passes unittests.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd opcode summary to SubmitManyJobs errors
Michael Hanselmann [Wed, 23 Mar 2011 16:01:13 +0000 (17:01 +0100)]
Add opcode summary to SubmitManyJobs errors

Requested-by: Iustin Pop <iustin@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd design for submitting jobs from LUs
Michael Hanselmann [Mon, 28 Feb 2011 17:04:45 +0000 (18:04 +0100)]
Add design for submitting jobs from LUs

This patch adds a design document describing how jobs can be submitted
from within LUs.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoFix Haskell unittests without RAPI
Iustin Pop [Wed, 23 Mar 2011 12:54:27 +0000 (13:54 +0100)]
Fix Haskell unittests without RAPI

Since we don't test for now the RAPI backend directly, we can simply
skip the import. Later we can make a conditional import if needed.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoAdd import/export version 2 design document
Michael Hanselmann [Tue, 1 Feb 2011 13:47:25 +0000 (14:47 +0100)]
Add import/export version 2 design document

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd design document for lighttpd as HTTP server
Michael Hanselmann [Mon, 24 Jan 2011 18:43:25 +0000 (19:43 +0100)]
Add design document for lighttpd as HTTP server

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoAdd design document for X509 CA
Michael Hanselmann [Mon, 24 Jan 2011 18:42:45 +0000 (19:42 +0100)]
Add design document for X509 CA

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoWrap long lines in configure.ac
Michael Hanselmann [Wed, 23 Mar 2011 11:25:45 +0000 (12:25 +0100)]
Wrap long lines in configure.ac

- Use m4_normalize to make single-line strings while removing
  unnecessary spaces
- Wrap lines longer than 80 characters

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

13 years agoUpdate INSTALL and devnotes.rst with Haskell notes
Iustin Pop [Tue, 22 Mar 2011 17:58:55 +0000 (18:58 +0100)]
Update INSTALL and devnotes.rst with Haskell notes

This documents the needed libraries for Haskell development. It also
fixes a tiny typo in devnotes.rst.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoRevert and change the apidoc and coverage dirs
Iustin Pop [Tue, 22 Mar 2011 17:18:36 +0000 (18:18 +0100)]
Revert and change the apidoc and coverage dirs

Based on Michael's suggestion, this patch partially reverts my
changes. The new directories are:

- doc/api/py
- doc/api/hs
- doc/coverage/py
- doc/coverage/hs

Basically the Python-specific output moves one level down (into py/)
compared to the original location, and the Haskell stuff goes into
hs/.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoRename away htools/NEWS
Iustin Pop [Tue, 22 Mar 2011 17:06:50 +0000 (18:06 +0100)]
Rename away htools/NEWS

Also add mention about it being obsolete.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoCleanup the Haskell-related Makefile.am variables
Iustin Pop [Tue, 22 Mar 2011 17:04:52 +0000 (18:04 +0100)]
Cleanup the Haskell-related Makefile.am variables

This should be more readable now. I wanted to even use the nicer
_SOURCES, but _SOURCES is special in Automake (again), so _SRCS it is.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoRemove obsolete htools/Makefile
Iustin Pop [Tue, 22 Mar 2011 10:42:35 +0000 (11:42 +0100)]
Remove obsolete htools/Makefile

Only one target wasn't ported over (the TAGS one), as hasktags is not
available easily in distributions, so it doesn't make sense to enable
it for all developers.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoMove hlint rule to the main Makefile
Iustin Pop [Tue, 22 Mar 2011 10:35:22 +0000 (11:35 +0100)]
Move hlint rule to the main Makefile

Since we do have errors currently, this is not enabled from the main
'make lint' rule. That will get cleaned up later.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoPort the live-test htools rule to the main Makefile
Iustin Pop [Tue, 22 Mar 2011 10:26:13 +0000 (11:26 +0100)]
Port the live-test htools rule to the main Makefile

This was a bit tricky, as the compilation from the top-dir changes the
paths in the .tix/.mix files.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

13 years agoEnable htools apidoc generation and unify dir names
Iustin Pop [Tue, 22 Mar 2011 09:47:59 +0000 (10:47 +0100)]
Enable htools apidoc generation and unify dir names

Previously, Python api doc was under doc/api (which didn't match with
the target rule, apidoc). After this patch, we have the following:

- make py-apidoc generates Python api doc under doc/py-apidoc
- make hs-apidoc generates Haskell api doc under doc/hs-apidoc
- make apidoc does both (if hs-apidoc enabled at configure time)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>