Michael Hanselmann [Wed, 18 Apr 2012 21:15:38 +0000 (23:15 +0200)]
Stop using locks in LUXI "QueryTags"
Also mark it as deprecated in NEWS as normal queries can be used
instead.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 19 Apr 2012 18:03:12 +0000 (20:03 +0200)]
Convert LUClusterConfigQuery to query2
The main intention of this patch is to make it possible to retrieve
cluster tags via query2. While at it I decided to convert
LUClusterConfigQuery right away. Some of the values returned by
LUClusterQuery are also included, but the conversion of LUClusterQuery
is not yet complete.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 19 Apr 2012 18:10:18 +0000 (20:10 +0200)]
Fix RAPI QA with exports via query2
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 19 Apr 2012 20:14:49 +0000 (22:14 +0200)]
Remove unused constants
These are not used anywhere in Python or Haskell.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 18 Apr 2012 16:38:14 +0000 (18:38 +0200)]
Convert listing exports to query2
This solves one case where locks are acquired during LUXI queries.
Pretty late into the transition I noticed that OpBackupQuery had a
“use_locking” parameter for a long time, but didn't use it. Since
most of the other changes were already and this allows exports to
be listed via RAPI (/2/query) I decided to finish.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 18 Apr 2012 16:39:38 +0000 (18:39 +0200)]
utils.algo: Use str.isdigit instead of regular expression
str.isdigit is about 4x faster than using a regular expression ("\d+").
This is in the inner sorting code so speed matters.
$ python -m timeit -s 'import re; s = re.compile("^\d+$")' \
's.match(""); s.match("Hello World"); s.match("1234")'
1000000 loops, best of 3: 0.937 usec per loop
$ python -m timeit '"".isdigit(); "Hello World".isdigit(); "1234".isdigit()'
1000000 loops, best of 3: 0.218 usec per loop
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Chris Schrier [Wed, 18 Apr 2012 22:10:50 +0000 (18:10 -0400)]
Include PycURL error code in GanetiApiError.
Signed-off-by: Chris Schrier <schrierc@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 17 Apr 2012 18:48:30 +0000 (20:48 +0200)]
Drop objects.QueryRequest
It was only used in one place and wasn't really necessary.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Apr 2012 19:11:49 +0000 (21:11 +0200)]
gnt-os modify: Add "--submit" option
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Apr 2012 19:11:36 +0000 (21:11 +0200)]
gnt-node: Add "--submit" and "--priority" to commands
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Apr 2012 19:11:21 +0000 (21:11 +0200)]
gnt-instance: Add "--submit" and "--priority" to commands
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Apr 2012 19:11:05 +0000 (21:11 +0200)]
gnt-group: Add "--submit" and "--priority" to commands
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Apr 2012 19:10:42 +0000 (21:10 +0200)]
gnt-cluster modify: Add "--submit" option
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Apr 2012 19:10:26 +0000 (21:10 +0200)]
gnt-backup: Add "--submit" to two commands
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Apr 2012 19:09:43 +0000 (21:09 +0200)]
Document "--submit" in ganeti.7
Like “--priority” and “--dry-run”, the “--submit” option is available
for many commands and can be documented in a central place. This patch
also fixes a small number of style issues.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 18 Apr 2012 11:58:56 +0000 (13:58 +0200)]
Fix further QA failures due to query changes
Hopefully these will be the last ones…
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Tue, 17 Apr 2012 18:06:55 +0000 (20:06 +0200)]
Fix error in opcode result processing
LUXI queries are processed without callbacks (see
server.masterd.ClientOps._Query). With commit
07923a3c the logic for
checking an opcode's result for jobs to submit was changed and
subsequently raised an exception (“'NoneType' object has no attribute
'SubmitManyJobs'”) in such a case. Before said commit the exception
would also have been raised if an opcode used by a query submitted jobs.
This patch changes the logic to only resolve the method if callbacks are
defined and to use an exception-raising implementation otherwise.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Alexander Schreiber [Tue, 17 Apr 2012 14:47:53 +0000 (16:47 +0200)]
Add "show" as alias for "info" to gnt commands
This patch adds support for "show" as an alias for "info" to
gnt-(cluster|instance|node|os). It already exists in gnt-job.
Signed-off-by: Alexander Schreiber <als@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Fri, 13 Apr 2012 21:28:21 +0000 (23:28 +0200)]
Copy debug level, priority and set comment for LU-generated opcodes
Before this patch, a node evacuation submitted with high priority would
only compute the solution at that priority, but the actual evacuation
ran at normal priority.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Apr 2012 14:53:21 +0000 (16:53 +0200)]
Fix QA failures with "gnt-job list"
Jobs have no “name” field.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 12 Apr 2012 22:08:20 +0000 (00:08 +0200)]
gnt-job list: Add options for commonly used filters
While “gnt-job list” would also accept filters on the command line (e.g.
“'status == "error"'”, having shortcuts in the form of options comes in
handy.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 12 Apr 2012 19:17:24 +0000 (21:17 +0200)]
Merge branch 'devel-2.5'
* devel-2.5: (29 commits)
gnt-* {add,list,remove}-tags: Unify options
Bump version for 2.5.0 final release
configure.ac: Fix “too many arguments” error
Fix extra whitespace
Further fixes concerning drbd port release
Fix a bug concerning TCP port release
Fix extra whitespace
Fix a bug concerning TCP port release
ganeti.initd: Add “status” action
Add whitelist for opcodes using BGL
LUOobCommand: acquire BGL in shared mode
Fix docstring bug
LUNodeAdd: Verify version in Prereq
LUNodeAdd: Verify version in Prereq
Fix LV status parsing to accept newer LVM
gnt-instance info: Show node group information
cmdlib: Factorize checking acquired node group locks
Bump version for 2.5.0~rc6 release
cmdlib: Stop forking in LUClusterQuery
locking: Notify only once on release
...
Conflicts:
NEWS: Trivial
daemons/daemon-util.in: Copyright line
lib/client/gnt_group.py: Tag operations
lib/cmdlib.py: Not so trivial, hopefully correct
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 12 Apr 2012 17:31:37 +0000 (19:31 +0200)]
gnt-* {add,list,remove}-tags: Unify options
- Listing tags is a query, so neither “--priority” nor “--submit” make
sense
- Support both options for adding/removing tags
- Also remove “--submit” from “gnt-node health”; it doesn't work and
doesn't make sense for listing node health
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 12 Apr 2012 16:12:48 +0000 (18:12 +0200)]
Merge branch 'stable-2.5' into devel-2.5
* stable-2.5:
Bump version for 2.5.0 final release
configure.ac: Fix “too many arguments” error
Fix extra whitespace
Further fixes concerning drbd port release
Fix a bug concerning TCP port release
Fix extra whitespace
Fix a bug concerning TCP port release
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 11 Apr 2012 17:34:58 +0000 (19:34 +0200)]
Bump version for 2.5.0 final release
Also update NEWS file.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 11 Apr 2012 18:26:35 +0000 (20:26 +0200)]
Merge branch 'devel-2.4' into stable-2.5
* devel-2.4:
Fix extra whitespace
Further fixes concerning drbd port release
Fix a bug concerning TCP port release
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 11 Apr 2012 17:34:54 +0000 (19:34 +0200)]
configure.ac: Fix “too many arguments” error
If GHC_PKG_QUICKCHECK contains multiple values, the test would fail
with “too many arguments”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Fri, 30 Mar 2012 10:44:51 +0000 (12:44 +0200)]
Fix extra whitespace
Sorry, didn't catch this before…
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
(cherry picked from commit
54b010cad1ea0a536ed037bf315a04dd1c079964)
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Dimitris Aragiorgis [Mon, 2 Apr 2012 18:27:35 +0000 (21:27 +0300)]
Further fixes concerning drbd port release
Commit 3b3b1bc does not entirely fix the bug introduced in commit
f396ad8. It fixes consistency of config data in permanent storage, but
does not ensure consistency in data held in runtime memory of masterd.
The bug of duplicate ports is still triggered when LUInstanceRemove()
invokes _RemoveDisks() and this returns False (in case
call_blockdev_remove RPC fails). The drbd ports get returned in the
pool, but execution is aborted and RemoveInstance() is never invoked.
Due to the fact that port handling is not done with
TemporaryReservationManager, ensure that ports are released, only if
disk related config data is deleted.
In _RemoveDisks() release ports only if all RPCs succeed.
Extend _RemoveDisks() to include ignore_failures argument passed by
_RemoveInstance() to handle the ports appropriately.
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Dimitris Aragiorgis [Fri, 30 Mar 2012 09:47:14 +0000 (11:47 +0200)]
Fix a bug concerning TCP port release
Commit f396ad8 returns the TCP port used by DRBD disk back to the
TCP/UDP port pool using AddTcpUdpPort().
However, AddTcpUdpPort() writes the config on every invocation,
using _WriteConfig(). This causes two problems:
* it causes critical errors logged by VerifyConfig(), after the DRBD
disk removal, and until the actual instance removal.
* if the code following AddTcpUdpPort() fails, the port is already
returned back the pool, which causes the port to have duplicates
(inconsistent config).
AddTcpUdpPort() is invoked in three cases:
* during InstanceRemove() through _RemoveDisks().
* during InstanceSetParams() in case of disk removal.
* during InstanceSetParams() through _ConvertDrbdToPlain().
This commit fixes the problem by removing the _WriteConfig() call from
AddTcpUdpPort(), delegate it to Update() via the
TemporaryReservationManager and ensure AddTcpUdpPort() precedes
Update().
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
[iustin@google.com: small comments adjustements]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
(cherry picked from commit
3b3b1bca566a005acd622a5b6e49528e5e3dbe85)
Michael Hanselmann [Fri, 30 Mar 2012 13:01:38 +0000 (15:01 +0200)]
Fix query unittests after converting jobs to query2
I missed these among some shelltest-related failures.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:31:46 +0000 (14:31 +0200)]
QA: Add tests for “gnt-job list”
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:37:55 +0000 (14:37 +0200)]
gnt-job list: Switch to query2
This brings “gnt-job list” up to the same level as “gnt-instance list”
with filters. Further updates will add more parameters for the most
common filters (e.g. only running jobs).
Also update the man page.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:36:12 +0000 (14:36 +0200)]
gnt-job info: Convert to query2
Otherwise detecting unavailable jobs is hard (“status” is None, is this
an error or just an unavailable job?).
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:34:16 +0000 (14:34 +0200)]
Add job support to query2 via LUXI
This enables the use of filters through query2 when listing jobs.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:32:59 +0000 (14:32 +0200)]
jqueue: Cache prepared field list in _JobChangesChecker
… instead of re-calculating it on every file change.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:30:37 +0000 (14:30 +0200)]
NEWS: Deprecate LUXI calls replaced with query2
Adding the “luxi” namespace is necessary in “sphinx_ext”.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:28:50 +0000 (14:28 +0200)]
jqueue: Convert GetInfo to query2
This rather inefficient implementation (fields are evaluated on every
call to GetInfo) is not good for WaitForJobChanges and doesn't support
filters, but that will be rectified in later patches.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:29:54 +0000 (14:29 +0200)]
query: Add definitions for job fields
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:25:48 +0000 (14:25 +0200)]
qlang.MakeFilter: Enable use of different name field
Jobs don't have a “name” field, so we must be able to control
the field used for simple filters.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 14:19:55 +0000 (16:19 +0200)]
Merge cli.FormatTimestamp and utils.FormatTime
… to some degree at least. Unittests are included.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 13:55:21 +0000 (15:55 +0200)]
constants: Don't hardcode priorities for LOCK_ATTEMPTS_TIMEOUT
Also include unittest for LOCK_ATTEMPTS_TIMEOUT.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 13:41:36 +0000 (15:41 +0200)]
jqueue._QueuedOpCode: Change a docstring
There was a typo and it's not necessary to repeat the class name.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 12:32:03 +0000 (14:32 +0200)]
locking: Remove unused OldStyleQueryLocks
No longer used after commit
090377807.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Fri, 30 Mar 2012 10:44:51 +0000 (12:44 +0200)]
Fix extra whitespace
Sorry, didn't catch this before…
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Dimitris Aragiorgis [Tue, 13 Mar 2012 14:48:58 +0000 (16:48 +0200)]
Fix a bug concerning TCP port release
Commit f396ad8 returns the TCP port used by DRBD disk back to the
TCP/UDP port pool using AddTcpUdpPort().
However, AddTcpUdpPort() writes the config on every invocation,
using _WriteConfig(). This causes two problems:
* it causes critical errors logged by VerifyConfig(), after the DRBD
disk removal, and until the actual instance removal.
* if the code following AddTcpUdpPort() fails, the port is already
returned back the pool, which causes the port to have duplicates
(inconsistent config).
AddTcpUdpPort() is invoked in three cases:
* during InstanceRemove() through _RemoveDisks().
* during InstanceSetParams() in case of disk removal.
* during InstanceSetParams() through _ConvertDrbdToPlain().
This commit fixes the problem by removing the _WriteConfig() call from
AddTcpUdpPort(), delegate it to Update() via the
TemporaryReservationManager and ensure AddTcpUdpPort() precedes
Update().
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
[iustin@google.com: small comments adjustements]
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 17:00:49 +0000 (19:00 +0200)]
ganeti.initd: Add “status” action
Eric Rostetter sent a patch adding a “status” action, but unfortunately
his code was apparently specific to Red Hat. I hope this implementation
is more distribution-agnostic; after all “status_of_proc” is part of
LSB. Example output:
$ /etc/init.d/ganeti status
ganeti-noded is not running ... failed!
ganeti-masterd is running.
ganeti-rapi is running.
ganeti-confd is running.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 16:11:04 +0000 (18:11 +0200)]
Add whitelist for opcodes using BGL
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Dimitris Aragiorgis [Fri, 23 Mar 2012 19:29:55 +0000 (21:29 +0200)]
Update IP pool management design doc
Update IP pool managenent design doc to be consistent
with the implementation.
* Add new NIC parameter: 'network'
Can be None for backwards compatibility.
* Introduce the term 'netparams'
The NIC inherits netparams (mode, link) as its nicparams
if assinged to a network. Netparams are defined during
network connection to a nodegroup.
* Introduce the term 'Conflicting IPs'
Ensure IPs uniqueness inside nodegroups.
* Update 'Hooks' section.
* Update 'Hook variables' section
* Update 'Userland interface' to reflect the implementation
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Wed, 28 Mar 2012 16:04:20 +0000 (18:04 +0200)]
Merge branch 'stable-2.5' into devel-2.5
* stable-2.5:
LUOobCommand: acquire BGL in shared mode
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Bernardo Dal Seno [Wed, 28 Mar 2012 11:42:46 +0000 (13:42 +0200)]
LUOobCommand: acquire BGL in shared mode
Fixed a typo so that now LUOobCommand acquires the BLG in shared mode, as
intended.
Signed-off-by: Bernardo Dal Seno <bdalseno@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
René Nussbaumer [Mon, 26 Mar 2012 15:15:55 +0000 (17:15 +0200)]
LUNodeAdd: Make the version call only dependend on DNS
Also move the version check into prereq to abort before alter cluster
state if the version mismatch.
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
René Nussbaumer [Mon, 26 Mar 2012 15:06:02 +0000 (17:06 +0200)]
RPC: Add a new client type for DNS only
This patch moves the “call_version” to a new RPC client definition and
then adds a new runner using the DNS resolver for getting the host
address.
The standard “BootstrapRunner”, where the call was before, tries to
resolve node names using ssconf first, which doesn't work properly when
re-adding a node with a new primary IP address.
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 27 Mar 2012 09:27:29 +0000 (11:27 +0200)]
Update install document
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Mar 2012 09:27:22 +0000 (11:27 +0200)]
Update admin doc
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Mar 2012 09:27:09 +0000 (11:27 +0200)]
Update walkthrough document
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Fri, 23 Mar 2012 09:17:56 +0000 (09:17 +0000)]
Update default instance kernel version
We switch from vmlinuz-2.6-… to vmlinuz-3-…. To do this nicely, we
also add a ./configure-time setting for the KVM instance kernel.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Fri, 23 Mar 2012 08:58:02 +0000 (08:58 +0000)]
Update INSTALL and devnotes documents
Added the new Haskell library requirements, for both normal and
developer usage.
Furthermore, all commands are now converted to the shell-example
lexer.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Mar 2012 09:00:35 +0000 (11:00 +0200)]
Fix escaping of percent signs in the shell lexer
Of course, we do have cases where we want to escape the percent signs,
and our regexes were not fully correct for this case.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Mon, 26 Mar 2012 12:57:26 +0000 (14:57 +0200)]
Add a special lexer for sphinx/pygments
This will be used throughout our docs for better formatting example
shell sessions, with custom markup for comments, user fixed input and
user variable input.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Fri, 23 Mar 2012 14:59:06 +0000 (15:59 +0100)]
ganeti.7: Add more filter examples
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Sun, 25 Mar 2012 14:32:35 +0000 (16:32 +0200)]
Enable -Werror by default for htools
Since the code base is now "clean" across all supported GHC versions
(6.12-7.4), we can enable -Werror again.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Sun, 25 Mar 2012 14:26:02 +0000 (16:26 +0200)]
Switch to new-style exception handling
Currently, we're using Prelude.catch to handle I/O errors in
htools. This style of error handling has been deprecated for a while,
but it still used to work without warnings.
However, the GHC release 7.4 has started to emit deprecation warnings
for it, so we change to the Control.Exception module; the code is a
bit less clean since we only care about I/O errors (but
Control.Exception deals with other error types too), so we have to
filter the exceptions.
Note that the new style exception handling is not really "new"; it has
existed since at least GHC 6.12, which is our oldest supported
compiler.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Sat, 24 Mar 2012 23:21:15 +0000 (00:21 +0100)]
Change a type computation for compatibility with 6.12
This is the last warning related to TemplateHaskell that was 6.12
specific; for some reason, it doesn't "see" that traw/tname were used.
The patch just replaces the quoting syntax with an explicit type
declaration; while less readable, it also has less warnings.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Sat, 24 Mar 2012 22:34:42 +0000 (23:34 +0100)]
Fix compatibility with TemplateHaskell from GHC 7.4
GHC 7.4 has updated the TemplateHaskell library, and it turns out that
the way we built the JSON instance implementation for showJSON was not
good (probably this is why GHC 6.12 was generating some warnings).
The patch changes the build of showJSON to be the same as readJSON,
which was working fine. As a bonus, this fixes both the 7.4 issue and
the 6.12 one.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Fri, 23 Mar 2012 16:32:57 +0000 (17:32 +0100)]
Add trivial tests for gnt-* cli
While testing some other stuff, I realised that the gnt-* commands
could be broken (as in, the script fails with syntax errors), but make
check doesn't detect it. Since we have shelltest, we can now add
trivial tests for this case.
One downside is that starting the scripts seems to be much slower
than the htools binaries, so we can't add as many tests.
The other downside is that shelltest is now required for all
development work, but I think this is a small disadvantage compared to
the increased testing possibilities.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Fri, 23 Mar 2012 09:08:41 +0000 (09:08 +0000)]
Fix hardcoded Xen kernel path
We already have a ./configure-time variable for this, but it seems to
be actually unused.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 22 Mar 2012 21:00:06 +0000 (21:00 +0000)]
Fix docstring bug
Fix a typo introduced in commit
c85b15c1, which breaks epydoc.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Thu, 22 Mar 2012 23:30:28 +0000 (23:30 +0000)]
Enable selection between Python and Haskell confd
This patch changes configure.ac Makefile.am so that the user can pass:
- --disable-confd (or --enable-confd=no) to disable it completely
- --enable-confd=yes or --enable-confd=python to select the
traditional implementation (this is the default setting)
- --enable-confd=haskell to select hconfd
The only "not nice" thing is that I've chosen to keep the
hconfd.hs/hconfd name, and we rename it after install via an
install-exec-hook. The other choice is possible too (to rename the
source file/binary).
One additional note is that if we select haskell, the _rule_ for
creating daemons/ganeti-confd dissapears; whereas if we select python,
the rule for htools/hconfd still exists (one can build it explicitly),
it just is not installed. This is due to the different way in which
the rules are declared.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Thu, 22 Mar 2012 22:59:00 +0000 (22:59 +0000)]
Fix qemu-img configure.ac check
By accident, commit a002ed7 introduced the qemu-img checks in the
htools block. I found this also by mistake while investigating
another issue :)
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Guido Trotter [Fri, 23 Mar 2012 12:39:41 +0000 (12:39 +0000)]
Merge branch 'stable-2.5' into devel-2.5
* stable-2.5:
LUNodeAdd: Verify version in Prereq
Fix LV status parsing to accept newer LVM
Bump version for 2.5.0~rc6 release
Revert "Stop acquiring BGL for LUXI queries"
LUClusterVerifyConfig: Share BGL, acquire all locks in shared mode
KVM: don't add -nographic using spice
Stop acquiring BGL for LUXI queries
Fix type error in LUInstanceChangeGroup
Conflicts:
lib/hypervisor/hv_kvm.py
- trivial, keep both changes
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
René Nussbaumer [Fri, 23 Mar 2012 11:18:15 +0000 (12:18 +0100)]
LUNodeAdd: Verify version in Prereq
There are other ways to leave the cluster in a broken state than just
the version check. However they are not very trivial to fix in 2.5. So
leave it up to 2.6 for a nicer fix.
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
(cherry picked from commit
e2ea8de1663b9a49219f2ea0709653b424384436)
René Nussbaumer [Fri, 23 Mar 2012 11:18:15 +0000 (12:18 +0100)]
LUNodeAdd: Verify version in Prereq
There are other ways to leave the cluster in a broken state than just
the version check. However they are not very trivial to fix in 2.5. So
leave it up to 2.6 for a nicer fix.
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 22 Mar 2012 19:16:20 +0000 (19:16 +0000)]
Fix LV status parsing to accept newer LVM
LVM version 2.02.93 (or at least, sometimes after .88) has extend the
lv_attr field with two more flag; we only care about the first digit,
so let's change the "!= 6" check to "< 6".
Thanks to Robin H Johnson <robbat2@gentoo.org> for finding this issue.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Thu, 22 Mar 2012 17:23:08 +0000 (18:23 +0100)]
gnt-instance info: Show node group information
This requires acquiring the node group locks in shared mode.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 22 Mar 2012 16:56:38 +0000 (17:56 +0100)]
cmdlib: Factorize checking acquired node group locks
The “cur_group_uuid” parameter is optional to prepare for using the
factorized code from LUInstanceQueryData.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Thu, 22 Mar 2012 12:34:53 +0000 (12:34 +0000)]
Rework exit model
While updating the confd code, I realised that we have _lots_ of
duplication in the exit model for the various programs.
So this patch attempts to abstract all the exits via a couple of new
functions; sorry for the somewhat big patch, but I hope the payoff is
worth the change: the actual exit conditions are much clearer.
Note that the patch (also) moves the exitIfBad function to Utils.hs,
since that is more logical.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 22 Mar 2012 15:26:02 +0000 (16:26 +0100)]
Bump version for 2.5.0~rc6 release
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Thu, 22 Mar 2012 14:55:01 +0000 (14:55 +0000)]
Fix out-of-tree builds
The new shell tests do not succeed out-of-tree, due to static paths
and other issues. This trivial patch fixes these issue, make distcheck
now passes.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 22 Mar 2012 14:13:36 +0000 (15:13 +0100)]
cmdlib: Stop forking in LUClusterQuery
While debugging another issue we realized that LUClusterQuery forks.
This turned out to be the “platform.architecture” function from the
Python library. It uses the “file” command to determine the architecture
of the Python binary.
This patch adds two new functions to the “runtime” module to get this
information once per process instead of doing it every single time
LUClusterQuery is used. Forking is a no-go in a multi-threaded
environment anyway.
A future change will also have to change the terminology in “gnt-cluster
info”: it reports the binary architecture simply as “architecture”, when
it's actually the binaries' architecture. Kernel and userland can be
different.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Iustin Pop [Wed, 21 Mar 2012 01:06:29 +0000 (01:06 +0000)]
Convert manual shell tests to shelltestrunner
This is more of a RFC… Basically most of the shell-based tests are
converted from exec+grep to shelltestrunner.
Things are not all fine and nice though:
- we have dependencies between tests, as some generate some data files
needed later; this is not nice, and we depend on serial execution in
testrunner
- we can still fail with no so nice messages in the offline-test
script (when we generate most of the data)
But overall, I think the tests are much nicer to
define/read/debug:
- each test is standalone, with the only dependency being an optional
input data file; this is much better than a single monolithic shell
script
- in case of failures, the failure is clearly shown by shell test,
both for exit code and stdout/stderr
- shelltest can run in --debug mode, where the exact details are shown
much better than the alternative of "set -x" for the shell script
Comments welcome!
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 21 Mar 2012 16:48:07 +0000 (16:48 +0000)]
Add command line option for controlling syslog use
… and enable it in hconfd.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Wed, 21 Mar 2012 15:08:28 +0000 (15:08 +0000)]
Add support for syslog logging to Ganeti.Logging
Currently this is initialised to no from Daemon.hs, but will in the
future allow command-line options for controlling it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Michael Hanselmann [Wed, 21 Mar 2012 15:21:36 +0000 (16:21 +0100)]
locking: Notify only once on release
Don't notify for every released lock in shared mode. The last one is
enough.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Michael Hanselmann [Wed, 21 Mar 2012 13:55:22 +0000 (14:55 +0100)]
locking: Handle spurious notifications on lock acquire
This was already a TODO since the implementation of lock priorities in
September 2010. Under certain conditions a waiting acquire can be
notified at a time when it can't actually get the lock. In this case it
would try and fail to acquire the lock and then return to the caller
before the timeout ends.
While this is not bad (nothing breaks), it isn't nice either. A separate
patch will prevent unnecessary notifications when shared locks are
released.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Michael Hanselmann [Wed, 21 Mar 2012 13:54:39 +0000 (14:54 +0100)]
locking: Fix lock deletion with timeout
While working on another SharedLock fix I realized timeouts on lock
deletion don't work very well if the timeout actually expires. This
patch fixes the issue and adds a new unittest.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Andrea Spadaccini [Wed, 28 Sep 2011 14:56:22 +0000 (15:56 +0100)]
Move _TimeoutExpired to utils
Signed-off-by: Andrea Spadaccini <spadaccio@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
(cherry picked from commit
f8326fcaac87958241d78526e5868d23d78ac286)
Michael Hanselmann [Wed, 21 Mar 2012 17:11:48 +0000 (18:11 +0100)]
Revert "Stop acquiring BGL for LUXI queries"
This reverts commit
0fa753bad2cf5a0cf88953347e5da3aebbf21956.
Turns out there are more queries acquiring locks than we'd like. This
patch goes to version 2.6 and a separate patch fixes the immediate
issues in LUClusterVerifyConfig.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Michael Hanselmann [Wed, 21 Mar 2012 15:59:17 +0000 (16:59 +0100)]
LUClusterVerifyConfig: Share BGL, acquire all locks in shared mode
Instead of acquiring the BGL in exclusive mode (which blocks all other
operations), we acquire all locks for groups, nodes and instances in
shared mode before verifying the configuration.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Guido Trotter [Wed, 21 Mar 2012 14:58:05 +0000 (14:58 +0000)]
KVM: don't add -nographic using spice
This fixes issue 222.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 21 Mar 2012 11:45:42 +0000 (11:45 +0000)]
Only build hconfd if --enable-confd was passed
A later, more complete patch, will allow selecting between either the
Python version or the Haskell version. This is just a temporary
solution to help building without all the needed Haskell libraries.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 20 Mar 2012 13:13:11 +0000 (13:13 +0000)]
Build test helpers that point to hpc-htools
Instead of using just shell constructs to run hpc-htools correctly
(i.e. HTOOLS=role htools/hpc-htools …), let's add some shell fragments
that do this for us.
This will ease the running of tests directly.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Tue, 20 Mar 2012 10:48:08 +0000 (11:48 +0100)]
Allow hail to read data from stdin
This patch makes hail treat '-' as denoting stdin, per the usual Unix
convention. This will help with testing.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Mon, 19 Mar 2012 17:02:56 +0000 (18:02 +0100)]
Update hconfd bind address handling
Instead of hardcoded IPv4 INADDR_ANY, this patch changes hconfd to use
either the any network for the configured cluster address family
(ipv4/ipv6), or whatever the user passes in via the --bind option.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 19 Mar 2012 17:01:52 +0000 (18:01 +0100)]
Add the bind-address option
This implements the same logic as the Python code: if the option is
not used, use the default appropriate for the cluster, otherwise try
to parse and use whatever was passed in.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Tue, 20 Mar 2012 17:57:33 +0000 (17:57 +0000)]
Add two utility functions
These both are work with/on the Result type, so we add them to
BasicTypes. The functions will be used as more generic versions of
some more specialised functions that are right now spread across the
modules.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Mon, 19 Mar 2012 16:47:45 +0000 (17:47 +0100)]
Add skeleton ssconf module
This currently has only one export function in it, which will be used
for future bind address functionality in daemons.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Tue, 20 Mar 2012 16:57:12 +0000 (17:57 +0100)]
Stop acquiring BGL for LUXI queries
Short description: This fixes an issue whereby masterd would become
unresponsive on the LUXI socket, leading to client timeouts. While made
worse in 2.5, the underlying issue was already present in 2.4.
Longer description: Until now all LUXI queries would acquire the BGL
(big Ganeti lock) in shared mode. With the exception of OpNodeAdd and
OpNodeRemove, this was also the case for all opcodes before version 2.5.
In 2.5 we split OpClusterVerify into multiple opcodes, one of which
(OpClusterVerifyConfig) now acquires the BGL in exclusive mode. Whether
or not doing so is good is a separate discussion: OpNodeAdd and
OpNodeRemove, as of this writing, still require an exclusive BGL.
OpClusterVerifyConfig is run more often than OpNodeAdd or OpNodeRemove
in normal clusters, which is why we only recognized this issue in 2.5.
What would happen is that once OpClusterVerifyConfig tried to acquire
its exclusive BGL while it was actually held by other opcodes (e.g.
OpInstanceReplaceDisks), the locking code would not grant shared
acquires for the BGL, even when the exclusive acquire is removed from
the queue for a short amount of time after a timeout. This is necessary
to prevent lock starvation.
In this situation further LUXI queries requiring the BGL in shared mode,
e.g. OpClusterQuery, would block and the client eventually time out.
Over time they fill the client request workerpool's queue and at that
point even requests not requiring the BGL stop working. Once the
long-running operation(s) holding the BGL in shared mode finished,
OpClusterVerifyConfig gets it in exclusive mode and everything returns
to normal. LUXI recovers very soon too.
I'd like to thank Bernardo Dal Seno for his contribution to this bugfix.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Bernardo Dal Seno <bdalseno@google.com>
Alexander Schreiber [Tue, 20 Mar 2012 11:00:00 +0000 (12:00 +0100)]
Typo fix: s/aditional/additional/
Trivial fix for a typo in message output of LUInstanceSetParams
Signed-off-by: Alexander Schreiber <als@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Mon, 19 Mar 2012 15:41:22 +0000 (16:41 +0100)]
Fix exported constants
I "forgot" to run the unittests before commit :(
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>