Manuel Franceschini [Mon, 2 Aug 2010 17:26:07 +0000 (19:26 +0200)]
Support IPv6 cluster init
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Manuel Franceschini [Mon, 2 Aug 2010 16:10:38 +0000 (18:10 +0200)]
Add primary_ip_family to ssconf
Since this parameter will be used on all daemon startups, it needs to be
available on all nodes. This is achieved by querying it via ssconf. This
patch additionally adds a getter method to readily retrieve the primary
ip family from a ConfigWriter object.
This patch also disables the 'R0904: Too many public methods' pylint
warning, as it crosses the 50 methods limit.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Manuel Franceschini [Mon, 2 Aug 2010 16:03:21 +0000 (18:03 +0200)]
Add new cluster parameter primary_ip_version
We expose the ip_version (4, 6) to the external interface and internally
we convert it to ip_family (AF_INET=2, AF_INET6=10). This makes the code
more concise as all functions deal with family rather than version.
This patch does not yet expose this parameter via gnt-cluster, but only uses
the constant IP4_VERSION. This will be enabled in a future patch.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Manuel Franceschini [Thu, 19 Aug 2010 08:26:25 +0000 (10:26 +0200)]
netutils: make re class attribute and catch IndexError
These missing changes were initially agreed upon but then forgotten.
First, we move the valid name regex to the class-level such that it
won't be compiled for every invocation of GetIP() and we wrap the result
of getaddrinfo() into a try/except to catch a possible IndexError.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 18 Aug 2010 16:59:28 +0000 (17:59 +0100)]
Merge branch 'devel-2.2'
* devel-2.2:
RAPI client: Support modifying instances
RAPI: Allow modifying instance
Small fixes for instance creation via RAPI documentation
gnt-debug: Extend job queue tests
jqueue: Mark opcodes following failed ones as failed, too
jqueue: Work around race condition between job processing and archival
jqueue: More checks for cancelling queued job
errors: Function to check whether value is encoded error
jqueue: Add more debug output
gnt-backup: Pass error code to OpPrereqError
Fix --master-netdev arg name in gnt-cluster(8)
Restore 'tablet mouse on vnc' behavior
Document the usb_mouse hv parameter
Revert "Add -usbdevice tablet to KVM when using vnc"
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 18 Aug 2010 16:44:12 +0000 (17:44 +0100)]
Merge branch 'devel-2.1' into devel-2.2
* devel-2.1:
Fix --master-netdev arg name in gnt-cluster(8)
Restore 'tablet mouse on vnc' behavior
Document the usb_mouse hv parameter
Revert "Add -usbdevice tablet to KVM when using vnc"
Conflicts:
man/gnt-instance.sgml
- merge
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Manuel Franceschini [Wed, 18 Aug 2010 08:58:41 +0000 (10:58 +0200)]
Fix some small newline style issues
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 17 Aug 2010 16:50:37 +0000 (18:50 +0200)]
RAPI client: Support modifying instances
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 17 Aug 2010 16:50:11 +0000 (18:50 +0200)]
RAPI: Allow modifying instance
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 17 Aug 2010 15:16:38 +0000 (17:16 +0200)]
Small fixes for instance creation via RAPI documentation
- Inconsistencies
- Missing types
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Aug 2010 10:22:43 +0000 (12:22 +0200)]
gnt-debug: Extend job queue tests
Test multiple opcodes, also with failure.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 17 Aug 2010 13:52:11 +0000 (15:52 +0200)]
jqueue: Mark opcodes following failed ones as failed, too
When an opcode fails, the job queue would leave following opcodes as “queued”,
which can be quite confusing. With this patch, they're all marked as failed and
assertions are added to check this.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Tue, 17 Aug 2010 13:33:52 +0000 (15:33 +0200)]
jqueue: Work around race condition between job processing and archival
This is a simplified version of a patch I sent earlier to make sure the job
file is only written once with a finalized status.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Manuel Franceschini [Fri, 13 Aug 2010 09:32:15 +0000 (11:32 +0200)]
rapi.client, http.client: Format url correctly when using IPv6
This patch moves the FormatAddress helper function from daemon.py to
netutils.py. This enables its use in http.client as well as in
rapi.client. Furthermore this adds functionality to format IPv6
addresses according to RFC 3986.
It is required for use of literal IPv6 addresses in URLs in pycurl.
For some reason it worked also without the bracketing ("["<address>"]"),
but we do not want to rely on that.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Manuel Franceschini [Fri, 16 Jul 2010 14:23:07 +0000 (16:23 +0200)]
Support IPv6 in lib/http/server.py
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Manuel Franceschini [Wed, 28 Jul 2010 13:41:22 +0000 (15:41 +0200)]
Support for resolving hostnames to IPv6 addresses
This patch enables IPv6 name resolution by using socket.getaddrinfo
instead of socket.gethostbyname_ex.
It renames the HostInfo class to Hostname and unifies its use throughout
the code. This is achieved by using static calls where no object is
needed and removes some obsolete code.
For now, we just resolve to IPv4 addresses, but this will change once it
is needed.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Manuel Franceschini [Fri, 2 Jul 2010 11:24:48 +0000 (13:24 +0200)]
Always use address instead of hostname in rpc.Client
In light of the upcoming IPv6 support, this patch enables the rpc.Client
to always use a node's address to connect to it. This is necessary as we
do not want to rely on name resolution to connect to the correct IP
address on a dual-stack machine.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Manuel Franceschini [Fri, 2 Jul 2010 09:16:28 +0000 (11:16 +0200)]
cluster init: Write ssconf before noded starts
This change is needed as we will need to read the primary ip version
cluster parameter before we start the node daemon. The reason is that we
need to know in advance if we bind to the IPv4 or IPv6 any address.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Manuel Franceschini [Wed, 14 Jul 2010 12:23:53 +0000 (14:23 +0200)]
Introduce new IPAddress classes
This patch unifies the netutils functions dealing with IP addresses to
three classes:
- IPAddress: Common IP address functionality
- IPv4Address: IPv4 specific functionality
- IPv6address: IPv6-specific functionality
Furthermore it adds methods to check whether an address is a loopback
address, replacing the .startswith("127") for IPv4 and adding IPv6
support.
It also provides the basis for future IPv6 address handling. Methods to
convert IP strings to their corresponding interger values will allow to
canonicalize IPv6 addresses.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Guido Trotter [Wed, 21 Jul 2010 15:31:33 +0000 (16:31 +0100)]
Add template 2.3 design doc
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Fri, 13 Aug 2010 17:18:36 +0000 (19:18 +0200)]
jqueue: More checks for cancelling queued job
We can also check when the lock status is updated. This will
improve job cancelling.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Aug 2010 17:19:07 +0000 (19:19 +0200)]
errors: Function to check whether value is encoded error
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 12 Aug 2010 11:08:46 +0000 (13:08 +0200)]
jqueue: Add more debug output
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Mon, 16 Aug 2010 13:53:41 +0000 (15:53 +0200)]
gnt-backup: Pass error code to OpPrereqError
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Tue, 17 Aug 2010 12:42:58 +0000 (14:42 +0200)]
Merge branch 'devel-2.1'
* devel-2.1:
Fix path in ganeti-rapi man page
Adjust message in case ~/.ssh is no directory
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 17 Aug 2010 12:18:34 +0000 (14:18 +0200)]
Re-add the 'live' parameter to migration opcodes
This patch reintroduces the live parameter, for backwards compatibility
at the Luxi level. This way, clients can work transparently with both
2.1 and 2.2, even though sub-optimally.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Thu, 12 Aug 2010 15:50:47 +0000 (11:50 -0400)]
Fix --master-netdev arg name in gnt-cluster(8)
This fixes Issue 114.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 9 Aug 2010 18:30:17 +0000 (14:30 -0400)]
Restore 'tablet mouse on vnc' behavior
We needed to revert commit
5b062a58ac76b39c2dc6a7e1543affdf43dc7ee7
because it was in conflict with the usb_mouse hv parameter. Here we
reintroduce its functionality only when usb_mouse is not specified.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 9 Aug 2010 16:19:07 +0000 (12:19 -0400)]
Document the usb_mouse hv parameter
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Guido Trotter [Mon, 9 Aug 2010 16:11:07 +0000 (12:11 -0400)]
Revert "Add -usbdevice tablet to KVM when using vnc"
This reverts commit
5b062a58ac76b39c2dc6a7e1543affdf43dc7ee7.
This fixes issue 109. The mouse type can be set with the usb_mouse
hv parameter.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Mon, 16 Aug 2010 13:54:26 +0000 (15:54 +0200)]
Fix path in ganeti-rapi man page
This takes care of issue 116.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 13 Aug 2010 10:26:31 +0000 (12:26 +0200)]
workerpool: Don't keep reference to task arguments
The workerpool should not keep any reference to task arguments after
they were processed by RunTask. Doing so led to jobs being cached
by the job queue's WeakValueDictionary for longer than they should've
been.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>
Michael Hanselmann [Fri, 13 Aug 2010 10:21:28 +0000 (12:21 +0200)]
cli.SubmitOpCode: Pass keyword parameter as keyword
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>
Michael Hanselmann [Tue, 10 Aug 2010 15:54:42 +0000 (17:54 +0200)]
gnt-backup: Don't show confusing message w/o target node
“gnt-backup export” requires the target node. Until now, the master
daemon would complain that the “parameter
'OP_BACKUP_EXPORT.target_node' fails validation”. With this patch,
an additional check is done in the client program.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>
Michael Hanselmann [Tue, 10 Aug 2010 14:41:50 +0000 (16:41 +0200)]
masterd.instance: Add missing argument
_DiskTransferPrivate takes three parameters, not two.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>
Michael Hanselmann [Tue, 10 Aug 2010 13:51:11 +0000 (15:51 +0200)]
Adjust message in case ~/.ssh is no directory
Use actual path, not something hardcoded.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Fri, 30 Jul 2010 18:47:50 +0000 (20:47 +0200)]
RAPI client: Fix docstring for migrating instance
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Jul 2010 17:52:43 +0000 (19:52 +0200)]
QA: Test renaming instance via RAPI
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Jul 2010 17:33:31 +0000 (19:33 +0200)]
RAPI client: Support renaming instances
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Jul 2010 17:12:32 +0000 (19:12 +0200)]
Allow renaming instances via RAPI
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Jul 2010 18:44:07 +0000 (20:44 +0200)]
RAPI client: Don't re-use PycURL object
With this patch, a new PycURL object will be created for each request.
This should make the RAPI client safe for simultaneous calls from
multiple threads. Unittests are adjusted accordingly.
An unnecessary variable assignment is also removed from the unittest
script.
This patch survived a small QA and unittests.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Manuel Franceschini [Mon, 9 Aug 2010 15:49:23 +0000 (17:49 +0200)]
Add --no-name-check to 'gnt-instance rename' man page
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Manuel Franceschini [Thu, 5 Aug 2010 12:56:12 +0000 (14:56 +0200)]
Fix bug in bdev when drbd version format is x.x.x.x
This patch fixes a bug reported in [0]. Newer drbd versions can have
another digit beyond the regular major, minor and point release digits.
We modify the regex used to match that with an optional parts which is
not saved.
Furthermore it adds unittests that test for these different cases. Now
the data read from /proc is passed into the _GetVersion method, which
makes testing easier.
[0] http://code.google.com/p/ganeti/issues/detail?id=110
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Fri, 30 Jul 2010 16:11:12 +0000 (12:11 -0400)]
Bump version to 2.2.0~rc0
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Fri, 30 Jul 2010 14:32:56 +0000 (16:32 +0200)]
move-instance: Use constants for parameters
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Jul 2010 14:32:33 +0000 (16:32 +0200)]
Allow instance NIC's IP address to be None
Also add some assertions.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Jul 2010 14:32:00 +0000 (16:32 +0200)]
Test instance NIC and disk parameter names
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 30 Jul 2010 14:31:26 +0000 (16:31 +0200)]
Add new parameter type “maybe string”
Before strict checking was implemented, NIC IP addresses could be set
to “None”. Commit
bd061c35 added more strict checking, including
enforcing the IP address to be a string. With this new type, it
can again be set to None.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 29 Jul 2010 15:55:38 +0000 (17:55 +0200)]
cmdlib: Change expected type for source CA on remote import
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 29 Jul 2010 15:55:14 +0000 (17:55 +0200)]
move-instance: Pass OS parameters to new instance
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Fri, 30 Jul 2010 14:15:48 +0000 (10:15 -0400)]
Update NEWS file for the first release candidate
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 29 Jul 2010 23:00:19 +0000 (19:00 -0400)]
Fix a few job archival issues
This patch fixes two issues with job archival. First, the
LoadJobFromDisk can return 'None' for no-such-job, and we shouldn't add
None to the job list; we can't anyway, as this raises an exception:
node1# gnt-job archive foo
Unhandled protocol error while talking to the master daemon:
Caught exception: cannot create weak reference to 'NoneType' object
After fixing this, job archival of missing jobs will just continue
silently, so we modify gnt-job archive to log jobs which were not
archived and to return exit code 1 for any missing jobs.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 29 Jul 2010 22:37:10 +0000 (18:37 -0400)]
burning: fix handling of empty job sets
If we call burning with only existing instance, then it will fail to
create any of them, and thus in the removal phase it won't have anything
to remove. Since calling luxi.SUBMIT_MULTIPLE_JOBS with an empty job set
is an error (and will raise an exception), this creates a very strange
error in burnin (which is unfortunately hidden by ExecJobSet()).
As such, we modify CommitQueue to return immediately if it has an empty
op queue.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 29 Jul 2010 22:13:58 +0000 (18:13 -0400)]
Change semantics of --force-multi for reinstall
Currently, we require both --force and --force-multiple for skipping the
confirmation on instance reinstalls. After offline conversations, this
has been deemed to be excessive, and this patch changes the meaning of
--force-multiple to be a “stronger” force, and not require both.
So, to skip the prompts:
- single instance reinstallation requires either --force or
--force-multiple
- multiple instance reinstallation requires --force-multiple
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 29 Jul 2010 21:14:19 +0000 (17:14 -0400)]
Change handling of non-Ganeti errors in jqueue
Currently, if a job execution raises a Ganeti-specific error (i.e.
subclass of GenericError), then we encode it as (error class, [error
args]). This matches the RAPI documentation.
However, if we get a non-Ganeti error, then we encode it as simply
str(err), a single string. This means that the opresult field is not
according to the RAPI docs, and thus it's hard to reliably parse the
job results.
This patch changes the encoding of a failed job (via failure) to always
be an OpExecError, so that we always encode it properly. For the command
line interface, the behaviour is the same, as any non-Ganeti errors get
re-encoded as OpExecError anyway. For the RAPI clients, it only means
that we always present the same type for results. The actual error value
is the same, since the err.args is either way str(original_error);
compare the original (doesn't contain the ValueError):
"opresult": [
"invalid literal for int(): aa"
],
with:
"opresult": [
[
"OpExecError",
[
"invalid literal for int(): aa"
]
]
],
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 29 Jul 2010 21:41:24 +0000 (17:41 -0400)]
Implement gnt-cluster master-ping
This can be used from shell-scripts to quickly check the status of the
master node, before launching a series of jobs (and handling the failure
of the jobs due to masterd other issues).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Thu, 29 Jul 2010 18:41:09 +0000 (14:41 -0400)]
Instance migration: remove error on missing link
Since we don't support upgrades from 1.2.4 without restarting the
instance, the 'not restarted since 1.2.5' check/error is
wrong/misleading.
Since the live migration works anyway without the links (it recreates
them during the disk reconfiguration anyway), we remove the check and we
transform it into a warning (to the node daemon log only,
unfortunately).
For 2.3, we'll need to change the symlink creation from instance start
time to disk activation time (but that requires more RPC changes).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Michael Hanselmann [Thu, 29 Jul 2010 16:08:11 +0000 (18:08 +0200)]
Add check for RAPI paths to start with /2
During a discussion in July 2010 it was decided that we'll stabilize on /2. See
message ID <
20100716180012.GA9423@google.com> for reference.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Thu, 29 Jul 2010 13:18:47 +0000 (15:18 +0200)]
Ensure assertions are evaluated in tests
A lot of assertions are used in Ganeti's code. Some unittests even check
whether AssertionError is raised in some cases. Explicitely ensuring
assertions are evaluated makes sure those tests don't fail and
assertions are checked.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
David Knowles [Tue, 20 Jul 2010 21:46:13 +0000 (17:46 -0400)]
RAPI client: The os argument for instance reinstalls is optional
Signed-off-by: David Knowles <dknowles@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Fri, 16 Jul 2010 17:44:06 +0000 (19:44 +0200)]
QA: Test instance migration via CLI and RAPI
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 16 Jul 2010 17:43:50 +0000 (19:43 +0200)]
RAPI client: Support migrating instances
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Fri, 16 Jul 2010 17:43:28 +0000 (19:43 +0200)]
RAPI: Support migrating instances
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Sat, 17 Jul 2010 21:04:32 +0000 (23:04 +0200)]
workerpool: Change signature of AddTask function to not use *args
By changing it to a normal parameter, which must be a sequence, we can
start using keyword parameters.
Before this patch all arguments to “AddTask(self, *args)” were passed as
arguments to the worker's “RunTask” method. Priorities, which should be
optional and will be implemented in a future patch, must be passed as a keyword
parameter. This means “*args” can no longer be used as one can't combine *args
and keyword parameters in a clean way:
>>> def f(name=None, *args):
... print "%r, %r" % (args, name)
...
>>> f("p1", "p2", "p3", name="thename")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: f() got multiple values for keyword argument 'name'
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Sat, 17 Jul 2010 21:00:56 +0000 (23:00 +0200)]
workerpool: Add two additional assertions
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Sat, 17 Jul 2010 20:58:09 +0000 (22:58 +0200)]
workerpool: Additional check in BaseWorker.ShouldTerminate
Document that it should only be called from within RunTask and
add an assertion for this. This means we can no longer use a
method on the pool and hence remove WorkerPool.ShouldWorkerTerminate.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Sat, 17 Jul 2010 20:56:18 +0000 (22:56 +0200)]
workerpool: Remove unused worker method
HasRunningTask is never used except for an assertion, where we
don't really need the lock.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Sat, 17 Jul 2010 20:32:43 +0000 (22:32 +0200)]
workerpool: Move waiting for new tasks for a worker to the pool
This way fewer private variables of the pool are accesssed by the worker.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Michael Hanselmann [Sat, 17 Jul 2010 20:25:27 +0000 (22:25 +0200)]
workerpool: Use common function to add tasks
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Wed, 28 Jul 2010 23:39:53 +0000 (19:39 -0400)]
Fix install document regarding DRBD usage
This is related to issue 105.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Wed, 28 Jul 2010 21:17:10 +0000 (17:17 -0400)]
Update RAPI documentation for the OS changes
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Tue, 27 Jul 2010 21:07:20 +0000 (17:07 -0400)]
Rename masterfailover to master-failover
Most (all?) of our commands use dash-separator: replace-disks,
verify-disks, add-tags, etc. “gnt-cluster masterfailover” is an old
exception to this rule.
The patch replaces it with master-failover, add a compatiblity alias,
and updates the documentation for this change.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 28 Jul 2010 18:26:04 +0000 (14:26 -0400)]
RAPI: Add os params to instance creation v1
Since the RAPI QA suite doesn't seem to offer easy testing of failed
creations, I didn't add this to the QA. Pointers on how to do it are
welcome.
The patch also changes the 'os' argument to be required, since that is
how the LU expects it, and without it we just fail later instead of
directly at submission time.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 28 Jul 2010 18:24:23 +0000 (14:24 -0400)]
makefile: fix TAGS building
“find .” requires that “-path” arguments start with a dot, otherwise
they are not matches. Additionally, we also include the QA files in the
tags, for easier search while modifying the QA suite.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 28 Jul 2010 20:08:29 +0000 (16:08 -0400)]
Improve handling of lost jobs
Currently, if the cli.JobExecutor class is being used, and one of the
jobs is being archived before it can check its result, it will raise a
stracktrace as _ChooseJob is not prepared to handle this case.
This case makes JobExecutor work better with lost jobs (it still reports
them as 'failed', but it doesn't break and returns a proper error
message), and modifies the generic FormatError to report the JobLost
exception properly, instead of as "Unhandled Ganeti Exception".
Since JobExecutor is hard to test properly, I only tested this manually,
via a fake invocation.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Wed, 28 Jul 2010 15:50:38 +0000 (11:50 -0400)]
luxi: convert permission errors into exception
This patch adds handling of permission errors so that we don't show
tracebacks when a non-root user runs a gnt-* command. Since in the
future we'll have different permissions, we need to handle this in RAPI
too.
It also fixes a typo in RAPI error message and the docstrings of LUXI
errors.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Michael Hanselmann [Mon, 19 Jul 2010 14:40:38 +0000 (16:40 +0200)]
cmdlib: Return new name from rename operations
The new name is then displayed by the clients.
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>
Manuel Franceschini [Wed, 28 Jul 2010 14:58:19 +0000 (16:58 +0200)]
gnt-instance rename: Fix bug and rename params
This patch fixes a bug when gnt-instance rename was invoked with
--no-name-check. It renames the internal variables to be consistent with
the ones in equivalent instance add code. Furthermore it checks whether
and instance rename is invoked with --no-name-check but without
--no-ip-check and throws an exception if so.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Iustin Pop [Sat, 24 Jul 2010 00:11:00 +0000 (20:11 -0400)]
QA: add tests for the reserved lvs feature
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Fri, 23 Jul 2010 23:28:29 +0000 (19:28 -0400)]
Add modification of the reserved logical volumes
This doesn't allow addition/removal of individual volumes, only
wholesale replace of the entire list. It can be improved later, if we
ever get generic container parameters.
The man page changes replaces some tabs with spaces (hence the
whitespace changes).
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Thu, 15 Apr 2010 15:08:40 +0000 (17:08 +0200)]
Add printing of reserved_lvs in cluster info
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Thu, 15 Apr 2010 15:07:03 +0000 (17:07 +0200)]
Introuce a new cluster parameter - reserved_lvs
This parameter, which is a list of regular expression patterns, will
make cluster verify ignore any such LVs. It will not prevent creation or
removal of such volumes by the backend code.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Fri, 23 Jul 2010 19:29:31 +0000 (15:29 -0400)]
Change the meaning of call_node_start_master
Currently, backend.StartMaster (the function behind this RPC call) will
activate the master IP and then, if the start_daemons parameter is true,
it will also activate the master role.
While this works, it has two issues:
- first, it will activate the master IP unconditionally, even if this
node will not start the master daemon due to missing votes
- second, the activation of the IP is done twice if start_daemons is
true, because the master daemon does its own activation too
This behaviour seems to be unmodified since Summer 2008, so probably any
rationale on why this is done in two places is forgotten.
The patch changes so that this function does *either* IP activation or
master role activation but not both. So the IP will be activated only
once (from the master daemon or from LURenameCluster), and it will only
be done if the masterd got enough votes for startup.
I can see only one downside to this change: if masterd won't actually
start (due to missing votes), RAPI will still start, and without the
master IP activated. But this is no worse than before, when both RAPI
was running and the IP was activated.
Note that the behaviour of StopMaster remains the same, as noone else
does the IP removal.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Fri, 23 Jul 2010 18:12:16 +0000 (14:12 -0400)]
masterd: move the IP activation from Exec to Check
Currently, the master IP activation is done in the Exec function. Since
the original masterd process returns after forking, and Exec is run in
the (grand)child process, this means that after 'ganeti-masterd' has
returned there are still initialization tasks running.
Normally this is not a problem, but in cases where one does quick master
failovers, this creates a race condition which hits the QA scripts
especially hard.
To solve this, and make the startup process cleaner (the system is in
steady state after the command has returned, even though masterd startup
could still fail), we move the IP activation to Check(). This also
allows error messages about the IP activation to be seen on the console.
With this patch enabled, I can no longer reproduce the double-failover
errors, which were occuring before in 4/5 cases.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Fri, 23 Jul 2010 17:51:44 +0000 (13:51 -0400)]
Move the UsesRPC decorator from cli to rpc
This is needed because not just the cli scripts need this decorator, but
the master daemon too (and it already duplicated the code once).
In cli.py we just leave a stub, so that we don't have to modify all the
scripts to import rpc.py.
We then change the master daemon code to reuse this decorator, instead
of duplicating it.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Fri, 23 Jul 2010 21:41:35 +0000 (17:41 -0400)]
watcher: smarter handling of instance records
This patch implements a few changes to the instance handling. First, old
instances which no longer exist on the cluster are removed from the
state file, to keep things clean.
Second, the instance restart counters are reset every 8 hours, since
some error cases might be transient (e.g. networking issues, or machine
temporarily down), and if the problem takes more than 5 restarts but is
not permanent, watcher will not restart the instance. The value of 8
hours is, I think, both conservative (as not to hammer the cluster too
often with restarts) and fast enough to clear semi-transient problems.
And last, if an instance is not restarted due to exhausted retries, this
should be warned, otherwise it's hard to understand why watcher doesn't
want to restart an ERROR_down instance.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Thu, 22 Jul 2010 17:54:17 +0000 (13:54 -0400)]
Update the RAPI node migrate for the 'live' change
This patch adds handling of the new 'mode' parameter to the RAPI server,
while keeping compatibility with the old mode. Note that in the old mode
(when 'live' is being passed), the auto-mode doesn't work.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Thu, 22 Jul 2010 17:42:01 +0000 (13:42 -0400)]
Update the RAPI client for the migration mode
See the discussion on the previous patch about this. Basically unless we
want to a add a new 'feature' marking for the live migration parameter,
there is no simple way to handle this nicely in the client.
Given that the client was/is marked as experimental, this patch simply
replaces live with mode. This means that this client won't work with 2.1
clusters…
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Thu, 22 Jul 2010 17:38:26 +0000 (13:38 -0400)]
Fix burnin and live migration
This is breakage from the original 'live' parameter changes.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Tue, 20 Jul 2010 16:26:44 +0000 (18:26 +0200)]
Rename the OpMigrate* parameter 'live' to 'mode'
This is needed as now the parameter is no longer boolean, but tri-state.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Iustin Pop [Mon, 19 Jul 2010 14:57:59 +0000 (16:57 +0200)]
Rename migration type to migration mode
This is in preparation for the rename of the opcode 'live' parameter to
'mode'.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Manuel Franceschini [Thu, 22 Jul 2010 16:09:59 +0000 (18:09 +0200)]
utils: Fix incorrect docstring
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Iustin Pop [Thu, 22 Jul 2010 17:21:56 +0000 (13:21 -0400)]
Merge branch 'devel-2.1' into master
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Iustin Pop [Thu, 22 Jul 2010 17:00:15 +0000 (13:00 -0400)]
Fix issue when changing the disk template to drbd
If we pass the current primary node, the conversion will fail horribly
with LVM creation errors. Instead, we catch and check for this
condition in CheckPrereq.
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Guido Trotter [Wed, 21 Jul 2010 15:27:32 +0000 (16:27 +0100)]
Remove a couple of empty design sections
The 2.1 and 2.2 designs contain sections with no actual content, as they
are detailed for each single change. Removing the global empty ones.
Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>
Manuel Franceschini [Wed, 21 Jul 2010 09:29:40 +0000 (11:29 +0200)]
Disable 'invalid name' pylint warning for tools/setup-ssh
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
Manuel Franceschini [Mon, 19 Jul 2010 19:07:57 +0000 (21:07 +0200)]
Always set commonName in X509 certificates
Due to the current switch of the RPC client to PycURL, a bug with newer
versions of libcurl surfaced. When the 'Subject' or 'Issuer' of
'server.pem' were empty, SSL handshake failed.
This patch changes the certificate generation functions such that they
always use "ganeti.example.com" as commonName (CN) for 'Subject' and
'Issuer'.
Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
René Nussbaumer [Tue, 13 Jul 2010 09:38:36 +0000 (11:38 +0200)]
Adding tool to setup SSH on a remote host
This prepares the remote node to be joined into a cluster
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
René Nussbaumer [Wed, 14 Jul 2010 09:04:35 +0000 (11:04 +0200)]
Adding new (optional) dependency to configure.ac
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
René Nussbaumer [Tue, 13 Jul 2010 14:29:49 +0000 (16:29 +0200)]
Adding constants for setup-ssh
Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>