ganeti-local
14 years agoReuse backend parameters from export
Iustin Pop [Fri, 9 Apr 2010 15:21:24 +0000 (17:21 +0200)]
Reuse backend parameters from export

Similar to the previous patches, if we're missing some parameters and
the export has them (either in the new style or old-style), we reuse
them.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoReuse disk information from export
Iustin Pop [Fri, 9 Apr 2010 15:09:04 +0000 (17:09 +0200)]
Reuse disk information from export

If the user doesn't pass the disk information on import, automatically
reuse the number and size of disks. This loses the iv_name attribute,
but that is only cosmetic and cannot be changed by the user.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoReuse hypervisor parameters in import
Iustin Pop [Fri, 9 Apr 2010 14:49:04 +0000 (16:49 +0200)]
Reuse hypervisor parameters in import

If available, we reuse the parameters from the export info.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRead disk template from export info
Iustin Pop [Fri, 9 Apr 2010 12:58:32 +0000 (14:58 +0200)]
Read disk template from export info

This patch changes the instance import to read the disk template
automatically from the export info, if the opcode doesn't already
specify a disk template.

To do this, we have a couple of additional changes:

- change from required parameter to optional one for disk_template
- move check for disabled file storage at ./configure time to the
  generic _CheckDiskTemplate checker
- move checks of the disk template from CheckArguments to CheckPrereq

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoCreateInstance: separate the reading of the export
Iustin Pop [Fri, 9 Apr 2010 12:07:38 +0000 (14:07 +0200)]
CreateInstance: separate the reading of the export

We move the reading of the export to a separate function, to simplify
CheckPrepreq and also read it earlier. This will allow building the
missing opcode parameters from the export information, instead of
requiring all of them on the command line.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMove code from ExpandNames to CheckPrereq
Iustin Pop [Fri, 9 Apr 2010 11:49:34 +0000 (13:49 +0200)]
Move code from ExpandNames to CheckPrereq

This is needed since only in CheckPrereq we have the nodes locked, and
future import enhancements will need to have access to the export info
during the parameter build.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoCreateInstance: Move some code to CheckArguments
Iustin Pop [Fri, 9 Apr 2010 11:23:39 +0000 (13:23 +0200)]
CreateInstance: Move some code to CheckArguments

ExpandNames holds too much non-locking code (first LU to be converted to
ExpandNames, and we didn't have CheckArguments at that poin), and this
patch moves the checks that are lock-independent to CheckArguments.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoExport more instance parameters in instance export
Iustin Pop [Fri, 9 Apr 2010 09:50:51 +0000 (11:50 +0200)]
Export more instance parameters in instance export

Currently the backend parameters are not exported automatically, but
only a few directly in the '[instance]' section. Hypervisor type and
hypervisor parameters are not exported at all.

This patch creates two separate sections for the be and hv parameters,
and stores the parameters (including ones that come from the cluster
defaults, but not the hypervisor globals for example) in the export.
The import code is not changed yet.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoExport the nicparams too during instance export
Iustin Pop [Fri, 29 Jan 2010 12:16:06 +0000 (13:16 +0100)]
Export the nicparams too during instance export

The patch tries to export all params (based on the dict defined in
constants), using None for missing keys.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoHandle errors better for wrong nic_count in export
Iustin Pop [Fri, 29 Jan 2010 12:14:51 +0000 (13:14 +0100)]
Handle errors better for wrong nic_count in export

This fixes an old 'FIXME' entry.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoQA: Make sure RAPI credentials are setup on cluster init
René Nussbaumer [Mon, 12 Apr 2010 12:48:19 +0000 (14:48 +0200)]
QA: Make sure RAPI credentials are setup on cluster init

This patch makes sure that the Ganeti RAPI credentials are setup,
if any, on cluster init time.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoburnin: only remove instances we actually added
Guido Trotter [Fri, 9 Apr 2010 14:06:13 +0000 (15:06 +0100)]
burnin: only remove instances we actually added

Currently burnin, if proceding in parallel, will remove all instances
which were passed, even if they failed to add. This is bad because it
will also remove instances which existed before burnin started. By
adding the instances to the removal queue only if their creation was
successful (passing the action as a post processing action to
ExecOrQueue) we guarantee pre-existing instances are saved.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoburnin.ExecOrQueue: add post-process function
Guido Trotter [Sat, 27 Mar 2010 10:37:51 +0000 (10:37 +0000)]
burnin.ExecOrQueue: add post-process function

If a post-process function is passed to ExecOrQueue it is executed if
and only if the job is successful. This happens immediately if we're
proceding iteratively, and at the end, when we collect all job results,
if we're proceding in parallel.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoburnin.ExecOrQueue: remove variable argument list
Guido Trotter [Fri, 9 Apr 2010 13:12:55 +0000 (14:12 +0100)]
burnin.ExecOrQueue: remove variable argument list

In order to later add an optional parameter we transform the variable
ops argument list in an explicit list.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix new pylint errors
Guido Trotter [Fri, 9 Apr 2010 14:46:13 +0000 (15:46 +0100)]
Fix new pylint errors

Under squeeze pylint reports the following errors:
************* Module ganeti.serializer
E1103:155:LoadSignedJson: Instance of 'False' has no 'get' member (but some types could not be inferred)
************* Module ganeti-masterd
E1103:166:ClientRqHandler.handle: Instance of 'False' has no 'get' member (but some types could not be inferred)
E1103:167:ClientRqHandler.handle: Instance of 'False' has no 'get' member (but some types could not be inferred)
************* Module gnt-instance
E1103:431:BatchCreate: Instance of 'False' has no 'keys' member (but some types could not be inferred)

For the first two cases it's actually wrong: we had checked before that
the variable on which "get" is called is actually a dict. In the third
case though such check doesn't exist, so we add it. Then we silence the
error all three times.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRename the confd_client unittest (to confd.client)
Iustin Pop [Fri, 9 Apr 2010 08:40:30 +0000 (10:40 +0200)]
Rename the confd_client unittest (to confd.client)

This is to keep same naming across all tests (modules separate with dot,
followed by _unittest.py).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoMake watcher request the max coverage
Iustin Pop [Thu, 8 Apr 2010 16:15:57 +0000 (18:15 +0200)]
Make watcher request the max coverage

Since the actions are potentially destructive, we should try to get a
consistent view of the cluster, so it's better to get the most coverage
possible.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoConfdClient.SendRequest: allow max coverage
Iustin Pop [Thu, 8 Apr 2010 16:08:51 +0000 (18:08 +0200)]
ConfdClient.SendRequest: allow max coverage

This patch changes the coverage parameter to allow specification of max
coverage (via -1), versus auto-computation (default, 0) and manual
specification.

Unittests are updated for this case too.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoDocument the watcher node maintenance feature
Iustin Pop [Tue, 23 Mar 2010 09:25:13 +0000 (10:25 +0100)]
Document the watcher node maintenance feature

The patch changes significantly the watcher man page, as it was very
simplistic.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoWatcher: automatic shutdown of orphan resources
Iustin Pop [Mon, 22 Mar 2010 15:10:03 +0000 (16:10 +0100)]
Watcher: automatic shutdown of orphan resources

This patch changes the watcher so that it maintains (on all nodes) the
list of instances and DRBD devices by shutting down ones that confd
daemons indicate should not be running on this node.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoExport the maintain_node_health option in ssconf
Iustin Pop [Tue, 23 Mar 2010 08:50:24 +0000 (09:50 +0100)]
Export the maintain_node_health option in ssconf

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd a new cluster parameter maintain_node_health
Iustin Pop [Tue, 23 Mar 2010 08:47:37 +0000 (09:47 +0100)]
Add a new cluster parameter maintain_node_health

This will be used to conditionally enable the watcher node maintenance
feature.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd a new confd callback (StoreResultCallback)
Iustin Pop [Wed, 7 Apr 2010 13:47:02 +0000 (15:47 +0200)]
Add a new confd callback (StoreResultCallback)

This new callback simply stores (without calling any lower-level
callback) the last result; coupled with the filtering callback, this
ensures that it has the 'best' response after all have been received.

The result can then be retrieved via the GetResponse method.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoConfdClient: add synchronous wait for replies mode
Iustin Pop [Wed, 7 Apr 2010 13:01:57 +0000 (15:01 +0200)]
ConfdClient: add synchronous wait for replies mode

Currently, there is no way for a user of the confd client library to
know how many replies there should be, whether all have been received,
etc. This is bad since we can't reliably detect the consistency of the
results.

This patch attempts to fix this by adding a synchronous WaitForReply
function that will wait until either a timeout expires, or until a
minimum number of replies have been received (interested users should
add similar functionality for the async case). The callback
functionality will still do call-backs into the user-provided code
during the wait, but after this function has returned, we know that we
received all possible replies.

Note: To account for the interval between initial send of the request,
and calling of this function, we modify the expiration time of the
request.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoConfdClient: unify some internal variables
Iustin Pop [Wed, 7 Apr 2010 15:16:43 +0000 (17:16 +0200)]
ConfdClient: unify some internal variables

Currently the requests are tracked in _request and in _expire_requests.
This is conventient, but it restricts the ability to extend the request
tracking, e.g. via packet stats and/or extension of expiration time.

This patch introduces a new simple class _Request that holds all
properties of pending requests; it then uses instances of this class as
values in _request instead of tuples, and removes the _expire_requests.

The only drawback is the change in behaviour of _ExpireRequests:
previously, it used to scan the list only up to the first non-expired
request, after which it aborted. Now it will scan the entire dict, which
(depending on workload) could change the time behaviour. I don't think
this is a problem, as:
- deleting from the head of a list is very expensive (list.pop(0);
  list.append() is an order of magnitude more expensive than deleting
  an element from a dictionary and re-adding it)
- we should have more than tens or hundreds of pending requests; in case
  this assumption changes, we could introduce a no-more-often-than-X
  expiration policy, etc.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix consistency checks in ConfdFilterCallback
Iustin Pop [Wed, 7 Apr 2010 13:44:54 +0000 (15:44 +0200)]
Fix consistency checks in ConfdFilterCallback

Commit 49b3fda added consistency checks, but these are wrongly triggered
for old responses - we need to make sure to check that we have the same
serial.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix utils.WaitForFdCondition inner retry loop
Iustin Pop [Wed, 7 Apr 2010 12:41:16 +0000 (14:41 +0200)]
Fix utils.WaitForFdCondition inner retry loop

Commit dfdc4060 added WaitForFdCondition which uses utils.Retry without
handling timeout exceptions. This breaks any nested retry loops.

This patch fixes the above function, and also changes utils.Retry to
detect and warn future similar cases. In addition, we add a few small
unittests for utils.Retry.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix bug introduced in 76e5f8b54: mkdir mode
Michael Hanselmann [Wed, 7 Apr 2010 13:26:55 +0000 (15:26 +0200)]
Fix bug introduced in 76e5f8b54: mkdir mode

After commit 76e5f8b54, mkdir_mode in utils.RenameFile is
no longer passed to Makedirs. This is fixed by this patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoutils: Move wrapper code around os.makedirs into separate function
Michael Hanselmann [Wed, 7 Apr 2010 11:12:22 +0000 (13:12 +0200)]
utils: Move wrapper code around os.makedirs into separate function

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix unittest for the rapi client library
Iustin Pop [Tue, 6 Apr 2010 15:38:27 +0000 (17:38 +0200)]
Fix unittest for the rapi client library

Wrong escape, so we make sure to use proper escapes (we want the
backslashes to be embedded, not interpreted). Also change " to ' to be
easier to read.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: David Knowles <dknowles@google.com>

14 years agoAdding RAPI client library.
David Knowles [Tue, 16 Mar 2010 17:21:38 +0000 (13:21 -0400)]
Adding RAPI client library.

Signed-off-by: David Knowles <dknowles@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
(modified slightly the unittest to account for
 missing httplib2 library)

14 years agoExtend ConfdFilterCallback with consistency checks
Iustin Pop [Thu, 18 Mar 2010 15:46:54 +0000 (16:46 +0100)]
Extend ConfdFilterCallback with consistency checks

Note that users of the callback will have to manually check the
attribute.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAbstract the confd client creation
Iustin Pop [Thu, 18 Mar 2010 15:56:08 +0000 (16:56 +0100)]
Abstract the confd client creation

Most creation of confd clients will do the same steps: read MC file,
parse it, read HMAC key, etc. We abstract this functionality so that
we don't duplicate the code.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRemove unused import from test file
Guido Trotter [Wed, 31 Mar 2010 15:40:56 +0000 (16:40 +0100)]
Remove unused import from test file

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agokvm_flag hypervisor parameter
Guido Trotter [Wed, 31 Mar 2010 15:37:09 +0000 (16:37 +0100)]
kvm_flag hypervisor parameter

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMove the runas user at execution time
Guido Trotter [Tue, 30 Mar 2010 15:37:02 +0000 (16:37 +0100)]
Move the runas user at execution time

Everything still works the same way, but the user is calculated each
time we start kvm, rather than stored in the config file. This makes it
easier to implement the "pool" security model.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoSend "501 Not Implemented" back when method not found
René Nussbaumer [Tue, 30 Mar 2010 14:16:04 +0000 (16:16 +0200)]
Send "501 Not Implemented" back when method not found

Before this was "400 Bad Request" and thus it didn't reflect
the reality.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdding QA RAPI tests for activate-disks and deactivate-disks calls
René Nussbaumer [Fri, 26 Mar 2010 12:56:58 +0000 (13:56 +0100)]
Adding QA RAPI tests for activate-disks and deactivate-disks calls

* This also adds support for authenticated RAPI calls
* Other HTTP methods than GET/POST

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoSerializableConfigParser: Make Loads class indep
Guido Trotter [Wed, 24 Mar 2010 15:42:39 +0000 (15:42 +0000)]
SerializableConfigParser: Make Loads class indep

Currently SerializableConfigParser.Loads is a static method that returns
a SerializableConfigParser. With this patch we change it to a class
method that returns a member of the class. This way a subclass calling
Loads on itself will get its own member, rather than a bare
SerializableConfigParser.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

14 years agoUnbreak command line job submission
Guido Trotter [Tue, 23 Mar 2010 13:07:41 +0000 (13:07 +0000)]
Unbreak command line job submission

A change introduced in 5299e61f modified the contents of
JobExecutor.jobs, missing a place where this tuple was deconstructed.
This creates a traceback in gnt-* <any> --submit, fixed by this patch.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAllow file storage to be grown
Guido Trotter [Tue, 23 Mar 2010 09:29:51 +0000 (09:29 +0000)]
Allow file storage to be grown

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoWrite grow support for file storage
Guido Trotter [Mon, 22 Mar 2010 16:16:12 +0000 (16:16 +0000)]
Write grow support for file storage

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoWatcher: fix some doc typos
Iustin Pop [Mon, 22 Mar 2010 15:21:45 +0000 (16:21 +0100)]
Watcher: fix some doc typos

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

14 years agoWatcher: do not warn for missing hooks dir
Iustin Pop [Mon, 22 Mar 2010 15:15:46 +0000 (16:15 +0100)]
Watcher: do not warn for missing hooks dir

If the hooks dir does not exist, do not warn needlessly. This is similar
to commit a9b7e346 (for backend.py).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

14 years agoExtend the hypervisor API with name-only shutdown
Iustin Pop [Mon, 22 Mar 2010 14:49:23 +0000 (15:49 +0100)]
Extend the hypervisor API with name-only shutdown

Currently the ShutdownInstance method of the hypervisors takes a full
instance object. However, when doing instance shutdowns from the node
only, we don't have a full object, just the name.

To handle this use case, we add a new ‘name’ argument to the method,
which makes the shutdown not use/rely on the ‘instance’ argument. The
KVM and fake hypervisors need a little bit of work, otherwise the change
is straightforward.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

14 years agoDistribute list of enabled hypervisors in ssconf
Iustin Pop [Mon, 22 Mar 2010 12:27:07 +0000 (13:27 +0100)]
Distribute list of enabled hypervisors in ssconf

This can be used by nodes to know which hypervisors they are supposed to
support.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

14 years agoganeti-confd: Call pyinotify flags correctly
Guido Trotter [Mon, 22 Mar 2010 16:25:27 +0000 (16:25 +0000)]
ganeti-confd: Call pyinotify flags correctly

The "apparently pylint was right" commit.

Although the pyinotify constants work on old distributions, they fail on
new ones, with new python. Fixing this by calling them in a way that
works everywhere.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix burnin error when trying to grow a file volume
Guido Trotter [Mon, 22 Mar 2010 16:17:41 +0000 (16:17 +0000)]
Fix burnin error when trying to grow a file volume

Abstract the growable disk types in a ganeti constants, and only run
disk grow, from burnin, on them.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoSome epydoc fixes
Iustin Pop [Thu, 18 Mar 2010 13:23:48 +0000 (14:23 +0100)]
Some epydoc fixes

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoA rewrite of LUClusterVerify
Iustin Pop [Thu, 18 Mar 2010 10:33:42 +0000 (11:33 +0100)]
A rewrite of LUClusterVerify

Per issue 90, current cluster verify is very very brittle. It's one of
the oldest pieces of code, with only additions without cleanups over the
last years.

Among its problems:

- data initialization interspersed with verification of RPC results,
  leading to non-initialized data for some branches
- due to the above, we order strictly some checks and we have the case
  where a bad node time result will skip checking of node volumes
- many many local variables, with each new check adding a new dict,
  leading to a spaghetti of dicts in the main Exec function
- monolithic code, both Exec() and _NodeVerify() do a lot of
  independent checks

This patch does an imperfect rewrite, but at least we gain:

- a clear infrastructure for adding more checks (the new NodeImage
  class, with it's clear and documented fields), and removal of most
  per-node dicts from the Exec() function
- the new NodeImage object should allow better type safety, e.g. by
  allowing pylint to check the actual object attributes rather than
  strings as dict keys
- a-priori initialization of data fields, eliminating the need to
  introduce dependencies between checks
- per-result-key status field, allowing elimination of duplicate error
  messages (where we want)
- split of most independent checks into separate functions, for greater
  clarity

The new code, being new will probably introduce for the short term more
bugs than it removes. However, it should offer a much better way for
extending cluster verify in the future.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoIntroduce a bool CLI option type
Iustin Pop [Mon, 22 Mar 2010 16:23:36 +0000 (17:23 +0100)]
Introduce a bool CLI option type

This option type enforces its value to either True or False, relieving
the scripts from manually parsing the values in each function.

We also update the bash completion code to use the option type if
possible.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix backend.VerifyNode behaviour for VG problems
Iustin Pop [Thu, 18 Mar 2010 09:54:24 +0000 (10:54 +0100)]
Fix backend.VerifyNode behaviour for VG problems

In case LVM is broken, backend.GetVolumeList will raise an RPC exception
(as expected since it's a function exposed over RPC). Therefore we must
be prepared to catch any such exceptions, so that we don't fail the
whole verify call in this case. cmdlib is already prepared to handle
string results for this response key.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdding missing documentation to make the docs better
René Nussbaumer [Mon, 22 Mar 2010 15:50:15 +0000 (16:50 +0100)]
Adding missing documentation to make the docs better

Also fixed a typo I noticed.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRemove race condition in FileStorage.Create
Guido Trotter [Mon, 22 Mar 2010 15:41:33 +0000 (15:41 +0000)]
Remove race condition in FileStorage.Create

Rather than checking that the file doesn't exist, and then creating it,
we create it with O_CREAT | O_EXCL, making sure the checking/creation is
atomic.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoKVM: Check instances for actual liveness
Guido Trotter [Mon, 22 Mar 2010 11:08:50 +0000 (11:08 +0000)]
KVM: Check instances for actual liveness

Currently if we find a live process with the pid we saved we assume kvm
is alive. What could happen, though, is that the pidfile has been
reused.

In order to avoid that we change the check to make sure, everywhere,
that the process we see is our actual kvm process. In order to do so we
open its cmdline, and check that it contains the correct instance name
in the -name argument passed to kvm.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoKVM: improve GetInstanceInfo docstring
Guido Trotter [Mon, 22 Mar 2010 11:09:14 +0000 (11:09 +0000)]
KVM: improve GetInstanceInfo docstring

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRevert "Only override any and all if not defined"
Guido Trotter [Mon, 22 Mar 2010 15:03:21 +0000 (15:03 +0000)]
Revert "Only override any and all if not defined"

This reverts commit bd5617020a50bcd08269330638d64078c1b30b71.

Turns out our and python's any/all are not compatible.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdding RAPI call to deactivate-disks for an instance
René Nussbaumer [Mon, 22 Mar 2010 15:16:47 +0000 (16:16 +0100)]
Adding RAPI call to deactivate-disks for an instance

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdding RAPI call for activate-disks on an instance
René Nussbaumer [Mon, 22 Mar 2010 15:16:17 +0000 (16:16 +0100)]
Adding RAPI call for activate-disks on an instance

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd a hint to masterd for inconsistent clusters
Iustin Pop [Thu, 18 Mar 2010 15:18:05 +0000 (16:18 +0100)]
Add a hint to masterd for inconsistent clusters

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoSimpleConfigReader: add docstrings
Guido Trotter [Thu, 18 Mar 2010 14:26:12 +0000 (14:26 +0000)]
SimpleConfigReader: add docstrings

All non-oneliner functions, after this patch, have their docstring.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoburnin: implement basic confd testing
Guido Trotter [Mon, 15 Mar 2010 13:23:01 +0000 (13:23 +0000)]
burnin: implement basic confd testing

Just a few queries are checked, but this should give us confidence that
at least the basic confd framework is working properly.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAsyncUDPSocket.process_next_packet
Guido Trotter [Mon, 15 Mar 2010 11:43:21 +0000 (11:43 +0000)]
AsyncUDPSocket.process_next_packet

This function allows receiving socket data synchronously.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoWaitForSocketCondition: rename, handle EINTR
Guido Trotter [Tue, 16 Mar 2010 15:08:31 +0000 (15:08 +0000)]
WaitForSocketCondition: rename, handle EINTR

- Rename WaitForSocketCondition to SingleWaitForFdCondition
  - Avoid potentially infinite loop, if we continue to get interrupted
  - Handle eintr correctly
  - Avoid the poller try/finally, as the poller object gets destroyed
    anyway
- Provide a new WaitForFdCondition
  - Using retry, guarantee to continue checking until the timeout
    expires
  - Needs an extra helper class, as it uses retry in a very custom way
    (no sleep happens, because the poller sleeps by itself)

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agomove http.WaitForSocketCondition to utils
Guido Trotter [Tue, 16 Mar 2010 13:59:25 +0000 (13:59 +0000)]
move http.WaitForSocketCondition to utils

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoConfdCountingCallback
Guido Trotter [Mon, 15 Mar 2010 11:42:12 +0000 (11:42 +0000)]
ConfdCountingCallback

This new confd callback counts received replies for the registered
queries.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoConfdClient: add synchronous features
Guido Trotter [Mon, 15 Mar 2010 11:40:54 +0000 (11:40 +0000)]
ConfdClient: add synchronous features

By sending requests with async=False, and receiving replies with
ReceiveReply we can more easily use confd from a synchronous client.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoReplace @keyword with @param in confd client
Guido Trotter [Fri, 12 Mar 2010 12:39:46 +0000 (12:39 +0000)]
Replace @keyword with @param in confd client

@keyword was used inappropriately.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAsyncUDPSocket: abstract do_read function
Guido Trotter [Mon, 15 Mar 2010 13:21:22 +0000 (13:21 +0000)]
AsyncUDPSocket: abstract do_read function

This basically implements read handling, without catching all
exceptions. When using the socket in synchronous mode, it's useful to
avoid losing exception data (which, in an async daemon, can only be
logged)

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoBurnin: don't add/remove routed nics
Guido Trotter [Thu, 11 Mar 2010 15:17:17 +0000 (15:17 +0000)]
Burnin: don't add/remove routed nics

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoOnly override any and all if not defined
Guido Trotter [Mon, 15 Mar 2010 11:37:38 +0000 (11:37 +0000)]
Only override any and all if not defined

If any or all are already defined (because we're using a new version of
python) just link them inside "utils" rather than redefining them.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agobackend: Two small style fixes
Michael Hanselmann [Wed, 17 Mar 2010 16:52:41 +0000 (17:52 +0100)]
backend: Two small style fixes

- Pass keyword parameter as such
- Replace “not x == y” with “x != y”

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAllow cluster copy file over the replication net
Iustin Pop [Wed, 17 Mar 2010 16:08:13 +0000 (17:08 +0100)]
Allow cluster copy file over the replication net

This patch introduces the option “--use-replication-network” for the
cluster copyfile functionality, which is useful if the primary and
secondary network are significantly different (see issue 32).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoEnhance cli.GetOnlineNodes query/filtering
Iustin Pop [Wed, 17 Mar 2010 15:49:59 +0000 (16:49 +0100)]
Enhance cli.GetOnlineNodes query/filtering

This patch allows GetOnlineNodes to return the secondary IPs instead of
the node names, and to provide filtering of the master node (required to
be done in this function in case we return the secondary IPs).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoInstance creation: implement --no-install mode
Iustin Pop [Wed, 17 Mar 2010 14:00:14 +0000 (15:00 +0100)]
Instance creation: implement --no-install mode

This is a simple patch that adds the no-install mode for instance
creation, allowing import from foreign source of the actual OS (instead
of requiring the preparation of data in a form expected by the import
scripts).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAllow OS changes without reinstallation
Iustin Pop [Wed, 17 Mar 2010 13:33:44 +0000 (14:33 +0100)]
Allow OS changes without reinstallation

This patch modifies LUSetInstanceParms to allow OS name changes, without
reinstallation, in case an OS gets renamed on-disk.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agocmdlib: Abstract OS checks
Iustin Pop [Wed, 17 Mar 2010 13:19:50 +0000 (14:19 +0100)]
cmdlib: Abstract OS checks

This patch moves the node-has-os checks to a separate function.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoImprove “gnt-cluster renew-crypto”
Michael Hanselmann [Tue, 16 Mar 2010 13:51:38 +0000 (14:51 +0100)]
Improve “gnt-cluster renew-crypto”

- Report exception text immediately instead of just logging it
- Remove leftover assertion from when it still used “gnt-cluster
  modify”

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix behaviour of gnt-node modify -C no
Iustin Pop [Tue, 16 Mar 2010 10:34:06 +0000 (11:34 +0100)]
Fix behaviour of gnt-node modify -C no

The current check on whether we require auto_promote or not is wrong, as
we check whether we will have exactly the correct number of master
candidates left. But it is fine if we have more (e.g. when CPS=10 and
mc_remaning=19) than the current number, and in that case we shouldn't
require auto promotion.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRightname confd's HMAC key
Michael Hanselmann [Mon, 15 Mar 2010 15:53:22 +0000 (16:53 +0100)]
Rightname confd's HMAC key

Currently, the ganeti-confd's HMAC key is called “cluster HMAC key” or
simply “HMAC key” everywhere. With the implementation of inter-cluster
instance moves, another HMAC key will be introduced for signing critical
data. They can not be the same, so this patch clarifies the purpose of the
“cluster HMAC key” by renaming it. The actual file name is not changed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRename SSL_CERT_FILE to NODED_CERT_FILE
Michael Hanselmann [Mon, 15 Mar 2010 15:15:47 +0000 (16:15 +0100)]
Rename SSL_CERT_FILE to NODED_CERT_FILE

To be consistent with RAPI_CERT_FILE, the rather generic named
“SSL_CERT_FILE” constant is renamed to “NODED_CERT_FILE”. The actual file
name is not changed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoClarify the error message for ':' in PV names
Iustin Pop [Mon, 15 Mar 2010 16:14:25 +0000 (17:14 +0100)]
Clarify the error message for ':' in PV names

As described in issue 93, just saying ':' is not a valid char can be
confusing.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoImplement QA tests for disk template changes
Iustin Pop [Sun, 14 Mar 2010 15:39:01 +0000 (16:39 +0100)]
Implement QA tests for disk template changes

The new test depends on the drbd type tests being enabled, and test
conversion to plain and back to drbd.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoUpdate instance modify documentation
Iustin Pop [Sun, 14 Mar 2010 15:39:00 +0000 (16:39 +0100)]
Update instance modify documentation

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoImplement conversion from drbd to plain
Iustin Pop [Sun, 14 Mar 2010 15:38:59 +0000 (16:38 +0100)]
Implement conversion from drbd to plain

This is much simpler than the opposite, with fewer possibilities of
failures.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoImplement conversion from plain to drbd
Iustin Pop [Sun, 14 Mar 2010 15:38:58 +0000 (16:38 +0100)]
Implement conversion from plain to drbd

This patch adds a new mode to instance modify, the changing of the disk
template. For now only plain to drbd conversion is supported, and the
new secondary node must be specified manually (no iallocator support).

The procedure for conversion works as follows:

- a completely new disk template is created, matching the count, size
  and mode of the instance's current disks
- we create manually (not via _CreateDisks) all the missing volumes
- we rename on the primary the LVs to the new name
- we create manually the DRBD devices

Failures during the creation of volumes will leave orphan volumes.
Failure during the rename might leave some disks renamed and some not,
leading to an inconsistent instance.

Once the disks are renamed, we update the instance information and wait
for resync. Any failures of the DRBD sync must be manually handled (like
a normal failure, e.g. by running replace-disks, etc.).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAbstract check that an instance is down
Iustin Pop [Sun, 14 Mar 2010 15:38:57 +0000 (16:38 +0100)]
Abstract check that an instance is down

Multiple LUs require that an instance is not running while they operate
on the instance (reinstall, rename, modify, recreate disks, deactivate
disks). The code to do this check is duplicate many times, and not very
consistent (some use call_instance_list, some call_instance_info).

The patch moves this check into a separate function that is then reused.
The only drawback is that _SafeShutdowInstanceDisks now raises an
OpPrereqError (even though it is run during Exec()), but this use case
is fine (there are no other modifications in that Exec).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAbstract node free disk space check
Iustin Pop [Sun, 14 Mar 2010 15:38:56 +0000 (16:38 +0100)]
Abstract node free disk space check

Both create instance and grow disk check the free disk space on nodes
using the same, duplicate code. Since we'll need this in other places in
the future, we abstract the check into a new function.

The patch adjusts the error message to be more in-line with the one for
memory checking, and fixes the exception raised for RPC errors.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAbstract disk template verification
Iustin Pop [Sun, 14 Mar 2010 15:38:55 +0000 (16:38 +0100)]
Abstract disk template verification

This is a simple check, but we'll need it in multiple places.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoUpdate documentation for disk adoption
Iustin Pop [Sun, 14 Mar 2010 00:55:19 +0000 (01:55 +0100)]
Update documentation for disk adoption

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoImplement disk adoption mode in gnt-instance
Iustin Pop [Sun, 14 Mar 2010 00:55:18 +0000 (01:55 +0100)]
Implement disk adoption mode in gnt-instance

This patch modifies the parsing of the “--disk” argument to instance
create to accept “adopt” as a valid key, which builds the correct disk
structure for OpCreateInstance.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoLUCreateInstance: implement disk adoption mode
Iustin Pop [Sun, 14 Mar 2010 00:55:17 +0000 (01:55 +0100)]
LUCreateInstance: implement disk adoption mode

This new mode, valid only for the plain template disk, allows creation
of an instance based on existing logical volumes (preserving data),
rather than creation of new volumes and OS creation.

The new mode works as follows:

- instead of size, all disks passed in must have an 'adopt' key, which
  signifies the LV name to be used
- all disks must have this key, or neither should
- we check the volume existence, and from the result we fill in the
  actual size
- online (in-use) volumes are not allowed
- 'stealing' of another's instance volumes is prevented via reservation
  of the LV names
- during creation, we rename the logical volumes to the standard Ganeti
  format (based on UUID)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoLUCreateInstance: Move parameter init earlier
Iustin Pop [Sun, 14 Mar 2010 00:55:16 +0000 (01:55 +0100)]
LUCreateInstance: Move parameter init earlier

This way, the parameters are available in CheckArguments too.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoConfigWriter: add an LV reservation manager
Iustin Pop [Sun, 14 Mar 2010 00:55:15 +0000 (01:55 +0100)]
ConfigWriter: add an LV reservation manager

This patch adds an LV reservation manager to be used for LV names. Since
we now have four such managers, we create a list for easier release.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix two issues related to check-man
Iustin Pop [Mon, 15 Mar 2010 12:55:48 +0000 (13:55 +0100)]
Fix two issues related to check-man

First, we don't need to check man pages at sed time, because this means
everyone building the package will do - we only need to check at docbook
time, which is mostly at developer time.

Second, don't force LC_ALL to C, as this breaks newer man-db. I've
verified and removing LC_ALL works fine across etch, hardy, lenny aid
squeeze/sid.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoutils.RunCmd: Test case with reset_env set and setting variables
Michael Hanselmann [Mon, 15 Mar 2010 14:33:47 +0000 (15:33 +0100)]
utils.RunCmd: Test case with reset_env set and setting variables

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoImplement replacing cluster certs and keys via “gnt-cluster renew-crypto”
Michael Hanselmann [Fri, 12 Mar 2010 15:16:08 +0000 (16:16 +0100)]
Implement replacing cluster certs and keys via “gnt-cluster renew-crypto”

Recent changes to “gnt-cluster verify” made it complain on expiring SSL
certificates. While it was possible to replace the SSL certificates and
other cluster secrets manually before, doing so was cumbersome. Cluster
certificates, keys and secrets can now be replaced easily.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agocli: Add helper function to stop and start whole cluster
Michael Hanselmann [Fri, 12 Mar 2010 10:51:22 +0000 (11:51 +0100)]
cli: Add helper function to stop and start whole cluster

Replacing cluster certificates and keys requires all cluster daemons to be
shut down. This might also be handy for the cluster merger tool, though
the function might need a few more extensions.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agocfgupgrade: Use new bootstrap function for certs and keys
Michael Hanselmann [Fri, 12 Mar 2010 10:49:16 +0000 (11:49 +0100)]
cfgupgrade: Use new bootstrap function for certs and keys

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agobootstrap: Add new function to create cluster certs and keys
Michael Hanselmann [Fri, 12 Mar 2010 10:49:47 +0000 (11:49 +0100)]
bootstrap: Add new function to create cluster certs and keys

The code to generate cluster certificates, keys and secrets is currently
spread over several places. It makes sense to move it to a separate
function as we want to provide the user with the ability to automatically
replace all cluster certificates and keys.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>