ganeti-local
14 years agoRemove a couple of empty design sections
Guido Trotter [Wed, 21 Jul 2010 15:27:32 +0000 (16:27 +0100)]
Remove a couple of empty design sections

The 2.1 and 2.2 designs contain sections with no actual content, as they
are detailed for each single change. Removing the global empty ones.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>

14 years agoDisable 'invalid name' pylint warning for tools/setup-ssh
Manuel Franceschini [Wed, 21 Jul 2010 09:29:40 +0000 (11:29 +0200)]
Disable 'invalid name' pylint warning for tools/setup-ssh

Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

14 years agoAlways set commonName in X509 certificates
Manuel Franceschini [Mon, 19 Jul 2010 19:07:57 +0000 (21:07 +0200)]
Always set commonName in X509 certificates

Due to the current switch of the RPC client to PycURL, a bug with newer
versions of libcurl surfaced. When the 'Subject' or 'Issuer' of
'server.pem' were empty, SSL handshake failed.

This patch changes the certificate generation functions such that they
always use "ganeti.example.com" as commonName (CN) for 'Subject' and
'Issuer'.

Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdding tool to setup SSH on a remote host
René Nussbaumer [Tue, 13 Jul 2010 09:38:36 +0000 (11:38 +0200)]
Adding tool to setup SSH on a remote host

This prepares the remote node to be joined into a cluster

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdding new (optional) dependency to configure.ac
René Nussbaumer [Wed, 14 Jul 2010 09:04:35 +0000 (11:04 +0200)]
Adding new (optional) dependency to configure.ac

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdding constants for setup-ssh
René Nussbaumer [Tue, 13 Jul 2010 14:29:49 +0000 (16:29 +0200)]
Adding constants for setup-ssh

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoChange AddAuthorizedKey to also allow filehandles
René Nussbaumer [Tue, 13 Jul 2010 09:37:53 +0000 (11:37 +0200)]
Change AddAuthorizedKey to also allow filehandles

This is required to use this function over paramiko
sftp file handles.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoUpdate .gitignore for vcs-version
Iustin Pop [Mon, 19 Jul 2010 14:14:37 +0000 (16:14 +0200)]
Update .gitignore for vcs-version

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRAPI client: Encode empty body to JSON
Michael Hanselmann [Fri, 16 Jul 2010 17:20:04 +0000 (19:20 +0200)]
RAPI client: Encode empty body to JSON

If the body consists of an empty dict, it should also be encoded.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoIntroduce git reference/tag tracking for debugging
Iustin Pop [Fri, 16 Jul 2010 08:18:24 +0000 (10:18 +0200)]
Introduce git reference/tag tracking for debugging

This patch adds a new vcs-version file that is generated via git (and
can be adapted if VCS is changed) and then embebbded as VCS_VERSION in
the constants module.

This means two things:
- local modifications without committing to git (or when using a tar.gz
  archive + mods) will not be reflected
- version is fixed at the time of the last make regen-vcs-version (dist time,
  or devel/upload which calls this)

Thus this is more geared at developers rather than end users.

The patch:

- adds rules for generating the vcs-version file
- adds a dist-hook for re-generating the file (if possible) and copying
  the updated version to the distdir
- modifies devel/upload to re-generate the file before upload

The output of --version will look like:
gnt-cluster (ganeti v2.2.0beta0-184-gebca7e6) 2.2.0~beta0

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix epydoc warning "Lists must be indented."
Luca Bigliardi [Fri, 16 Jul 2010 15:29:16 +0000 (16:29 +0100)]
Fix epydoc warning "Lists must be indented."

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoConvert RPC client to PycURL
Michael Hanselmann [Tue, 6 Jul 2010 13:56:49 +0000 (15:56 +0200)]
Convert RPC client to PycURL

Instead of using our custom HTTP client, using PycURL's multi
interface allows us to get rid of the HTTP client threadpool.
The majority of the code is still in the ganeti.http.client
module.

A simple per-thread HTTP client pool gives cURL a chance to
cache and retain as much information as possible (e.g. SSL certs).
Unused HTTP clients (e.g. due to removed nodes) are deleted after
25 requests going through the pool.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoImplement lock names for debugging purposes
Iustin Pop [Mon, 4 May 2009 20:51:04 +0000 (22:51 +0200)]
Implement lock names for debugging purposes

This patch adds lock names to SharedLocks and LockSets, that can be used
later for displaying the actual locks being held/used in places where we
only have the lock, and not the entire context of the locking operation.

Since I realized that the production code doesn't call LockSet with the
proper members= syntax, but directly as positional parameters, I've
converted this (and the arguments to GlobalLockManager) into positional
arguments.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMerge branch 'devel-2.1'
Guido Trotter [Fri, 16 Jul 2010 13:05:23 +0000 (14:05 +0100)]
Merge branch 'devel-2.1'

* devel-2.1:
  Bump up version to release 2.1.6
  Update NEWS file for 2.1.6

Conflicts:
NEWS
  - merge
configure.ac
  - keep 2.2 version

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoBump up version to release 2.1.6 v2.1.6
Guido Trotter [Fri, 16 Jul 2010 11:04:02 +0000 (12:04 +0100)]
Bump up version to release 2.1.6

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoUpdate NEWS file for 2.1.6
Guido Trotter [Fri, 16 Jul 2010 11:17:40 +0000 (12:17 +0100)]
Update NEWS file for 2.1.6

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix pylint complaints introduced in commit e58f87a958c
Michael Hanselmann [Fri, 16 Jul 2010 00:00:04 +0000 (02:00 +0200)]
Fix pylint complaints introduced in commit e58f87a958c

Due to a small mistake I missed three non-critical pylint complaints for
commit e58f87a958c. They're fixed with this patch.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLXC: Add cpu_mask hypervisor parameter
Balazs Lecz [Fri, 9 Jul 2010 12:30:39 +0000 (13:30 +0100)]
LXC: Add cpu_mask hypervisor parameter

Also implement syntax checking.

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd ParseCpuMask() utility function
Balazs Lecz [Mon, 12 Jul 2010 16:54:47 +0000 (17:54 +0100)]
Add ParseCpuMask() utility function

Also adds a generic ParseError exception.

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd a migration type global hypervisor parameter
Iustin Pop [Thu, 15 Jul 2010 16:05:46 +0000 (18:05 +0200)]
Add a migration type global hypervisor parameter

Since migration live/non-live is more stable (e.g.) for Xen-PVM versus
Xen-HVM, we introduce a new parameter for what mode we should use by
default (if not overridden by the user, in the opcode).

The meaning of the opcode 'live' field changes from boolean to either
None (use the hypervisor default), or one of the allowed migration
string constants. The live parameter of the TLMigrateInstance is still a
boolean, computed from the opcode field (which is no longer passed to
the TL).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd test for some aspects of job queue
Michael Hanselmann [Thu, 15 Jul 2010 16:23:17 +0000 (18:23 +0200)]
Add test for some aspects of job queue

This new opcode and gnt-debug sub-command test some aspects of the
job queue, including the status of a job. The bug fixed in commit
2034c70d507 was identified using this test. A future patch will
run this test automatically from the QA scripts.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLUVerifyCluster: update _ValidateNode description
Luca Bigliardi [Thu, 15 Jul 2010 16:13:06 +0000 (17:13 +0100)]
LUVerifyCluster: update _ValidateNode description

Change _ValidateNode description to reflect what the function actually does.

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoKVM hypervisor: Use utils.ShellWriter for network script
Michael Hanselmann [Wed, 14 Jul 2010 18:15:59 +0000 (20:15 +0200)]
KVM hypervisor: Use utils.ShellWriter for network script

This patch converts hv_kvm to use utils.ShellWriter for writing
the network script. It also adds a few unittests (the first
for any hypervisor modules).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMove ShellWriter class to utils
Michael Hanselmann [Wed, 14 Jul 2010 17:32:23 +0000 (19:32 +0200)]
Move ShellWriter class to utils

Also add unittest.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRename test for utils.IgnoreProcessNotFound
Michael Hanselmann [Wed, 14 Jul 2010 17:32:55 +0000 (19:32 +0200)]
Rename test for utils.IgnoreProcessNotFound

Usually our tests are named “Test…”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>

14 years agojqueue: Factorize code waiting for job changes
Michael Hanselmann [Wed, 14 Jul 2010 15:29:56 +0000 (17:29 +0200)]
jqueue: Factorize code waiting for job changes

By splitting the _WaitForJobChangesHelper class into multiple smaller
classes, we gain in several places:

- Simpler code, less interaction between functions and variables
- Easy to unittest (close to 100% coverage)
- Waiting for job changes has no direct knowledge of queue anymore (it
  doesn't references queue functions anymore, especially not private ones)
- Activate inotify only if there was no change at the beginning (and
  checking again right away to avoid race conditions)

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoMerge remote branch 'origin/devel-2.1'
Michael Hanselmann [Tue, 13 Jul 2010 19:01:17 +0000 (21:01 +0200)]
Merge remote branch 'origin/devel-2.1'

* origin/devel-2.1:
  RAPI client: Implement old instance creation request format
  rlib2: Use constants for disk and NIC parameters

Conflicts:
test/ganeti.rapi.client_unittest.py: Trivial
test/ganeti.rapi.rlib2_unittest.py: Trivial

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRAPI client: Implement old instance creation request format
Michael Hanselmann [Tue, 13 Jul 2010 17:45:44 +0000 (19:45 +0200)]
RAPI client: Implement old instance creation request format

Commit 8a47b4478 implemented instance creation in the RAPI client,
but it left out support for the old instance creation request format.
This patch now implements the old format as good as possible. This
will only be used when talking to clusters before Ganeti 2.1.3.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agorlib2: Use constants for disk and NIC parameters
Michael Hanselmann [Mon, 12 Jul 2010 20:16:52 +0000 (22:16 +0200)]
rlib2: Use constants for disk and NIC parameters

These constants were added in commit bd061c35, but the parsing code
was not updated. This also fixes a bug where a NIC's MAC address
wasn't used.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoUse reserved documentation IPs and domains
Manuel Franceschini [Mon, 12 Jul 2010 13:44:20 +0000 (15:44 +0200)]
Use reserved documentation IPs and domains

Use RFC 5737 IP addresses and RFC 2606 domain names in all
unittests, docs, qa and docstrings.

Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoProvide feedback function for all LU methods
Michael Hanselmann [Thu, 8 Jul 2010 15:21:39 +0000 (17:21 +0200)]
Provide feedback function for all LU methods

By exposing mcpu's _Feedback function (now renamed to “Log”) to LU's,
methods like ExpandNames can also write to the job execution log.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agojqueue: Setup inotify before checking for any job changes
Michael Hanselmann [Thu, 8 Jul 2010 15:09:03 +0000 (17:09 +0200)]
jqueue: Setup inotify before checking for any job changes

Since the code waiting for job changes was modified to use inotify,
a race condition between checking for changes the first time and
setting up inotify occurs. If the job is modified after the check
but before inotify is active, changes would only be noticed after
the timeout (29 seconds in most cases) expired.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agocli.SubmitOpCode: Support custom job reporter
Michael Hanselmann [Thu, 8 Jul 2010 15:06:14 +0000 (17:06 +0200)]
cli.SubmitOpCode: Support custom job reporter

This is necessary to reuse SubmitOpCode while adding processing for
custom message types.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd function to format all job log messages
Michael Hanselmann [Thu, 8 Jul 2010 15:04:59 +0000 (17:04 +0200)]
Add function to format all job log messages

Just calling utils.SafeEncode on the log message failed when it
wasn't of the type ELOG_MESSAGE and not a string. Now non-message
log entries are formatted using repr().

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agobaserlib: Fix feedback function
Michael Hanselmann [Thu, 8 Jul 2010 15:02:14 +0000 (17:02 +0200)]
baserlib: Fix feedback function

The feedback function is called with only one parameter, a tuple
with the message details.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoConfd IPv6 support
Manuel Franceschini [Wed, 30 Jun 2010 09:55:18 +0000 (11:55 +0200)]
Confd IPv6 support

This patch series basically adds a new parameter 'family' to the constructors
of daemon.AsyncUDPSocket and confd.client.ConfdUDPClient. This enables the
users of these two classes to support IPv6.

In ganeti-confd.ConfdAsyncUDPClient a method to check the address families of
all peers is added.

Furthermore it adds unittests for the added functionality.

Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLXC: Fix GetInstanceInfo()
Balazs Lecz [Thu, 8 Jul 2010 17:45:48 +0000 (18:45 +0100)]
LXC: Fix GetInstanceInfo()

Don't try to get cgroups info if instance is not running.

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLXC: Fix wording of error messages
Balazs Lecz [Thu, 8 Jul 2010 17:18:51 +0000 (18:18 +0100)]
LXC: Fix wording of error messages

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLXC: Create per-instance log files
Balazs Lecz [Thu, 8 Jul 2010 17:12:09 +0000 (18:12 +0100)]
LXC: Create per-instance log files

This replaces the single global log file with per-instance logs.
The instance log file is not truncated when the instance is started.

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'devel-2.1'
Iustin Pop [Fri, 9 Jul 2010 13:48:00 +0000 (15:48 +0200)]
Merge branch 'devel-2.1'

* devel-2.1:
  Enable from-repository builds on old distributions

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>

14 years agoEnable from-repository builds on old distributions
Iustin Pop [Fri, 9 Jul 2010 13:11:39 +0000 (15:11 +0200)]
Enable from-repository builds on old distributions

… or on distributions which simply have other implementations of man,
that do not support '--warnings'.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Manuel Franceschini <livewire@google.com>

14 years agoIntroduce lib/netutils.py
Manuel Franceschini [Mon, 5 Jul 2010 16:50:39 +0000 (18:50 +0200)]
Introduce lib/netutils.py

This patch moves network utility functions to a dedicated module.

Signed-off-by: Manuel Franceschini <livewire@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd oper_vcpus instance status field
Balazs Lecz [Wed, 7 Jul 2010 18:02:26 +0000 (18:02 +0000)]
Add oper_vcpus instance status field

This introduces a new instance status field, named "oper_vcpus".
It contains the actual number of VCPUs an instance is using as
seen by the hypervisor.

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLXC: Fix GetAllInstancesInfo()
Balazs Lecz [Wed, 7 Jul 2010 16:59:06 +0000 (16:59 +0000)]
LXC: Fix GetAllInstancesInfo()

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLUNodeEvacuationStrategy: Use default iallocator
Apollon Oikonomopoulos [Thu, 8 Jul 2010 12:05:01 +0000 (15:05 +0300)]
LUNodeEvacuationStrategy: Use default iallocator

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLUCreateInstance: use cluster-wide iallocator
Apollon Oikonomopoulos [Thu, 8 Jul 2010 12:04:44 +0000 (15:04 +0300)]
LUCreateInstance: use cluster-wide iallocator

LUCreateInstance uses the cluster-wide default iallocator if no iallocator or
primary node is specified manually.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years ago_CheckIAllocatorOrNode unit tests
Apollon Oikonomopoulos [Thu, 8 Jul 2010 12:04:15 +0000 (15:04 +0300)]
_CheckIAllocatorOrNode unit tests

Add unit tests to check the function of _CheckIAllocatorOrNode

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd _CheckIAllocatorOrNode for common iallocator/node checks
Apollon Oikonomopoulos [Thu, 8 Jul 2010 12:03:58 +0000 (15:03 +0300)]
Add _CheckIAllocatorOrNode for common iallocator/node checks

_CheckIAllocatorOrNode will be called by LUs wishing to use an instance
allocator or a target node. It performs sanity checks and will modify the LU's
opcode's iallocator slot to use the cluster-wide allocator if
appropriate.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoDocument the default instance allocator in gnt-cluster.sgml
Apollon Oikonomopoulos [Thu, 8 Jul 2010 12:03:26 +0000 (15:03 +0300)]
Document the default instance allocator in gnt-cluster.sgml

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd default_iallocator cluster parameter
Apollon Oikonomopoulos [Thu, 8 Jul 2010 12:02:55 +0000 (15:02 +0300)]
Add default_iallocator cluster parameter

Add a cluster parameter to hold the iallocator that will be used by default
when required and no alternative (manually-specified iallocator or
manually-specified node(s)) is given.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLXC: Report actual number of CPUs
Balazs Lecz [Wed, 7 Jul 2010 15:57:00 +0000 (15:57 +0000)]
LXC: Report actual number of CPUs

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'devel-2.1'
Luca Bigliardi [Wed, 7 Jul 2010 14:48:37 +0000 (15:48 +0100)]
Merge branch 'devel-2.1'

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMlockall: decrease warnings if ctypes module is not present
Luca Bigliardi [Tue, 6 Jul 2010 14:28:58 +0000 (15:28 +0100)]
Mlockall: decrease warnings if ctypes module is not present

Node daemon prints a lot of warnings if --no-mlock option is not specified and
ctypes module is not present.

With the following patch the warning is printed only at noded startup.

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd a delay in master failover
Iustin Pop [Tue, 6 Jul 2010 12:20:15 +0000 (14:20 +0200)]
Add a delay in master failover

I have seen some very seldom errors where (it seems) the address is
still live for a short while after removing it from the old master, thus
the new master will fail in startup/adding its own IP address.

To prevent against this, we add a delay/retry before we proceed, if the
IP is still reachable.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>
(cherry picked from commit 425f0f5470c912ff4a615d14c8b924116abe5c92)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

14 years agoLXC: Use lxc-info to get instance info
Balazs Lecz [Tue, 6 Jul 2010 17:58:09 +0000 (17:58 +0000)]
LXC: Use lxc-info to get instance info

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLXC: add lxc.console to the generated lxc.conf file
Balazs Lecz [Mon, 5 Jul 2010 14:48:39 +0000 (14:48 +0000)]
LXC: add lxc.console to the generated lxc.conf file

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'devel-2.1'
Iustin Pop [Wed, 7 Jul 2010 08:06:08 +0000 (10:06 +0200)]
Merge branch 'devel-2.1'

* devel-2.1:
  QA, burnin: allow selection of reboot types
  Add a QA option to disable reboots during burnin

Conflicts:
qa/qa-sample.json (trivial)
qa/qa_cluster.py  (trivial)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Luca Bigliardi <shammash@google.com>

14 years agoQA, burnin: allow selection of reboot types
Iustin Pop [Tue, 6 Jul 2010 16:52:41 +0000 (18:52 +0200)]
QA, burnin: allow selection of reboot types

After some more investigation, only the soft reboot type fails for Xen
3.4 (due to the reboot/uptime time counter). As such, it's better to
allow selective testing, since we do want to test in general these
opcodes/the command line script.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Luca Bigliardi <shammash@google.com>

14 years agoFix a typo in gnt-instance's man page
Iustin Pop [Tue, 6 Jul 2010 14:55:56 +0000 (16:55 +0200)]
Fix a typo in gnt-instance's man page

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRework the export failure handling
Iustin Pop [Tue, 6 Jul 2010 14:49:45 +0000 (16:49 +0200)]
Rework the export failure handling

Currently, the way to signal export failures is by the return value.
This means that if a client doesn't check the values (e.g. burnin), any
failure is being ignore. And this is what we've been doing forever in
burning (not actually testing that the export is successful).

This patch changes the behaviour of ExportInstance: it will abort with
an exception for any error, and removes the custom handling from
gnt-backup. This makes the behaviour consistent for any client (e.g.
RAPI), and it prevents false positives. If, for a given instance, a
subset of disks should not be backed up, the OS scripts should handle
that case.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agognt-instance: fix GenericManyOps
Iustin Pop [Tue, 6 Jul 2010 13:33:05 +0000 (15:33 +0200)]
gnt-instance: fix GenericManyOps

Currently, GenericManyOps ignores the actual success or failure results
from the invididual jobs. We change this to return '0' (i.e. success)
only when all jobs failed, as many times we have just one job.

Together with the JobExecutor change, this will report failures
correctly when used with a drained queue and submit only, or when used
normally and the opcode actually fails.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agocli.JobExecutor.WaitOrShow: always return status
Iustin Pop [Tue, 6 Jul 2010 13:29:36 +0000 (15:29 +0200)]
cli.JobExecutor.WaitOrShow: always return status

Currently, for the 'wait' case, we return a list of tuples (status,
result), in the order of submitted jobs, but we don't return anything
for the no-wait case.

This patch changes the no-wait case to return a list of tuples (status,
result), where result can be either a job ID or an error message.
Processing in clients can then ignore whether we did wait or not, and
test the overall or individual status.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix opcode transition from WAITLOCK to RUNNING
Iustin Pop [Tue, 6 Jul 2010 13:14:08 +0000 (15:14 +0200)]
Fix opcode transition from WAITLOCK to RUNNING

With the recent changes in the job queue, an old bug surfaced: we never
serialized the status change when in NotifyStart, thus a crash of the
master would have left the job queue oblivious to the fact that the job
was actually running.

In the previous implementation, queries against the job status were
using the in-memory object, so they 'saw' and reported correctly the
running status. But the new implementation just looks at the on-disk
version, and thus didn't see this transition.

The patch also moves NotifyStart to a decorator-based version (like the
other functions), which generates a lot of churn in the diff, sorry.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agohv_chroot: use utils.GetMounts()
Balazs Lecz [Mon, 5 Jul 2010 18:27:10 +0000 (18:27 +0000)]
hv_chroot: use utils.GetMounts()

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLXC: use utils.GetMounts()
Balazs Lecz [Mon, 5 Jul 2010 17:57:27 +0000 (17:57 +0000)]
LXC: use utils.GetMounts()

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd utils.GetMounts()
Balazs Lecz [Mon, 5 Jul 2010 16:20:17 +0000 (16:20 +0000)]
Add utils.GetMounts()

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd a delay in master failover
Iustin Pop [Tue, 6 Jul 2010 12:20:15 +0000 (14:20 +0200)]
Add a delay in master failover

I have seen some very seldom errors where (it seems) the address is
still live for a short while after removing it from the old master, thus
the new master will fail in startup/adding its own IP address.

To prevent against this, we add a delay/retry before we proceed, if the
IP is still reachable.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: René Nussbaumer <rn@google.com>

14 years agognt-cluster: deal with drbd helper in init/modify/info
Luca Bigliardi [Mon, 28 Jun 2010 16:06:20 +0000 (17:06 +0100)]
gnt-cluster: deal with drbd helper in init/modify/info

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd drbd helper and storage options
Luca Bigliardi [Mon, 28 Jun 2010 16:04:47 +0000 (17:04 +0100)]
Add drbd helper and storage options

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoReport drbd helper in query info LU
Luca Bigliardi [Mon, 28 Jun 2010 16:04:13 +0000 (17:04 +0100)]
Report drbd helper in query info LU

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoCheck and set drbd helper in set params LU
Luca Bigliardi [Mon, 28 Jun 2010 16:03:21 +0000 (17:03 +0100)]
Check and set drbd helper in set params LU

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoCheck and set drbd helper during bootstrap
Luca Bigliardi [Mon, 28 Jun 2010 15:51:06 +0000 (16:51 +0100)]
Check and set drbd helper during bootstrap

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd drbd_helper rpc call
Luca Bigliardi [Mon, 28 Jun 2010 15:47:40 +0000 (16:47 +0100)]
Add drbd_helper rpc call

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoCheck drbd usermode helper in cluster verify
Luca Bigliardi [Fri, 25 Jun 2010 10:23:25 +0000 (11:23 +0100)]
Check drbd usermode helper in cluster verify

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoSet drbd usermode helper on config upgrade
Luca Bigliardi [Fri, 25 Jun 2010 10:01:10 +0000 (11:01 +0100)]
Set drbd usermode helper on config upgrade

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoGeneralize a recursive check on logical disks
Luca Bigliardi [Fri, 25 Jun 2010 14:39:47 +0000 (15:39 +0100)]
Generalize a recursive check on logical disks

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd drbd_usermode_helper to configuration
Luca Bigliardi [Fri, 25 Jun 2010 09:57:55 +0000 (10:57 +0100)]
Add drbd_usermode_helper to configuration

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoVerifyNode: add usermode helper reply
Luca Bigliardi [Thu, 24 Jun 2010 10:02:23 +0000 (11:02 +0100)]
VerifyNode: add usermode helper reply

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoBaseDRBD: provide a way to query usermode_helper parameter
Luca Bigliardi [Wed, 16 Jun 2010 16:59:20 +0000 (17:59 +0100)]
BaseDRBD: provide a way to query usermode_helper parameter

Signed-off-by: Luca Bigliardi <shammash@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix a broken commandline switch option
René Nussbaumer [Mon, 5 Jul 2010 14:31:29 +0000 (16:31 +0200)]
Fix a broken commandline switch option

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd a QA option to disable reboots during burnin
Iustin Pop [Mon, 5 Jul 2010 13:37:43 +0000 (15:37 +0200)]
Add a QA option to disable reboots during burnin

Since we have seen cases where (repeated) reboots are not supported
(e.g. Xen 3.4+), we need to be able to control this in the QA
configuration.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoCheck pycurl module at configure time
Guido Trotter [Sat, 3 Jul 2010 07:34:47 +0000 (08:34 +0100)]
Check pycurl module at configure time

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd a QA option to disable reboots during burnin
Iustin Pop [Mon, 5 Jul 2010 13:37:43 +0000 (15:37 +0200)]
Add a QA option to disable reboots during burnin

Since we have seen cases where (repeated) reboots are not supported
(e.g. Xen 3.4+), we need to be able to control this in the QA
configuration.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoOpCreateInstance: do not require hv/be/os params
Iustin Pop [Mon, 5 Jul 2010 11:28:36 +0000 (13:28 +0200)]
OpCreateInstance: do not require hv/be/os params

It is perfectly legal to create an instance using only defaults
(although beparams will be most times passed in), so let's relax the
requirement for these three parameters.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix ganeti-rapi version string
Iustin Pop [Fri, 2 Jul 2010 15:20:51 +0000 (17:20 +0200)]
Fix ganeti-rapi version string

This was "broken" for almost a year :)

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoSilence the check-dirs check
Iustin Pop [Fri, 2 Jul 2010 15:19:22 +0000 (17:19 +0200)]
Silence the check-dirs check

The big shell fragment is just noise, for the common case where it
doesn't fail.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRemove _CheckBooleanOpField
Iustin Pop [Thu, 1 Jul 2010 17:03:03 +0000 (19:03 +0200)]
Remove _CheckBooleanOpField

This is no longer used, and we can remove it.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRework the "type" system
Iustin Pop [Thu, 1 Jul 2010 17:01:48 +0000 (19:01 +0200)]
Rework the "type" system

This patch merges the _OP_REQP and _OP_DEFS class attributes into a
_OP_PARAMS list, which holds both. The associated unittest checks that
all opcode attributes are declared and checked, and that no LU uses the
old fields (could be removed later).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoMake _CheckDiskTemplate a valid type checker
Iustin Pop [Thu, 1 Jul 2010 17:00:41 +0000 (19:00 +0200)]
Make _CheckDiskTemplate a valid type checker

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoA few more type definitions
Iustin Pop [Thu, 1 Jul 2010 17:00:06 +0000 (19:00 +0200)]
A few more type definitions

This is to simplify the type declarations in the actual LUs.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoMerge branch 'stable-2.1'
Guido Trotter [Thu, 1 Jul 2010 14:25:41 +0000 (15:25 +0100)]
Merge branch 'stable-2.1'

* stable-2.1:
  Bump up version for 2.1.5 release
  RapiClient: fix multi-authentication in Python 2.6
  Remove rapi-user and rapi-pass from qa-sample.json
  qa: fix gnt-instance modify -t drbd
  qa: shutdown instance before trying disk convert
  Fix check in gnt-instance modify -t
  Document optional ctypes dependency
  Update NEWS for the 2.1.5 release
  Pass force variant option at instance creation
  BatchCreate: get force_variant from specs not opts
  BatchCreate: set a default for force_variant

Conflicts:
INSTALL
  - merge
NEWS
  - merge
configure.ac
  - keep 2.2 version
lib/rapi/client.py
  - keep curl version

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoBump up version for 2.1.5 release v2.1.5
Guido Trotter [Wed, 30 Jun 2010 10:48:25 +0000 (11:48 +0100)]
Bump up version for 2.1.5 release

Also update the release date and the NEWS file.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRapiClient: fix multi-authentication in Python 2.6
Guido Trotter [Thu, 1 Jul 2010 13:36:52 +0000 (14:36 +0100)]
RapiClient: fix multi-authentication in Python 2.6

In Python 2.6 the urllib2.HTTPBasicAuthHandler has a "retried" count for
failed authentications. The handler fails after 5 of them. To solve this
we reset the handler's "retried" member variable to 0 after every
successful request. This is a bit ugly, but makes the client work again
for more than 5 requests under all versions of Python.

Note that the digest authentication handler has a reset_retry_count()
method to do this, but the method is not defined for the basic
authentication handler, so we must reset the variable itself.

This member variable is unused in 2.4 and 2.5, so the change doesn't
affect the client under older Python versions.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRemove rapi-user and rapi-pass from qa-sample.json
Guido Trotter [Thu, 1 Jul 2010 10:17:34 +0000 (11:17 +0100)]
Remove rapi-user and rapi-pass from qa-sample.json

After commit 725ec2f10019c35bafeb1aabfce6f14174bf4f46 they are unused.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoutils.OwnIpAddress: Change try/except for Python 2.4
Michael Hanselmann [Thu, 1 Jul 2010 11:45:49 +0000 (13:45 +0200)]
utils.OwnIpAddress: Change try/except for Python 2.4

Python 2.4 doesn't support “except” and “finally” in the same block.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRAPI client: Switch to pycURL
Michael Hanselmann [Thu, 1 Jul 2010 11:38:59 +0000 (13:38 +0200)]
RAPI client: Switch to pycURL

Currently the RAPI client uses the urllib2 and httplib modules from
Python's standard library. They're used with pyOpenSSL in a very fragile
way, and there are known issues when receiving large responses from a RAPI
server.

By switching to PycURL we leverage the power and stability of the
widely-used curl library (libcurl). This brings us much more flexibility
than before, and timeouts were easily implemented (something that would
have involved a lot of work with the built-in modules).

There's one small drawback: Programs using libcurl have to call
curl_global_init(3) (available as pycurl.global_init) while exactly one
thread is running (e.g. before other threads) and are supposed to call
curl_global_cleanup(3) (available as pycurl.global_cleanup) upon exiting.
See the manpages for details. A decorator is provided to simplify this.

Unittests for the new code are provided, increasing the test coverage of
the RAPI client from 74% to 89%.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agobaserlib: Use boolean type for boolean variables
Michael Hanselmann [Wed, 30 Jun 2010 15:45:45 +0000 (17:45 +0200)]
baserlib: Use boolean type for boolean variables

This does not yet fix all issues in the RAPI interface which were
introduced with the type system. More testing is needed.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoqa: fix gnt-instance modify -t drbd
Guido Trotter [Thu, 1 Jul 2010 10:01:00 +0000 (11:01 +0100)]
qa: fix gnt-instance modify -t drbd

We need to pass the secondary node name, not a dict, which is an invalid
value.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFixing Makefile.am to reflect the document move and adding of cluster merger
René Nussbaumer [Thu, 1 Jul 2010 11:29:12 +0000 (13:29 +0200)]
Fixing Makefile.am to reflect the document move and adding of cluster merger

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdding a user document for the use of cluster-merge
René Nussbaumer [Tue, 29 Jun 2010 13:58:16 +0000 (15:58 +0200)]
Adding a user document for the use of cluster-merge

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>