ganeti-local
14 years agoAdd "adopt" to the allowed disk parameters
Apollon Oikonomopoulos [Fri, 18 Jun 2010 14:52:05 +0000 (17:52 +0300)]
Add "adopt" to the allowed disk parameters

"adopt" was missing from bd061c3, thus breaking disk adoption.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'stable-2.1'
Guido Trotter [Fri, 18 Jun 2010 10:41:30 +0000 (11:41 +0100)]
Merge branch 'stable-2.1'

* stable-2.1:
  Bump up version for the 2.1.4 release
  Update NEWS about the latest 2.1 change
  Fix handling of errors from socket.gethostbyname
  Update a comment in qa-sample.json
  RAPI client: Add support for Python 2.6
  Update NEWS for Ganeti 2.1.4

Conflicts:
NEWS: keep both
configure.ac: keep the 2.2 version
qa/qa-sample.json: merge nearby changes

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoBump up version for the 2.1.4 release v2.1.4
Guido Trotter [Thu, 17 Jun 2010 15:09:26 +0000 (16:09 +0100)]
Bump up version for the 2.1.4 release

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoUpdate NEWS about the latest 2.1 change
Guido Trotter [Thu, 17 Jun 2010 17:06:22 +0000 (18:06 +0100)]
Update NEWS about the latest 2.1 change

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix handling of errors from socket.gethostbyname
Iustin Pop [Wed, 16 Jun 2010 03:16:05 +0000 (05:16 +0200)]
Fix handling of errors from socket.gethostbyname

Socket functions can raise more than just gaierror. Most of the times,
socket.gethostbyname_ex will return gaierror, but rarely it will also
raise herror. For completeness, we catch all socket exceptions with data
of type (code, description).

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUpdate a comment in qa-sample.json
Guido Trotter [Thu, 17 Jun 2010 16:53:56 +0000 (17:53 +0100)]
Update a comment in qa-sample.json

Fix the sentence to say what it means.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agognt-debug: remove @todo from GenericOpCodes
Guido Trotter [Thu, 17 Jun 2010 14:46:09 +0000 (15:46 +0100)]
gnt-debug: remove @todo from GenericOpCodes

- the function is not broken, and we're using in nowadays
- we have example json files and all, which show its usage
=> the todo is incorrect

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojqueue.AddManyJobs: use AddManyTasks
Guido Trotter [Thu, 17 Jun 2010 13:02:20 +0000 (14:02 +0100)]
jqueue.AddManyJobs: use AddManyTasks

Rather than adding the jobs to the worker pool one at a time, we add
them all together, which is slightly faster, and ensures they don't get
started while we loop.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoWorkerpool.AddManyTasks: check tasks type
Guido Trotter [Thu, 17 Jun 2010 13:32:55 +0000 (14:32 +0100)]
Workerpool.AddManyTasks: check tasks type

Each task has to be a sequence, or the RunTask call will fail.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agocount the number of tasks done in the wp unittest
Guido Trotter [Thu, 17 Jun 2010 13:02:32 +0000 (14:02 +0100)]
count the number of tasks done in the wp unittest

Currently there's no way to know if something actually gets done.
After this check we actually test that the threads do their job.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRAPI client: Add support for Python 2.6
Michael Hanselmann [Thu, 17 Jun 2010 14:48:43 +0000 (16:48 +0200)]
RAPI client: Add support for Python 2.6

The httplib module used by urllib2 requires its sockets to have a
makefile() method to provide a file-like interface (or rather
file-in-Python-like) to the socket. PyOpenSSL doesn't implement
makefile() as the semantics require files to call dup(2) on the
underlying file descriptors, something not easily done on SSL sockets.

Python up to and including 2.5 have a class to simulate makefile(),
httplib.FakeSocket. With the addition of SSL support in Python 2.6, this
class was deprecated and no longer functions.

This patch adds a new, simpler wrapper class which is used in Python 2.6
and above only. It's good enough for this use.

There are general problems in these generic wrapper classes--none of
them handles SSL I/O properly. They break, for example, when the server
requests a renegotiation. This will need more work.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRAPI client: Add support for Python 2.6
Michael Hanselmann [Thu, 17 Jun 2010 14:48:43 +0000 (16:48 +0200)]
RAPI client: Add support for Python 2.6

The httplib module used by urllib2 requires its sockets to have a
makefile() method to provide a file-like interface (or rather
file-in-Python-like) to the socket. PyOpenSSL doesn't implement
makefile() as the semantics require files to call dup(2) on the
underlying file descriptors, something not easily done on SSL sockets.

Python up to and including 2.5 have a class to simulate makefile(),
httplib.FakeSocket. With the addition of SSL support in Python 2.6, this
class was deprecated and no longer functions.

This patch adds a new, simpler wrapper class which is used in Python 2.6
and above only. It's good enough for this use.

There are general problems in these generic wrapper classes--none of
them handles SSL I/O properly. They break, for example, when the server
requests a renegotiation. This will need more work.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoBump RPC protocol version to 40
Michael Hanselmann [Thu, 17 Jun 2010 12:14:19 +0000 (14:14 +0200)]
Bump RPC protocol version to 40

Many RPC calls have changed in Ganeti 2.2, hence bumping the RPC protocol
version.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoChange ganeti-cleaner unittest to not use random values
Michael Hanselmann [Thu, 17 Jun 2010 12:12:16 +0000 (14:12 +0200)]
Change ganeti-cleaner unittest to not use random values

Using random values in unittests isn't good. This one broke exactly
when building the 2.2.0~beta0 release. I suspect there were duplicate
job IDs generated (due to $large being not so large).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUpdate NEWS for Ganeti 2.1.4
Guido Trotter [Thu, 17 Jun 2010 11:06:36 +0000 (12:06 +0100)]
Update NEWS for Ganeti 2.1.4

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoBump version to 2.2.0~beta0 v2.2.0beta0
Michael Hanselmann [Thu, 17 Jun 2010 09:40:03 +0000 (11:40 +0200)]
Bump version to 2.2.0~beta0

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix parameter names in SimpleFillBE/NIC docstrings
Guido Trotter [Thu, 17 Jun 2010 10:08:53 +0000 (11:08 +0100)]
Fix parameter names in SimpleFillBE/NIC docstrings

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAsyncAwaker: use shutdown on the socketpair
Guido Trotter [Thu, 17 Jun 2010 08:42:36 +0000 (09:42 +0100)]
AsyncAwaker: use shutdown on the socketpair

This makes sure the out_socket can only be used for writing, and the
in_socket for reading.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoWorkerPool.AddManyTasks
Guido Trotter [Thu, 17 Jun 2010 08:15:17 +0000 (09:15 +0100)]
WorkerPool.AddManyTasks

Useful if we want to add many tasks at once, without contention with the
previous one we added starting.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojqueue: make replication on job update optional
Guido Trotter [Thu, 17 Jun 2010 07:44:25 +0000 (08:44 +0100)]
jqueue: make replication on job update optional

Sometimes it's useful to write to the local filesystem, but immediate
replication to all master candidates is not needed.

The _WriteAndReplicateFileUnlocked function gets renamed to
_UpdateJobQueueFile, as calling "write and replicate, but don't
replicate" seemed a bit strange.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agos/queue._GetJobInfoUnlocked/job.GetInfo/
Guido Trotter [Tue, 15 Jun 2010 11:08:41 +0000 (12:08 +0100)]
s/queue._GetJobInfoUnlocked/job.GetInfo/

The job queue currently has a static _GetJobInfoUnlocked method.
Changing it to be a normal method of _QueuedJob, which makes more sense.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAbstract loading job file from disk
Guido Trotter [Tue, 15 Jun 2010 10:17:24 +0000 (11:17 +0100)]
Abstract loading job file from disk

Move the work from _LoadJobUnlocked to _LoadJobFileFromDisk, which can
then be used in other contexts as well. Also, if we fail to deserialize
the job, archive it as well (before we archived it only if we failed to
create the related object, but kept it there if deserialization failed.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMakefile: Add support for local Makefile additions
Michael Hanselmann [Thu, 17 Jun 2010 09:25:56 +0000 (11:25 +0200)]
Makefile: Add support for local Makefile additions

With the recent addition of a check for directories listed in Makefile
local custom directories are always reported as unlisted. This patch
adds support for a “Makefile.local” file, which can adjust settings in
Makefile. Example: “DIRCHECK_EXCLUDE += xyz .mydata doc/manhtml”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoListVisibleFiles: do not sort output
Guido Trotter [Fri, 11 Jun 2010 20:23:33 +0000 (21:23 +0100)]
ListVisibleFiles: do not sort output

Among all users, turns out just one *may* need the output to be sorted.
All the others can cope without.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agojqueue: simplify removal from _nodes
Guido Trotter [Mon, 14 Jun 2010 12:17:33 +0000 (13:17 +0100)]
jqueue: simplify removal from _nodes

Somewhere we do try/del/except and somewhere just pop. Using pop
everywhere saves lines of code.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoImprove gnt-debug man page
Manuel Franceschini [Mon, 14 Jun 2010 11:59:15 +0000 (13:59 +0200)]
Improve gnt-debug man page

Signed-off-by: Manuel Franceschini <livewire@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRemove a TODO
Iustin Pop [Sun, 13 Jun 2010 06:05:24 +0000 (08:05 +0200)]
Remove a TODO

Since OS objects are not stored in the configuration, we cannot put
os_hvp there, therefore the TODO is obsolete…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRework LUSetInstanceParams._GetUpdatedParams
Iustin Pop [Sun, 13 Jun 2010 05:45:27 +0000 (07:45 +0200)]
Rework LUSetInstanceParams._GetUpdatedParams

Currently, this function does three things:
- special handling of constants.VALUE_DEFAULT
- type enforcing of the resulting dict
- filling the dictionary with defaults

However, except for the first one, the second two do not belong in this
function:
- in the future, not all parameter dictionaries will be able to be
  enforced
- filling the dictionary with defaults cannot be done via a defaults
  dict in all cases, and should be done by the specialized functions
  (ideally we'd pass a partial function instance here, but we don't have
  that yet…)

As such, we remove the last items, and move them to the callers; this is
overall the same complexity, as we were calling this function in just
three places and constructing the many arguments was also complicated.

Furthermore, we move the function out of LUSetInstanceParams, as in the
future it will be used by LUSetClusterParams too.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoSplit the core-OS and instance-specific env
Iustin Pop [Fri, 11 Jun 2010 00:30:11 +0000 (02:30 +0200)]
Split the core-OS and instance-specific env

Since we'll need to be able to generate the OS-specific environment
separately from the instance one, we move it to a separate function. We
also add a new OS_NAME env. var which is identical to the INSTANCE_OS
one (which won't exist for OS-only environments).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd cluster.SimpleFill*() functions
Iustin Pop [Sun, 13 Jun 2010 05:22:07 +0000 (07:22 +0200)]
Add cluster.SimpleFill*() functions

Currently, the existing cluster.Fill* functions take as argument an
instance. This means that in any case where we don't have an actual
instance object, we have to resort to calling the low-level
objects.FillDict function.

This is bad for two reasons:
- we have to know of, and we hardcode, the cluster object internals
  (e.g. that the nicparams are stored in a dict indexed by group)
- which can result in subtle bugs, if the underlying storage mechanisms
  change

This patch adds a lower-level implementation SimpleFillHV for FillHV and
SimpleFillBE for FillBE, and adds a completely new SimpleFillNIC (all
use cases until now hardcoded cluster.nicparams[constant.PP_DEFAULT]
directly); it then uses these new functions in cmdlib.py.

A side effect is that _CheckNicsBridgesExist loses the 'profile'
parameter, which was unused. If it's needed, we should add it later via
a proper profile parameter to SimpleFillNIC.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoMerge branch 'devel-2.1' into master
Iustin Pop [Mon, 14 Jun 2010 18:11:38 +0000 (20:11 +0200)]
Merge branch 'devel-2.1' into master

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

14 years agoFix a bug in instance startup with custom hvparams
Iustin Pop [Sun, 13 Jun 2010 05:19:37 +0000 (07:19 +0200)]
Fix a bug in instance startup with custom hvparams

Since the introduction of OS-specific hvparams, we shouldn't ever use
objects.FillDict directly for instances, but instead go via the cluster
object. Otherwise the os_hvp will be ignored.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix unsafe variant initializer in _TryOSFromDisk
Iustin Pop [Mon, 14 Jun 2010 01:34:41 +0000 (03:34 +0200)]
Fix unsafe variant initializer in _TryOSFromDisk

In case an OS has inconsistent declarations, we might get into a case
where one node reports a valid variants list (with OS API >=15), and
another node has OS API < 15, in which case its supported_variants gets
the default value of None. This leads to the same variable having
inconsistent data types, which leads to subtle bugs later: instead of
reporting something like "Inconsistent OS API versions", the LU exits
with a run-time exception. Furthermore, in another datapath, variants is
initialized to '[]' in case of OS diagnose failures.

The patch changes _TryOSFromDisk to initialize variants to '[]' for
OS api level below 15, and changes the variants calculation in
DiagnoseOS to be more readable.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoMakefile: Add check for DIRS consistency
Michael Hanselmann [Mon, 14 Jun 2010 16:52:09 +0000 (18:52 +0200)]
Makefile: Add check for DIRS consistency

It's easy to forget to add a new directory to DIRS. This check should
report such inconsistencies.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoDisallow DES for SSL connections
Michael Hanselmann [Mon, 14 Jun 2010 15:37:47 +0000 (17:37 +0200)]
Disallow DES for SSL connections

Older OpenSSL versions include DES-CBC3-* ciphers when specifying the
HIGH group of ciphers. Removing potentially weak ciphers from the list
of allowed ciphers ensures only strong ciphers are considered for SSL
connections.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoStart instance after creating snapshots for export
Michael Hanselmann [Mon, 14 Jun 2010 14:37:51 +0000 (16:37 +0200)]
Start instance after creating snapshots for export

This restores functionality lost in commit 387794f8. Found during
tests using QA scripts. An instance should be started after it
has been temporarily shutdown for an export.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUse import/export magic for backup/import and inter-cluster moves
Michael Hanselmann [Fri, 11 Jun 2010 17:04:51 +0000 (19:04 +0200)]
Use import/export magic for backup/import and inter-cluster moves

This should prevent bugs in our code from accidentally overwriting
disks.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoDisable compression for all intra-cluster imports/exports
Michael Hanselmann [Fri, 11 Jun 2010 17:01:16 +0000 (19:01 +0200)]
Disable compression for all intra-cluster imports/exports

Tests have shown that usually we're CPU-bound for intra-cluster
imports/exports. Disabling compression will help with this.

Some versions of OpenSSL, depending on the build options, also
compress transparently. This will need further work in Ganeti.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoqa_rapi: Test inter-cluster instance move script
Michael Hanselmann [Fri, 11 Jun 2010 16:57:29 +0000 (18:57 +0200)]
qa_rapi: Test inter-cluster instance move script

This test moves an instance on the same cluster and, if successful,
moves it back. While not testing a real move between two clusters,
this is certainly better than nothing.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agobackend: Add support for import/export magic
Michael Hanselmann [Fri, 11 Jun 2010 16:03:42 +0000 (18:03 +0200)]
backend: Add support for import/export magic

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Add support for a magic prefix
Michael Hanselmann [Fri, 11 Jun 2010 15:14:44 +0000 (17:14 +0200)]
import/export daemon: Add support for a magic prefix

This “magic” value will be used to ensure that we don't accidentially
connect to the wrong daemon (e.g. due to a bug), comparable to DRBD's
per-disk secret. Just depending on the SSL certificate isn't enough
as it's always per instance and not per disk.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Simplify command building
Michael Hanselmann [Fri, 11 Jun 2010 14:18:12 +0000 (16:18 +0200)]
import/export daemon: Simplify command building

Instead of appending strings, stage parts in a list. Building the "dd"
command is moved to a separate function.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Limit max length of socat options
Michael Hanselmann [Fri, 11 Jun 2010 13:17:45 +0000 (15:17 +0200)]
import/export: Limit max length of socat options

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Validate remote host/port
Michael Hanselmann [Fri, 11 Jun 2010 12:07:23 +0000 (14:07 +0200)]
import/export: Validate remote host/port

The hostname and port received from the remote cluster should
be validated, just in case.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoutils: Add function to validate service name
Michael Hanselmann [Fri, 11 Jun 2010 11:52:19 +0000 (13:52 +0200)]
utils: Add function to validate service name

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoHandle ESRCH when sending signals
Michael Hanselmann [Mon, 14 Jun 2010 12:10:56 +0000 (14:10 +0200)]
Handle ESRCH when sending signals

Upon sending signals, ESRCH can be reported when the target no
longer exists.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd missing directory from Makefile.am
Guido Trotter [Mon, 14 Jun 2010 16:47:25 +0000 (17:47 +0100)]
Add missing directory from Makefile.am

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd example gnt-debug submit-job json files
Guido Trotter [Mon, 14 Jun 2010 15:16:30 +0000 (16:16 +0100)]
Add example gnt-debug submit-job json files

These files are being used to test the job queue performance with
various changes and conditions. Adding them here for posterity.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix RpcResult.Raise error code
Iustin Pop [Sat, 12 Jun 2010 00:29:01 +0000 (02:29 +0200)]
Fix RpcResult.Raise error code

A typo in the Raise() method of rpc.RpcResult means that any remote
errors will lack an appropriate error code; this will confuse e.g. RAPI
users.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoCache a few bits of status in jqueue
Guido Trotter [Fri, 11 Jun 2010 11:25:59 +0000 (12:25 +0100)]
Cache a few bits of status in jqueue

Currently each time we submit a job we check the job queue size, and the
drained file. With this change we keep these pieces of information in
memory and don't read them from the filesystem each time.

Significant changes include:
  - The drained value can only be properly set by calling the
    appropriate cluster command "gnt-cluster queue drain/undrain" and
    not by removing/creating the file in the job queue directory. Not
    that anybody would have done it in this undocumented way before.
  - We get rid of the soft limit for the job queue, which we haven't
    ever used anyway.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojstore._ReadNumericFile: use utils.ReadFile
Guido Trotter [Wed, 9 Jun 2010 13:27:26 +0000 (14:27 +0100)]
jstore._ReadNumericFile: use utils.ReadFile

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojqueue: Rename _queue_lock to _queue_filelock
Guido Trotter [Fri, 4 Jun 2010 15:51:33 +0000 (16:51 +0100)]
jqueue: Rename _queue_lock to _queue_filelock

The name clarifies the difference between this and the internal lock.
Also explain a bit better what it is.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoOptimize _GetJobIDsUnlocked
Guido Trotter [Fri, 11 Jun 2010 11:17:52 +0000 (12:17 +0100)]
Optimize _GetJobIDsUnlocked

Currently we sort the list of job queue files twice (once in
utils.ListVisibleFiles with sort and then later with NiceSort). We apply
the _RE_JOB_FILE regular expression twice (once in _ListJobFiles and
once in _ExtractJobID). This simplifies the code a little, and a couple
of functions performing basically the same job are collapsed.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRemove unused parameter from function
Guido Trotter [Fri, 11 Jun 2010 11:11:11 +0000 (12:11 +0100)]
Remove unused parameter from function

This also removes the relevant pylint disable.
No point in keeping unused parameters around: if/when we need them it's
easy to add it back.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix a TODO in _QueuedJob
Guido Trotter [Fri, 11 Jun 2010 10:34:34 +0000 (11:34 +0100)]
Fix a TODO in _QueuedJob

Rather than raising Exception use GenericError and explain a bit better
what happened.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoListVisibleFiles: do optional sorting
Guido Trotter [Wed, 9 Jun 2010 17:12:35 +0000 (18:12 +0100)]
ListVisibleFiles: do optional sorting

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoImprove import-export unittest a bit
Michael Hanselmann [Thu, 10 Jun 2010 17:02:15 +0000 (19:02 +0200)]
Improve import-export unittest a bit

- Increase timeouts from 10 to 30 seconds (this still breaks when the
  machine is busy, e.g. using bonnie++)
- Depend on only one timeout per test instead of three
- Reset variables before each test

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoTest client timeout for import-export daemon
Michael Hanselmann [Thu, 10 Jun 2010 14:25:28 +0000 (16:25 +0200)]
Test client timeout for import-export daemon

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoGenerate import-export unittest certs in parallel
Michael Hanselmann [Thu, 10 Jun 2010 14:12:30 +0000 (16:12 +0200)]
Generate import-export unittest certs in parallel

Generating certificates can be slow.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoEnforce consistency in disks and nics input dicts
Guido Trotter [Tue, 8 Jun 2010 16:40:40 +0000 (17:40 +0100)]
Enforce consistency in disks and nics input dicts

With this change unknown disk and nic parameters will be refused, rather
than silently ignored, so that one can't pass them in by mistake and not
realize what went wrong.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoTLMigrateInstance: pass lu to _Check*
Guido Trotter [Thu, 10 Jun 2010 16:43:59 +0000 (17:43 +0100)]
TLMigrateInstance: pass lu to _Check*

The various _Check* helper functions expect an lu to be passed in, but
the TL is passed instead. This works... sometimes! :)

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRemove locking._CountingCondition
Guido Trotter [Wed, 9 Jun 2010 19:01:27 +0000 (20:01 +0100)]
Remove locking._CountingCondition

This class is unused and untested. We must have forgot it around.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRemove the job queue drain rpc call
Guido Trotter [Wed, 9 Jun 2010 11:07:25 +0000 (12:07 +0100)]
Remove the job queue drain rpc call

This call was introduced but never used. In two years.
Since it's just creating/removing a file it can also be in simpler ways,
without a special rpc call, if/when we need it again. In the meantime,
let's give it to history.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMove fake hypervisor run dir under ganeti
Iustin Pop [Tue, 20 Apr 2010 11:35:27 +0000 (13:35 +0200)]
Move fake hypervisor run dir under ganeti

This makes it uniform with the other hypervisors.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

14 years ago_BaseCondition: allow saving/restoring state
Guido Trotter [Wed, 9 Jun 2010 18:35:57 +0000 (19:35 +0100)]
_BaseCondition: allow saving/restoring state

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoSharedLock _acquire_restore and _release_save
Guido Trotter [Wed, 9 Jun 2010 18:18:24 +0000 (19:18 +0100)]
SharedLock _acquire_restore and _release_save

If a shared lock is used inside a condition, we need to make sure that
it's reacquired in the same way as it was originally, after the wait.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoSubmit[*each*]Pending job
Guido Trotter [Wed, 9 Jun 2010 15:32:29 +0000 (16:32 +0100)]
Submit[*each*]Pending job

This is useful so we can test both SubmitJob and SubmitManyJobs.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd unittest for ganeti-cleaner
Michael Hanselmann [Wed, 9 Jun 2010 11:40:42 +0000 (13:40 +0200)]
Add unittest for ganeti-cleaner

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoStart to prepare documentation for 2.2 release
Michael Hanselmann [Tue, 8 Jun 2010 17:05:42 +0000 (19:05 +0200)]
Start to prepare documentation for 2.2 release

- Update NEWS file
- Remove dependency on OpenSSL (pyOpenSSL remains)
- Update manpages, fix typos and other things

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agocfgupgrade: Local variable for cluster-domain-secret filename
Michael Hanselmann [Tue, 8 Jun 2010 09:25:48 +0000 (11:25 +0200)]
cfgupgrade: Local variable for cluster-domain-secret filename

This is necessary to allow cfgupgrade to work on a non-standard directory.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agognt-job auto-completion: suggest "all" too
Iustin Pop [Tue, 8 Jun 2010 18:27:35 +0000 (20:27 +0200)]
gnt-job auto-completion: suggest "all" too

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoShow formatted ETA for disk sync and import/export
Michael Hanselmann [Thu, 3 Jun 2010 17:52:46 +0000 (19:52 +0200)]
Show formatted ETA for disk sync and import/export

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agobackend: Enable export size prediction
Michael Hanselmann [Thu, 3 Jun 2010 17:51:42 +0000 (19:51 +0200)]
backend: Enable export size prediction

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Allow script to predict size
Michael Hanselmann [Thu, 3 Jun 2010 17:50:20 +0000 (19:50 +0200)]
import/export: Allow script to predict size

Once we have a size for an export (in the context of the
import/export daemon), we can provide the user with a
percentage and ETA.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Show progress updates to user
Michael Hanselmann [Wed, 2 Jun 2010 11:06:56 +0000 (13:06 +0200)]
import/export: Show progress updates to user

With this patch, we show progress updates approx. once per minute.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Record amount of data transferred
Michael Hanselmann [Wed, 26 May 2010 18:57:42 +0000 (20:57 +0200)]
import/export daemon: Record amount of data transferred

This reports the amount of data transferred and the throughput (averaged
over 60 seconds) to the master daemon. While not yet fully implemented,
once the export scripts report the expected data size, we can even provide
an ETA and percentage.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoensure-dirs: don't fail if no rapi log is present
Guido Trotter [Fri, 4 Jun 2010 16:20:43 +0000 (17:20 +0100)]
ensure-dirs: don't fail if no rapi log is present

Sometimes a node has never been a master. Or ran rapi. In that case we
need to create the file (because if later rapi gets started, it won't be
able to create it itself).

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoIntroduce harcdoded timeouts for each RPC call
Iustin Pop [Fri, 4 Jun 2010 09:18:33 +0000 (11:18 +0200)]
Introduce harcdoded timeouts for each RPC call

This patch adds a table with per-opcode timeouts. They were chosen in an
empiric, rather than scientific, way - see the comments in lib/rpc.py.

The patch also shows how custom timeouts can be used - call_test_delay
explicitly overrides the timeout with one computed from the delay
parameters.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agohttp client: support per-request read timeout
Iustin Pop [Fri, 4 Jun 2010 08:37:05 +0000 (10:37 +0200)]
http client: support per-request read timeout

Currently, the read timeout is hardcoded in the
HttpClientRequestExecutor class. The patch changes the timeout so that
it's a per-request property, and makes the rpc.Client class pass one
explicitly in. Furthermore, we modify the rpc.RpcRunner class to support
per-call explicit timeouts.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoLet daemon-utils fix the owners for ganeti-rapi
René Nussbaumer [Thu, 3 Jun 2010 08:11:35 +0000 (10:11 +0200)]
Let daemon-utils fix the owners for ganeti-rapi

This is a workaround until we fully switched to user separation and fixes the
owners of directories/log files so ganeti-rapi will start flawlessly. This is
right now run for every daemon but as it operates on a relatively small subset
its impact is small.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoModify ganeti-masterd to set permission and owner of masterd-socket
René Nussbaumer [Thu, 3 Jun 2010 12:17:15 +0000 (14:17 +0200)]
Modify ganeti-masterd to set permission and owner of masterd-socket

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoLet ganeti-rapi run under a different user/group
René Nussbaumer [Wed, 2 Jun 2010 11:29:18 +0000 (13:29 +0200)]
Let ganeti-rapi run under a different user/group

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMake it possible to call utils.Daemonize with uid and gid to run as
René Nussbaumer [Wed, 2 Jun 2010 08:34:15 +0000 (10:34 +0200)]
Make it possible to call utils.Daemonize with uid and gid to run as

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdding customized user/group as configure flags
René Nussbaumer [Tue, 18 May 2010 13:03:17 +0000 (15:03 +0200)]
Adding customized user/group as configure flags

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMerge branch 'devel-2.1'
Guido Trotter [Fri, 4 Jun 2010 14:00:59 +0000 (15:00 +0100)]
Merge branch 'devel-2.1'

* devel-2.1:
  _ExecuteKVMRuntime: fix hv parameter fun
  Update FinalizeMigration docstring
  LUGrowDisk: fix operation on down instances
  Allow disk operation to act on a subset of disks
  NEWS: add release date for 2.1.3
  Bump up version for the 2.1.3 release

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years ago_ExecuteKVMRuntime: fix hv parameter fun
Guido Trotter [Fri, 4 Jun 2010 13:03:43 +0000 (14:03 +0100)]
_ExecuteKVMRuntime: fix hv parameter fun

When executing the kvm runtime we were currently accessing a mix of the
parameters as configured currently on the instance and the ones it was
started with. We were doing it without a precise criteria, but quite by
chance we got it *almost* right. The only remaining issue was that when
ganeti was upgraded and some parameters were added, trying to access
them from the "old" ones caused a keyerror, since they weren't present
back when the instance was started.

To fix this:
  - We fill the startup-time dict with any new parameter
  - We provide a clear guideline on which version of the parameters to
    access, and about the fact that new parameters must have an
    instance-migration backwards compatible default

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoUpdate FinalizeMigration docstring
Guido Trotter [Fri, 4 Jun 2010 11:05:56 +0000 (12:05 +0100)]
Update FinalizeMigration docstring

This is used not only for aborted migrations, so the docstring should
reflect that.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLUGrowDisk: fix operation on down instances
Guido Trotter [Fri, 4 Jun 2010 10:12:41 +0000 (11:12 +0100)]
LUGrowDisk: fix operation on down instances

Currently it's impossible to grow a disk if an instance is shutdown,
because the disk could not be assembled. Now we take care of assembling
it, and shutting it down after.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAllow disk operation to act on a subset of disks
Guido Trotter [Fri, 4 Jun 2010 09:27:40 +0000 (10:27 +0100)]
Allow disk operation to act on a subset of disks

If the disks= parameter is passed, we can assemble/wait for
sync/shutdown only some disks belonging to an instance, rather than all.

This is useful to only activate/sync/shutdown the affected disk when
growing it.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoNEWS: add release date for 2.1.3
Guido Trotter [Thu, 3 Jun 2010 13:39:33 +0000 (14:39 +0100)]
NEWS: add release date for 2.1.3

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoutils: Add function to format seconds
Michael Hanselmann [Thu, 3 Jun 2010 18:10:05 +0000 (20:10 +0200)]
utils: Add function to format seconds

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoBump up version for the 2.1.3 release v2.1.3
Guido Trotter [Wed, 2 Jun 2010 11:13:09 +0000 (12:13 +0100)]
Bump up version for the 2.1.3 release

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'devel-2.1'
Guido Trotter [Thu, 3 Jun 2010 12:36:09 +0000 (13:36 +0100)]
Merge branch 'devel-2.1'

* devel-2.1:
  TestAsyncUDPSocket: remove dead code and add test
  TestAsyncUDPSocket: test for oversized sends
  Document the check-man change
  Update NEWS for Ganeti 2.1.3
  Second attempt at fixing check-man
  Fix check-man for newer man-db
  Add RemoveDir utility function

Conflicts:
NEWS
  - trivial
test/ganeti.daemon_unittest.py
  - trivial

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoimport/export unittest: Improve logging and fix one race condition
Michael Hanselmann [Thu, 27 May 2010 15:11:13 +0000 (17:11 +0200)]
import/export unittest: Improve logging and fix one race condition

Apart from improved logging, one race condition is fixed. If
the destination's status file became available, the port would
be returned immediately, even if it was still “None”. Most of
the time it worked, but not always. Now an additional check
ensures the port evaluates to True.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export unittest: Test large(r) transfer
Michael Hanselmann [Wed, 26 May 2010 19:00:20 +0000 (21:00 +0200)]
import/export unittest: Test large(r) transfer

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agodaemon.AsyncAwaker
Guido Trotter [Tue, 18 May 2010 12:20:46 +0000 (13:20 +0100)]
daemon.AsyncAwaker

This new asyncore dispatcher can be used to force a thread running the
asyncore loop to awake from the select, by signaling it on one of its
selected sockets.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoConvert ganeti-masterd's main thread to mainloop
Guido Trotter [Thu, 13 May 2010 17:35:20 +0000 (18:35 +0100)]
Convert ganeti-masterd's main thread to mainloop

Not much changes with this patch. The main loop for the IOServer is
repaced by mainloop.Run() and the main thread now uses asyncore to
handle connections to the master socket. Once it accepts them, though,
it just pushes them to the current infrastructure, and everything
proceeds as before.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoTest the new streaming daemon classes
Guido Trotter [Mon, 24 May 2010 09:36:45 +0000 (10:36 +0100)]
Test the new streaming daemon classes

Unittests cover AsyncStreamServer and AsyncTerminatedMessageStream with
both tcp and unix sockets.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agodaemon.AsyncTerminatedMessageStream
Guido Trotter [Mon, 24 May 2010 16:24:20 +0000 (17:24 +0100)]
daemon.AsyncTerminatedMessageStream

This is the counterpart of the AsyncStreamServer can be used to handle
connected sockets returned from connected clients if the protocol is a
terminator separated message stream. Nothing in this class is server
specific though: it can be used as a client as well, if the client is
implemented inside an asyncore daemon.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agodaemon.AsyncStreamServer
Guido Trotter [Thu, 13 May 2010 17:32:25 +0000 (18:32 +0100)]
daemon.AsyncStreamServer

This is a new asyncore server which handles listening stream sockets by
calling a non-implemented function for each connection it accepts. It's
the stream-oriented cousing of the AsyncUDPSocket.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>