ganeti-local
14 years agoAdd OS parameters to cluster and instance objects
Iustin Pop [Sat, 12 Jun 2010 02:17:20 +0000 (04:17 +0200)]
Add OS parameters to cluster and instance objects

The patch also modifies the instance RPC calls to fill the osparameters
correctly with the cluster defaults, and exports the OS parameters in
the instance/OS environment.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoIntroduce an RPC call for OS parameters validation
Iustin Pop [Sat, 12 Jun 2010 02:13:29 +0000 (04:13 +0200)]
Introduce an RPC call for OS parameters validation

While we only support the 'parameters' check today, the RPC call is
generic enough that will be able to support other checks in the future.
The backend function will both validate the parameters list (so as to
make sure we don't pass in extra parameters that the OS validation
doesn't care about) and the parameter values, via the OS verify script.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd reading of OS parameters from disk
Iustin Pop [Sat, 12 Jun 2010 02:02:06 +0000 (04:02 +0200)]
Add reading of OS parameters from disk

The patch also modifies the internal methods in LUDiagnoseOS and gnt-os
to deal with the format change of call_os_diagnose.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd os api v20 and related fields to the OS object
Iustin Pop [Sat, 12 Jun 2010 01:55:59 +0000 (03:55 +0200)]
Add os api v20 and related fields to the OS object

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoSilence a pylint warning
Iustin Pop [Wed, 16 Jun 2010 02:22:51 +0000 (04:22 +0200)]
Silence a pylint warning

The OS parameters code will bump the number of lines over 10K, and thus
we need to silence this (no, we don't want any other module to become
this big…, so we use a targeted silence only).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUpdate the 2.2 design doc with OS parameters
Iustin Pop [Wed, 23 Jun 2010 04:29:57 +0000 (06:29 +0200)]
Update the 2.2 design doc with OS parameters

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRemove job object condition
Guido Trotter [Tue, 22 Jun 2010 09:46:05 +0000 (11:46 +0200)]
Remove job object condition

We don't need it anymore, since nobody waits on it.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoParallelize WaitForJobChanges
Guido Trotter [Tue, 15 Jun 2010 16:39:30 +0000 (17:39 +0100)]
Parallelize WaitForJobChanges

As for QueryJobs we rely on file updates rather than condition
notification to acquire job changes. In order to do that we use the
pyinotify module to watch files. This might make the client a bit slower
(pending planned improvements, such as subscription-based
WaitForJobChanges) but detaches it from the job execution.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoUpdate the job file on feedback
Guido Trotter [Tue, 22 Jun 2010 09:02:10 +0000 (11:02 +0200)]
Update the job file on feedback

This is needed to convert waitforjobchanges to use inotify and the
on-disk version and decouple it from the job queue lock. No replication
to remote nodes is done, to keep the operation fast.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoDon't lock on QueryJobs, by using the disk version
Guido Trotter [Mon, 14 Jun 2010 10:23:52 +0000 (11:23 +0100)]
Don't lock on QueryJobs, by using the disk version

We move from querying the in-memory version to loading all jobs from the
disk. Since the jobs are written/deleted on disk in an atomic manner, we
don't need to lock at all. Also, since we're just looking at the
contents of a directory, we don't need to check that the job queue is
"open".

If some jobs are removed between when we listed them and us loading
them, we need to be able to cope: if we were asked to load those jobs
specifically, we must report the failure, but if we were just asked to
"load all" we shall just not consider them as part of the "all" set,
since they were deleted.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd JobQueue.SafeLoadJobFromDisk
Guido Trotter [Tue, 22 Jun 2010 09:35:32 +0000 (11:35 +0200)]
Add JobQueue.SafeLoadJobFromDisk

This will be used to read a job file without having to deal with
exceptions from _LoadJobFromDisk.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agojqueue._LoadJobFromDisk: remove safety archival
Guido Trotter [Tue, 22 Jun 2010 09:19:47 +0000 (11:19 +0200)]
jqueue._LoadJobFromDisk: remove safety archival

Currently _LoadJobFromDisk archives job files it finds corrupted. Since
we want to use it to load files without holding locks, this could cause
a conflict: we just move the feature to _LoadJobUnlocked which is always
called with the lock held.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd repetition count to the TestDelay opcode
Guido Trotter [Wed, 23 Jun 2010 07:50:26 +0000 (09:50 +0200)]
Add repetition count to the TestDelay opcode

If the repetition count is not passed or is passed as 0 we sleep exactly
one time, otherwise we sleep "repeat" times and log in between.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'devel-2.1'
Iustin Pop [Tue, 22 Jun 2010 13:25:55 +0000 (15:25 +0200)]
Merge branch 'devel-2.1'

* devel-2.1:
  Add "adopt" to the allowed disk parameters
  Improve pylintrc for pylint 0.21+
  Fix warnings with Python 2.6
  Fix a small bug introduced in cf26a87a
  Fix the type of 'valid' attribute in LUDiagnoseOS

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd "adopt" to the allowed disk parameters
Apollon Oikonomopoulos [Fri, 18 Jun 2010 14:52:05 +0000 (17:52 +0300)]
Add "adopt" to the allowed disk parameters

"adopt" was missing from bd061c3, thus breaking disk adoption.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoImprove pylintrc for pylint 0.21+
Iustin Pop [Tue, 22 Jun 2010 09:48:36 +0000 (11:48 +0200)]
Improve pylintrc for pylint 0.21+

While we'll need to update the source files too, at least this change
makes pylint 0.21 not fail on the current source tree.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix warnings with Python 2.6
Iustin Pop [Tue, 22 Jun 2010 09:38:23 +0000 (11:38 +0200)]
Fix warnings with Python 2.6

'format' is a new built-in function, and 'bytes' is a new builtin type.
We rename this to make pylint happy (and remove potential bugs).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix a small bug introduced in cf26a87a
Iustin Pop [Fri, 18 Jun 2010 12:30:48 +0000 (14:30 +0200)]
Fix a small bug introduced in cf26a87a

Commit cf26a87a added a tiny typo, which would break non-FQDN arguments
to modify node storage.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix the type of 'valid' attribute in LUDiagnoseOS
Iustin Pop [Mon, 14 Jun 2010 20:09:23 +0000 (22:09 +0200)]
Fix the type of 'valid' attribute in LUDiagnoseOS

The update of the valid status in LUDiagnoseOS says:

  valid = valid and osl and osl[0][1]

However, in Python, “True and []” (which '[]' we get for an invalid OS)
will result in “[]”, and thus the valid field for an OS will be either
True or an empty list. Which is not what we want…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd "adopt" to the allowed disk parameters
Apollon Oikonomopoulos [Fri, 18 Jun 2010 14:52:05 +0000 (17:52 +0300)]
Add "adopt" to the allowed disk parameters

"adopt" was missing from bd061c3, thus breaking disk adoption.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'stable-2.1'
Guido Trotter [Fri, 18 Jun 2010 10:41:30 +0000 (11:41 +0100)]
Merge branch 'stable-2.1'

* stable-2.1:
  Bump up version for the 2.1.4 release
  Update NEWS about the latest 2.1 change
  Fix handling of errors from socket.gethostbyname
  Update a comment in qa-sample.json
  RAPI client: Add support for Python 2.6
  Update NEWS for Ganeti 2.1.4

Conflicts:
NEWS: keep both
configure.ac: keep the 2.2 version
qa/qa-sample.json: merge nearby changes

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoBump up version for the 2.1.4 release v2.1.4
Guido Trotter [Thu, 17 Jun 2010 15:09:26 +0000 (16:09 +0100)]
Bump up version for the 2.1.4 release

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoUpdate NEWS about the latest 2.1 change
Guido Trotter [Thu, 17 Jun 2010 17:06:22 +0000 (18:06 +0100)]
Update NEWS about the latest 2.1 change

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix handling of errors from socket.gethostbyname
Iustin Pop [Wed, 16 Jun 2010 03:16:05 +0000 (05:16 +0200)]
Fix handling of errors from socket.gethostbyname

Socket functions can raise more than just gaierror. Most of the times,
socket.gethostbyname_ex will return gaierror, but rarely it will also
raise herror. For completeness, we catch all socket exceptions with data
of type (code, description).

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUpdate a comment in qa-sample.json
Guido Trotter [Thu, 17 Jun 2010 16:53:56 +0000 (17:53 +0100)]
Update a comment in qa-sample.json

Fix the sentence to say what it means.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agognt-debug: remove @todo from GenericOpCodes
Guido Trotter [Thu, 17 Jun 2010 14:46:09 +0000 (15:46 +0100)]
gnt-debug: remove @todo from GenericOpCodes

- the function is not broken, and we're using in nowadays
- we have example json files and all, which show its usage
=> the todo is incorrect

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojqueue.AddManyJobs: use AddManyTasks
Guido Trotter [Thu, 17 Jun 2010 13:02:20 +0000 (14:02 +0100)]
jqueue.AddManyJobs: use AddManyTasks

Rather than adding the jobs to the worker pool one at a time, we add
them all together, which is slightly faster, and ensures they don't get
started while we loop.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoWorkerpool.AddManyTasks: check tasks type
Guido Trotter [Thu, 17 Jun 2010 13:32:55 +0000 (14:32 +0100)]
Workerpool.AddManyTasks: check tasks type

Each task has to be a sequence, or the RunTask call will fail.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agocount the number of tasks done in the wp unittest
Guido Trotter [Thu, 17 Jun 2010 13:02:32 +0000 (14:02 +0100)]
count the number of tasks done in the wp unittest

Currently there's no way to know if something actually gets done.
After this check we actually test that the threads do their job.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRAPI client: Add support for Python 2.6
Michael Hanselmann [Thu, 17 Jun 2010 14:48:43 +0000 (16:48 +0200)]
RAPI client: Add support for Python 2.6

The httplib module used by urllib2 requires its sockets to have a
makefile() method to provide a file-like interface (or rather
file-in-Python-like) to the socket. PyOpenSSL doesn't implement
makefile() as the semantics require files to call dup(2) on the
underlying file descriptors, something not easily done on SSL sockets.

Python up to and including 2.5 have a class to simulate makefile(),
httplib.FakeSocket. With the addition of SSL support in Python 2.6, this
class was deprecated and no longer functions.

This patch adds a new, simpler wrapper class which is used in Python 2.6
and above only. It's good enough for this use.

There are general problems in these generic wrapper classes--none of
them handles SSL I/O properly. They break, for example, when the server
requests a renegotiation. This will need more work.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRAPI client: Add support for Python 2.6
Michael Hanselmann [Thu, 17 Jun 2010 14:48:43 +0000 (16:48 +0200)]
RAPI client: Add support for Python 2.6

The httplib module used by urllib2 requires its sockets to have a
makefile() method to provide a file-like interface (or rather
file-in-Python-like) to the socket. PyOpenSSL doesn't implement
makefile() as the semantics require files to call dup(2) on the
underlying file descriptors, something not easily done on SSL sockets.

Python up to and including 2.5 have a class to simulate makefile(),
httplib.FakeSocket. With the addition of SSL support in Python 2.6, this
class was deprecated and no longer functions.

This patch adds a new, simpler wrapper class which is used in Python 2.6
and above only. It's good enough for this use.

There are general problems in these generic wrapper classes--none of
them handles SSL I/O properly. They break, for example, when the server
requests a renegotiation. This will need more work.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoBump RPC protocol version to 40
Michael Hanselmann [Thu, 17 Jun 2010 12:14:19 +0000 (14:14 +0200)]
Bump RPC protocol version to 40

Many RPC calls have changed in Ganeti 2.2, hence bumping the RPC protocol
version.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoChange ganeti-cleaner unittest to not use random values
Michael Hanselmann [Thu, 17 Jun 2010 12:12:16 +0000 (14:12 +0200)]
Change ganeti-cleaner unittest to not use random values

Using random values in unittests isn't good. This one broke exactly
when building the 2.2.0~beta0 release. I suspect there were duplicate
job IDs generated (due to $large being not so large).

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUpdate NEWS for Ganeti 2.1.4
Guido Trotter [Thu, 17 Jun 2010 11:06:36 +0000 (12:06 +0100)]
Update NEWS for Ganeti 2.1.4

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoBump version to 2.2.0~beta0 v2.2.0beta0
Michael Hanselmann [Thu, 17 Jun 2010 09:40:03 +0000 (11:40 +0200)]
Bump version to 2.2.0~beta0

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix parameter names in SimpleFillBE/NIC docstrings
Guido Trotter [Thu, 17 Jun 2010 10:08:53 +0000 (11:08 +0100)]
Fix parameter names in SimpleFillBE/NIC docstrings

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAsyncAwaker: use shutdown on the socketpair
Guido Trotter [Thu, 17 Jun 2010 08:42:36 +0000 (09:42 +0100)]
AsyncAwaker: use shutdown on the socketpair

This makes sure the out_socket can only be used for writing, and the
in_socket for reading.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoWorkerPool.AddManyTasks
Guido Trotter [Thu, 17 Jun 2010 08:15:17 +0000 (09:15 +0100)]
WorkerPool.AddManyTasks

Useful if we want to add many tasks at once, without contention with the
previous one we added starting.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojqueue: make replication on job update optional
Guido Trotter [Thu, 17 Jun 2010 07:44:25 +0000 (08:44 +0100)]
jqueue: make replication on job update optional

Sometimes it's useful to write to the local filesystem, but immediate
replication to all master candidates is not needed.

The _WriteAndReplicateFileUnlocked function gets renamed to
_UpdateJobQueueFile, as calling "write and replicate, but don't
replicate" seemed a bit strange.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agos/queue._GetJobInfoUnlocked/job.GetInfo/
Guido Trotter [Tue, 15 Jun 2010 11:08:41 +0000 (12:08 +0100)]
s/queue._GetJobInfoUnlocked/job.GetInfo/

The job queue currently has a static _GetJobInfoUnlocked method.
Changing it to be a normal method of _QueuedJob, which makes more sense.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAbstract loading job file from disk
Guido Trotter [Tue, 15 Jun 2010 10:17:24 +0000 (11:17 +0100)]
Abstract loading job file from disk

Move the work from _LoadJobUnlocked to _LoadJobFileFromDisk, which can
then be used in other contexts as well. Also, if we fail to deserialize
the job, archive it as well (before we archived it only if we failed to
create the related object, but kept it there if deserialization failed.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMakefile: Add support for local Makefile additions
Michael Hanselmann [Thu, 17 Jun 2010 09:25:56 +0000 (11:25 +0200)]
Makefile: Add support for local Makefile additions

With the recent addition of a check for directories listed in Makefile
local custom directories are always reported as unlisted. This patch
adds support for a “Makefile.local” file, which can adjust settings in
Makefile. Example: “DIRCHECK_EXCLUDE += xyz .mydata doc/manhtml”.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoListVisibleFiles: do not sort output
Guido Trotter [Fri, 11 Jun 2010 20:23:33 +0000 (21:23 +0100)]
ListVisibleFiles: do not sort output

Among all users, turns out just one *may* need the output to be sorted.
All the others can cope without.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agojqueue: simplify removal from _nodes
Guido Trotter [Mon, 14 Jun 2010 12:17:33 +0000 (13:17 +0100)]
jqueue: simplify removal from _nodes

Somewhere we do try/del/except and somewhere just pop. Using pop
everywhere saves lines of code.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoImprove gnt-debug man page
Manuel Franceschini [Mon, 14 Jun 2010 11:59:15 +0000 (13:59 +0200)]
Improve gnt-debug man page

Signed-off-by: Manuel Franceschini <livewire@google.com>
Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRemove a TODO
Iustin Pop [Sun, 13 Jun 2010 06:05:24 +0000 (08:05 +0200)]
Remove a TODO

Since OS objects are not stored in the configuration, we cannot put
os_hvp there, therefore the TODO is obsolete…

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoRework LUSetInstanceParams._GetUpdatedParams
Iustin Pop [Sun, 13 Jun 2010 05:45:27 +0000 (07:45 +0200)]
Rework LUSetInstanceParams._GetUpdatedParams

Currently, this function does three things:
- special handling of constants.VALUE_DEFAULT
- type enforcing of the resulting dict
- filling the dictionary with defaults

However, except for the first one, the second two do not belong in this
function:
- in the future, not all parameter dictionaries will be able to be
  enforced
- filling the dictionary with defaults cannot be done via a defaults
  dict in all cases, and should be done by the specialized functions
  (ideally we'd pass a partial function instance here, but we don't have
  that yet…)

As such, we remove the last items, and move them to the callers; this is
overall the same complexity, as we were calling this function in just
three places and constructing the many arguments was also complicated.

Furthermore, we move the function out of LUSetInstanceParams, as in the
future it will be used by LUSetClusterParams too.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoSplit the core-OS and instance-specific env
Iustin Pop [Fri, 11 Jun 2010 00:30:11 +0000 (02:30 +0200)]
Split the core-OS and instance-specific env

Since we'll need to be able to generate the OS-specific environment
separately from the instance one, we move it to a separate function. We
also add a new OS_NAME env. var which is identical to the INSTANCE_OS
one (which won't exist for OS-only environments).

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd cluster.SimpleFill*() functions
Iustin Pop [Sun, 13 Jun 2010 05:22:07 +0000 (07:22 +0200)]
Add cluster.SimpleFill*() functions

Currently, the existing cluster.Fill* functions take as argument an
instance. This means that in any case where we don't have an actual
instance object, we have to resort to calling the low-level
objects.FillDict function.

This is bad for two reasons:
- we have to know of, and we hardcode, the cluster object internals
  (e.g. that the nicparams are stored in a dict indexed by group)
- which can result in subtle bugs, if the underlying storage mechanisms
  change

This patch adds a lower-level implementation SimpleFillHV for FillHV and
SimpleFillBE for FillBE, and adds a completely new SimpleFillNIC (all
use cases until now hardcoded cluster.nicparams[constant.PP_DEFAULT]
directly); it then uses these new functions in cmdlib.py.

A side effect is that _CheckNicsBridgesExist loses the 'profile'
parameter, which was unused. If it's needed, we should add it later via
a proper profile parameter to SimpleFillNIC.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoMerge branch 'devel-2.1' into master
Iustin Pop [Mon, 14 Jun 2010 18:11:38 +0000 (20:11 +0200)]
Merge branch 'devel-2.1' into master

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

14 years agoFix a bug in instance startup with custom hvparams
Iustin Pop [Sun, 13 Jun 2010 05:19:37 +0000 (07:19 +0200)]
Fix a bug in instance startup with custom hvparams

Since the introduction of OS-specific hvparams, we shouldn't ever use
objects.FillDict directly for instances, but instead go via the cluster
object. Otherwise the os_hvp will be ignored.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix unsafe variant initializer in _TryOSFromDisk
Iustin Pop [Mon, 14 Jun 2010 01:34:41 +0000 (03:34 +0200)]
Fix unsafe variant initializer in _TryOSFromDisk

In case an OS has inconsistent declarations, we might get into a case
where one node reports a valid variants list (with OS API >=15), and
another node has OS API < 15, in which case its supported_variants gets
the default value of None. This leads to the same variable having
inconsistent data types, which leads to subtle bugs later: instead of
reporting something like "Inconsistent OS API versions", the LU exits
with a run-time exception. Furthermore, in another datapath, variants is
initialized to '[]' in case of OS diagnose failures.

The patch changes _TryOSFromDisk to initialize variants to '[]' for
OS api level below 15, and changes the variants calculation in
DiagnoseOS to be more readable.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoMakefile: Add check for DIRS consistency
Michael Hanselmann [Mon, 14 Jun 2010 16:52:09 +0000 (18:52 +0200)]
Makefile: Add check for DIRS consistency

It's easy to forget to add a new directory to DIRS. This check should
report such inconsistencies.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoDisallow DES for SSL connections
Michael Hanselmann [Mon, 14 Jun 2010 15:37:47 +0000 (17:37 +0200)]
Disallow DES for SSL connections

Older OpenSSL versions include DES-CBC3-* ciphers when specifying the
HIGH group of ciphers. Removing potentially weak ciphers from the list
of allowed ciphers ensures only strong ciphers are considered for SSL
connections.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoStart instance after creating snapshots for export
Michael Hanselmann [Mon, 14 Jun 2010 14:37:51 +0000 (16:37 +0200)]
Start instance after creating snapshots for export

This restores functionality lost in commit 387794f8. Found during
tests using QA scripts. An instance should be started after it
has been temporarily shutdown for an export.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoUse import/export magic for backup/import and inter-cluster moves
Michael Hanselmann [Fri, 11 Jun 2010 17:04:51 +0000 (19:04 +0200)]
Use import/export magic for backup/import and inter-cluster moves

This should prevent bugs in our code from accidentally overwriting
disks.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoDisable compression for all intra-cluster imports/exports
Michael Hanselmann [Fri, 11 Jun 2010 17:01:16 +0000 (19:01 +0200)]
Disable compression for all intra-cluster imports/exports

Tests have shown that usually we're CPU-bound for intra-cluster
imports/exports. Disabling compression will help with this.

Some versions of OpenSSL, depending on the build options, also
compress transparently. This will need further work in Ganeti.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoqa_rapi: Test inter-cluster instance move script
Michael Hanselmann [Fri, 11 Jun 2010 16:57:29 +0000 (18:57 +0200)]
qa_rapi: Test inter-cluster instance move script

This test moves an instance on the same cluster and, if successful,
moves it back. While not testing a real move between two clusters,
this is certainly better than nothing.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agobackend: Add support for import/export magic
Michael Hanselmann [Fri, 11 Jun 2010 16:03:42 +0000 (18:03 +0200)]
backend: Add support for import/export magic

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Add support for a magic prefix
Michael Hanselmann [Fri, 11 Jun 2010 15:14:44 +0000 (17:14 +0200)]
import/export daemon: Add support for a magic prefix

This “magic” value will be used to ensure that we don't accidentially
connect to the wrong daemon (e.g. due to a bug), comparable to DRBD's
per-disk secret. Just depending on the SSL certificate isn't enough
as it's always per instance and not per disk.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Simplify command building
Michael Hanselmann [Fri, 11 Jun 2010 14:18:12 +0000 (16:18 +0200)]
import/export daemon: Simplify command building

Instead of appending strings, stage parts in a list. Building the "dd"
command is moved to a separate function.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Limit max length of socat options
Michael Hanselmann [Fri, 11 Jun 2010 13:17:45 +0000 (15:17 +0200)]
import/export: Limit max length of socat options

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Validate remote host/port
Michael Hanselmann [Fri, 11 Jun 2010 12:07:23 +0000 (14:07 +0200)]
import/export: Validate remote host/port

The hostname and port received from the remote cluster should
be validated, just in case.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoutils: Add function to validate service name
Michael Hanselmann [Fri, 11 Jun 2010 11:52:19 +0000 (13:52 +0200)]
utils: Add function to validate service name

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoHandle ESRCH when sending signals
Michael Hanselmann [Mon, 14 Jun 2010 12:10:56 +0000 (14:10 +0200)]
Handle ESRCH when sending signals

Upon sending signals, ESRCH can be reported when the target no
longer exists.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd missing directory from Makefile.am
Guido Trotter [Mon, 14 Jun 2010 16:47:25 +0000 (17:47 +0100)]
Add missing directory from Makefile.am

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd example gnt-debug submit-job json files
Guido Trotter [Mon, 14 Jun 2010 15:16:30 +0000 (16:16 +0100)]
Add example gnt-debug submit-job json files

These files are being used to test the job queue performance with
various changes and conditions. Adding them here for posterity.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix RpcResult.Raise error code
Iustin Pop [Sat, 12 Jun 2010 00:29:01 +0000 (02:29 +0200)]
Fix RpcResult.Raise error code

A typo in the Raise() method of rpc.RpcResult means that any remote
errors will lack an appropriate error code; this will confuse e.g. RAPI
users.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoCache a few bits of status in jqueue
Guido Trotter [Fri, 11 Jun 2010 11:25:59 +0000 (12:25 +0100)]
Cache a few bits of status in jqueue

Currently each time we submit a job we check the job queue size, and the
drained file. With this change we keep these pieces of information in
memory and don't read them from the filesystem each time.

Significant changes include:
  - The drained value can only be properly set by calling the
    appropriate cluster command "gnt-cluster queue drain/undrain" and
    not by removing/creating the file in the job queue directory. Not
    that anybody would have done it in this undocumented way before.
  - We get rid of the soft limit for the job queue, which we haven't
    ever used anyway.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojstore._ReadNumericFile: use utils.ReadFile
Guido Trotter [Wed, 9 Jun 2010 13:27:26 +0000 (14:27 +0100)]
jstore._ReadNumericFile: use utils.ReadFile

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojqueue: Rename _queue_lock to _queue_filelock
Guido Trotter [Fri, 4 Jun 2010 15:51:33 +0000 (16:51 +0100)]
jqueue: Rename _queue_lock to _queue_filelock

The name clarifies the difference between this and the internal lock.
Also explain a bit better what it is.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoOptimize _GetJobIDsUnlocked
Guido Trotter [Fri, 11 Jun 2010 11:17:52 +0000 (12:17 +0100)]
Optimize _GetJobIDsUnlocked

Currently we sort the list of job queue files twice (once in
utils.ListVisibleFiles with sort and then later with NiceSort). We apply
the _RE_JOB_FILE regular expression twice (once in _ListJobFiles and
once in _ExtractJobID). This simplifies the code a little, and a couple
of functions performing basically the same job are collapsed.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRemove unused parameter from function
Guido Trotter [Fri, 11 Jun 2010 11:11:11 +0000 (12:11 +0100)]
Remove unused parameter from function

This also removes the relevant pylint disable.
No point in keeping unused parameters around: if/when we need them it's
easy to add it back.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix a TODO in _QueuedJob
Guido Trotter [Fri, 11 Jun 2010 10:34:34 +0000 (11:34 +0100)]
Fix a TODO in _QueuedJob

Rather than raising Exception use GenericError and explain a bit better
what happened.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoListVisibleFiles: do optional sorting
Guido Trotter [Wed, 9 Jun 2010 17:12:35 +0000 (18:12 +0100)]
ListVisibleFiles: do optional sorting

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoImprove import-export unittest a bit
Michael Hanselmann [Thu, 10 Jun 2010 17:02:15 +0000 (19:02 +0200)]
Improve import-export unittest a bit

- Increase timeouts from 10 to 30 seconds (this still breaks when the
  machine is busy, e.g. using bonnie++)
- Depend on only one timeout per test instead of three
- Reset variables before each test

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoTest client timeout for import-export daemon
Michael Hanselmann [Thu, 10 Jun 2010 14:25:28 +0000 (16:25 +0200)]
Test client timeout for import-export daemon

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoGenerate import-export unittest certs in parallel
Michael Hanselmann [Thu, 10 Jun 2010 14:12:30 +0000 (16:12 +0200)]
Generate import-export unittest certs in parallel

Generating certificates can be slow.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoEnforce consistency in disks and nics input dicts
Guido Trotter [Tue, 8 Jun 2010 16:40:40 +0000 (17:40 +0100)]
Enforce consistency in disks and nics input dicts

With this change unknown disk and nic parameters will be refused, rather
than silently ignored, so that one can't pass them in by mistake and not
realize what went wrong.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoTLMigrateInstance: pass lu to _Check*
Guido Trotter [Thu, 10 Jun 2010 16:43:59 +0000 (17:43 +0100)]
TLMigrateInstance: pass lu to _Check*

The various _Check* helper functions expect an lu to be passed in, but
the TL is passed instead. This works... sometimes! :)

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRemove locking._CountingCondition
Guido Trotter [Wed, 9 Jun 2010 19:01:27 +0000 (20:01 +0100)]
Remove locking._CountingCondition

This class is unused and untested. We must have forgot it around.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRemove the job queue drain rpc call
Guido Trotter [Wed, 9 Jun 2010 11:07:25 +0000 (12:07 +0100)]
Remove the job queue drain rpc call

This call was introduced but never used. In two years.
Since it's just creating/removing a file it can also be in simpler ways,
without a special rpc call, if/when we need it again. In the meantime,
let's give it to history.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMove fake hypervisor run dir under ganeti
Iustin Pop [Tue, 20 Apr 2010 11:35:27 +0000 (13:35 +0200)]
Move fake hypervisor run dir under ganeti

This makes it uniform with the other hypervisors.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

14 years ago_BaseCondition: allow saving/restoring state
Guido Trotter [Wed, 9 Jun 2010 18:35:57 +0000 (19:35 +0100)]
_BaseCondition: allow saving/restoring state

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoSharedLock _acquire_restore and _release_save
Guido Trotter [Wed, 9 Jun 2010 18:18:24 +0000 (19:18 +0100)]
SharedLock _acquire_restore and _release_save

If a shared lock is used inside a condition, we need to make sure that
it's reacquired in the same way as it was originally, after the wait.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoSubmit[*each*]Pending job
Guido Trotter [Wed, 9 Jun 2010 15:32:29 +0000 (16:32 +0100)]
Submit[*each*]Pending job

This is useful so we can test both SubmitJob and SubmitManyJobs.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd unittest for ganeti-cleaner
Michael Hanselmann [Wed, 9 Jun 2010 11:40:42 +0000 (13:40 +0200)]
Add unittest for ganeti-cleaner

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoStart to prepare documentation for 2.2 release
Michael Hanselmann [Tue, 8 Jun 2010 17:05:42 +0000 (19:05 +0200)]
Start to prepare documentation for 2.2 release

- Update NEWS file
- Remove dependency on OpenSSL (pyOpenSSL remains)
- Update manpages, fix typos and other things

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agocfgupgrade: Local variable for cluster-domain-secret filename
Michael Hanselmann [Tue, 8 Jun 2010 09:25:48 +0000 (11:25 +0200)]
cfgupgrade: Local variable for cluster-domain-secret filename

This is necessary to allow cfgupgrade to work on a non-standard directory.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agognt-job auto-completion: suggest "all" too
Iustin Pop [Tue, 8 Jun 2010 18:27:35 +0000 (20:27 +0200)]
gnt-job auto-completion: suggest "all" too

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoShow formatted ETA for disk sync and import/export
Michael Hanselmann [Thu, 3 Jun 2010 17:52:46 +0000 (19:52 +0200)]
Show formatted ETA for disk sync and import/export

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agobackend: Enable export size prediction
Michael Hanselmann [Thu, 3 Jun 2010 17:51:42 +0000 (19:51 +0200)]
backend: Enable export size prediction

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Allow script to predict size
Michael Hanselmann [Thu, 3 Jun 2010 17:50:20 +0000 (19:50 +0200)]
import/export: Allow script to predict size

Once we have a size for an export (in the context of the
import/export daemon), we can provide the user with a
percentage and ETA.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Show progress updates to user
Michael Hanselmann [Wed, 2 Jun 2010 11:06:56 +0000 (13:06 +0200)]
import/export: Show progress updates to user

With this patch, we show progress updates approx. once per minute.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Record amount of data transferred
Michael Hanselmann [Wed, 26 May 2010 18:57:42 +0000 (20:57 +0200)]
import/export daemon: Record amount of data transferred

This reports the amount of data transferred and the throughput (averaged
over 60 seconds) to the master daemon. While not yet fully implemented,
once the export scripts report the expected data size, we can even provide
an ETA and percentage.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoensure-dirs: don't fail if no rapi log is present
Guido Trotter [Fri, 4 Jun 2010 16:20:43 +0000 (17:20 +0100)]
ensure-dirs: don't fail if no rapi log is present

Sometimes a node has never been a master. Or ran rapi. In that case we
need to create the file (because if later rapi gets started, it won't be
able to create it itself).

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoIntroduce harcdoded timeouts for each RPC call
Iustin Pop [Fri, 4 Jun 2010 09:18:33 +0000 (11:18 +0200)]
Introduce harcdoded timeouts for each RPC call

This patch adds a table with per-opcode timeouts. They were chosen in an
empiric, rather than scientific, way - see the comments in lib/rpc.py.

The patch also shows how custom timeouts can be used - call_test_delay
explicitly overrides the timeout with one computed from the delay
parameters.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agohttp client: support per-request read timeout
Iustin Pop [Fri, 4 Jun 2010 08:37:05 +0000 (10:37 +0200)]
http client: support per-request read timeout

Currently, the read timeout is hardcoded in the
HttpClientRequestExecutor class. The patch changes the timeout so that
it's a per-request property, and makes the rpc.Client class pass one
explicitly in. Furthermore, we modify the rpc.RpcRunner class to support
per-call explicit timeouts.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoLet daemon-utils fix the owners for ganeti-rapi
René Nussbaumer [Thu, 3 Jun 2010 08:11:35 +0000 (10:11 +0200)]
Let daemon-utils fix the owners for ganeti-rapi

This is a workaround until we fully switched to user separation and fixes the
owners of directories/log files so ganeti-rapi will start flawlessly. This is
right now run for every daemon but as it operates on a relatively small subset
its impact is small.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoModify ganeti-masterd to set permission and owner of masterd-socket
René Nussbaumer [Thu, 3 Jun 2010 12:17:15 +0000 (14:17 +0200)]
Modify ganeti-masterd to set permission and owner of masterd-socket

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>