ganeti-local
14 years agoHandle ESRCH when sending signals
Michael Hanselmann [Mon, 14 Jun 2010 12:10:56 +0000 (14:10 +0200)]
Handle ESRCH when sending signals

Upon sending signals, ESRCH can be reported when the target no
longer exists.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd missing directory from Makefile.am
Guido Trotter [Mon, 14 Jun 2010 16:47:25 +0000 (17:47 +0100)]
Add missing directory from Makefile.am

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd example gnt-debug submit-job json files
Guido Trotter [Mon, 14 Jun 2010 15:16:30 +0000 (16:16 +0100)]
Add example gnt-debug submit-job json files

These files are being used to test the job queue performance with
various changes and conditions. Adding them here for posterity.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoCache a few bits of status in jqueue
Guido Trotter [Fri, 11 Jun 2010 11:25:59 +0000 (12:25 +0100)]
Cache a few bits of status in jqueue

Currently each time we submit a job we check the job queue size, and the
drained file. With this change we keep these pieces of information in
memory and don't read them from the filesystem each time.

Significant changes include:
  - The drained value can only be properly set by calling the
    appropriate cluster command "gnt-cluster queue drain/undrain" and
    not by removing/creating the file in the job queue directory. Not
    that anybody would have done it in this undocumented way before.
  - We get rid of the soft limit for the job queue, which we haven't
    ever used anyway.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojstore._ReadNumericFile: use utils.ReadFile
Guido Trotter [Wed, 9 Jun 2010 13:27:26 +0000 (14:27 +0100)]
jstore._ReadNumericFile: use utils.ReadFile

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agojqueue: Rename _queue_lock to _queue_filelock
Guido Trotter [Fri, 4 Jun 2010 15:51:33 +0000 (16:51 +0100)]
jqueue: Rename _queue_lock to _queue_filelock

The name clarifies the difference between this and the internal lock.
Also explain a bit better what it is.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoOptimize _GetJobIDsUnlocked
Guido Trotter [Fri, 11 Jun 2010 11:17:52 +0000 (12:17 +0100)]
Optimize _GetJobIDsUnlocked

Currently we sort the list of job queue files twice (once in
utils.ListVisibleFiles with sort and then later with NiceSort). We apply
the _RE_JOB_FILE regular expression twice (once in _ListJobFiles and
once in _ExtractJobID). This simplifies the code a little, and a couple
of functions performing basically the same job are collapsed.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRemove unused parameter from function
Guido Trotter [Fri, 11 Jun 2010 11:11:11 +0000 (12:11 +0100)]
Remove unused parameter from function

This also removes the relevant pylint disable.
No point in keeping unused parameters around: if/when we need them it's
easy to add it back.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoFix a TODO in _QueuedJob
Guido Trotter [Fri, 11 Jun 2010 10:34:34 +0000 (11:34 +0100)]
Fix a TODO in _QueuedJob

Rather than raising Exception use GenericError and explain a bit better
what happened.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoListVisibleFiles: do optional sorting
Guido Trotter [Wed, 9 Jun 2010 17:12:35 +0000 (18:12 +0100)]
ListVisibleFiles: do optional sorting

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoImprove import-export unittest a bit
Michael Hanselmann [Thu, 10 Jun 2010 17:02:15 +0000 (19:02 +0200)]
Improve import-export unittest a bit

- Increase timeouts from 10 to 30 seconds (this still breaks when the
  machine is busy, e.g. using bonnie++)
- Depend on only one timeout per test instead of three
- Reset variables before each test

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoTest client timeout for import-export daemon
Michael Hanselmann [Thu, 10 Jun 2010 14:25:28 +0000 (16:25 +0200)]
Test client timeout for import-export daemon

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoGenerate import-export unittest certs in parallel
Michael Hanselmann [Thu, 10 Jun 2010 14:12:30 +0000 (16:12 +0200)]
Generate import-export unittest certs in parallel

Generating certificates can be slow.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoRemove locking._CountingCondition
Guido Trotter [Wed, 9 Jun 2010 19:01:27 +0000 (20:01 +0100)]
Remove locking._CountingCondition

This class is unused and untested. We must have forgot it around.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRemove the job queue drain rpc call
Guido Trotter [Wed, 9 Jun 2010 11:07:25 +0000 (12:07 +0100)]
Remove the job queue drain rpc call

This call was introduced but never used. In two years.
Since it's just creating/removing a file it can also be in simpler ways,
without a special rpc call, if/when we need it again. In the meantime,
let's give it to history.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years ago_BaseCondition: allow saving/restoring state
Guido Trotter [Wed, 9 Jun 2010 18:35:57 +0000 (19:35 +0100)]
_BaseCondition: allow saving/restoring state

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoSharedLock _acquire_restore and _release_save
Guido Trotter [Wed, 9 Jun 2010 18:18:24 +0000 (19:18 +0100)]
SharedLock _acquire_restore and _release_save

If a shared lock is used inside a condition, we need to make sure that
it's reacquired in the same way as it was originally, after the wait.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoSubmit[*each*]Pending job
Guido Trotter [Wed, 9 Jun 2010 15:32:29 +0000 (16:32 +0100)]
Submit[*each*]Pending job

This is useful so we can test both SubmitJob and SubmitManyJobs.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd unittest for ganeti-cleaner
Michael Hanselmann [Wed, 9 Jun 2010 11:40:42 +0000 (13:40 +0200)]
Add unittest for ganeti-cleaner

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoStart to prepare documentation for 2.2 release
Michael Hanselmann [Tue, 8 Jun 2010 17:05:42 +0000 (19:05 +0200)]
Start to prepare documentation for 2.2 release

- Update NEWS file
- Remove dependency on OpenSSL (pyOpenSSL remains)
- Update manpages, fix typos and other things

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agocfgupgrade: Local variable for cluster-domain-secret filename
Michael Hanselmann [Tue, 8 Jun 2010 09:25:48 +0000 (11:25 +0200)]
cfgupgrade: Local variable for cluster-domain-secret filename

This is necessary to allow cfgupgrade to work on a non-standard directory.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agognt-job auto-completion: suggest "all" too
Iustin Pop [Tue, 8 Jun 2010 18:27:35 +0000 (20:27 +0200)]
gnt-job auto-completion: suggest "all" too

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoShow formatted ETA for disk sync and import/export
Michael Hanselmann [Thu, 3 Jun 2010 17:52:46 +0000 (19:52 +0200)]
Show formatted ETA for disk sync and import/export

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agobackend: Enable export size prediction
Michael Hanselmann [Thu, 3 Jun 2010 17:51:42 +0000 (19:51 +0200)]
backend: Enable export size prediction

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Allow script to predict size
Michael Hanselmann [Thu, 3 Jun 2010 17:50:20 +0000 (19:50 +0200)]
import/export: Allow script to predict size

Once we have a size for an export (in the context of the
import/export daemon), we can provide the user with a
percentage and ETA.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export: Show progress updates to user
Michael Hanselmann [Wed, 2 Jun 2010 11:06:56 +0000 (13:06 +0200)]
import/export: Show progress updates to user

With this patch, we show progress updates approx. once per minute.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Record amount of data transferred
Michael Hanselmann [Wed, 26 May 2010 18:57:42 +0000 (20:57 +0200)]
import/export daemon: Record amount of data transferred

This reports the amount of data transferred and the throughput (averaged
over 60 seconds) to the master daemon. While not yet fully implemented,
once the export scripts report the expected data size, we can even provide
an ETA and percentage.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoensure-dirs: don't fail if no rapi log is present
Guido Trotter [Fri, 4 Jun 2010 16:20:43 +0000 (17:20 +0100)]
ensure-dirs: don't fail if no rapi log is present

Sometimes a node has never been a master. Or ran rapi. In that case we
need to create the file (because if later rapi gets started, it won't be
able to create it itself).

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoIntroduce harcdoded timeouts for each RPC call
Iustin Pop [Fri, 4 Jun 2010 09:18:33 +0000 (11:18 +0200)]
Introduce harcdoded timeouts for each RPC call

This patch adds a table with per-opcode timeouts. They were chosen in an
empiric, rather than scientific, way - see the comments in lib/rpc.py.

The patch also shows how custom timeouts can be used - call_test_delay
explicitly overrides the timeout with one computed from the delay
parameters.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agohttp client: support per-request read timeout
Iustin Pop [Fri, 4 Jun 2010 08:37:05 +0000 (10:37 +0200)]
http client: support per-request read timeout

Currently, the read timeout is hardcoded in the
HttpClientRequestExecutor class. The patch changes the timeout so that
it's a per-request property, and makes the rpc.Client class pass one
explicitly in. Furthermore, we modify the rpc.RpcRunner class to support
per-call explicit timeouts.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoLet daemon-utils fix the owners for ganeti-rapi
René Nussbaumer [Thu, 3 Jun 2010 08:11:35 +0000 (10:11 +0200)]
Let daemon-utils fix the owners for ganeti-rapi

This is a workaround until we fully switched to user separation and fixes the
owners of directories/log files so ganeti-rapi will start flawlessly. This is
right now run for every daemon but as it operates on a relatively small subset
its impact is small.

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoModify ganeti-masterd to set permission and owner of masterd-socket
René Nussbaumer [Thu, 3 Jun 2010 12:17:15 +0000 (14:17 +0200)]
Modify ganeti-masterd to set permission and owner of masterd-socket

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoLet ganeti-rapi run under a different user/group
René Nussbaumer [Wed, 2 Jun 2010 11:29:18 +0000 (13:29 +0200)]
Let ganeti-rapi run under a different user/group

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMake it possible to call utils.Daemonize with uid and gid to run as
René Nussbaumer [Wed, 2 Jun 2010 08:34:15 +0000 (10:34 +0200)]
Make it possible to call utils.Daemonize with uid and gid to run as

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdding customized user/group as configure flags
René Nussbaumer [Tue, 18 May 2010 13:03:17 +0000 (15:03 +0200)]
Adding customized user/group as configure flags

Signed-off-by: René Nussbaumer <rn@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoMerge branch 'devel-2.1'
Guido Trotter [Fri, 4 Jun 2010 14:00:59 +0000 (15:00 +0100)]
Merge branch 'devel-2.1'

* devel-2.1:
  _ExecuteKVMRuntime: fix hv parameter fun
  Update FinalizeMigration docstring
  LUGrowDisk: fix operation on down instances
  Allow disk operation to act on a subset of disks
  NEWS: add release date for 2.1.3
  Bump up version for the 2.1.3 release

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years ago_ExecuteKVMRuntime: fix hv parameter fun
Guido Trotter [Fri, 4 Jun 2010 13:03:43 +0000 (14:03 +0100)]
_ExecuteKVMRuntime: fix hv parameter fun

When executing the kvm runtime we were currently accessing a mix of the
parameters as configured currently on the instance and the ones it was
started with. We were doing it without a precise criteria, but quite by
chance we got it *almost* right. The only remaining issue was that when
ganeti was upgraded and some parameters were added, trying to access
them from the "old" ones caused a keyerror, since they weren't present
back when the instance was started.

To fix this:
  - We fill the startup-time dict with any new parameter
  - We provide a clear guideline on which version of the parameters to
    access, and about the fact that new parameters must have an
    instance-migration backwards compatible default

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoUpdate FinalizeMigration docstring
Guido Trotter [Fri, 4 Jun 2010 11:05:56 +0000 (12:05 +0100)]
Update FinalizeMigration docstring

This is used not only for aborted migrations, so the docstring should
reflect that.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoLUGrowDisk: fix operation on down instances
Guido Trotter [Fri, 4 Jun 2010 10:12:41 +0000 (11:12 +0100)]
LUGrowDisk: fix operation on down instances

Currently it's impossible to grow a disk if an instance is shutdown,
because the disk could not be assembled. Now we take care of assembling
it, and shutting it down after.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAllow disk operation to act on a subset of disks
Guido Trotter [Fri, 4 Jun 2010 09:27:40 +0000 (10:27 +0100)]
Allow disk operation to act on a subset of disks

If the disks= parameter is passed, we can assemble/wait for
sync/shutdown only some disks belonging to an instance, rather than all.

This is useful to only activate/sync/shutdown the affected disk when
growing it.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoNEWS: add release date for 2.1.3
Guido Trotter [Thu, 3 Jun 2010 13:39:33 +0000 (14:39 +0100)]
NEWS: add release date for 2.1.3

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoutils: Add function to format seconds
Michael Hanselmann [Thu, 3 Jun 2010 18:10:05 +0000 (20:10 +0200)]
utils: Add function to format seconds

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoBump up version for the 2.1.3 release v2.1.3
Guido Trotter [Wed, 2 Jun 2010 11:13:09 +0000 (12:13 +0100)]
Bump up version for the 2.1.3 release

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'devel-2.1'
Guido Trotter [Thu, 3 Jun 2010 12:36:09 +0000 (13:36 +0100)]
Merge branch 'devel-2.1'

* devel-2.1:
  TestAsyncUDPSocket: remove dead code and add test
  TestAsyncUDPSocket: test for oversized sends
  Document the check-man change
  Update NEWS for Ganeti 2.1.3
  Second attempt at fixing check-man
  Fix check-man for newer man-db
  Add RemoveDir utility function

Conflicts:
NEWS
  - trivial
test/ganeti.daemon_unittest.py
  - trivial

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoimport/export unittest: Improve logging and fix one race condition
Michael Hanselmann [Thu, 27 May 2010 15:11:13 +0000 (17:11 +0200)]
import/export unittest: Improve logging and fix one race condition

Apart from improved logging, one race condition is fixed. If
the destination's status file became available, the port would
be returned immediately, even if it was still “None”. Most of
the time it worked, but not always. Now an additional check
ensures the port evaluates to True.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export unittest: Test large(r) transfer
Michael Hanselmann [Wed, 26 May 2010 19:00:20 +0000 (21:00 +0200)]
import/export unittest: Test large(r) transfer

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agodaemon.AsyncAwaker
Guido Trotter [Tue, 18 May 2010 12:20:46 +0000 (13:20 +0100)]
daemon.AsyncAwaker

This new asyncore dispatcher can be used to force a thread running the
asyncore loop to awake from the select, by signaling it on one of its
selected sockets.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoConvert ganeti-masterd's main thread to mainloop
Guido Trotter [Thu, 13 May 2010 17:35:20 +0000 (18:35 +0100)]
Convert ganeti-masterd's main thread to mainloop

Not much changes with this patch. The main loop for the IOServer is
repaced by mainloop.Run() and the main thread now uses asyncore to
handle connections to the master socket. Once it accepts them, though,
it just pushes them to the current infrastructure, and everything
proceeds as before.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoTest the new streaming daemon classes
Guido Trotter [Mon, 24 May 2010 09:36:45 +0000 (10:36 +0100)]
Test the new streaming daemon classes

Unittests cover AsyncStreamServer and AsyncTerminatedMessageStream with
both tcp and unix sockets.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agodaemon.AsyncTerminatedMessageStream
Guido Trotter [Mon, 24 May 2010 16:24:20 +0000 (17:24 +0100)]
daemon.AsyncTerminatedMessageStream

This is the counterpart of the AsyncStreamServer can be used to handle
connected sockets returned from connected clients if the protocol is a
terminator separated message stream. Nothing in this class is server
specific though: it can be used as a client as well, if the client is
implemented inside an asyncore daemon.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agodaemon.AsyncStreamServer
Guido Trotter [Thu, 13 May 2010 17:32:25 +0000 (18:32 +0100)]
daemon.AsyncStreamServer

This is a new asyncore server which handles listening stream sockets by
calling a non-implemented function for each connection it accepts. It's
the stream-oriented cousing of the AsyncUDPSocket.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoganeti-watcher should attempt to fix ganeti-rapi
Tom Limoncelli [Wed, 2 Jun 2010 15:06:37 +0000 (11:06 -0400)]
ganeti-watcher should attempt to fix ganeti-rapi

Update ganeti-watcher so that it tests the master's RAPI port with a
simple test (in this case GetVersion). If it fails, make one attempt
at restarting ganeti-rapi and retest.

- daemons/ganeti-watcher: Test rapi and make one attempt at restarting it.
- lib/utils.py: add StopDaemon() function.

Signed-off-by: Tom Limoncelli <tlim@google.com>
Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoTestAsyncUDPSocket: remove dead code and add test
Guido Trotter [Wed, 2 Jun 2010 17:30:04 +0000 (18:30 +0100)]
TestAsyncUDPSocket: remove dead code and add test

- _ThreadedClient was added on the idea of making this unittest
  concurrent, which was actually never done (we could test everything
  without it, so well)
- handle_write() was never called without filling the send queue, and
  this caused me trouble now that I learned to look at coverage

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoTestAsyncUDPSocket: test for oversized sends
Guido Trotter [Wed, 2 Jun 2010 17:19:52 +0000 (18:19 +0100)]
TestAsyncUDPSocket: test for oversized sends

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Luca Bigliardi <shammash@google.com>

14 years agoDocument the check-man change
Guido Trotter [Wed, 2 Jun 2010 16:35:33 +0000 (17:35 +0100)]
Document the check-man change

Since this affects developers' systems, document it in NEWS and
devnotes.rst

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoUpdate NEWS for Ganeti 2.1.3
Guido Trotter [Wed, 2 Jun 2010 11:04:07 +0000 (12:04 +0100)]
Update NEWS for Ganeti 2.1.3

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoSecond attempt at fixing check-man
Iustin Pop [Wed, 2 Jun 2010 14:48:54 +0000 (16:48 +0200)]
Second attempt at fixing check-man

I was wrong, actually LANG-vs-LC_ALL only fixed one case, by mistake. To
get proper UTF-8 encoding, we need to enforce any UTF-8 locale. We
choose the 'default' of en_US.UTF-8.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix check-man for newer man-db
Iustin Pop [Wed, 2 Jun 2010 13:51:55 +0000 (15:51 +0200)]
Fix check-man for newer man-db

Again, check-man :)

Commit 5fa1642226 removed LC_ALL=C, since that breaks the check.
However, with no LANG/LC_* variables, man-db is still broken.

We import the new lintian behaviour, i.e. LANG=C (which seems to differ
from LC_ALL=C, even with empty environment). I'm not sure of the
difference, though.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd RemoveDir utility function
Balazs Lecz [Wed, 26 May 2010 15:52:27 +0000 (16:52 +0100)]
Add RemoveDir utility function

Backported from master, 72087dcd5b06c0127e2ec3bf8c80f7f54da3fb01

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoMerge remote branch 'origin/devel-2.1'
Guido Trotter [Tue, 1 Jun 2010 17:23:28 +0000 (18:23 +0100)]
Merge remote branch 'origin/devel-2.1'

* origin/devel-2.1:
  Explicitely return None from IgnoreSignals
  AsyncUDPSocket: fix IgnoreSignals usage and test
  Add KVM chroot feature
  Fix and Improve TryToRoman unittest

Conflicts:
test/ganeti.daemon_unittest.py
  - trivial

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoExplicitely return None from IgnoreSignals
Guido Trotter [Tue, 1 Jun 2010 17:14:54 +0000 (18:14 +0100)]
Explicitely return None from IgnoreSignals

Same result, but what happens is clearer.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAsyncUDPSocket: fix IgnoreSignals usage and test
Guido Trotter [Tue, 1 Jun 2010 16:56:52 +0000 (17:56 +0100)]
AsyncUDPSocket: fix IgnoreSignals usage and test

This bug was found in the asyncore master patch series, but actually
applies to 2.1 for AsyncUDPSocket as well.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd KVM chroot feature
Balazs Lecz [Wed, 26 May 2010 15:53:02 +0000 (16:53 +0100)]
Add KVM chroot feature

This patch adds a new boolean hypervisor parameter to the KVM hypervisor,
named 'use_chroot'.
If it's turned on for an instance, than KVM is started in "chroot mode":
Ganeti creates an empty directory for the instance and passes the path
of this dir to KVM via the -chroot flag.
KVM changes its root to this directory after starting up.

It also adds a "quarantine" feature for moving any unexpected files to
a separate directory for later analysis.

This has been backported from master,
commit 84c08e4ee04cadbaf7d7be8bc1c3b9023918e276

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoutils: Add function to check whether process handles a signal
Michael Hanselmann [Wed, 26 May 2010 18:58:40 +0000 (20:58 +0200)]
utils: Add function to check whether process handles a signal

This will be used to avoid a race condition between starting a program (dd
for import/export) and sending signals to it.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix and Improve TryToRoman unittest
Guido Trotter [Tue, 1 Jun 2010 10:45:30 +0000 (11:45 +0100)]
Fix and Improve TryToRoman unittest

1) Don't break when the roman module is not found
2) Test that not finding the roman module doesn't make TryToRoman fail
(currently that is the case)

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agomove-instance: Use error message instead of multiple state variables
Michael Hanselmann [Mon, 31 May 2010 16:25:57 +0000 (18:25 +0200)]
move-instance: Use error message instead of multiple state variables

Until now, move-instance used different status variables: “success”,
“abort” and “error_message”. With this patch, everything is changed
to use “error_message” only. This simplifies the code a bit.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoDistribute cluster domain secret
Michael Hanselmann [Mon, 31 May 2010 15:53:30 +0000 (17:53 +0200)]
Distribute cluster domain secret

The cluster domain secret file was not distributed to other nodes.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'devel-2.1'
Guido Trotter [Tue, 1 Jun 2010 10:32:44 +0000 (11:32 +0100)]
Merge branch 'devel-2.1'

* devel-2.1:
  Convert gnt-instance list and info to use roman
  gnt-cluster info --roman
  FormatUidPool: provide optional roman conversion
  gnt-node: remove latinfriendlyfields
  Move roman conversion to compat
  Add a new opcode timestamp field
  Fix IgnoreSignals on socket.error
  RAPI client should convert urllib2.URLError to GanetiApiError
  KVM: Migration bandwidth and downtime control
  Make utils.EnsureDirs() ignore umask
  Fix two race conditions in reboot instance
  Support for latin friendly output in node list

Conflicts:
man/gnt-instance.sgml
  - trivial

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoConvert gnt-instance list and info to use roman
Guido Trotter [Tue, 1 Jun 2010 09:23:25 +0000 (10:23 +0100)]
Convert gnt-instance list and info to use roman

Finally gnt-instance has roman support as well.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agognt-cluster info --roman
Guido Trotter [Thu, 27 May 2010 10:28:04 +0000 (11:28 +0100)]
gnt-cluster info --roman

Convert to roman (if so the user wishes) the following:
  - cluster candidate size
  - uid pool
  - any integer be or hv parameter

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFormatUidPool: provide optional roman conversion
Guido Trotter [Thu, 27 May 2010 10:26:16 +0000 (11:26 +0100)]
FormatUidPool: provide optional roman conversion

The convert= option of compat.tryToRoman is used to do optional
conversion without duplicating formatting code.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agognt-node: remove latinfriendlyfields
Guido Trotter [Thu, 27 May 2010 10:25:01 +0000 (11:25 +0100)]
gnt-node: remove latinfriendlyfields

Rather than relying on a static list of fields, we opportunistically
convert all integers.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMove roman conversion to compat
Guido Trotter [Thu, 27 May 2010 10:21:17 +0000 (11:21 +0100)]
Move roman conversion to compat

The new TryToRoman function provides optional easy to use roman
conversion. Nunc cum demonstrationi unitati.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agossconf: error out when writing oversized files
Guido Trotter [Mon, 31 May 2010 16:45:46 +0000 (18:45 +0200)]
ssconf: error out when writing oversized files

Since we impose a maximum limit when reading ssconf files, let's error
out when trying to write them too big, so we don't pretend everything is
ok, and make mistakes when we actually read partial files.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoAdd a new opcode timestamp field
Iustin Pop [Tue, 1 Jun 2010 07:48:04 +0000 (09:48 +0200)]
Add a new opcode timestamp field

Since the current start_timestamp opcode attribute refers to the inital
start time, before locks are acquired, it's not useful to determine the
actual execution order of two opcodes/jobs competing for the same lock.

This patch adds a new field, exec_timestamp, that is updated when the
opcode moves from OP_STATUS_WAITLOCK to OP_STATUS_RUNNING, thus allowing
a clear view of the execution history. The new field is visible in the
job output via the 'opexec' field.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoFix IgnoreSignals on socket.error
Guido Trotter [Mon, 31 May 2010 09:36:14 +0000 (11:36 +0200)]
Fix IgnoreSignals on socket.error

Some confusion arose handling EINTR on this function: in python 2.6
socket.error is an IOError, and thus:
  - It's an EnvironmentError
  - It has an .errno member

In 2.4 and 2.5 it's not, and so its errno variable must be extracted
from the args tuple. This patch fixes both the function, and the
unittests.

This is a cherry-pick of master commit
965d0e5ba37f3e88aa38230177ad1c66814bf927 with the portions not relevant
to 2.1 removed (changes to the RetryOnSignals function).

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix {Ignore, RetryOn}Signals on socket.error
Guido Trotter [Mon, 31 May 2010 09:36:14 +0000 (11:36 +0200)]
Fix {Ignore, RetryOn}Signals on socket.error

Some confusion arose handling EINTR on those functions: in python 2.6
socket.error is an IOError, and thus:
  - It's an EnvironmentError
  - It has an .errno member

In 2.4 and 2.5 it's not, and so its errno variable must be extracted
from the args tuple. This patch fixes both functions, and the unittests.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMove hash functions to the compat module
Guido Trotter [Fri, 28 May 2010 18:48:17 +0000 (19:48 +0100)]
Move hash functions to the compat module

Since the hash functions' changed their module name between python 2.4
and 2.6, and we have to do an try/import/except trick, we'll do it just
once, for both hash functions, and in compat.py. This also fixes a use
of md5 in the utils unittests which didn't use the trick before, and
generated a deprecation warning under 2.6.

In compat we keep both a ganeti-wide non-version-specific version to be
used by other ganeti modules, and a python-version specific that can be
passed to python modules which expect a hash function for their input
but call it differently under different versions of python (hmac, for
example).

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoreraise exceptions in async tests' error handlers
Guido Trotter [Wed, 26 May 2010 13:12:19 +0000 (14:12 +0100)]
reraise exceptions in async tests' error handlers

This makes sure that any unforeseen error raises an exception rather
then just increasing a counter. It makes unittest debugging a lot
easier.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agodesign-2.2: job queue lock analysis/remediation
Guido Trotter [Thu, 27 May 2010 15:19:36 +0000 (16:19 +0100)]
design-2.2: job queue lock analysis/remediation

This builds up on the "Master core scalability design doc" detailing the
critical situations in the job queue and proposing how to fix them. The
bulleted point list at the beginning is changed to subparagraph, as the
job queue part is quite longer and more detailed, then the remediation
section is updated removing the generic "we'll fix it somehow" paragraph
to propose a real solution.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMaster core scalability design doc
Guido Trotter [Tue, 18 May 2010 15:38:25 +0000 (16:38 +0100)]
Master core scalability design doc

This initial design still lacks information about the job queue lock
contention decrease.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoRAPI client should convert urllib2.URLError to GanetiApiError
Tom Limoncelli [Mon, 31 May 2010 17:09:00 +0000 (13:09 -0400)]
RAPI client should convert urllib2.URLError to GanetiApiError

Signed-off-by: Tom Limoncelli <tlim@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoKVM: Migration bandwidth and downtime control
Apollon Oikonomopoulos [Mon, 31 May 2010 11:28:47 +0000 (14:28 +0300)]
KVM: Migration bandwidth and downtime control

Introduce 2 new hypervisor options, migration_bandwidth and migration_downtime
and implement KVM migration bandwidth and downtime control.

migration_bandwidth controls KVM's maximal bandwidth during migration, in
MiB/s. Default value is 32 MiB/s, same as KVM's internal default. This option
is a global hypervisor option.

migration_downtime sets the amount of time (in ms) a KVM instance is allowed to
freeze while copying memory pages. This is useful when migrating busy guests,
as KVM's internal default of 30ms is too low for the page-copying algorithm to
converge. This is a per-instance option, with a default of 30ms, same as KVM's
internal default.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

14 years agoMake utils.EnsureDirs() ignore umask
Balazs Lecz [Fri, 28 May 2010 12:31:21 +0000 (13:31 +0100)]
Make utils.EnsureDirs() ignore umask

EnsureDirs() should create directories with the exact mode requested
in the arguments, but it currently applies the umask.
This patch makes it independent from the umask.

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Move some I/O processing code to module
Michael Hanselmann [Fri, 21 May 2010 16:26:00 +0000 (18:26 +0200)]
import/export daemon: Move some I/O processing code to module

The code parsing the child process' output is moved to a separate
class in the impexpd module. As more programs are added, it'll
become more complex and should be separated.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Move command building into separate module
Michael Hanselmann [Fri, 21 May 2010 14:07:34 +0000 (16:07 +0200)]
import/export daemon: Move command building into separate module

The import/export daemon code is already large. Moving some code
to a separate module will make it smaller and easier to test.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoimport/export daemon: Move command building into class
Michael Hanselmann [Fri, 21 May 2010 11:50:45 +0000 (13:50 +0200)]
import/export daemon: Move command building into class

Instead of passing around many variables for building the executed
command, they're now kept as instance variables.

Signed-off-by: Michael Hanselmann <hansmi@google.com>
Reviewed-by: Guido Trotter <ultrotter@google.com>

14 years agoAdd KVM chroot feature
Balazs Lecz [Wed, 26 May 2010 15:53:02 +0000 (16:53 +0100)]
Add KVM chroot feature

This patch adds a new boolean hypervisor parameter to the KVM hypervisor,
named 'use_chroot'.
If it's turned on for an instance, than KVM is started in "chroot mode":
Ganeti creates an empty directory for the instance and passes the path
of this dir to KVM via the -chroot flag.
KVM changes its root to this directory after starting up.

It also adds a "quarantine" feature for moving any unexpected files to
a separate directory for later analysis.

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoAdd RemoveDir utility function
Balazs Lecz [Wed, 26 May 2010 15:52:27 +0000 (16:52 +0100)]
Add RemoveDir utility function

Signed-off-by: Balazs Lecz <leczb@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoFix two race conditions in reboot instance
Iustin Pop [Thu, 27 May 2010 11:37:55 +0000 (13:37 +0200)]
Fix two race conditions in reboot instance

If the instance crashes between backend.InstanceReboot checks the list
of running instances and the execution of hv_xen.RebootInstance,
ini_info will be None. And if the instance doesn't reboot fast enough,
new_info will be None. Both cases lead to “TypeError: unsubscriptable
object”. Too bad pylint doesn't detect such cases.

Signed-off-by: Iustin Pop <iustin@google.com>
Reviewed-by: Michael Hanselmann <hansmi@google.com>

14 years agoSupport for latin friendly output in node list
Guido Trotter [Tue, 25 May 2010 15:30:52 +0000 (16:30 +0100)]
Support for latin friendly output in node list

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'devel-2.1'
Guido Trotter [Tue, 25 May 2010 11:20:41 +0000 (12:20 +0100)]
Merge branch 'devel-2.1'

* devel-2.1:
  Test for errors during inotify callback
  SingleFileEventHandler: Remove try/except blocks
  ErrorLoggingAsyncNotifier
  daemon.GanetiBaseAsyncoreDispatcher

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoTest for errors during inotify callback
Guido Trotter [Fri, 21 May 2010 13:54:54 +0000 (14:54 +0100)]
Test for errors during inotify callback

- Create a new _MyErrorLoggingAsyncNotifier class which registers
  error counts, rather than logging them
- Add an additional ERR notifier to test with
- Check that no error was returned, for tests that weren't supposed to
- Add a new test case for a callback that's supposed to raise an
  exception

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoSingleFileEventHandler: Remove try/except blocks
Guido Trotter [Fri, 21 May 2010 13:28:40 +0000 (14:28 +0100)]
SingleFileEventHandler: Remove try/except blocks

Since now we use the SingleFileEventHandler together with an error
handling asyncore dispatcher, we don't need the internal try/except
anymore.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoErrorLoggingAsyncNotifier
Guido Trotter [Fri, 21 May 2010 13:27:04 +0000 (14:27 +0100)]
ErrorLoggingAsyncNotifier

This mixes AsyncNotifier with GanetiBaseAsyncoreDispatcher to provide an
AsyncNotifier which will log errors, rather than bail out.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agodaemon.GanetiBaseAsyncoreDispatcher
Guido Trotter [Thu, 13 May 2010 17:32:25 +0000 (18:32 +0100)]
daemon.GanetiBaseAsyncoreDispatcher

Abstract a few common functionalities between all ganeti asyncore
dispatchers:
  - Handle errors by logging them, and then continue
  - By default check sockets only for readability

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoMerge branch 'devel-2.1'
Guido Trotter [Mon, 24 May 2010 10:05:53 +0000 (11:05 +0100)]
Merge branch 'devel-2.1'

* devel-2.1:
  TestSingleFileEventHandler: abstract notifier type
  Mainloop: handle SIGINT as well (and terminate)
  SingleFileEventHandler: update comments

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

14 years agoTestSingleFileEventHandler: abstract notifier type
Guido Trotter [Fri, 21 May 2010 14:12:05 +0000 (15:12 +0100)]
TestSingleFileEventHandler: abstract notifier type

Rather than hardcode that we have two notifiers, and notifier 0 is the
terminating one, we abstract this with class level constants. This makes
it easier to add more, with different features.

The only real change is that now the callback class takes as input the
whole test object, rather than just the notified array, to have access
to those constants.

The rest is just replacing of hardcoded 0s and 1s with
self.NOTIFIER_TERM and self.NOTIFIER_NORM, and of notifier_count with
len(self.NOTIFIERS).

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Balazs Lecz <leczb@google.com>

14 years agoMainloop: handle SIGINT as well (and terminate)
Guido Trotter [Fri, 21 May 2010 10:24:54 +0000 (11:24 +0100)]
Mainloop: handle SIGINT as well (and terminate)

This is needed if daemons are in the foreground, and get ctrl+c-ed by
the user. Also add unittests to make sure the correct signals terminate
the mainloop.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>

14 years agoSingleFileEventHandler: update comments
Guido Trotter [Fri, 21 May 2010 11:45:56 +0000 (12:45 +0100)]
SingleFileEventHandler: update comments

The comments in the SingleFileEventHandler are still confd-specific.
Update them to make them generic for any single-file monitoring.

Signed-off-by: Guido Trotter <ultrotter@google.com>
Reviewed-by: Iustin Pop <iustin@google.com>