ganeti-local
15 years agoUse new RPC call in “gnt-node list”
Michael Hanselmann [Wed, 6 Aug 2008 08:28:06 +0000 (08:28 +0000)]
Use new RPC call in “gnt-node list”

Reviewed-by: iustinp

15 years agoImplement query for nodes
Michael Hanselmann [Wed, 6 Aug 2008 08:26:01 +0000 (08:26 +0000)]
Implement query for nodes

Reviewed-by: iustinp

15 years agoUse new query RPC call in “gnt-instance list”
Michael Hanselmann [Wed, 6 Aug 2008 08:25:31 +0000 (08:25 +0000)]
Use new query RPC call in “gnt-instance list”

Reviewed-by: iustinp

15 years agoImplement query for instances
Michael Hanselmann [Wed, 6 Aug 2008 08:25:03 +0000 (08:25 +0000)]
Implement query for instances

Queries don't create jobs and are more efficient. Log messages
are not yet stored anywhere.

Reviewed-by: iustinp

15 years agojqueue: Replicate jobs to all nodes
Michael Hanselmann [Tue, 5 Aug 2008 10:33:08 +0000 (10:33 +0000)]
jqueue: Replicate jobs to all nodes

Newly added nodes are not yet taken care of. Queue locking on
non-master nodes is not yet correct.

Reviewed-by: iustinp

15 years agojqueue: Use new jstore module
Michael Hanselmann [Mon, 4 Aug 2008 12:27:18 +0000 (12:27 +0000)]
jqueue: Use new jstore module

Reviewed-by: iustinp

15 years agojstore: Add queue helper functions
Michael Hanselmann [Mon, 4 Aug 2008 12:27:01 +0000 (12:27 +0000)]
jstore: Add queue helper functions

This will be used to move common code out of jqueue.

Reviewed-by: iustinp

15 years agoImplement job submission for scripts
Iustin Pop [Mon, 4 Aug 2008 09:47:08 +0000 (09:47 +0000)]
Implement job submission for scripts

This patch adds the infrastructure for executing a job in background,
instead of foreground, via a new “--submit” option. The behaviour is
that the job ID is printed and the script will immediately exit.

The patch also converts gnt-node list to this model (yes, this will be a
query in the future).

Reviewed-by: imsnah

15 years agoAnother typo in the install doc
Iustin Pop [Mon, 4 Aug 2008 09:14:31 +0000 (09:14 +0000)]
Another typo in the install doc

Reviewed-by: imsnah

15 years agoUpdate the module build section of install doc
Iustin Pop [Mon, 4 Aug 2008 09:14:16 +0000 (09:14 +0000)]
Update the module build section of install doc

Reviewed-by: imsnah

15 years agojqueue: Move assert into decorator
Michael Hanselmann [Thu, 31 Jul 2008 15:03:13 +0000 (15:03 +0000)]
jqueue: Move assert into decorator

This reduces code duplication. A later patch will modify the job queue
a bit more and will need a change of this assert. The assertion is
also removed from all class-internal functions.

Reviewed-by: iustinp

15 years agoSplit cli.SubmitOpCode in two parts
Iustin Pop [Thu, 31 Jul 2008 14:52:04 +0000 (14:52 +0000)]
Split cli.SubmitOpCode in two parts

The current SubmitOpCode function is not flexible enough to be used for
submitters that don't want to wait for the job finish.

The patch splits this in two, a SendJob function and a PollJob one, and
the old SubmitOpCode becomes a wrapper. Note that the new SendJob takes
a list of opcodes (and not a single opcode anymore).

Reviewed-by: imsnah

15 years agoAllow job queue files to be uploaded through ganeti-noded
Michael Hanselmann [Thu, 31 Jul 2008 14:42:11 +0000 (14:42 +0000)]
Allow job queue files to be uploaded through ganeti-noded

This is needed for job queue replication.

Reviewed-by: iustinp

15 years agoAdd FileLock utility class
Michael Hanselmann [Thu, 31 Jul 2008 14:33:58 +0000 (14:33 +0000)]
Add FileLock utility class

This class is a wrapper around fcntl.flock and abstracts opening and
closing the lockfile. It'll used for the job queue.

(The patch also removes a duplicate import of tempfile into the unittest)

Reviewed-by: iustinp

15 years agojqueue: Store context in job queue instead of worker pool
Michael Hanselmann [Thu, 31 Jul 2008 14:33:36 +0000 (14:33 +0000)]
jqueue: Store context in job queue instead of worker pool

The job queue will need to access to configuration, which is provided
through the context object, to get a list of nodes.

Reviewed-by: iustinp

15 years agoRAPI Implement DELETE for tags
Oleksiy Mishchenko [Thu, 31 Jul 2008 12:58:55 +0000 (12:58 +0000)]
RAPI Implement DELETE for tags

Reviewed-by: imsnah

15 years agoFirst write operation (add tag) for Ganeti RAPI
Oleksiy Mishchenko [Thu, 31 Jul 2008 09:06:37 +0000 (09:06 +0000)]
First write operation (add tag) for Ganeti RAPI

Add instance tag handling, improved error logging.
...oh, yes adopt instance listing for RAPI2!

Reviewed-by: iustinp

15 years agoFix cluster destroy
Iustin Pop [Wed, 30 Jul 2008 15:58:28 +0000 (15:58 +0000)]
Fix cluster destroy

With the recent startup/shutdown changes (and with the master daemon in
place), the cluster destroy needs some fixing.

This patch moves the finalization of the destroy out from cmdlib into
bootstrap, so we can nicely shutdown the rapi and master daemons.

Reviewed-by: ultrotter

15 years agoXen: remove two end-of-line semicolons
Guido Trotter [Wed, 30 Jul 2008 15:49:05 +0000 (15:49 +0000)]
Xen: remove two end-of-line semicolons

It's python, isn't it?

Reviewed-by: iustinp

15 years agoFix cluster init
Iustin Pop [Wed, 30 Jul 2008 15:17:58 +0000 (15:17 +0000)]
Fix cluster init

With the recent changes, I forgot the extra parameter to this rpc call.
Also the rpc call needs to be done after we setup the config data, for
the master daemon to be able to start, so we move it after all other
init steps.

Reviewed-by: ultrotter

15 years agoMake gnt-* commands fail nicely on non-masters
Iustin Pop [Wed, 30 Jul 2008 15:06:01 +0000 (15:06 +0000)]
Make gnt-* commands fail nicely on non-masters

This patch adds a check that we are on the master after failing to
connect to the socket, and log nicely the master name.

Reviewed-by: ultrotter

15 years agoParallelize LUFailoverInstance
Guido Trotter [Wed, 30 Jul 2008 15:04:48 +0000 (15:04 +0000)]
Parallelize LUFailoverInstance

Reviewed-by: iustinp

15 years agoChainOpCode is still BGL-only
Guido Trotter [Wed, 30 Jul 2008 15:04:27 +0000 (15:04 +0000)]
ChainOpCode is still BGL-only

Prevent mistakes with an assert.

Reviewed-by: iustinp

15 years agoFix a misuse of exc_info in logging.info
Iustin Pop [Wed, 30 Jul 2008 15:00:54 +0000 (15:00 +0000)]
Fix a misuse of exc_info in logging.info

This is my fault, sorry.

Reviewed-by: imsnah

15 years agoFix pylint-detected issues
Iustin Pop [Wed, 30 Jul 2008 14:04:36 +0000 (14:04 +0000)]
Fix pylint-detected issues

This is mostly:
  - whitespace fix (space at EOL in some files, not all, broken
    indentation, etc)
  - variable names overriding others (one is a real bug in there)
  - too-long-lines
  - cleanup of most unused imports (not all)

Reviewed-by: ultrotter

15 years agoFix some errors detected by pylint
Iustin Pop [Wed, 30 Jul 2008 13:27:25 +0000 (13:27 +0000)]
Fix some errors detected by pylint

Reviewed-by: imsnah

15 years agoUnify SetupDaemon/SetupLogging
Iustin Pop [Wed, 30 Jul 2008 12:32:42 +0000 (12:32 +0000)]
Unify SetupDaemon/SetupLogging

The 'old-style' info, error, debug logs do not make much sense. This
patch unifies the SetupLogging and SetupDaemon functions. As a result,
all the commands logs to a 'commands.log' file.

The patch also changes the log setup to keep going if there's an error
in setting up the file logging but we're logging to stderr.

Also, burnin now logs to its own file (burnin.log).

Reviewed-by: ultrotter

15 years agoSimplify the log constants and add another one
Iustin Pop [Wed, 30 Jul 2008 12:29:07 +0000 (12:29 +0000)]
Simplify the log constants and add another one

The patch changes the log constants by moving the slash to the end of
the log dir instead of at the beginning of *each* log file name.

It also adds a new LOG_COMMANDS constant (to be used in a next patch).

Reviewed-by: ultrotter

15 years agoFix gnt-cluster getmaster
Iustin Pop [Wed, 30 Jul 2008 12:27:48 +0000 (12:27 +0000)]
Fix gnt-cluster getmaster

This is special in the sense that it can run on any node. As such, we
just instantiate ssconf and read the data from it.

Reviewed-by: ultrotter

15 years agoParallelize {Startup,Shutdown,Reboot}Instance
Guido Trotter [Wed, 30 Jul 2008 11:31:09 +0000 (11:31 +0000)]
Parallelize {Startup,Shutdown,Reboot}Instance

Reviewed-by: iustinp

15 years agoParallelize LUReinstallInstance
Guido Trotter [Wed, 30 Jul 2008 11:30:49 +0000 (11:30 +0000)]
Parallelize LUReinstallInstance

self.recalculate_locks[locking.LEVEL_NODE] could have any value and
everything would work anyway. We'll use the string 'replace' by
convention because in the future we might want an 'append' mode.

Reviewed-by: iustinp

15 years agoLogicalUnit._LockInstancesNodes helper function
Guido Trotter [Wed, 30 Jul 2008 11:30:29 +0000 (11:30 +0000)]
LogicalUnit._LockInstancesNodes helper function

This function is used to lock instances' primary and secondary nodes
after locking instances themselves.

Reviewed-by: iustinp

15 years agoMake sharing locks possible
Guido Trotter [Wed, 30 Jul 2008 11:30:10 +0000 (11:30 +0000)]
Make sharing locks possible

LUs can declare which locks they need by populating the
self.needed_locks dictionary, but those locks are always acquired as
exclusive. Make it possible to acquire shared locks as well, by
declaring a particular level as shared in the self.share_locks
dictionary. By default this dictionary is populated so that all locks
are acquired exclusively.

Reviewed-by: iustinp

15 years agoAdd LogicalUnit.DeclareLocks
Guido Trotter [Wed, 30 Jul 2008 11:29:51 +0000 (11:29 +0000)]
Add LogicalUnit.DeclareLocks

This additional LogicalUnit function is optional to implement, but lets
you change your locking needs for one level just before locking it, but
after the previous levels have been already locked. It is useful for
example to calculate what nodes to lock after locking an instance.

Reviewed-by: iustinp

15 years agoLURenameInstance, add/remove relevant locks
Guido Trotter [Wed, 30 Jul 2008 11:29:31 +0000 (11:29 +0000)]
LURenameInstance, add/remove relevant locks

LURenameInstance forgot to remove the old lock name and add the new one,
making it impossible for parallel LUs to act on the instance (without a
master daemon restart). This also fixes burning+rename with the
parallelization of {Start,Stop}Instance.

Reviewed-by: iustinp

15 years agoRewrite job queue
Michael Hanselmann [Wed, 30 Jul 2008 10:02:02 +0000 (10:02 +0000)]
Rewrite job queue

We found several issues in the old job queue implementation. It had race
conditions, deadlocks and other deficiencies.

Short summary:
- _QueuedOpCode and _QueuedJob are now more or less data structures with a few
  utility functions. __Setup is gone.
- DiskJobStorage and JobQueue classes merged into one to reduce code complexity.
- One lock in JobQueue for almost everything. There's also a lock per opcode
  for log messages.

Reviewed-by: iustinp

15 years agoworkerpool: Log when waiting for a thread
Michael Hanselmann [Wed, 30 Jul 2008 08:56:38 +0000 (08:56 +0000)]
workerpool: Log when waiting for a thread

Reviewed-by: iustinp

15 years agoRework master startup/shutdown/failover
Iustin Pop [Wed, 30 Jul 2008 08:43:31 +0000 (08:43 +0000)]
Rework master startup/shutdown/failover

This (big) patch reworks the master startup/shutdown and the fixes the
master failover.

What does the patch do?

For master start/stop:
  - remove the old ganeti-master script and its associated man page
  - moves the ip start/stop directly into the backend.(Start|Stop)Master
  - adds start/stop of the master/rapi daemon into these functions,
    selectively based on the start/stop arguments
  - makes the master call via rpc StartMaster(start_daemons=False) to
    the local node so that the master IP is started
  - and finally changes the example init.d script to directly start and
    stop all three daemons, since they do the right thing (depending on
    master/not master role)

For master failover:
  - moves the code from LUMasterFailover into bootstrap.MasterFailover,
    since we need to start/stop the master during this operation and
    thus it can't be executed from the master
  - removes the LUMasterFailover and its associated opcode

Notes: ubuntu's /etc/lsb-base-logging.sh is dumb, so the messages 'not
master' are not seen during startup on non-master nodes.

Reviewed-by: ultrotter

15 years agoExpose utils.DaemonPidFileName
Iustin Pop [Wed, 30 Jul 2008 08:34:55 +0000 (08:34 +0000)]
Expose utils.DaemonPidFileName

Since we need to compute this from outside utils.py, we change this to a
public function.

Reviewed-by: ultrotter

15 years agoImplement checking for the master role in rapi
Iustin Pop [Wed, 30 Jul 2008 08:33:49 +0000 (08:33 +0000)]
Implement checking for the master role in rapi

This patch moves the CheckMaster function from ganeti-masterd to ssconf
(most logical place, it cannot go in utils since we would have recursive
imports between ssconf and utils) and changes ganeti-rapi to also call
this function.

This is needed so that starting ganeti-rapi on a non-master node does
the right thing.

Reviewed-by: ultrotter

15 years agoAdd a new parameter to backend.(Start|Stop)Master
Iustin Pop [Wed, 30 Jul 2008 08:32:38 +0000 (08:32 +0000)]
Add a new parameter to backend.(Start|Stop)Master

This patch adds a new, unused for now, parameter to the start and stop
master operations in backend. The idea behind it is that we need to be
able to control whether the IP (de)activation is coupled with daemon
startup/shutdown.

The callers are also modified to pass this parameter (even if unused for
now).

Reviewed-by: ultrotter

15 years agoLog thread name when debug output is enabled
Michael Hanselmann [Tue, 29 Jul 2008 14:07:46 +0000 (14:07 +0000)]
Log thread name when debug output is enabled

Reviewed-by: iustinp

15 years agojqueue: Fix error logging
Michael Hanselmann [Tue, 29 Jul 2008 14:07:16 +0000 (14:07 +0000)]
jqueue: Fix error logging

The passed parameters were not correct.

Reviewed-by: iustinp, ultrotter

15 years agoFix constants typo
Iustin Pop [Tue, 29 Jul 2008 10:42:46 +0000 (10:42 +0000)]
Fix constants typo

Reviewed-by: imsnah

15 years agoUse constants for the pid file stems
Iustin Pop [Tue, 29 Jul 2008 09:06:16 +0000 (09:06 +0000)]
Use constants for the pid file stems

Reviewed-by: imsnah

15 years agoAdd a KillProcess function
Iustin Pop [Tue, 29 Jul 2008 08:49:50 +0000 (08:49 +0000)]
Add a KillProcess function

We cannot depend on all environments to have a start-stop-daemon or
similar tool. We instead implement a KillProcess function that behaves
similar to “start-stop-daemon --retry”.

Note that the attached unittest can hang in foreground if the child
misbehaves (doesn't write to the internal pipe). Since unittest are
either run in the foreground or are run with a timeout from an automated
framework, I think this is an acceptable trade-off (against of using
hardcoded timeouts in the test).

Reviewed-by: imsnah

15 years agoChange IsPidFileAlive into ReadPidFile
Iustin Pop [Tue, 29 Jul 2008 08:49:34 +0000 (08:49 +0000)]
Change IsPidFileAlive into ReadPidFile

We already have a function to test if a PID is alive, so it makes more
sense to use function composition that force calling (since we need to
read PIDs from files in other places too). Now IsProcessAlive returns
False for PIDs <= 0, since this is the error return from ReadPidFile.

The patch also adds a unittest for checking that WriteFile raises the
correct exception, and checks that an invalid or missing file causes
ReadPidFile to return zero. The unittest tearDown method will try to
cleanup the temp directory too (otherwise it leaves stuff after it).

Reviewed-by: ultrotter

15 years agoMake the rapi daemon create a pidfile
Iustin Pop [Tue, 29 Jul 2008 08:48:23 +0000 (08:48 +0000)]
Make the rapi daemon create a pidfile

This is needed for controlling it cleanly with start-stop daemon.

Reviewed-by: ultrotter

15 years agoFix unittests for ganeti-rapi
Michael Hanselmann [Mon, 28 Jul 2008 10:35:06 +0000 (10:35 +0000)]
Fix unittests for ganeti-rapi

The RESTHTTPServer module went the way of the dodo.

Reviewed-by: iustinp

15 years agoImplement signal handling in ganeti-rapi
Michael Hanselmann [Mon, 28 Jul 2008 10:17:29 +0000 (10:17 +0000)]
Implement signal handling in ganeti-rapi

Reviewed-by: iustinp

15 years agoMove ganeti-rapi core code to daemon
Michael Hanselmann [Mon, 28 Jul 2008 10:17:13 +0000 (10:17 +0000)]
Move ganeti-rapi core code to daemon

All other daemons have their main code in themselves and not in a module.
This patch does the same to ganeti-rapi by moving the code from
lib/rapi/RESTHTTPServer.py to daemons/ganeti-rapi.

Reviewed-by: iustinp

15 years agoReplace httperror module with ganeti.http
Michael Hanselmann [Mon, 28 Jul 2008 10:16:51 +0000 (10:16 +0000)]
Replace httperror module with ganeti.http

The generic HTTP server doesn't know about httperror based exceptions
and would treat them as unknown exceptions, thereby not doing the right
thing with HTTP errors.

Reviewed-by: iustinp

15 years agoImplement “gnt-job cancel”
Michael Hanselmann [Mon, 28 Jul 2008 10:13:53 +0000 (10:13 +0000)]
Implement “gnt-job cancel”

Reviewed-by: ultrotter

15 years agoImplement job canceling on server side
Michael Hanselmann [Mon, 28 Jul 2008 10:13:37 +0000 (10:13 +0000)]
Implement job canceling on server side

Locking is not completeley right due to a deadlock when the job calls
UpdateJob after changing its status.

Reviewed-by: ultrotter

15 years agoFix exception class name in utils.WritePidFile
Michael Hanselmann [Mon, 28 Jul 2008 09:16:57 +0000 (09:16 +0000)]
Fix exception class name in utils.WritePidFile

Reviewed-by: iustinp

15 years agoAdd “canceled” status for opcodes
Michael Hanselmann [Mon, 28 Jul 2008 09:16:39 +0000 (09:16 +0000)]
Add “canceled” status for opcodes

Reviewed-by: ultrotter

15 years agoMake “gnt-debug delay” work again
Michael Hanselmann [Mon, 28 Jul 2008 09:16:17 +0000 (09:16 +0000)]
Make “gnt-debug delay” work again

The old API is no longer working.

Reviewed-by: ultrotter

15 years agoMove code extracting job ID into function
Michael Hanselmann [Fri, 25 Jul 2008 12:47:24 +0000 (12:47 +0000)]
Move code extracting job ID into function

It might come in handy at some point and makes the code a bit easier
to read.

Reviewed-by: iustinp

15 years agoConvert set to a list in LUGetTags
Oleksiy Mishchenko [Fri, 25 Jul 2008 12:32:43 +0000 (12:32 +0000)]
Convert set to a list in LUGetTags

The set triggers exception on a list-tags command and RAPI calls for tags
since it is not serializable by JSON.

Reviewed-by: iustinp

15 years agoSwitch RAPI to ganeti.http module
Oleksiy Mishchenko [Thu, 24 Jul 2008 16:34:49 +0000 (16:34 +0000)]
Switch RAPI to ganeti.http module

Reviewed-by: imsnah

15 years agoImplement “gnt-job archive” to archive jobs
Michael Hanselmann [Thu, 24 Jul 2008 15:04:09 +0000 (15:04 +0000)]
Implement “gnt-job archive” to archive jobs

Reviewed-by: iustinp

15 years agoImplement job archiving on the server side
Michael Hanselmann [Thu, 24 Jul 2008 11:32:58 +0000 (11:32 +0000)]
Implement job archiving on the server side

So far no error reporting to the client is done. Clients don't get
noticed if a job doesn't exist or couldn't be archived because of
its current status.

The internal cache is always cleaned when the preconditions didn't
fail to make sure that the actual disk status will be reread next
time.

Reviewed-by: iustinp

15 years agoAdd directory for archived jobs
Michael Hanselmann [Thu, 24 Jul 2008 11:32:46 +0000 (11:32 +0000)]
Add directory for archived jobs

Reviewed-by: iustinp

15 years agoFix RPC parameters for {Cancel,Archive}Job
Michael Hanselmann [Thu, 24 Jul 2008 11:32:30 +0000 (11:32 +0000)]
Fix RPC parameters for {Cancel,Archive}Job

They aren't be tuples on the client side.

Reviewed-by: iustinp

15 years agoAdd utils unittests for new functions
Guido Trotter [Thu, 24 Jul 2008 08:46:01 +0000 (08:46 +0000)]
Add utils unittests for new functions

The submitted WritePidFile, RemovePidfile and IsPidFileAlive functions
miss unit tests. Adding a simple one which covers their basic
functionality.

Reviewed-by: iustinp

15 years agoMove code formatting job ID into a base class
Michael Hanselmann [Wed, 23 Jul 2008 16:56:13 +0000 (16:56 +0000)]
Move code formatting job ID into a base class

A later patch will add a memory based job storage class, hence this
code is going into a separate class. It also changes the number format
to always use at least 10 digits, allowing up to 9'999'999'999 jobs to
be sorted without using a custom function.

Reviewed-by: iustinp

15 years agoUse pidfiles in example init script
Guido Trotter [Wed, 23 Jul 2008 14:24:08 +0000 (14:24 +0000)]
Use pidfiles in example init script

Rather than searching for the ganeti daemons by name we'll use the
pidfile they create to stop them. This change also adds the --oknodo
option to start-stop-daemon when stopping ganeti (which means it won't
give an error if it wasn't started).

Reviewed-by: iustinp

15 years agoganeti-masterd: write and remove pidfile
Guido Trotter [Wed, 23 Jul 2008 14:23:55 +0000 (14:23 +0000)]
ganeti-masterd: write and remove pidfile

Reviewed-by: iustinp

15 years agoganeti-noded: write and remove pid file
Guido Trotter [Wed, 23 Jul 2008 14:23:43 +0000 (14:23 +0000)]
ganeti-noded: write and remove pid file

Reviewed-by: iustinp

15 years agoAdd utils.{Write,Remove}PidFile
Guido Trotter [Wed, 23 Jul 2008 14:23:31 +0000 (14:23 +0000)]
Add utils.{Write,Remove}PidFile

WritePidFile is a helper function that writes the current pid in a
pidfile within the ganeti run directory. RemovePidFile tries to delete
it.

Reviewed-by: iustinp

15 years agoAdd utils.IsPidFileAlive function
Guido Trotter [Wed, 23 Jul 2008 14:23:18 +0000 (14:23 +0000)]
Add utils.IsPidFileAlive function

This helper function reads a pid from a file containing it and checks
whether it refers to a live process.

Reviewed-by: iustinp

15 years agoInvert nodes/instances locking order
Guido Trotter [Wed, 23 Jul 2008 14:23:05 +0000 (14:23 +0000)]
Invert nodes/instances locking order

An implementation mistake from the original design caused nodes to be
locked before instances, rather than after. This patch inverts the level
numbering, changing also the relevant unittests and the recursive
locking function starting point.

Reviewed-by: iustinp

15 years agoGeneralization of bulk output mapping
Oleksiy Mishchenko [Wed, 23 Jul 2008 14:16:53 +0000 (14:16 +0000)]
Generalization of bulk output mapping

Reviewed-by: iustinp

15 years agoRename JobStorage to DiskJobStorage
Michael Hanselmann [Wed, 23 Jul 2008 13:30:15 +0000 (13:30 +0000)]
Rename JobStorage to DiskJobStorage

Reviewed-by: iustinp

15 years agognt-job: Don't treat job IDs as numbers
Michael Hanselmann [Wed, 23 Jul 2008 13:30:03 +0000 (13:30 +0000)]
gnt-job: Don't treat job IDs as numbers

Reviewed-by: iustinp

15 years agoFix logging with string job IDs
Michael Hanselmann [Wed, 23 Jul 2008 12:25:38 +0000 (12:25 +0000)]
Fix logging with string job IDs

The job ID is now a string, hence logging must use %s instead of %d.

Reviewed-by: iustinp

15 years agoSimplify rapi.baserlib.MapFields()
Iustin Pop [Wed, 23 Jul 2008 12:13:16 +0000 (12:13 +0000)]
Simplify rapi.baserlib.MapFields()

We can use zip for simplifying this function. Actually, at this point
I'm not sure if it needs to be a separate function at all.

Reviewed-by: imsnah

15 years agoMake job ID a string
Michael Hanselmann [Wed, 23 Jul 2008 11:34:18 +0000 (11:34 +0000)]
Make job ID a string

The docstring says that _NewSerialUnlocked returns “a string
representing the job identifier”. Until now it returned an
integer and this patch changes it.

Reviewed-by: iustinp

15 years agoDistribute the queue serial file after each update
Iustin Pop [Wed, 23 Jul 2008 10:06:19 +0000 (10:06 +0000)]
Distribute the queue serial file after each update

This patch adds distribution of the queue serial file after each write
to it (but before a new job is created and written with that ID, and
before a response is returned, so we should be safe from crashes in
between).

Currently it only logs if a node cannot be contacted, it should abort if
> 50% errors are seen.

Reviewed-by: imsnah

15 years agoMake the job storage init reuse a serial file
Iustin Pop [Wed, 23 Jul 2008 10:06:08 +0000 (10:06 +0000)]
Make the job storage init reuse a serial file

This will be needed for master failover. If we don't have a valid queue
directory, we need to reinitialize it, but we should keep the existing
serial number.

As such, we abstract the reading of the serial and if we find a valid
serial, we do not reset it.

Reviewed-by: imsnah

15 years agoMove BDEV_CACHE_DIR to RUN_GANETI_DIR/bdev-cache
Guido Trotter [Wed, 23 Jul 2008 08:22:06 +0000 (08:22 +0000)]
Move BDEV_CACHE_DIR to RUN_GANETI_DIR/bdev-cache

This was a TODO for 2.0

Reviewed-by: iustinp

15 years agoConvert SetInstanceParams to concurrency
Guido Trotter [Tue, 22 Jul 2008 14:25:34 +0000 (14:25 +0000)]
Convert SetInstanceParams to concurrency

Grab a lock for the instance we're working on, and update its params.

Reviewed-by: iustinp

15 years agoUse Update in SetInstanceParams
Guido Trotter [Tue, 22 Jul 2008 14:25:21 +0000 (14:25 +0000)]
Use Update in SetInstanceParams

When we set the instance params we're not adding a new instance, but
just updating an existing one, so why using AddInstance?

Reviewed-by: iustinp

15 years agoConvert LUConnectConsole to concurrency
Guido Trotter [Tue, 22 Jul 2008 14:25:08 +0000 (14:25 +0000)]
Convert LUConnectConsole to concurrency

For ConnectConsole we just need to lock the instance we're connecting
to. We make a few rpcs to its primary node, but node daemons can now
handle multiple queries and nodes cannot be removed till they have
instances on them anyway. Note that since we return the ssh command, and
that's executed outside of the ganeti daemon, without any locks held,
the instance can then be subject to operations while we're connected to
it, but that was the previous behavior as well.

Reviewed-by: iustinp

15 years agoAdd _ExpandAndLockInstance auxiliary function.
Guido Trotter [Tue, 22 Jul 2008 14:24:33 +0000 (14:24 +0000)]
Add _ExpandAndLockInstance auxiliary function.

LUs that take an instance name as input and need to expand its name and
lock it can use it to simplify their ExpandNames call. Possibly, and
_ExpandAndLockNode will come as well.

Reviewed-by: iustinp

15 years agoConvert two (simple) LUs to be concurrent
Guido Trotter [Tue, 22 Jul 2008 14:24:19 +0000 (14:24 +0000)]
Convert two (simple) LUs to be concurrent

LUQueryClusterInfo and LUDumpClusterConfig can be made concurrent and
don't need to acquire any locks. In fact they don't interact with the
cluster at all, but just with its configuration, which is thread-safe by
design.

Reviewed-by: iustinp

15 years agoAdd missing empty line
Guido Trotter [Tue, 22 Jul 2008 14:23:53 +0000 (14:23 +0000)]
Add missing empty line

Two top level definitions were separated only by one empty line.
Fixing this.

Reviewed-by: imsnah

15 years agoPut the poper RAPI baserlib
Oleksiy Mishchenko [Tue, 22 Jul 2008 14:12:30 +0000 (14:12 +0000)]
Put the poper RAPI baserlib

Reviewed-by: imsnah

15 years agoMake argument to CleanCacheUnlocked mandatory
Michael Hanselmann [Tue, 22 Jul 2008 14:05:08 +0000 (14:05 +0000)]
Make argument to CleanCacheUnlocked mandatory

Not passing the argument means it has the value None. Iterating None
doesn't work:
  >>> "123" in None
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: iterable argument required

Hence I rename it to "exclude" instead of "exceptions", which may be
confusing, and make it mandatory. If one wants to clean all cache
entries, an empty list can be passed.

Reviewed-by: iustinp

15 years agoSplit RAPI resources to pieces
Oleksiy Mishchenko [Tue, 22 Jul 2008 13:33:13 +0000 (13:33 +0000)]
Split RAPI resources to pieces

Reviewed-by: iustinp

15 years agoSplit conditions in worker pool
Michael Hanselmann [Tue, 22 Jul 2008 08:17:52 +0000 (08:17 +0000)]
Split conditions in worker pool

This patch splits the single threading.Condition object used in the
worker pool for synchronization into three.

- worker_to_pool: Notified if a worker wants to notify the pool
- pool_to_worker: Notified if the pool wants to notify a single
  or all workers
- pool_to_pool: Used for synchronization in Quiesce

Reviewed-by: ultrotter

15 years agoHandle signals in node daemon
Michael Hanselmann [Mon, 21 Jul 2008 15:32:54 +0000 (15:32 +0000)]
Handle signals in node daemon

This also fixes a TODO added by ultrotter by killing the parent
process when QuitGanetiException is raised.

Reviewed-by: ultrotter

15 years agoUse new signal handler class in master daemon
Michael Hanselmann [Mon, 21 Jul 2008 15:32:43 +0000 (15:32 +0000)]
Use new signal handler class in master daemon

Reviewed-by: ultrotter

15 years agoAdd signal handler class
Michael Hanselmann [Mon, 21 Jul 2008 15:32:25 +0000 (15:32 +0000)]
Add signal handler class

This signal handler class abstracts some of the code previously
used in other places. It also uninstalls its handler when Reset()
is called or the class is destructed, thereby restoring the
previous behaviour.

Reviewed-by: iustinp

16 years agoImplement jobs resource in RAPI
Oleksiy Mishchenko [Thu, 17 Jul 2008 12:51:38 +0000 (12:51 +0000)]
Implement jobs resource in RAPI

Reviewed-by: imsnah

16 years agoBreath life in to RAPI for trunk
Oleksiy Mishchenko [Wed, 16 Jul 2008 12:17:32 +0000 (12:17 +0000)]
Breath life in to RAPI for trunk

Reviewed-by: imsnah

16 years agoFork ganeti-noded
Guido Trotter [Wed, 16 Jul 2008 09:48:20 +0000 (09:48 +0000)]
Fork ganeti-noded

Create a new ForkingHTTPServer in ganeti-noded by deriving both from
NodeDaemonHttpServer and ForkingMixin. This will allow us to process
concurrent requests.

Reviewed-by: imsnah

16 years agoDocumentation updates
Iustin Pop [Tue, 15 Jul 2008 15:47:21 +0000 (15:47 +0000)]
Documentation updates

Reviewed-by: imsnah

16 years agoMigrate RAPI QA to trunk.
Oleksiy Mishchenko [Tue, 15 Jul 2008 13:36:16 +0000 (13:36 +0000)]
Migrate RAPI QA to trunk.

Reviewed-by: imsnah

16 years agoAdd apidoc makefile target
Iustin Pop [Tue, 15 Jul 2008 13:23:14 +0000 (13:23 +0000)]
Add apidoc makefile target

The patch adds the apidoc target and the epydoc config file for it. Note
that this is for epydoc 3.0 and that it will put the docs into
./doc/api/.

The patch also adds a new .gitignore rule for the auto-generated rapi
fragment.

Reviewed-by: imsnah