Michael Hanselmann [Thu, 31 Jul 2008 14:33:58 +0000 (14:33 +0000)]
Add FileLock utility class
This class is a wrapper around fcntl.flock and abstracts opening and
closing the lockfile. It'll used for the job queue.
(The patch also removes a duplicate import of tempfile into the unittest)
Reviewed-by: iustinp
Michael Hanselmann [Thu, 31 Jul 2008 14:33:36 +0000 (14:33 +0000)]
jqueue: Store context in job queue instead of worker pool
The job queue will need to access to configuration, which is provided
through the context object, to get a list of nodes.
Reviewed-by: iustinp
Oleksiy Mishchenko [Thu, 31 Jul 2008 12:58:55 +0000 (12:58 +0000)]
RAPI Implement DELETE for tags
Reviewed-by: imsnah
Oleksiy Mishchenko [Thu, 31 Jul 2008 09:06:37 +0000 (09:06 +0000)]
First write operation (add tag) for Ganeti RAPI
Add instance tag handling, improved error logging.
...oh, yes adopt instance listing for RAPI2!
Reviewed-by: iustinp
Iustin Pop [Wed, 30 Jul 2008 15:58:28 +0000 (15:58 +0000)]
Fix cluster destroy
With the recent startup/shutdown changes (and with the master daemon in
place), the cluster destroy needs some fixing.
This patch moves the finalization of the destroy out from cmdlib into
bootstrap, so we can nicely shutdown the rapi and master daemons.
Reviewed-by: ultrotter
Guido Trotter [Wed, 30 Jul 2008 15:49:05 +0000 (15:49 +0000)]
Xen: remove two end-of-line semicolons
It's python, isn't it?
Reviewed-by: iustinp
Iustin Pop [Wed, 30 Jul 2008 15:17:58 +0000 (15:17 +0000)]
Fix cluster init
With the recent changes, I forgot the extra parameter to this rpc call.
Also the rpc call needs to be done after we setup the config data, for
the master daemon to be able to start, so we move it after all other
init steps.
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 15:06:01 +0000 (15:06 +0000)]
Make gnt-* commands fail nicely on non-masters
This patch adds a check that we are on the master after failing to
connect to the socket, and log nicely the master name.
Reviewed-by: ultrotter
Guido Trotter [Wed, 30 Jul 2008 15:04:48 +0000 (15:04 +0000)]
Parallelize LUFailoverInstance
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 15:04:27 +0000 (15:04 +0000)]
ChainOpCode is still BGL-only
Prevent mistakes with an assert.
Reviewed-by: iustinp
Iustin Pop [Wed, 30 Jul 2008 15:00:54 +0000 (15:00 +0000)]
Fix a misuse of exc_info in logging.info
This is my fault, sorry.
Reviewed-by: imsnah
Iustin Pop [Wed, 30 Jul 2008 14:04:36 +0000 (14:04 +0000)]
Fix pylint-detected issues
This is mostly:
- whitespace fix (space at EOL in some files, not all, broken
indentation, etc)
- variable names overriding others (one is a real bug in there)
- too-long-lines
- cleanup of most unused imports (not all)
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 13:27:25 +0000 (13:27 +0000)]
Fix some errors detected by pylint
Reviewed-by: imsnah
Iustin Pop [Wed, 30 Jul 2008 12:32:42 +0000 (12:32 +0000)]
Unify SetupDaemon/SetupLogging
The 'old-style' info, error, debug logs do not make much sense. This
patch unifies the SetupLogging and SetupDaemon functions. As a result,
all the commands logs to a 'commands.log' file.
The patch also changes the log setup to keep going if there's an error
in setting up the file logging but we're logging to stderr.
Also, burnin now logs to its own file (burnin.log).
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 12:29:07 +0000 (12:29 +0000)]
Simplify the log constants and add another one
The patch changes the log constants by moving the slash to the end of
the log dir instead of at the beginning of *each* log file name.
It also adds a new LOG_COMMANDS constant (to be used in a next patch).
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 12:27:48 +0000 (12:27 +0000)]
Fix gnt-cluster getmaster
This is special in the sense that it can run on any node. As such, we
just instantiate ssconf and read the data from it.
Reviewed-by: ultrotter
Guido Trotter [Wed, 30 Jul 2008 11:31:09 +0000 (11:31 +0000)]
Parallelize {Startup,Shutdown,Reboot}Instance
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:30:49 +0000 (11:30 +0000)]
Parallelize LUReinstallInstance
self.recalculate_locks[locking.LEVEL_NODE] could have any value and
everything would work anyway. We'll use the string 'replace' by
convention because in the future we might want an 'append' mode.
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:30:29 +0000 (11:30 +0000)]
LogicalUnit._LockInstancesNodes helper function
This function is used to lock instances' primary and secondary nodes
after locking instances themselves.
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:30:10 +0000 (11:30 +0000)]
Make sharing locks possible
LUs can declare which locks they need by populating the
self.needed_locks dictionary, but those locks are always acquired as
exclusive. Make it possible to acquire shared locks as well, by
declaring a particular level as shared in the self.share_locks
dictionary. By default this dictionary is populated so that all locks
are acquired exclusively.
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:29:51 +0000 (11:29 +0000)]
Add LogicalUnit.DeclareLocks
This additional LogicalUnit function is optional to implement, but lets
you change your locking needs for one level just before locking it, but
after the previous levels have been already locked. It is useful for
example to calculate what nodes to lock after locking an instance.
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:29:31 +0000 (11:29 +0000)]
LURenameInstance, add/remove relevant locks
LURenameInstance forgot to remove the old lock name and add the new one,
making it impossible for parallel LUs to act on the instance (without a
master daemon restart). This also fixes burning+rename with the
parallelization of {Start,Stop}Instance.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 30 Jul 2008 10:02:02 +0000 (10:02 +0000)]
Rewrite job queue
We found several issues in the old job queue implementation. It had race
conditions, deadlocks and other deficiencies.
Short summary:
- _QueuedOpCode and _QueuedJob are now more or less data structures with a few
utility functions. __Setup is gone.
- DiskJobStorage and JobQueue classes merged into one to reduce code complexity.
- One lock in JobQueue for almost everything. There's also a lock per opcode
for log messages.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 30 Jul 2008 08:56:38 +0000 (08:56 +0000)]
workerpool: Log when waiting for a thread
Reviewed-by: iustinp
Iustin Pop [Wed, 30 Jul 2008 08:43:31 +0000 (08:43 +0000)]
Rework master startup/shutdown/failover
This (big) patch reworks the master startup/shutdown and the fixes the
master failover.
What does the patch do?
For master start/stop:
- remove the old ganeti-master script and its associated man page
- moves the ip start/stop directly into the backend.(Start|Stop)Master
- adds start/stop of the master/rapi daemon into these functions,
selectively based on the start/stop arguments
- makes the master call via rpc StartMaster(start_daemons=False) to
the local node so that the master IP is started
- and finally changes the example init.d script to directly start and
stop all three daemons, since they do the right thing (depending on
master/not master role)
For master failover:
- moves the code from LUMasterFailover into bootstrap.MasterFailover,
since we need to start/stop the master during this operation and
thus it can't be executed from the master
- removes the LUMasterFailover and its associated opcode
Notes: ubuntu's /etc/lsb-base-logging.sh is dumb, so the messages 'not
master' are not seen during startup on non-master nodes.
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 08:34:55 +0000 (08:34 +0000)]
Expose utils.DaemonPidFileName
Since we need to compute this from outside utils.py, we change this to a
public function.
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 08:33:49 +0000 (08:33 +0000)]
Implement checking for the master role in rapi
This patch moves the CheckMaster function from ganeti-masterd to ssconf
(most logical place, it cannot go in utils since we would have recursive
imports between ssconf and utils) and changes ganeti-rapi to also call
this function.
This is needed so that starting ganeti-rapi on a non-master node does
the right thing.
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 08:32:38 +0000 (08:32 +0000)]
Add a new parameter to backend.(Start|Stop)Master
This patch adds a new, unused for now, parameter to the start and stop
master operations in backend. The idea behind it is that we need to be
able to control whether the IP (de)activation is coupled with daemon
startup/shutdown.
The callers are also modified to pass this parameter (even if unused for
now).
Reviewed-by: ultrotter
Michael Hanselmann [Tue, 29 Jul 2008 14:07:46 +0000 (14:07 +0000)]
Log thread name when debug output is enabled
Reviewed-by: iustinp
Michael Hanselmann [Tue, 29 Jul 2008 14:07:16 +0000 (14:07 +0000)]
jqueue: Fix error logging
The passed parameters were not correct.
Reviewed-by: iustinp, ultrotter
Iustin Pop [Tue, 29 Jul 2008 10:42:46 +0000 (10:42 +0000)]
Fix constants typo
Reviewed-by: imsnah
Iustin Pop [Tue, 29 Jul 2008 09:06:16 +0000 (09:06 +0000)]
Use constants for the pid file stems
Reviewed-by: imsnah
Iustin Pop [Tue, 29 Jul 2008 08:49:50 +0000 (08:49 +0000)]
Add a KillProcess function
We cannot depend on all environments to have a start-stop-daemon or
similar tool. We instead implement a KillProcess function that behaves
similar to “start-stop-daemon --retry”.
Note that the attached unittest can hang in foreground if the child
misbehaves (doesn't write to the internal pipe). Since unittest are
either run in the foreground or are run with a timeout from an automated
framework, I think this is an acceptable trade-off (against of using
hardcoded timeouts in the test).
Reviewed-by: imsnah
Iustin Pop [Tue, 29 Jul 2008 08:49:34 +0000 (08:49 +0000)]
Change IsPidFileAlive into ReadPidFile
We already have a function to test if a PID is alive, so it makes more
sense to use function composition that force calling (since we need to
read PIDs from files in other places too). Now IsProcessAlive returns
False for PIDs <= 0, since this is the error return from ReadPidFile.
The patch also adds a unittest for checking that WriteFile raises the
correct exception, and checks that an invalid or missing file causes
ReadPidFile to return zero. The unittest tearDown method will try to
cleanup the temp directory too (otherwise it leaves stuff after it).
Reviewed-by: ultrotter
Iustin Pop [Tue, 29 Jul 2008 08:48:23 +0000 (08:48 +0000)]
Make the rapi daemon create a pidfile
This is needed for controlling it cleanly with start-stop daemon.
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 28 Jul 2008 10:35:06 +0000 (10:35 +0000)]
Fix unittests for ganeti-rapi
The RESTHTTPServer module went the way of the dodo.
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 10:17:29 +0000 (10:17 +0000)]
Implement signal handling in ganeti-rapi
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 10:17:13 +0000 (10:17 +0000)]
Move ganeti-rapi core code to daemon
All other daemons have their main code in themselves and not in a module.
This patch does the same to ganeti-rapi by moving the code from
lib/rapi/RESTHTTPServer.py to daemons/ganeti-rapi.
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 10:16:51 +0000 (10:16 +0000)]
Replace httperror module with ganeti.http
The generic HTTP server doesn't know about httperror based exceptions
and would treat them as unknown exceptions, thereby not doing the right
thing with HTTP errors.
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 10:13:53 +0000 (10:13 +0000)]
Implement “gnt-job cancel”
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 28 Jul 2008 10:13:37 +0000 (10:13 +0000)]
Implement job canceling on server side
Locking is not completeley right due to a deadlock when the job calls
UpdateJob after changing its status.
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 28 Jul 2008 09:16:57 +0000 (09:16 +0000)]
Fix exception class name in utils.WritePidFile
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 09:16:39 +0000 (09:16 +0000)]
Add “canceled” status for opcodes
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 28 Jul 2008 09:16:17 +0000 (09:16 +0000)]
Make “gnt-debug delay” work again
The old API is no longer working.
Reviewed-by: ultrotter
Michael Hanselmann [Fri, 25 Jul 2008 12:47:24 +0000 (12:47 +0000)]
Move code extracting job ID into function
It might come in handy at some point and makes the code a bit easier
to read.
Reviewed-by: iustinp
Oleksiy Mishchenko [Fri, 25 Jul 2008 12:32:43 +0000 (12:32 +0000)]
Convert set to a list in LUGetTags
The set triggers exception on a list-tags command and RAPI calls for tags
since it is not serializable by JSON.
Reviewed-by: iustinp
Oleksiy Mishchenko [Thu, 24 Jul 2008 16:34:49 +0000 (16:34 +0000)]
Switch RAPI to ganeti.http module
Reviewed-by: imsnah
Michael Hanselmann [Thu, 24 Jul 2008 15:04:09 +0000 (15:04 +0000)]
Implement “gnt-job archive” to archive jobs
Reviewed-by: iustinp
Michael Hanselmann [Thu, 24 Jul 2008 11:32:58 +0000 (11:32 +0000)]
Implement job archiving on the server side
So far no error reporting to the client is done. Clients don't get
noticed if a job doesn't exist or couldn't be archived because of
its current status.
The internal cache is always cleaned when the preconditions didn't
fail to make sure that the actual disk status will be reread next
time.
Reviewed-by: iustinp
Michael Hanselmann [Thu, 24 Jul 2008 11:32:46 +0000 (11:32 +0000)]
Add directory for archived jobs
Reviewed-by: iustinp
Michael Hanselmann [Thu, 24 Jul 2008 11:32:30 +0000 (11:32 +0000)]
Fix RPC parameters for {Cancel,Archive}Job
They aren't be tuples on the client side.
Reviewed-by: iustinp
Guido Trotter [Thu, 24 Jul 2008 08:46:01 +0000 (08:46 +0000)]
Add utils unittests for new functions
The submitted WritePidFile, RemovePidfile and IsPidFileAlive functions
miss unit tests. Adding a simple one which covers their basic
functionality.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 23 Jul 2008 16:56:13 +0000 (16:56 +0000)]
Move code formatting job ID into a base class
A later patch will add a memory based job storage class, hence this
code is going into a separate class. It also changes the number format
to always use at least 10 digits, allowing up to 9'999'999'999 jobs to
be sorted without using a custom function.
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:24:08 +0000 (14:24 +0000)]
Use pidfiles in example init script
Rather than searching for the ganeti daemons by name we'll use the
pidfile they create to stop them. This change also adds the --oknodo
option to start-stop-daemon when stopping ganeti (which means it won't
give an error if it wasn't started).
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:55 +0000 (14:23 +0000)]
ganeti-masterd: write and remove pidfile
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:43 +0000 (14:23 +0000)]
ganeti-noded: write and remove pid file
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:31 +0000 (14:23 +0000)]
Add utils.{Write,Remove}PidFile
WritePidFile is a helper function that writes the current pid in a
pidfile within the ganeti run directory. RemovePidFile tries to delete
it.
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:18 +0000 (14:23 +0000)]
Add utils.IsPidFileAlive function
This helper function reads a pid from a file containing it and checks
whether it refers to a live process.
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:05 +0000 (14:23 +0000)]
Invert nodes/instances locking order
An implementation mistake from the original design caused nodes to be
locked before instances, rather than after. This patch inverts the level
numbering, changing also the relevant unittests and the recursive
locking function starting point.
Reviewed-by: iustinp
Oleksiy Mishchenko [Wed, 23 Jul 2008 14:16:53 +0000 (14:16 +0000)]
Generalization of bulk output mapping
Reviewed-by: iustinp
Michael Hanselmann [Wed, 23 Jul 2008 13:30:15 +0000 (13:30 +0000)]
Rename JobStorage to DiskJobStorage
Reviewed-by: iustinp
Michael Hanselmann [Wed, 23 Jul 2008 13:30:03 +0000 (13:30 +0000)]
gnt-job: Don't treat job IDs as numbers
Reviewed-by: iustinp
Michael Hanselmann [Wed, 23 Jul 2008 12:25:38 +0000 (12:25 +0000)]
Fix logging with string job IDs
The job ID is now a string, hence logging must use %s instead of %d.
Reviewed-by: iustinp
Iustin Pop [Wed, 23 Jul 2008 12:13:16 +0000 (12:13 +0000)]
Simplify rapi.baserlib.MapFields()
We can use zip for simplifying this function. Actually, at this point
I'm not sure if it needs to be a separate function at all.
Reviewed-by: imsnah
Michael Hanselmann [Wed, 23 Jul 2008 11:34:18 +0000 (11:34 +0000)]
Make job ID a string
The docstring says that _NewSerialUnlocked returns “a string
representing the job identifier”. Until now it returned an
integer and this patch changes it.
Reviewed-by: iustinp
Iustin Pop [Wed, 23 Jul 2008 10:06:19 +0000 (10:06 +0000)]
Distribute the queue serial file after each update
This patch adds distribution of the queue serial file after each write
to it (but before a new job is created and written with that ID, and
before a response is returned, so we should be safe from crashes in
between).
Currently it only logs if a node cannot be contacted, it should abort if
> 50% errors are seen.
Reviewed-by: imsnah
Iustin Pop [Wed, 23 Jul 2008 10:06:08 +0000 (10:06 +0000)]
Make the job storage init reuse a serial file
This will be needed for master failover. If we don't have a valid queue
directory, we need to reinitialize it, but we should keep the existing
serial number.
As such, we abstract the reading of the serial and if we find a valid
serial, we do not reset it.
Reviewed-by: imsnah
Guido Trotter [Wed, 23 Jul 2008 08:22:06 +0000 (08:22 +0000)]
Move BDEV_CACHE_DIR to RUN_GANETI_DIR/bdev-cache
This was a TODO for 2.0
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:25:34 +0000 (14:25 +0000)]
Convert SetInstanceParams to concurrency
Grab a lock for the instance we're working on, and update its params.
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:25:21 +0000 (14:25 +0000)]
Use Update in SetInstanceParams
When we set the instance params we're not adding a new instance, but
just updating an existing one, so why using AddInstance?
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:25:08 +0000 (14:25 +0000)]
Convert LUConnectConsole to concurrency
For ConnectConsole we just need to lock the instance we're connecting
to. We make a few rpcs to its primary node, but node daemons can now
handle multiple queries and nodes cannot be removed till they have
instances on them anyway. Note that since we return the ssh command, and
that's executed outside of the ganeti daemon, without any locks held,
the instance can then be subject to operations while we're connected to
it, but that was the previous behavior as well.
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:24:33 +0000 (14:24 +0000)]
Add _ExpandAndLockInstance auxiliary function.
LUs that take an instance name as input and need to expand its name and
lock it can use it to simplify their ExpandNames call. Possibly, and
_ExpandAndLockNode will come as well.
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:24:19 +0000 (14:24 +0000)]
Convert two (simple) LUs to be concurrent
LUQueryClusterInfo and LUDumpClusterConfig can be made concurrent and
don't need to acquire any locks. In fact they don't interact with the
cluster at all, but just with its configuration, which is thread-safe by
design.
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:23:53 +0000 (14:23 +0000)]
Add missing empty line
Two top level definitions were separated only by one empty line.
Fixing this.
Reviewed-by: imsnah
Oleksiy Mishchenko [Tue, 22 Jul 2008 14:12:30 +0000 (14:12 +0000)]
Put the poper RAPI baserlib
Reviewed-by: imsnah
Michael Hanselmann [Tue, 22 Jul 2008 14:05:08 +0000 (14:05 +0000)]
Make argument to CleanCacheUnlocked mandatory
Not passing the argument means it has the value None. Iterating None
doesn't work:
>>> "123" in None
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: iterable argument required
Hence I rename it to "exclude" instead of "exceptions", which may be
confusing, and make it mandatory. If one wants to clean all cache
entries, an empty list can be passed.
Reviewed-by: iustinp
Oleksiy Mishchenko [Tue, 22 Jul 2008 13:33:13 +0000 (13:33 +0000)]
Split RAPI resources to pieces
Reviewed-by: iustinp
Michael Hanselmann [Tue, 22 Jul 2008 08:17:52 +0000 (08:17 +0000)]
Split conditions in worker pool
This patch splits the single threading.Condition object used in the
worker pool for synchronization into three.
- worker_to_pool: Notified if a worker wants to notify the pool
- pool_to_worker: Notified if the pool wants to notify a single
or all workers
- pool_to_pool: Used for synchronization in Quiesce
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 21 Jul 2008 15:32:54 +0000 (15:32 +0000)]
Handle signals in node daemon
This also fixes a TODO added by ultrotter by killing the parent
process when QuitGanetiException is raised.
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 21 Jul 2008 15:32:43 +0000 (15:32 +0000)]
Use new signal handler class in master daemon
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 21 Jul 2008 15:32:25 +0000 (15:32 +0000)]
Add signal handler class
This signal handler class abstracts some of the code previously
used in other places. It also uninstalls its handler when Reset()
is called or the class is destructed, thereby restoring the
previous behaviour.
Reviewed-by: iustinp
Oleksiy Mishchenko [Thu, 17 Jul 2008 12:51:38 +0000 (12:51 +0000)]
Implement jobs resource in RAPI
Reviewed-by: imsnah
Oleksiy Mishchenko [Wed, 16 Jul 2008 12:17:32 +0000 (12:17 +0000)]
Breath life in to RAPI for trunk
Reviewed-by: imsnah
Guido Trotter [Wed, 16 Jul 2008 09:48:20 +0000 (09:48 +0000)]
Fork ganeti-noded
Create a new ForkingHTTPServer in ganeti-noded by deriving both from
NodeDaemonHttpServer and ForkingMixin. This will allow us to process
concurrent requests.
Reviewed-by: imsnah
Iustin Pop [Tue, 15 Jul 2008 15:47:21 +0000 (15:47 +0000)]
Documentation updates
Reviewed-by: imsnah
Oleksiy Mishchenko [Tue, 15 Jul 2008 13:36:16 +0000 (13:36 +0000)]
Migrate RAPI QA to trunk.
Reviewed-by: imsnah
Iustin Pop [Tue, 15 Jul 2008 13:23:14 +0000 (13:23 +0000)]
Add apidoc makefile target
The patch adds the apidoc target and the epydoc config file for it. Note
that this is for epydoc 3.0 and that it will put the docs into
./doc/api/.
The patch also adds a new .gitignore rule for the auto-generated rapi
fragment.
Reviewed-by: imsnah
Iustin Pop [Tue, 15 Jul 2008 11:56:18 +0000 (11:56 +0000)]
Rename BaseJO to BaseOpCode
Since we don't have for now a job definition object anymore, we rename
this class to BaseOpCode. It's still useful (and not merged with OpCode)
since it holds all the 'pure' logic (no custom field handling, etc.)
whereas OpCode holds opcode specific data (OP_ID handling, etc).
The patch also fixes the module's docstring.
Reviewed-by: imsnah
Iustin Pop [Tue, 15 Jul 2008 10:49:55 +0000 (10:49 +0000)]
Sort the job list in _GetJobIDsUnlocked
Since the IDs are integers, we can simply sort them.
Reviewed-by: imsnah
Michael Hanselmann [Mon, 14 Jul 2008 15:52:18 +0000 (15:52 +0000)]
Fix previous patch using workerpool in masterd
The function to stop a worker pool is TerminateWorkers(), not Shutdown().
Reviewed-by: iustinp
Iustin Pop [Mon, 14 Jul 2008 15:43:05 +0000 (15:43 +0000)]
Fix a syntax error in gnt-backup
I broke gnt-backup in rev 1035, sorry :(
Reviewed-by: imsnah
Michael Hanselmann [Mon, 14 Jul 2008 15:42:00 +0000 (15:42 +0000)]
Use workerpool in master daemon
Reusing threads instead of starting one for each request is more efficient.
Reviewed-by: iustinp
Iustin Pop [Mon, 14 Jul 2008 15:22:45 +0000 (15:22 +0000)]
Further fixes to enable RAPI startup
Note that since RAPI itself doesn't use luxi.Client yet, nothing works,
but at least it can startup now.
Reviewed-by: imsnah
Iustin Pop [Mon, 14 Jul 2008 15:04:57 +0000 (15:04 +0000)]
Add forgotten RAPI constant
This was forgot on the forward-porting of RAPI.
Reviewed-by: imsnah
Iustin Pop [Mon, 14 Jul 2008 13:38:14 +0000 (13:38 +0000)]
Improve cli.SubmitOpCode
Currently, the feedback_fn argument to SubmitOpCode is no longer used.
We still need it in burnin, so we re-enable it by making the code call
that function with the msg argument in case feedback_fn is callable. The
patch also modifies burnin to accept the new argument format (msg is not
a triple instead of a string).
The patch also removes the ‘proc’ argument as it's obsolete; instead we
can accept a luxi.Client instance (noone uses this right now).
Reviewed-by: ultrotter
Iustin Pop [Mon, 14 Jul 2008 13:15:58 +0000 (13:15 +0000)]
First version of user feedback fixes
This patch contains a raw version for fixing feedback_fn.
The new mechanism works as follows:
- instead of a per-Processor feedback_fn, there's one for each
ExecOpCode, so that feedback for different opcodes go via possibly
different functions
- each _QueuedOpCode gets a message buffer, a method for adding
feedback and a method for retrieving (parts of) the feedback
- the _QueuedJob object gets a new attribute that is equal to the
index of the currently executing opcode
- job queries get an extra parameter called 'ticker' that will return
the latest message on the current executing opcode
- the cli.py job completion poll will show the new status if different
from the old one
Of course, quick messages will be lost, as currently only the latest one
is available. Also changes between opcodes are not represented at all.
Reviewed-by: imsnah
Iustin Pop [Mon, 14 Jul 2008 11:27:40 +0000 (11:27 +0000)]
Cache some jobs in memory
This patch adds a caching mechanisms to the JobStorage. Note that is
does not make the memory cache authoritative.
The algorithm is:
- all jobs loaded from disks are entered in the cache
- all new jobs are entered in the cache
- at each job save (in UpdateJobUnlocked), jobs which are not
executing or queued are removed from the cache
The end effect is that running jobs will always be in the cache (which
will fix the opcode log changes) and finished jobs will be kept for a
while in the cache after being loaded.
Reviewed-by: imsnah
Iustin Pop [Mon, 14 Jul 2008 11:12:50 +0000 (11:12 +0000)]
Fix JobStorage._GetJobIDsUnlocked
The job ID returned must be an integer (and the regex enforces that),
but we didn't convert it manually.
Reviewed-by: imsnah
Iustin Pop [Mon, 14 Jul 2008 10:08:52 +0000 (10:08 +0000)]
Change JobStorage to work with ids not filenames
Currently some of the functions in JobStorage work with filenames (which
is an implementation detail and should only be used when dealing with
the storage) and not with job IDs. We need to change this in order to
implement a job cache.
Reviewed-by: ultrotter
Michael Hanselmann [Fri, 11 Jul 2008 16:17:35 +0000 (16:17 +0000)]
Add experimental persistency to job queue
It's not perfect and it's not finished, but it's a start.
- Serial number is read only once, but written on each update
- Jobs are kept only on disk (caching will be implemented)
Reviewed-by: iustinp