Michael Hanselmann [Thu, 7 Aug 2008 13:03:45 +0000 (13:03 +0000)]
Use API instead of command line utilities in watcher
Reviewed-by: iustinp
Michael Hanselmann [Thu, 7 Aug 2008 09:07:27 +0000 (09:07 +0000)]
Fix cli.PollJob
feedback_fn wasn't passed to it.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 14:56:26 +0000 (14:56 +0000)]
Notify job queue about added/removed nodes
The job queue maintains its own node list and must be notified
when nodes are added/removed.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 14:56:10 +0000 (14:56 +0000)]
Implement {Add,Readd,Remove}Node in GanetiContext
By doing this we've a central place which coordinates what needs to be
done when adding or removing nodes. Another patch will add calls into
the job queue.
Two log messages move to config.py.
When removing a node, node_leave_cluster is now called after it has
been removed from the configuration and job manager. That way we're
sure not to access the node again after files have been removed.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 13:36:23 +0000 (13:36 +0000)]
jqueue: Implement {Add,Remove}Node
These functions will be used to notify the queue about newly added
or removed nodes.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 13:35:59 +0000 (13:35 +0000)]
jqueue: Don't pass the list of nodes to SubmitJob anymore
The job queue now maintains its own list and is updated when
nodes are added or removed from the cluster.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 13:35:38 +0000 (13:35 +0000)]
Maintain node list in job queue
The code makes sure not to include the master in the list.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 13:35:22 +0000 (13:35 +0000)]
masterd: Move job queue into context object
The job queue must be called from cmdlib when adding or removing
nodes to the cluster. Moving it to the context objects makes
this possible.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 11:27:41 +0000 (11:27 +0000)]
Clean job queue directories when leaving cluster
Old job files shouldn't be left on nodes removed from a cluster.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 08:28:06 +0000 (08:28 +0000)]
Use new RPC call in “gnt-node list”
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 08:26:01 +0000 (08:26 +0000)]
Implement query for nodes
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 08:25:31 +0000 (08:25 +0000)]
Use new query RPC call in “gnt-instance list”
Reviewed-by: iustinp
Michael Hanselmann [Wed, 6 Aug 2008 08:25:03 +0000 (08:25 +0000)]
Implement query for instances
Queries don't create jobs and are more efficient. Log messages
are not yet stored anywhere.
Reviewed-by: iustinp
Michael Hanselmann [Tue, 5 Aug 2008 10:33:08 +0000 (10:33 +0000)]
jqueue: Replicate jobs to all nodes
Newly added nodes are not yet taken care of. Queue locking on
non-master nodes is not yet correct.
Reviewed-by: iustinp
Michael Hanselmann [Mon, 4 Aug 2008 12:27:18 +0000 (12:27 +0000)]
jqueue: Use new jstore module
Reviewed-by: iustinp
Michael Hanselmann [Mon, 4 Aug 2008 12:27:01 +0000 (12:27 +0000)]
jstore: Add queue helper functions
This will be used to move common code out of jqueue.
Reviewed-by: iustinp
Iustin Pop [Mon, 4 Aug 2008 09:47:08 +0000 (09:47 +0000)]
Implement job submission for scripts
This patch adds the infrastructure for executing a job in background,
instead of foreground, via a new “--submit” option. The behaviour is
that the job ID is printed and the script will immediately exit.
The patch also converts gnt-node list to this model (yes, this will be a
query in the future).
Reviewed-by: imsnah
Iustin Pop [Mon, 4 Aug 2008 09:14:31 +0000 (09:14 +0000)]
Another typo in the install doc
Reviewed-by: imsnah
Iustin Pop [Mon, 4 Aug 2008 09:14:16 +0000 (09:14 +0000)]
Update the module build section of install doc
Reviewed-by: imsnah
Michael Hanselmann [Thu, 31 Jul 2008 15:03:13 +0000 (15:03 +0000)]
jqueue: Move assert into decorator
This reduces code duplication. A later patch will modify the job queue
a bit more and will need a change of this assert. The assertion is
also removed from all class-internal functions.
Reviewed-by: iustinp
Iustin Pop [Thu, 31 Jul 2008 14:52:04 +0000 (14:52 +0000)]
Split cli.SubmitOpCode in two parts
The current SubmitOpCode function is not flexible enough to be used for
submitters that don't want to wait for the job finish.
The patch splits this in two, a SendJob function and a PollJob one, and
the old SubmitOpCode becomes a wrapper. Note that the new SendJob takes
a list of opcodes (and not a single opcode anymore).
Reviewed-by: imsnah
Michael Hanselmann [Thu, 31 Jul 2008 14:42:11 +0000 (14:42 +0000)]
Allow job queue files to be uploaded through ganeti-noded
This is needed for job queue replication.
Reviewed-by: iustinp
Michael Hanselmann [Thu, 31 Jul 2008 14:33:58 +0000 (14:33 +0000)]
Add FileLock utility class
This class is a wrapper around fcntl.flock and abstracts opening and
closing the lockfile. It'll used for the job queue.
(The patch also removes a duplicate import of tempfile into the unittest)
Reviewed-by: iustinp
Michael Hanselmann [Thu, 31 Jul 2008 14:33:36 +0000 (14:33 +0000)]
jqueue: Store context in job queue instead of worker pool
The job queue will need to access to configuration, which is provided
through the context object, to get a list of nodes.
Reviewed-by: iustinp
Oleksiy Mishchenko [Thu, 31 Jul 2008 12:58:55 +0000 (12:58 +0000)]
RAPI Implement DELETE for tags
Reviewed-by: imsnah
Oleksiy Mishchenko [Thu, 31 Jul 2008 09:06:37 +0000 (09:06 +0000)]
First write operation (add tag) for Ganeti RAPI
Add instance tag handling, improved error logging.
...oh, yes adopt instance listing for RAPI2!
Reviewed-by: iustinp
Iustin Pop [Wed, 30 Jul 2008 15:58:28 +0000 (15:58 +0000)]
Fix cluster destroy
With the recent startup/shutdown changes (and with the master daemon in
place), the cluster destroy needs some fixing.
This patch moves the finalization of the destroy out from cmdlib into
bootstrap, so we can nicely shutdown the rapi and master daemons.
Reviewed-by: ultrotter
Guido Trotter [Wed, 30 Jul 2008 15:49:05 +0000 (15:49 +0000)]
Xen: remove two end-of-line semicolons
It's python, isn't it?
Reviewed-by: iustinp
Iustin Pop [Wed, 30 Jul 2008 15:17:58 +0000 (15:17 +0000)]
Fix cluster init
With the recent changes, I forgot the extra parameter to this rpc call.
Also the rpc call needs to be done after we setup the config data, for
the master daemon to be able to start, so we move it after all other
init steps.
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 15:06:01 +0000 (15:06 +0000)]
Make gnt-* commands fail nicely on non-masters
This patch adds a check that we are on the master after failing to
connect to the socket, and log nicely the master name.
Reviewed-by: ultrotter
Guido Trotter [Wed, 30 Jul 2008 15:04:48 +0000 (15:04 +0000)]
Parallelize LUFailoverInstance
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 15:04:27 +0000 (15:04 +0000)]
ChainOpCode is still BGL-only
Prevent mistakes with an assert.
Reviewed-by: iustinp
Iustin Pop [Wed, 30 Jul 2008 15:00:54 +0000 (15:00 +0000)]
Fix a misuse of exc_info in logging.info
This is my fault, sorry.
Reviewed-by: imsnah
Iustin Pop [Wed, 30 Jul 2008 14:04:36 +0000 (14:04 +0000)]
Fix pylint-detected issues
This is mostly:
- whitespace fix (space at EOL in some files, not all, broken
indentation, etc)
- variable names overriding others (one is a real bug in there)
- too-long-lines
- cleanup of most unused imports (not all)
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 13:27:25 +0000 (13:27 +0000)]
Fix some errors detected by pylint
Reviewed-by: imsnah
Iustin Pop [Wed, 30 Jul 2008 12:32:42 +0000 (12:32 +0000)]
Unify SetupDaemon/SetupLogging
The 'old-style' info, error, debug logs do not make much sense. This
patch unifies the SetupLogging and SetupDaemon functions. As a result,
all the commands logs to a 'commands.log' file.
The patch also changes the log setup to keep going if there's an error
in setting up the file logging but we're logging to stderr.
Also, burnin now logs to its own file (burnin.log).
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 12:29:07 +0000 (12:29 +0000)]
Simplify the log constants and add another one
The patch changes the log constants by moving the slash to the end of
the log dir instead of at the beginning of *each* log file name.
It also adds a new LOG_COMMANDS constant (to be used in a next patch).
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 12:27:48 +0000 (12:27 +0000)]
Fix gnt-cluster getmaster
This is special in the sense that it can run on any node. As such, we
just instantiate ssconf and read the data from it.
Reviewed-by: ultrotter
Guido Trotter [Wed, 30 Jul 2008 11:31:09 +0000 (11:31 +0000)]
Parallelize {Startup,Shutdown,Reboot}Instance
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:30:49 +0000 (11:30 +0000)]
Parallelize LUReinstallInstance
self.recalculate_locks[locking.LEVEL_NODE] could have any value and
everything would work anyway. We'll use the string 'replace' by
convention because in the future we might want an 'append' mode.
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:30:29 +0000 (11:30 +0000)]
LogicalUnit._LockInstancesNodes helper function
This function is used to lock instances' primary and secondary nodes
after locking instances themselves.
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:30:10 +0000 (11:30 +0000)]
Make sharing locks possible
LUs can declare which locks they need by populating the
self.needed_locks dictionary, but those locks are always acquired as
exclusive. Make it possible to acquire shared locks as well, by
declaring a particular level as shared in the self.share_locks
dictionary. By default this dictionary is populated so that all locks
are acquired exclusively.
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:29:51 +0000 (11:29 +0000)]
Add LogicalUnit.DeclareLocks
This additional LogicalUnit function is optional to implement, but lets
you change your locking needs for one level just before locking it, but
after the previous levels have been already locked. It is useful for
example to calculate what nodes to lock after locking an instance.
Reviewed-by: iustinp
Guido Trotter [Wed, 30 Jul 2008 11:29:31 +0000 (11:29 +0000)]
LURenameInstance, add/remove relevant locks
LURenameInstance forgot to remove the old lock name and add the new one,
making it impossible for parallel LUs to act on the instance (without a
master daemon restart). This also fixes burning+rename with the
parallelization of {Start,Stop}Instance.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 30 Jul 2008 10:02:02 +0000 (10:02 +0000)]
Rewrite job queue
We found several issues in the old job queue implementation. It had race
conditions, deadlocks and other deficiencies.
Short summary:
- _QueuedOpCode and _QueuedJob are now more or less data structures with a few
utility functions. __Setup is gone.
- DiskJobStorage and JobQueue classes merged into one to reduce code complexity.
- One lock in JobQueue for almost everything. There's also a lock per opcode
for log messages.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 30 Jul 2008 08:56:38 +0000 (08:56 +0000)]
workerpool: Log when waiting for a thread
Reviewed-by: iustinp
Iustin Pop [Wed, 30 Jul 2008 08:43:31 +0000 (08:43 +0000)]
Rework master startup/shutdown/failover
This (big) patch reworks the master startup/shutdown and the fixes the
master failover.
What does the patch do?
For master start/stop:
- remove the old ganeti-master script and its associated man page
- moves the ip start/stop directly into the backend.(Start|Stop)Master
- adds start/stop of the master/rapi daemon into these functions,
selectively based on the start/stop arguments
- makes the master call via rpc StartMaster(start_daemons=False) to
the local node so that the master IP is started
- and finally changes the example init.d script to directly start and
stop all three daemons, since they do the right thing (depending on
master/not master role)
For master failover:
- moves the code from LUMasterFailover into bootstrap.MasterFailover,
since we need to start/stop the master during this operation and
thus it can't be executed from the master
- removes the LUMasterFailover and its associated opcode
Notes: ubuntu's /etc/lsb-base-logging.sh is dumb, so the messages 'not
master' are not seen during startup on non-master nodes.
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 08:34:55 +0000 (08:34 +0000)]
Expose utils.DaemonPidFileName
Since we need to compute this from outside utils.py, we change this to a
public function.
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 08:33:49 +0000 (08:33 +0000)]
Implement checking for the master role in rapi
This patch moves the CheckMaster function from ganeti-masterd to ssconf
(most logical place, it cannot go in utils since we would have recursive
imports between ssconf and utils) and changes ganeti-rapi to also call
this function.
This is needed so that starting ganeti-rapi on a non-master node does
the right thing.
Reviewed-by: ultrotter
Iustin Pop [Wed, 30 Jul 2008 08:32:38 +0000 (08:32 +0000)]
Add a new parameter to backend.(Start|Stop)Master
This patch adds a new, unused for now, parameter to the start and stop
master operations in backend. The idea behind it is that we need to be
able to control whether the IP (de)activation is coupled with daemon
startup/shutdown.
The callers are also modified to pass this parameter (even if unused for
now).
Reviewed-by: ultrotter
Michael Hanselmann [Tue, 29 Jul 2008 14:07:46 +0000 (14:07 +0000)]
Log thread name when debug output is enabled
Reviewed-by: iustinp
Michael Hanselmann [Tue, 29 Jul 2008 14:07:16 +0000 (14:07 +0000)]
jqueue: Fix error logging
The passed parameters were not correct.
Reviewed-by: iustinp, ultrotter
Iustin Pop [Tue, 29 Jul 2008 10:42:46 +0000 (10:42 +0000)]
Fix constants typo
Reviewed-by: imsnah
Iustin Pop [Tue, 29 Jul 2008 09:06:16 +0000 (09:06 +0000)]
Use constants for the pid file stems
Reviewed-by: imsnah
Iustin Pop [Tue, 29 Jul 2008 08:49:50 +0000 (08:49 +0000)]
Add a KillProcess function
We cannot depend on all environments to have a start-stop-daemon or
similar tool. We instead implement a KillProcess function that behaves
similar to “start-stop-daemon --retry”.
Note that the attached unittest can hang in foreground if the child
misbehaves (doesn't write to the internal pipe). Since unittest are
either run in the foreground or are run with a timeout from an automated
framework, I think this is an acceptable trade-off (against of using
hardcoded timeouts in the test).
Reviewed-by: imsnah
Iustin Pop [Tue, 29 Jul 2008 08:49:34 +0000 (08:49 +0000)]
Change IsPidFileAlive into ReadPidFile
We already have a function to test if a PID is alive, so it makes more
sense to use function composition that force calling (since we need to
read PIDs from files in other places too). Now IsProcessAlive returns
False for PIDs <= 0, since this is the error return from ReadPidFile.
The patch also adds a unittest for checking that WriteFile raises the
correct exception, and checks that an invalid or missing file causes
ReadPidFile to return zero. The unittest tearDown method will try to
cleanup the temp directory too (otherwise it leaves stuff after it).
Reviewed-by: ultrotter
Iustin Pop [Tue, 29 Jul 2008 08:48:23 +0000 (08:48 +0000)]
Make the rapi daemon create a pidfile
This is needed for controlling it cleanly with start-stop daemon.
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 28 Jul 2008 10:35:06 +0000 (10:35 +0000)]
Fix unittests for ganeti-rapi
The RESTHTTPServer module went the way of the dodo.
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 10:17:29 +0000 (10:17 +0000)]
Implement signal handling in ganeti-rapi
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 10:17:13 +0000 (10:17 +0000)]
Move ganeti-rapi core code to daemon
All other daemons have their main code in themselves and not in a module.
This patch does the same to ganeti-rapi by moving the code from
lib/rapi/RESTHTTPServer.py to daemons/ganeti-rapi.
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 10:16:51 +0000 (10:16 +0000)]
Replace httperror module with ganeti.http
The generic HTTP server doesn't know about httperror based exceptions
and would treat them as unknown exceptions, thereby not doing the right
thing with HTTP errors.
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 10:13:53 +0000 (10:13 +0000)]
Implement “gnt-job cancel”
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 28 Jul 2008 10:13:37 +0000 (10:13 +0000)]
Implement job canceling on server side
Locking is not completeley right due to a deadlock when the job calls
UpdateJob after changing its status.
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 28 Jul 2008 09:16:57 +0000 (09:16 +0000)]
Fix exception class name in utils.WritePidFile
Reviewed-by: iustinp
Michael Hanselmann [Mon, 28 Jul 2008 09:16:39 +0000 (09:16 +0000)]
Add “canceled” status for opcodes
Reviewed-by: ultrotter
Michael Hanselmann [Mon, 28 Jul 2008 09:16:17 +0000 (09:16 +0000)]
Make “gnt-debug delay” work again
The old API is no longer working.
Reviewed-by: ultrotter
Michael Hanselmann [Fri, 25 Jul 2008 12:47:24 +0000 (12:47 +0000)]
Move code extracting job ID into function
It might come in handy at some point and makes the code a bit easier
to read.
Reviewed-by: iustinp
Oleksiy Mishchenko [Fri, 25 Jul 2008 12:32:43 +0000 (12:32 +0000)]
Convert set to a list in LUGetTags
The set triggers exception on a list-tags command and RAPI calls for tags
since it is not serializable by JSON.
Reviewed-by: iustinp
Oleksiy Mishchenko [Thu, 24 Jul 2008 16:34:49 +0000 (16:34 +0000)]
Switch RAPI to ganeti.http module
Reviewed-by: imsnah
Michael Hanselmann [Thu, 24 Jul 2008 15:04:09 +0000 (15:04 +0000)]
Implement “gnt-job archive” to archive jobs
Reviewed-by: iustinp
Michael Hanselmann [Thu, 24 Jul 2008 11:32:58 +0000 (11:32 +0000)]
Implement job archiving on the server side
So far no error reporting to the client is done. Clients don't get
noticed if a job doesn't exist or couldn't be archived because of
its current status.
The internal cache is always cleaned when the preconditions didn't
fail to make sure that the actual disk status will be reread next
time.
Reviewed-by: iustinp
Michael Hanselmann [Thu, 24 Jul 2008 11:32:46 +0000 (11:32 +0000)]
Add directory for archived jobs
Reviewed-by: iustinp
Michael Hanselmann [Thu, 24 Jul 2008 11:32:30 +0000 (11:32 +0000)]
Fix RPC parameters for {Cancel,Archive}Job
They aren't be tuples on the client side.
Reviewed-by: iustinp
Guido Trotter [Thu, 24 Jul 2008 08:46:01 +0000 (08:46 +0000)]
Add utils unittests for new functions
The submitted WritePidFile, RemovePidfile and IsPidFileAlive functions
miss unit tests. Adding a simple one which covers their basic
functionality.
Reviewed-by: iustinp
Michael Hanselmann [Wed, 23 Jul 2008 16:56:13 +0000 (16:56 +0000)]
Move code formatting job ID into a base class
A later patch will add a memory based job storage class, hence this
code is going into a separate class. It also changes the number format
to always use at least 10 digits, allowing up to 9'999'999'999 jobs to
be sorted without using a custom function.
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:24:08 +0000 (14:24 +0000)]
Use pidfiles in example init script
Rather than searching for the ganeti daemons by name we'll use the
pidfile they create to stop them. This change also adds the --oknodo
option to start-stop-daemon when stopping ganeti (which means it won't
give an error if it wasn't started).
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:55 +0000 (14:23 +0000)]
ganeti-masterd: write and remove pidfile
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:43 +0000 (14:23 +0000)]
ganeti-noded: write and remove pid file
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:31 +0000 (14:23 +0000)]
Add utils.{Write,Remove}PidFile
WritePidFile is a helper function that writes the current pid in a
pidfile within the ganeti run directory. RemovePidFile tries to delete
it.
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:18 +0000 (14:23 +0000)]
Add utils.IsPidFileAlive function
This helper function reads a pid from a file containing it and checks
whether it refers to a live process.
Reviewed-by: iustinp
Guido Trotter [Wed, 23 Jul 2008 14:23:05 +0000 (14:23 +0000)]
Invert nodes/instances locking order
An implementation mistake from the original design caused nodes to be
locked before instances, rather than after. This patch inverts the level
numbering, changing also the relevant unittests and the recursive
locking function starting point.
Reviewed-by: iustinp
Oleksiy Mishchenko [Wed, 23 Jul 2008 14:16:53 +0000 (14:16 +0000)]
Generalization of bulk output mapping
Reviewed-by: iustinp
Michael Hanselmann [Wed, 23 Jul 2008 13:30:15 +0000 (13:30 +0000)]
Rename JobStorage to DiskJobStorage
Reviewed-by: iustinp
Michael Hanselmann [Wed, 23 Jul 2008 13:30:03 +0000 (13:30 +0000)]
gnt-job: Don't treat job IDs as numbers
Reviewed-by: iustinp
Michael Hanselmann [Wed, 23 Jul 2008 12:25:38 +0000 (12:25 +0000)]
Fix logging with string job IDs
The job ID is now a string, hence logging must use %s instead of %d.
Reviewed-by: iustinp
Iustin Pop [Wed, 23 Jul 2008 12:13:16 +0000 (12:13 +0000)]
Simplify rapi.baserlib.MapFields()
We can use zip for simplifying this function. Actually, at this point
I'm not sure if it needs to be a separate function at all.
Reviewed-by: imsnah
Michael Hanselmann [Wed, 23 Jul 2008 11:34:18 +0000 (11:34 +0000)]
Make job ID a string
The docstring says that _NewSerialUnlocked returns “a string
representing the job identifier”. Until now it returned an
integer and this patch changes it.
Reviewed-by: iustinp
Iustin Pop [Wed, 23 Jul 2008 10:06:19 +0000 (10:06 +0000)]
Distribute the queue serial file after each update
This patch adds distribution of the queue serial file after each write
to it (but before a new job is created and written with that ID, and
before a response is returned, so we should be safe from crashes in
between).
Currently it only logs if a node cannot be contacted, it should abort if
> 50% errors are seen.
Reviewed-by: imsnah
Iustin Pop [Wed, 23 Jul 2008 10:06:08 +0000 (10:06 +0000)]
Make the job storage init reuse a serial file
This will be needed for master failover. If we don't have a valid queue
directory, we need to reinitialize it, but we should keep the existing
serial number.
As such, we abstract the reading of the serial and if we find a valid
serial, we do not reset it.
Reviewed-by: imsnah
Guido Trotter [Wed, 23 Jul 2008 08:22:06 +0000 (08:22 +0000)]
Move BDEV_CACHE_DIR to RUN_GANETI_DIR/bdev-cache
This was a TODO for 2.0
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:25:34 +0000 (14:25 +0000)]
Convert SetInstanceParams to concurrency
Grab a lock for the instance we're working on, and update its params.
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:25:21 +0000 (14:25 +0000)]
Use Update in SetInstanceParams
When we set the instance params we're not adding a new instance, but
just updating an existing one, so why using AddInstance?
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:25:08 +0000 (14:25 +0000)]
Convert LUConnectConsole to concurrency
For ConnectConsole we just need to lock the instance we're connecting
to. We make a few rpcs to its primary node, but node daemons can now
handle multiple queries and nodes cannot be removed till they have
instances on them anyway. Note that since we return the ssh command, and
that's executed outside of the ganeti daemon, without any locks held,
the instance can then be subject to operations while we're connected to
it, but that was the previous behavior as well.
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:24:33 +0000 (14:24 +0000)]
Add _ExpandAndLockInstance auxiliary function.
LUs that take an instance name as input and need to expand its name and
lock it can use it to simplify their ExpandNames call. Possibly, and
_ExpandAndLockNode will come as well.
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:24:19 +0000 (14:24 +0000)]
Convert two (simple) LUs to be concurrent
LUQueryClusterInfo and LUDumpClusterConfig can be made concurrent and
don't need to acquire any locks. In fact they don't interact with the
cluster at all, but just with its configuration, which is thread-safe by
design.
Reviewed-by: iustinp
Guido Trotter [Tue, 22 Jul 2008 14:23:53 +0000 (14:23 +0000)]
Add missing empty line
Two top level definitions were separated only by one empty line.
Fixing this.
Reviewed-by: imsnah
Oleksiy Mishchenko [Tue, 22 Jul 2008 14:12:30 +0000 (14:12 +0000)]
Put the poper RAPI baserlib
Reviewed-by: imsnah
Michael Hanselmann [Tue, 22 Jul 2008 14:05:08 +0000 (14:05 +0000)]
Make argument to CleanCacheUnlocked mandatory
Not passing the argument means it has the value None. Iterating None
doesn't work:
>>> "123" in None
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: iterable argument required
Hence I rename it to "exclude" instead of "exceptions", which may be
confusing, and make it mandatory. If one wants to clean all cache
entries, an empty list can be passed.
Reviewed-by: iustinp
Oleksiy Mishchenko [Tue, 22 Jul 2008 13:33:13 +0000 (13:33 +0000)]
Split RAPI resources to pieces
Reviewed-by: iustinp
Michael Hanselmann [Tue, 22 Jul 2008 08:17:52 +0000 (08:17 +0000)]
Split conditions in worker pool
This patch splits the single threading.Condition object used in the
worker pool for synchronization into three.
- worker_to_pool: Notified if a worker wants to notify the pool
- pool_to_worker: Notified if the pool wants to notify a single
or all workers
- pool_to_pool: Used for synchronization in Quiesce
Reviewed-by: ultrotter