Statistics
| Branch: | Tag: | Revision:

root / lib / jqueue.py @ a194dc28

History | View | Annotate | Download (73.5 kB)

# Date Author Comment
aa66c183 12/22/2011 07:16 pm Michael Hanselmann

Merge branch 'devel-2.5'

  • devel-2.5:
    jqueue: Factorize checking job processor's result
    jqueue unittest: Rename simple fake-job class
    jqueue: Fix epylint errors introduced in 37d76f1e4
    jqueue: Fix deadlock between job queue and dependency manager...
4f44e311 12/22/2011 06:41 pm Michael Hanselmann

Merge branch 'stable-2.5' into devel-2.5

  • stable-2.5:
    jqueue: Fix epylint errors introduced in 37d76f1e4
    jqueue: Fix deadlock between job queue and dependency manager

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

df5a5730 12/22/2011 03:19 pm Michael Hanselmann

jqueue: Factorize checking job processor's result

This allows for more unittesting.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

1316ebc2 12/21/2011 06:04 pm Michael Hanselmann

jqueue: Fix epylint errors introduced in 37d76f1e4

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

a182a3ed 12/21/2011 04:55 pm Michael Hanselmann

serializer: Remove JSON indentation and dict key sorting

Serializing to JSON using “simplejson” is significantly slower when
indentation and/or sorting of dictionary keys is used. In simplejson 1.x
the difference isn't that big, but with simplejson 2.x the difference...

37d76f1e 12/21/2011 04:35 pm Michael Hanselmann

jqueue: Fix deadlock between job queue and dependency manager

When an opcode is about to be processed its dependencies are
evaluated using “_JobDependencyManager.CheckAndRegister”. Due
to its nature that function requires a lock on the manager's
internal structures. All of this happens while the job queue...

6d5ea385 11/21/2011 09:36 am Michael Hanselmann

jqueue: Add code to prepare for queue shutdown

Doing so will prevent job submissions (similar to a drained queue),
but won't affect currently running jobs. No further jobs will be
executed.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

0e82dcf9 11/18/2011 01:48 pm Andrea Spadaccini

Merge branch 'devel-2.5'

  • devel-2.5: (24 commits)
    LUInstanceCreate: Release unused node locks
    htools: rework message display construction
    hbal: handle empty node groups
    Document OpNodeMigrate's result for RAPI
    Ensure unused ports return to the free port pool...
c8d0be94 11/17/2011 03:39 pm Michael Hanselmann

jqueue: Factorize code checking for drained queue

This is in preparation for a clean(er) shutdown of masterd.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

719f8fba 10/27/2011 04:26 pm Michael Hanselmann

jqueue: Allow zero jobs to be submitted at once

If cmdlib.LUNodeMigrate was called for a node without primary instances
it would try to submit an empty list of jobs. This was never visible via
CLI as there we check the list of primary instances first.

Signed-off-by: Michael Hanselmann <>...

fb1ffbca 10/26/2011 11:53 am Michael Hanselmann

Convert job queue's RPC to generated code

With these changes job queue RPC will finally show up on the lock
monitor. See below for an example. A job queue-specific class is used to
restrict the use of a static list for name resolution to the job queue.
Further improvements can be made to not re-create the whole RPC client...

17385bd2 08/30/2011 02:01 pm Andrea Spadaccini

Fixes to errors/warnings raised by pylint 0.24

Running pylint 0.24.0 revealed 2 errors and 1 warning. Here is how I
fixed them:

  • jqueue.py: silenced E1101
  • netutils.py: rewrote the list comprehension using extend()
  • watcher/__init__.py: fixed a missing format string parameter...
b459a848 08/30/2011 11:24 am Andrea Spadaccini

DeprecationWarning fixes for pylint

In version 0.21, pylint unified all the disable-* (and enable-*)
directives to disable (resp. enable). This leads to a lot of
DeprecationWarning being emitted even if one uses the recommended
version of pylint (0.21.1, as stated in devnotes.rst)....

cb66225d 08/19/2011 03:11 pm Michael Hanselmann

ensure-dirs: Set permissions on job files in queue

This was a regression from 2.4.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: René Nussbaumer <>

dfc8824a 08/02/2011 12:56 pm Michael Hanselmann

jqueue: Add short delay before detecting job changes

By sleeping for 100ms after receiving a notification for a changed job
file the job is given some additional time to change again. This
significantly reduces the number of LUXI calls for WaitForJobChanges...

fcb21ad7 07/21/2011 02:55 pm Michael Hanselmann

Export job dependencies through lock monitor

This makes them visible to the user. Example:

$ gnt-debug locks -o name,pending
Name Pending
job/890 job:891,892
job/892 job:894

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

47099cd1 07/21/2011 02:20 pm Michael Hanselmann

Rename *_STATUS_WAITLOCK to …_WAITING

This patch renames the {JOB,OP}_STATUS_WAITLOCK constants to {JOB,OP}_STATUS_WAITING, as per design document for chained jobs.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

75d81fc8 07/21/2011 11:58 am Michael Hanselmann

Fix locking issue with job dependencies

When jobs waiting for a dependency are notified, they're re-added to the
queue. This would require owning the queue lock in exclusive mode, but
since the function doing so is called from within the job/opcode
processor, it only holds the lock in shared mode....

f8a4adfa 07/21/2011 11:56 am Michael Hanselmann

jqueue: Read-only jobs don't need processor lock

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

b247c6fc 07/21/2011 08:33 am Michael Hanselmann

jqueue: Implement submitting multiple jobs with dependencies

With this change users of the “SubmitManyJobs” interface can use
relative job dependencies. Relative job IDs in dependencies are resolved
before handing the job off to the workerpool.

Signed-off-by: Michael Hanselmann <>...

c0f6d0d8 07/20/2011 03:11 pm Michael Hanselmann

jqueue: Add “writable” flag to memory objects

Basically only one instance of the job, the one being processed,
should be serialized to disk and replicated to other nodes. With
this flag assertions can be added in various places.

Signed-off-by: Michael Hanselmann <>...

b95479a5 07/20/2011 03:11 pm Michael Hanselmann

Implement chained jobs

An overview is available in the design document for this change,
doc/design-chained-jobs.rst.

When a job enters the job processor, the current opcode's dependencies
are evaluated. If a referenced job has not yet reached the desired...

45df0793 07/15/2011 08:13 pm Michael Hanselmann

Fix assertion error on unclean master shutdown

Commit 66bd7445 added an assertion to ensure a finalized job has its
“end_timestamp” attribute set. Unfortunately it didn't cover a case when
the queue is recovering from an unclean master shutdown.

Signed-off-by: Michael Hanselmann <>...

b795a775 07/11/2011 05:53 pm Michael Hanselmann

Merge branch 'devel-2.4'

  • devel-2.4:
    ht: Add new check for numbers
    Fix off-by-one bug in job serial generation
    Shorten some unbreakable lines in man pages
    Correct some spelling mistakes
    Fix bug in recreate-disks for DRBD instances
    Fix a lint warning...
3c88bf36 07/11/2011 05:28 pm Michael Hanselmann

Fix off-by-one bug in job serial generation

Commit 009e73d0 (September 2009) changed the job queue to generate
multiple job serials at once. Ever since it would return one more than
requested.

The “serial” file in the job queue directory is defined to contain the...

194c8ca4 06/10/2011 06:48 pm Michael Hanselmann

jqueue: Allow loading of archived jobs

Chained jobs need to look at previous jobs, including archived ones. A
nice side-effect of this change is the ability to look at archived jobs
using “gnt-job info <id>” as long as the ID is known.

Signed-off-by: Michael Hanselmann <>...

07346f28 06/01/2011 08:08 pm Michael Hanselmann

Merge branch 'devel-2.4'

  • devel-2.4:
    jqueue: Fix potential race condition when cancelling queued jobs
    Fix argument order in ReserveLV and ReserveMAC

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

66bd7445 05/31/2011 07:17 pm Michael Hanselmann

jqueue: Fix potential race condition when cancelling queued jobs

When a job was cancelled, its status would be changed and the file
written again. Since this was a final status, the job file could be
moved anytime for archival. If the job was still in the queue, however,...

0aeeb6e3 05/10/2011 06:32 pm Michael Hanselmann

jqueue: Update worker thread name to include opcode summary

With this patch, the worker thread name is updated to include a short
summary of the opcode (basically its OP_ID). The base name of job queue
threads is shortened from “JobQueue” to “Jq”. Logs and the lock monitor...

6a373640 03/25/2011 03:53 pm Michael Hanselmann

Implement submitting jobs from logical units

The design details can be seen in the design document
(doc/design-lu-generated-jobs.rst).

Signed-off-by: Michael Hanselmann <>
Reviewed-by: René Nussbaumer <>

98ed5092 03/23/2011 06:17 pm Michael Hanselmann

Add opcode summary to SubmitManyJobs errors

Requested-by: Iustin Pop <>
Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

ff699aa9 02/28/2011 05:26 pm Michael Hanselmann

gnt-cluster master-failover: Undrain queue

- Move functions for drain status (tracked via file) from jqueue to jstore
- Undrain queue on master failover if necessary
- Add QA test

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

30c945d0 12/29/2010 05:48 pm Michael Hanselmann

jqueue: Fix cancelling while in waitlock in queue

Since the recent change to leave jobs in the “waitlock” status (commit
5fd6b6947), cancelling a job while it's back in the queue would break.
This patch handles these cases and adds a unittest.

Signed-off-by: Michael Hanselmann <>...

5fd6b694 12/15/2010 03:42 pm Michael Hanselmann

jqueue: Keep jobs in “waitlock” while returning to queue

Iustin Pop reported that a job's file is updated many times while it
waits for locks held by other thread(s). After an investigation it was
concluded that the reason was a design decision for job priorities to...

9e49dfc5 10/12/2010 03:48 pm Michael Hanselmann

jqueue: Fix bug when cancelling jobs

If a job was cancelled while it was waiting for locks, an assertion
would've failed. This patch fixes the problem and provides a unit
test to check for this situation.

Signed-off-by: Michael Hanselmann <>...

b8802cc4 10/12/2010 03:48 pm Michael Hanselmann

jqueue/gnt-job: Add job priority fields for display

These fields can help with debugging.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

320d1daf 10/12/2010 03:48 pm Michael Hanselmann

jqueue: Resume jobs from “waitlock” status (2nd try)

Commit 5ef699a0e had to roll back an earlier attempt at implementing
this. With the improved job queue processer, this is finally possible.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

86b16e9d 10/07/2010 06:10 pm Michael Hanselmann

jqueue, CancelJob: Check status only once per call

This simplifies the code a bit--the status is only checked once.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

a38e8674 09/24/2010 07:21 pm Michael Hanselmann

Fix docstring typo in jqueue._JobProcessor._MarkWaitlock

epydoc complained:
“File …/ganeti/jqueue.py, line 886, in
ganeti.jqueue._JobProcessor._MarkWaitlock
Warning: Redefinition of type for job”

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

f23db633 09/24/2010 06:18 pm Michael Hanselmann

jqueue: Use priority for acquiring locks

Signed-off-by: Michael Hanselmann <>
Reviewed-by: René Nussbaumer <>

26d3fd2f 09/24/2010 06:18 pm Michael Hanselmann

jqueue: Use timeout when acquiring locks

As already noted in the design document, an opcode's priority is
increased when the lock(s) can't be acquired within a certain amount of
time, except at the highest priority, where in such a case a blocking
acquire is used....

b80cc518 09/23/2010 04:06 pm Michael Hanselmann

jqueue: Introduce per-opcode context object

This is better to group per-opcode data.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: René Nussbaumer <>

03b63608 09/23/2010 12:07 pm Michael Hanselmann

jqueue: Rename current_op to better reflect what it actually is

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

fa4aa6b4 09/23/2010 12:07 pm Michael Hanselmann

jqueue: Separate function for in-memory variables

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

be760ba8 09/20/2010 03:41 pm Michael Hanselmann

jqueue: Change model from per-job to per-opcode processing

In order to support priorities, the processing of jobs needs to be
changed. Instead of processing jobs as a whole, the code is changed to
process one opcode at a time and then return to the queue. See the...

7b5c4a69 09/20/2010 03:41 pm Michael Hanselmann

jqueue: Use priority for worker pool

A small helper function is added to make this easier. Priorities are not
yet used in all necessary places.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

a0d2fe2c 09/20/2010 11:11 am Michael Hanselmann

jqueue: Add missing docstring to _QueuedJob.Cancel

This was forgotten in commit 099b2870b.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

099b2870 09/16/2010 01:25 pm Michael Hanselmann

jqueue: Move CancelJob logic to separate function

Moving the internals of this function will allow it to be used from
unittests in the future. Splitting this into a pure, side-effect free
function and an impure one makes the pure function easily testable....

42e32075 09/16/2010 10:25 am Iustin Pop

Merge branch 'devel-2.2'

  • devel-2.2:
    Fix case of MAC special-values
    Remove mcpu's ReportLocks callback
    Revert "jqueue: Resume jobs from “waitlock” status"

(no conflicts, took LGTM from original commit)

Signed-off-by: Iustin Pop <>...

e71c8147 09/13/2010 06:35 pm Michael Hanselmann

jqueue: Ensure only accepted priorities are allowed for submitting jobs

Quoting the design document: “Submitted opcodes can have one of the priorities
listed below. Other priorities are reserved for internal use”. Submitting jobs
at priority -20 should not be allowed....

8f5c488d 09/13/2010 06:35 pm Michael Hanselmann

Add support for job priority to opcodes and job queue objects

This allows clients to submit opcodes with a priority. Except for being
tracked by the job queue, it is not yet used by any code.

Unittests for jqueue._QueuedOpCode and jqueue._QueuedJob are provided for...

acf931b7 09/13/2010 05:46 pm Michael Hanselmann

Remove mcpu's ReportLocks callback

This is no longer needed with the new lock monitor. One callback is kept to
check for cancelled jobs.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

5ef699a0 09/13/2010 05:39 pm Michael Hanselmann

Revert "jqueue: Resume jobs from “waitlock” status"

This reverts commit 4008c8edae31a3971fa8c4b200238afc8005d3d4.

While it worked in my initial tests, I've now found cases where this doesn't
work properly as it is. More work is needed and will be done as part of the...

a68fe106 09/10/2010 03:11 pm Michael Hanselmann

Merge branch 'devel-2.2'

  • devel-2.2:
    Fix pylint warning in http/__init__.py
    Allow SSL ciphers to be overridden in HTTP server
    jqueue: Resume jobs from “waitlock” status
    jqueue: Move queue inspection into separate function
    jqueue: Don't update file in MarkUnfinishedOps...
4008c8ed 09/10/2010 02:23 pm Michael Hanselmann

jqueue: Resume jobs from “waitlock” status

After an unclean restart of ganeti-masterd, jobs in the “waitlock” status can
be safely restarted. They hadn't modified the cluster yet.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: René Nussbaumer <>

de9d02c7 09/10/2010 02:23 pm Michael Hanselmann

jqueue: Move queue inspection into separate function

This makes the init function a lot smaller while not changing
functionality.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: René Nussbaumer <>

747f6113 09/10/2010 02:23 pm Michael Hanselmann

jqueue: Don't update file in MarkUnfinishedOps

This reduced the number of updates to the job files. It's used in two places
while processing a job and the file is updated just afterwards.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

82b22e19 09/07/2010 02:26 pm René Nussbaumer

Move job queue to new ganeti.runtime

Signed-off-by: René Nussbaumer <>
Reviewed-by: Michael Hanselmann <>

ae8419a2 09/07/2010 01:07 pm Michael Hanselmann

Merge branch 'devel-2.2'

  • devel-2.2:
    cli: Use list of options shared between commands
    jqueue: Use separate function for encoding errors
    Fix some epydoc warnings
    Fix breakage introduced by commit 8044bf655
    Remove “dry_run” from opcodes.OpCreateInstance...
6760e4ed 09/07/2010 12:44 pm Michael Hanselmann

jqueue: Use separate function for encoding errors

Comes with unittest.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

c30421e0 08/25/2010 12:56 pm René Nussbaumer

Merge branch 'devel-2.2'

hansmi helped me with merging the conflict. Thanks

Conflicts:
lib/workerpool.py

Signed-off-by: René Nussbaumer <>
Reviewed-by: Iustin Pop <>

daba67c7 08/24/2010 05:27 pm Michael Hanselmann

workerpool: Allow setting task name

With this patch, the task name is added to the thread name and will show up in
logs. Log messages from jobs will look like “pid=578/JobQueue14/Job13 mcpu:289
DEBUG LU locks acquired/cluster/BGL/shared”.

Signed-off-by: Michael Hanselmann <>...

8f9069e5 08/23/2010 01:39 pm Iustin Pop

Merge branch 'devel-2.2'

  • devel-2.2:
    setup-ssh: fix updating of authorized_keys
    setup-ssh: Also use keys from the ssh-agent
    setup-ssh: try to use key auth first
    setup-ssh: redo the logging levels
    setup-ssh: only read the ssh port once
    setup-ssh: fix the logging error message...
9bdab621 08/19/2010 06:30 pm Michael Hanselmann

jqueue: Remove lock status field

With the job queue changes for Ganeti 2.2, watched and queried jobs are
loaded directly from disk, rendering the in-memory “lock_status” field
useless. Writing it to disk would be possible, but has a huge cost at
runtime (when tested, processing 1'000 opcodes involved 4'000 additional...

0f979a34 08/18/2010 07:59 pm Guido Trotter

Merge branch 'devel-2.2'

  • devel-2.2:
    RAPI client: Support modifying instances
    RAPI: Allow modifying instance
    Small fixes for instance creation via RAPI documentation
    gnt-debug: Extend job queue tests
    jqueue: Mark opcodes following failed ones as failed, too...
963a068b 08/18/2010 02:21 pm Michael Hanselmann

jqueue: Mark opcodes following failed ones as failed, too

When an opcode fails, the job queue would leave following opcodes as “queued”,
which can be quite confusing. With this patch, they're all marked as failed and
assertions are added to check this.

Signed-off-by: Michael Hanselmann <>...

6ea72e43 08/18/2010 02:21 pm Michael Hanselmann

jqueue: Work around race condition between job processing and archival

This is a simplified version of a patch I sent earlier to make sure the job
file is only written once with a finalized status.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

b705c7a6 08/18/2010 11:27 am Manuel Franceschini

Support for resolving hostnames to IPv6 addresses

This patch enables IPv6 name resolution by using socket.getaddrinfo
instead of socket.gethostbyname_ex.

It renames the HostInfo class to Hostname and unifies its use throughout
the code. This is achieved by using static calls where no object is...

dc1e2262 08/17/2010 04:53 pm Michael Hanselmann

jqueue: More checks for cancelling queued job

We can also check when the lock status is updated. This will
improve job cancelling.

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

e35344b4 08/17/2010 04:25 pm Michael Hanselmann

jqueue: Add more debug output

Signed-off-by: Michael Hanselmann <>
Reviewed-by: Iustin Pop <>

aa9f8167 07/30/2010 04:43 am Iustin Pop

Fix a few job archival issues

This patch fixes two issues with job archival. First, the
LoadJobFromDisk can return 'None' for no-such-job, and we shouldn't add
None to the job list; we can't anyway, as this raises an exception:

node1# gnt-job archive foo...
599ee321 07/30/2010 12:52 am Iustin Pop

Change handling of non-Ganeti errors in jqueue

Currently, if a job execution raises a Ganeti-specific error (i.e.
subclass of GenericError), then we encode it as (error class, [error
args]). This matches the RAPI documentation.

However, if we get a non-Ganeti error, then we encode it as simply...

b2e8a4d9 07/29/2010 04:05 pm Michael Hanselmann

workerpool: Change signature of AddTask function to not use *args

By changing it to a normal parameter, which must be a sequence, we can
start using keyword parameters.

Before this patch all arguments to “AddTask(self, *args)” were passed as
arguments to the worker's “RunTask” method. Priorities, which should be...

7f93570a 07/16/2010 04:56 pm Iustin Pop

Implement lock names for debugging purposes

This patch adds lock names to SharedLocks and LockSets, that can be used
later for displaying the actual locks being held/used in places where we
only have the lock, and not the entire context of the locking operation....

989a8bee 07/15/2010 05:37 pm Michael Hanselmann

jqueue: Factorize code waiting for job changes

By splitting the _WaitForJobChangesHelper class into multiple smaller
classes, we gain in several places:

- Simpler code, less interaction between functions and variables
- Easy to unittest (close to 100% coverage)...

2034c70d 07/12/2010 05:27 pm Michael Hanselmann

jqueue: Setup inotify before checking for any job changes

Since the code waiting for job changes was modified to use inotify,
a race condition between checking for changes the first time and
setting up inotify occurs. If the job is modified after the check...

a744b676 07/09/2010 04:37 pm Manuel Franceschini

Introduce lib/netutils.py

This patch moves network utility functions to a dedicated module.

Signed-off-by: Manuel Franceschini <>
Reviewed-by: Iustin Pop <>

271daef8 07/06/2010 07:05 pm Iustin Pop

Fix opcode transition from WAITLOCK to RUNNING

With the recent changes in the job queue, an old bug surfaced: we never
serialized the status change when in NotifyStart, thus a crash of the
master would have left the job queue oblivious to the fact that the job...

ebb80afa 06/28/2010 02:04 pm Guido Trotter

jqueue: remove the _big_jqueue_lock module global

By using ssynchronized in the new way, we can remove the module-global
_big_jqueue_lock and revert back to an internal _lock inside the jqueue.

Signed-off-by: Guido Trotter <>
Reviewed-by: Iustin Pop <>

3c0d60d0 06/28/2010 02:04 pm Guido Trotter

Share the jqueue lock on job-local changes

We can share the jqueue lock when we do per-job updates. These only
conflict with updates/checks on the same job from another thread (eg.
CancelJob, ArchiveJob, which keep the lock unshared, since they are less
frequent)....

9bf5e01f 06/28/2010 02:04 pm Guido Trotter

_OpExecCallbacks abstract _AppendFeedback

Move some code to a decorated function rather than explicitely
acquiring/releasing the lock in AppendFeedback.

Signed-off-by: Guido Trotter <>
Reviewed-by: Iustin Pop <>

99bd4f0a 06/28/2010 02:04 pm Guido Trotter

jqueue: convert to a SharedLock()

Remove the jqueue _lock member and convert to a _big_jqueue_lock
sharedlock. This allows smooth transition from the old single lock to a
more granular approach.

Signed-off-by: Guido Trotter <>
Reviewed-by: Iustin Pop <>

39ed3a98 06/28/2010 02:04 pm Guido Trotter

MarkUnfinishedOps: update job file on disk

Every time we call MarkUnfinishedOps we do it in a try/finally block
that updates the job file. With this patch we move the try/finally
inside. CancelJobUnlocked is removed, because it just becomes a wrapper
over MarkUnfinishedOps with two constant values....

a1bfdeb1 06/28/2010 02:04 pm Guido Trotter

Remove spurious empty line

Signed-off-by: Guido Trotter <>
Reviewed-by: Iustin Pop <>

41593f6b 06/23/2010 01:32 pm Guido Trotter

Remove job object condition

We don't need it anymore, since nobody waits on it.

Signed-off-by: Guido Trotter <>
Reviewed-by: Iustin Pop <>

6c2549d6 06/23/2010 01:32 pm Guido Trotter

Parallelize WaitForJobChanges

As for QueryJobs we rely on file updates rather than condition
notification to acquire job changes. In order to do that we use the
pyinotify module to watch files. This might make the client a bit slower
(pending planned improvements, such as subscription-based...

b3855790 06/23/2010 01:32 pm Guido Trotter

Update the job file on feedback

This is needed to convert waitforjobchanges to use inotify and the
on-disk version and decouple it from the job queue lock. No replication
to remote nodes is done, to keep the operation fast.

Signed-off-by: Guido Trotter <>...

9f7b4967 06/23/2010 01:32 pm Guido Trotter

Don't lock on QueryJobs, by using the disk version

We move from querying the in-memory version to loading all jobs from the
disk. Since the jobs are written/deleted on disk in an atomic manner, we
don't need to lock at all. Also, since we're just looking at the...

0f9c08dc 06/23/2010 01:32 pm Guido Trotter

Add JobQueue.SafeLoadJobFromDisk

This will be used to read a job file without having to deal with
exceptions from _LoadJobFromDisk.

Signed-off-by: Guido Trotter <>
Reviewed-by: Iustin Pop <>

3d6c5566 06/23/2010 01:32 pm Guido Trotter

jqueue._LoadJobFromDisk: remove safety archival

Currently _LoadJobFromDisk archives job files it finds corrupted. Since
we want to use it to load files without holding locks, this could cause
a conflict: we just move the feature to _LoadJobUnlocked which is always...

7beb1e53 06/17/2010 08:25 pm Guido Trotter

jqueue.AddManyJobs: use AddManyTasks

Rather than adding the jobs to the worker pool one at a time, we add
them all together, which is slightly faster, and ensures they don't get
started while we loop.

Signed-off-by: Guido Trotter <>
Reviewed-by: Michael Hanselmann <>

4c36bdf5 06/17/2010 01:00 pm Guido Trotter

jqueue: make replication on job update optional

Sometimes it's useful to write to the local filesystem, but immediate
replication to all master candidates is not needed.

The _WriteAndReplicateFileUnlocked function gets renamed to
_UpdateJobQueueFile, as calling "write and replicate, but don't...

6a290889 06/17/2010 12:53 pm Guido Trotter

s/queue._GetJobInfoUnlocked/job.GetInfo/

The job queue currently has a static _GetJobInfoUnlocked method.
Changing it to be a normal method of _QueuedJob, which makes more sense.

Signed-off-by: Guido Trotter <>
Reviewed-by: Michael Hanselmann <>

162c8636 06/17/2010 12:53 pm Guido Trotter

Abstract loading job file from disk

Move the work from _LoadJobUnlocked to _LoadJobFileFromDisk, which can
then be used in other contexts as well. Also, if we fail to deserialize
the job, archive it as well (before we archived it only if we failed to
create the related object, but kept it there if deserialization failed....

d8e0dc17 06/15/2010 12:49 pm Guido Trotter

jqueue: simplify removal from _nodes

Somewhere we do try/del/except and somewhere just pop. Using pop
everywhere saves lines of code.

Signed-off-by: Guido Trotter <>
Reviewed-by: Iustin Pop <>

b5b8309d 06/15/2010 12:49 pm Guido Trotter

ListVisibleFiles: do not sort output

Among all users, turns out just one may need the output to be sorted.
All the others can cope without.

Signed-off-by: Guido Trotter <>
Reviewed-by: Iustin Pop <>

20571a26 06/11/2010 07:06 pm Guido Trotter

Cache a few bits of status in jqueue

Currently each time we submit a job we check the job queue size, and the
drained file. With this change we keep these pieces of information in
memory and don't read them from the filesystem each time.

Significant changes include:...

69b03fd7 06/11/2010 05:06 pm Guido Trotter

Remove unused parameter from function

This also removes the relevant pylint disable.
No point in keeping unused parameters around: if/when we need them it's
easy to add it back.

Signed-off-by: Guido Trotter <>
Reviewed-by: Michael Hanselmann <>

85a1c57d 06/11/2010 05:06 pm Guido Trotter

Optimize _GetJobIDsUnlocked

Currently we sort the list of job queue files twice (once in
utils.ListVisibleFiles with sort and then later with NiceSort). We apply
the _RE_JOB_FILE regular expression twice (once in _ListJobFiles and
once in _ExtractJobID). This simplifies the code a little, and a couple...

a71f9c7d 06/11/2010 05:06 pm Guido Trotter

jqueue: Rename _queue_lock to _queue_filelock

The name clarifies the difference between this and the internal lock.
Also explain a bit better what it is.

Signed-off-by: Guido Trotter <>
Reviewed-by: Michael Hanselmann <>