root / doc / design-chained-jobs.rst @ 6c3d18e0
History | View | Annotate | Download (5.3 kB)
1 |
============ |
---|---|
2 |
Chained jobs |
3 |
============ |
4 |
|
5 |
.. contents:: :depth: 4 |
6 |
|
7 |
This is a design document about the innards of Ganeti's job processing. |
8 |
Readers are advised to study previous design documents on the topic: |
9 |
|
10 |
- :ref:`Original job queue <jqueue-original-design>` |
11 |
- :ref:`Job priorities <jqueue-job-priority-design>` |
12 |
- :doc:`LU-generated jobs <design-lu-generated-jobs>` |
13 |
|
14 |
|
15 |
Current state and shortcomings |
16 |
============================== |
17 |
|
18 |
Ever since the introduction of the job queue with Ganeti 2.0 there have |
19 |
been situations where we wanted to run several jobs in a specific order. |
20 |
Due to the job queue's current design, such a guarantee can not be |
21 |
given. Jobs are run according to their priority, their ability to |
22 |
acquire all necessary locks and other factors. |
23 |
|
24 |
One way to work around this limitation is to do some kind of job |
25 |
grouping in the client code. Once all jobs of a group have finished, the |
26 |
next group is submitted and waited for. There are different kinds of |
27 |
clients for Ganeti, some of which don't share code (e.g. Python clients |
28 |
vs. htools). This design proposes a solution which would be implemented |
29 |
as part of the job queue in the master daemon. |
30 |
|
31 |
|
32 |
Proposed changes |
33 |
================ |
34 |
|
35 |
With the implementation of :ref:`job priorities |
36 |
<jqueue-job-priority-design>` the processing code was re-architectured |
37 |
and became a lot more versatile. It now returns jobs to the queue in |
38 |
case the locks for an opcode can't be acquired, allowing other |
39 |
jobs/opcodes to be run in the meantime. |
40 |
|
41 |
The proposal is to add a new, optional property to opcodes to define |
42 |
dependencies on other jobs. Job X could define opcodes with a dependency |
43 |
on the success of job Y and would only be run once job Y is finished. If |
44 |
there's a dependency on success and job Y failed, job X would fail as |
45 |
well. Since such dependencies would use job IDs, the jobs still need to |
46 |
be submitted in the right order. |
47 |
|
48 |
.. pyassert:: |
49 |
|
50 |
# Update description below if finalized job status change |
51 |
constants.JOBS_FINALIZED == frozenset([ |
52 |
constants.JOB_STATUS_CANCELED, |
53 |
constants.JOB_STATUS_SUCCESS, |
54 |
constants.JOB_STATUS_ERROR, |
55 |
]) |
56 |
|
57 |
The new attribute's value would be a list of two-valued tuples. Each |
58 |
tuple contains a job ID and a list of requested status for the job |
59 |
depended upon. Only final status are accepted |
60 |
(:pyeval:`utils.CommaJoin(constants.JOBS_FINALIZED)`). An empty list is |
61 |
equivalent to specifying all final status (except |
62 |
:pyeval:`constants.JOB_STATUS_CANCELED`, which is treated specially). |
63 |
An opcode runs only once all its dependency requirements have been |
64 |
fulfilled. |
65 |
|
66 |
Any job referring to a cancelled job is also cancelled unless it |
67 |
explicitely lists :pyeval:`constants.JOB_STATUS_CANCELED` as a requested |
68 |
status. |
69 |
|
70 |
In case a referenced job can not be found in the normal queue or the |
71 |
archive, referring jobs fail as the status of the referenced job can't |
72 |
be determined. |
73 |
|
74 |
With this change, clients can submit all wanted jobs in the right order |
75 |
and proceed to wait for changes on all these jobs (see |
76 |
``cli.JobExecutor``). The master daemon will take care of executing them |
77 |
in the right order, while still presenting the client with a simple |
78 |
interface. |
79 |
|
80 |
Clients using the ``SubmitManyJobs`` interface can use relative job IDs |
81 |
(negative integers) to refer to jobs in the same submission. |
82 |
|
83 |
.. highlight:: javascript |
84 |
|
85 |
Example data structures:: |
86 |
|
87 |
# First job |
88 |
{ |
89 |
"job_id": "6151", |
90 |
"ops": [ |
91 |
{ "OP_ID": "OP_INSTANCE_REPLACE_DISKS", ..., }, |
92 |
{ "OP_ID": "OP_INSTANCE_FAILOVER", ..., }, |
93 |
], |
94 |
} |
95 |
|
96 |
# Second job, runs in parallel with first job |
97 |
{ |
98 |
"job_id": "7687", |
99 |
"ops": [ |
100 |
{ "OP_ID": "OP_INSTANCE_MIGRATE", ..., }, |
101 |
], |
102 |
} |
103 |
|
104 |
# Third job, depending on success of previous jobs |
105 |
{ |
106 |
"job_id": "9218", |
107 |
"ops": [ |
108 |
{ "OP_ID": "OP_NODE_SET_PARAMS", |
109 |
"depend": [ |
110 |
[6151, ["success"]], |
111 |
[7687, ["success"]], |
112 |
], |
113 |
"offline": True, }, |
114 |
], |
115 |
} |
116 |
|
117 |
|
118 |
Other discussed solutions |
119 |
========================= |
120 |
|
121 |
Job-level attribute |
122 |
------------------- |
123 |
|
124 |
At a first look it might seem to be better to put dependencies on |
125 |
previous jobs at a job level. However, it turns out that having the |
126 |
option of defining only a single opcode in a job as having such a |
127 |
dependency can be useful as well. The code complexity in the job queue |
128 |
is equivalent if not simpler. |
129 |
|
130 |
Since opcodes are guaranteed to run in order, clients can just define |
131 |
the dependency on the first opcode. |
132 |
|
133 |
Another reason for the choice of an opcode-level attribute is that the |
134 |
current LUXI interface for submitting jobs is a bit restricted and would |
135 |
need to be changed to allow the addition of job-level attributes, |
136 |
potentially requiring changes in all LUXI clients and/or breaking |
137 |
backwards compatibility. |
138 |
|
139 |
|
140 |
Client-side logic |
141 |
----------------- |
142 |
|
143 |
There's at least one implementation of a batched job executor twisted |
144 |
into the ``burnin`` tool's code. While certainly possible, a client-side |
145 |
solution should be avoided due to the different clients already in use. |
146 |
For one, the :doc:`remote API <rapi>` client shouldn't import |
147 |
non-standard modules. htools are written in Haskell and can't use Python |
148 |
modules. A batched job executor contains quite some logic. Even if |
149 |
cleanly abstracted in a (Python) library, sharing code between different |
150 |
clients is difficult if not impossible. |
151 |
|
152 |
|
153 |
.. vim: set textwidth=72 : |
154 |
.. Local Variables: |
155 |
.. mode: rst |
156 |
.. fill-column: 72 |
157 |
.. End: |