Revision 2915335f
b/doc/design-chained-jobs.rst | ||
---|---|---|
115 | 115 |
} |
116 | 116 |
|
117 | 117 |
|
118 |
Implementation details |
|
119 |
---------------------- |
|
120 |
|
|
121 |
Status while waiting for dependencies |
|
122 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
123 |
|
|
124 |
Jobs waiting for dependencies are certainly not in the queue anymore and |
|
125 |
therefore need to change their status from "queued". While waiting for |
|
126 |
opcode locks the job is in the "waiting" status (the constant is named |
|
127 |
``JOB_STATUS_WAITLOCK``, but the actual value is ``waiting``). There the |
|
128 |
following possibilities: |
|
129 |
|
|
130 |
#. Introduce a new status, e.g. "waitdeps". |
|
131 |
|
|
132 |
Pro: |
|
133 |
|
|
134 |
- Clients know for sure a job is waiting for dependencies, not locks |
|
135 |
|
|
136 |
Con: |
|
137 |
|
|
138 |
- Code and tests would have to be updated/extended for the new status |
|
139 |
- List of possible state transitions certainly wouldn't get simpler |
|
140 |
- Breaks backwards compatibility, older clients might get confused |
|
141 |
|
|
142 |
#. Use existing "waiting" status. |
|
143 |
|
|
144 |
Pro: |
|
145 |
|
|
146 |
- No client changes necessary, less code churn (note that there are |
|
147 |
clients which don't live in Ganeti core) |
|
148 |
- Clients don't need to know the difference between waiting for a job |
|
149 |
and waiting for a lock; it doesn't make a difference |
|
150 |
- Fewer state transitions (see commit ``5fd6b69479c0``, which removed |
|
151 |
many state transitions and disk writes) |
|
152 |
|
|
153 |
Con: |
|
154 |
|
|
155 |
- Not immediately visible what a job is waiting for, but it's the |
|
156 |
same issue with locks; this is the reason why the lock monitor |
|
157 |
(``gnt-debug locks``) was introduced; job dependencies can be shown |
|
158 |
as "locks" in the monitor |
|
159 |
|
|
160 |
Based on these arguments, the proposal is to do the following: |
|
161 |
|
|
162 |
- Rename ``JOB_STATUS_WAITLOCK`` constant to ``JOB_STATUS_WAITING`` to |
|
163 |
reflect its actual meanting: the job is waiting for something |
|
164 |
- While waiting for dependencies and locks, jobs are in the "waiting" |
|
165 |
status |
|
166 |
- Export dependency information in lock monitor; example output:: |
|
167 |
|
|
168 |
Name Mode Owner Pending |
|
169 |
job/27491 - - success:job/34709,job/21459 |
|
170 |
job/21459 - - success,error:job/14513 |
|
171 |
|
|
172 |
|
|
173 |
Cost of deserialization |
|
174 |
~~~~~~~~~~~~~~~~~~~~~~~ |
|
175 |
|
|
176 |
To determine the status of a dependency job the job queue must have |
|
177 |
access to its data structure. Other queue operations already do this, |
|
178 |
e.g. archiving, watching a job's progress and querying jobs. |
|
179 |
|
|
180 |
Initially (Ganeti 2.0/2.1) the job queue shared the job objects |
|
181 |
in memory and protected them using locks. Ganeti 2.2 (see :doc:`design |
|
182 |
document <design-2.2>`) changed the queue to read and deserialize jobs |
|
183 |
from disk. This significantly reduced locking and code complexity. |
|
184 |
Nowadays inotify is used to wait for changes on job files when watching |
|
185 |
a job's progress. |
|
186 |
|
|
187 |
Reading from disk and deserializing certainly has some cost associated |
|
188 |
with it, but it's a significantly simpler architecture than |
|
189 |
synchronizing in memory with locks. At the stage where dependencies are |
|
190 |
evaluated the queue lock is held in shared mode, so different workers |
|
191 |
can read at the same time (deliberately ignoring CPython's interpreter |
|
192 |
lock). |
|
193 |
|
|
194 |
It is expected that the majority of executed jobs won't use |
|
195 |
dependencies and therefore won't be affected. |
|
196 |
|
|
197 |
|
|
118 | 198 |
Other discussed solutions |
119 | 199 |
========================= |
120 | 200 |
|
Also available in: Unified diff