root / doc / design-2.0-job-queue.rst @ cd55576a
History | View | Annotate | Download (3.9 kB)
1 | b2cee5e5 | Michael Hanselmann | Job Queue |
---|---|---|---|
2 | b2cee5e5 | Michael Hanselmann | ========= |
3 | b2cee5e5 | Michael Hanselmann | |
4 | b2cee5e5 | Michael Hanselmann | .. contents:: |
5 | b2cee5e5 | Michael Hanselmann | |
6 | b2cee5e5 | Michael Hanselmann | Overview |
7 | b2cee5e5 | Michael Hanselmann | -------- |
8 | b2cee5e5 | Michael Hanselmann | |
9 | b2cee5e5 | Michael Hanselmann | In Ganeti 1.2, operations in a cluster have to be done in a serialized way. |
10 | b2cee5e5 | Michael Hanselmann | Virtually any operation locks the whole cluster by grabbing the global lock. |
11 | b2cee5e5 | Michael Hanselmann | Other commands can't return before all work has been done. |
12 | b2cee5e5 | Michael Hanselmann | |
13 | b2cee5e5 | Michael Hanselmann | By implementing a job queue and granular locking, we can lower the latency of |
14 | b2cee5e5 | Michael Hanselmann | command execution inside a Ganeti cluster. |
15 | b2cee5e5 | Michael Hanselmann | |
16 | b2cee5e5 | Michael Hanselmann | |
17 | b2cee5e5 | Michael Hanselmann | Detailed Design |
18 | b2cee5e5 | Michael Hanselmann | --------------- |
19 | b2cee5e5 | Michael Hanselmann | |
20 | b2cee5e5 | Michael Hanselmann | Job execution—“Life of a Ganeti job” |
21 | b2cee5e5 | Michael Hanselmann | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
22 | b2cee5e5 | Michael Hanselmann | |
23 | b2cee5e5 | Michael Hanselmann | #. Job gets submitted by the client. A new job identifier is generated and |
24 | b2cee5e5 | Michael Hanselmann | assigned to the job. The job is then automatically replicated to all nodes |
25 | b2cee5e5 | Michael Hanselmann | in the cluster. The identifier is returned to the client. |
26 | b2cee5e5 | Michael Hanselmann | #. A pool of worker threads waits for new jobs. If all are busy, the job has |
27 | b2cee5e5 | Michael Hanselmann | to wait and the first worker finishing its work will grab it. Otherwise any |
28 | b2cee5e5 | Michael Hanselmann | of the waiting threads will pick up the new job. |
29 | b2cee5e5 | Michael Hanselmann | #. Client waits for job status updates by calling a waiting RPC function. |
30 | b2cee5e5 | Michael Hanselmann | Log message may be shown to the user. Until the job is started, it can also |
31 | b2cee5e5 | Michael Hanselmann | be cancelled. |
32 | b2cee5e5 | Michael Hanselmann | #. As soon as the job is finished, its final result and status can be retrieved |
33 | b2cee5e5 | Michael Hanselmann | from the server. |
34 | b2cee5e5 | Michael Hanselmann | #. If the client archives the job, it gets moved to a history directory. |
35 | b2cee5e5 | Michael Hanselmann | This could also be done regularily using a cron script. |
36 | b2cee5e5 | Michael Hanselmann | |
37 | b2cee5e5 | Michael Hanselmann | |
38 | b2cee5e5 | Michael Hanselmann | Queue structure |
39 | b2cee5e5 | Michael Hanselmann | ~~~~~~~~~~~~~~~ |
40 | b2cee5e5 | Michael Hanselmann | |
41 | b2cee5e5 | Michael Hanselmann | All file operations have to be done atomically by writing to a temporary file |
42 | b2cee5e5 | Michael Hanselmann | and subsequent renaming. Except for log messages, every change in a job is |
43 | b2cee5e5 | Michael Hanselmann | stored and replicated to other nodes. |
44 | b2cee5e5 | Michael Hanselmann | |
45 | b2cee5e5 | Michael Hanselmann | :: |
46 | b2cee5e5 | Michael Hanselmann | |
47 | b2cee5e5 | Michael Hanselmann | /var/lib/ganeti/queue/ |
48 | b2cee5e5 | Michael Hanselmann | job-1 (JSON encoded job description and status) |
49 | b2cee5e5 | Michael Hanselmann | […] |
50 | b2cee5e5 | Michael Hanselmann | job-37 |
51 | b2cee5e5 | Michael Hanselmann | job-38 |
52 | b2cee5e5 | Michael Hanselmann | job-39 |
53 | b2cee5e5 | Michael Hanselmann | lock (Queue managing process opens this file in exclusive mode) |
54 | b2cee5e5 | Michael Hanselmann | serial (Last job ID used) |
55 | b2cee5e5 | Michael Hanselmann | version (Queue format version) |
56 | b2cee5e5 | Michael Hanselmann | |
57 | b2cee5e5 | Michael Hanselmann | |
58 | b2cee5e5 | Michael Hanselmann | Locking |
59 | b2cee5e5 | Michael Hanselmann | ~~~~~~~ |
60 | b2cee5e5 | Michael Hanselmann | |
61 | b2cee5e5 | Michael Hanselmann | Locking in the job queue is a complicated topic. It is called from more than |
62 | b2cee5e5 | Michael Hanselmann | one thread and must be thread-safe. For simplicity, a single lock is used for |
63 | b2cee5e5 | Michael Hanselmann | the whole job queue. |
64 | b2cee5e5 | Michael Hanselmann | |
65 | b2cee5e5 | Michael Hanselmann | A more detailed description can be found in doc/locking.txt. |
66 | b2cee5e5 | Michael Hanselmann | |
67 | b2cee5e5 | Michael Hanselmann | |
68 | b2cee5e5 | Michael Hanselmann | Internal RPC |
69 | b2cee5e5 | Michael Hanselmann | ~~~~~~~~~~~~ |
70 | b2cee5e5 | Michael Hanselmann | |
71 | b2cee5e5 | Michael Hanselmann | RPC calls available between Ganeti master and node daemons: |
72 | b2cee5e5 | Michael Hanselmann | |
73 | b2cee5e5 | Michael Hanselmann | jobqueue_update(file_name, content) |
74 | b2cee5e5 | Michael Hanselmann | Writes a file in the job queue directory. |
75 | b2cee5e5 | Michael Hanselmann | jobqueue_purge() |
76 | b2cee5e5 | Michael Hanselmann | Cleans the job queue directory completely, including archived job. |
77 | b2cee5e5 | Michael Hanselmann | jobqueue_rename(old, new) |
78 | b2cee5e5 | Michael Hanselmann | Renames a file in the job queue directory. |
79 | b2cee5e5 | Michael Hanselmann | |
80 | b2cee5e5 | Michael Hanselmann | |
81 | b2cee5e5 | Michael Hanselmann | Client RPC |
82 | b2cee5e5 | Michael Hanselmann | ~~~~~~~~~~ |
83 | b2cee5e5 | Michael Hanselmann | |
84 | b2cee5e5 | Michael Hanselmann | RPC between Ganeti clients and the Ganeti master daemon supports the following |
85 | b2cee5e5 | Michael Hanselmann | operations: |
86 | b2cee5e5 | Michael Hanselmann | |
87 | b2cee5e5 | Michael Hanselmann | SubmitJob(ops) |
88 | b2cee5e5 | Michael Hanselmann | Submits a list of opcodes and returns the job identifier. The identifier is |
89 | b2cee5e5 | Michael Hanselmann | guaranteed to be unique during the lifetime of a cluster. |
90 | b2cee5e5 | Michael Hanselmann | WaitForJobChange(job_id, fields, […], timeout) |
91 | b2cee5e5 | Michael Hanselmann | This function waits until a job changes or a timeout expires. The condition |
92 | b2cee5e5 | Michael Hanselmann | for when a job changed is defined by the fields passed and the last log |
93 | b2cee5e5 | Michael Hanselmann | message received. |
94 | b2cee5e5 | Michael Hanselmann | QueryJobs(job_ids, fields) |
95 | b2cee5e5 | Michael Hanselmann | Returns field values for the job identifiers passed. |
96 | b2cee5e5 | Michael Hanselmann | CancelJob(job_id) |
97 | b2cee5e5 | Michael Hanselmann | Cancels the job specified by identifier. This operation may fail if the job |
98 | b2cee5e5 | Michael Hanselmann | is already running, canceled or finished. |
99 | b2cee5e5 | Michael Hanselmann | ArchiveJob(job_id) |
100 | b2cee5e5 | Michael Hanselmann | Moves a job into the …/archive/ directory. This operation will fail if the |
101 | b2cee5e5 | Michael Hanselmann | job has not been canceled or finished. |
102 | b2cee5e5 | Michael Hanselmann | |
103 | b2cee5e5 | Michael Hanselmann | |
104 | b2cee5e5 | Michael Hanselmann | Job and opcode status |
105 | b2cee5e5 | Michael Hanselmann | ~~~~~~~~~~~~~~~~~~~~~ |
106 | b2cee5e5 | Michael Hanselmann | |
107 | b2cee5e5 | Michael Hanselmann | Each job and each opcode has, at any time, one of the following states: |
108 | b2cee5e5 | Michael Hanselmann | |
109 | b2cee5e5 | Michael Hanselmann | Queued |
110 | b2cee5e5 | Michael Hanselmann | The job/opcode was submitted, but did not yet start. |
111 | b2cee5e5 | Michael Hanselmann | Running |
112 | b2cee5e5 | Michael Hanselmann | The job/opcode is running. |
113 | b2cee5e5 | Michael Hanselmann | Canceled |
114 | b2cee5e5 | Michael Hanselmann | The job/opcode was canceled before it started. |
115 | b2cee5e5 | Michael Hanselmann | Success |
116 | b2cee5e5 | Michael Hanselmann | The job/opcode ran and finished successfully. |
117 | b2cee5e5 | Michael Hanselmann | Error |
118 | b2cee5e5 | Michael Hanselmann | The job/opcode was aborted with an error. |
119 | b2cee5e5 | Michael Hanselmann | |
120 | b2cee5e5 | Michael Hanselmann | If the master is aborted while a job is running, the job will be set to the |
121 | b2cee5e5 | Michael Hanselmann | Error status once the master started again. |
122 | b2cee5e5 | Michael Hanselmann | |
123 | b2cee5e5 | Michael Hanselmann | |
124 | b2cee5e5 | Michael Hanselmann | History |
125 | b2cee5e5 | Michael Hanselmann | ~~~~~~~ |
126 | b2cee5e5 | Michael Hanselmann | |
127 | b2cee5e5 | Michael Hanselmann | Archived jobs are kept in a separate directory, /var/lib/ganeti/queue/archive/. |
128 | b2cee5e5 | Michael Hanselmann | The idea is to speed up the queue handling. |
129 | b2cee5e5 | Michael Hanselmann | |
130 | b2cee5e5 | Michael Hanselmann | |
131 | b2cee5e5 | Michael Hanselmann | Ganeti updates |
132 | b2cee5e5 | Michael Hanselmann | ~~~~~~~~~~~~~~ |
133 | b2cee5e5 | Michael Hanselmann | |
134 | b2cee5e5 | Michael Hanselmann | The queue has to be completely empty for Ganeti updates with changes in the job |
135 | b2cee5e5 | Michael Hanselmann | queue structure. |