Statistics
| Branch: | Tag: | Revision:

root / doc / design-2.0-job-queue.rst @ e9f242e4

History | View | Annotate | Download (5.7 kB)

1
Job Queue
2
=========
3

    
4
.. contents::
5

    
6
Overview
7
--------
8

    
9
In Ganeti 1.2, operations in a cluster have to be done in a serialized way.
10
Virtually any operation locks the whole cluster by grabbing the global lock.
11
Other commands can't return before all work has been done.
12

    
13
By implementing a job queue and granular locking, we can lower the latency of
14
command execution inside a Ganeti cluster.
15

    
16

    
17
Detailed Design
18
---------------
19

    
20
Job execution—“Life of a Ganeti job”
21
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
22

    
23
#. Job gets submitted by the client. A new job identifier is generated and
24
   assigned to the job. The job is then automatically replicated [#replic]_
25
   to all nodes in the cluster. The identifier is returned to the client.
26
#. A pool of worker threads waits for new jobs. If all are busy, the job has
27
   to wait and the first worker finishing its work will grab it. Otherwise any
28
   of the waiting threads will pick up the new job.
29
#. Client waits for job status updates by calling a waiting RPC function.
30
   Log message may be shown to the user. Until the job is started, it can also
31
   be cancelled.
32
#. As soon as the job is finished, its final result and status can be retrieved
33
   from the server.
34
#. If the client archives the job, it gets moved to a history directory.
35
   There will be a method to archive all jobs older than a a given age.
36

    
37
.. [#replic] We need replication in order to maintain the consistency across
38
   all nodes in the system; the master node only differs in the fact that
39
   now it is running the master daemon, but it if fails and we do a master
40
   failover, the jobs are still visible on the new master (even though they
41
   will be marked as failed).
42

    
43
Failures to replicate a job to other nodes will be only flagged as
44
errors in the master daemon log if more than half of the nodes failed,
45
otherwise we ignore the failure, and rely on the fact that the next
46
update (for still running jobs) will retry the update. For finished
47
jobs, it is less of a problem.
48

    
49
Future improvements will look into checking the consistency of the job
50
list and jobs themselves at master daemon startup.
51

    
52

    
53
Job storage
54
~~~~~~~~~~~
55

    
56
Jobs are stored in the filesystem as individual files, serialized
57
using JSON (standard serialization mechanism in Ganeti).
58

    
59
The choice of storing each job in its own file was made because:
60

    
61
- a file can be atomically replaced
62
- a file can easily be replicated to other nodes
63
- checking consistency across nodes can be implemented very easily, since
64
  all job files should be (at a given moment in time) identical
65

    
66
The other possible choices that were discussed and discounted were:
67

    
68
- single big file with all job data: not feasible due to difficult updates
69
- in-process databases: hard to replicate the entire database to the
70
  other nodes, and replicating individual operations does not mean wee keep
71
  consistency
72

    
73

    
74
Queue structure
75
~~~~~~~~~~~~~~~
76

    
77
All file operations have to be done atomically by writing to a temporary file
78
and subsequent renaming. Except for log messages, every change in a job is
79
stored and replicated to other nodes.
80

    
81
::
82

    
83
  /var/lib/ganeti/queue/
84
    job-1 (JSON encoded job description and status)
85
    […]
86
    job-37
87
    job-38
88
    job-39
89
    lock (Queue managing process opens this file in exclusive mode)
90
    serial (Last job ID used)
91
    version (Queue format version)
92

    
93

    
94
Locking
95
~~~~~~~
96

    
97
Locking in the job queue is a complicated topic. It is called from more than
98
one thread and must be thread-safe. For simplicity, a single lock is used for
99
the whole job queue.
100

    
101
A more detailed description can be found in doc/locking.txt.
102

    
103

    
104
Internal RPC
105
~~~~~~~~~~~~
106

    
107
RPC calls available between Ganeti master and node daemons:
108

    
109
jobqueue_update(file_name, content)
110
  Writes a file in the job queue directory.
111
jobqueue_purge()
112
  Cleans the job queue directory completely, including archived job.
113
jobqueue_rename(old, new)
114
  Renames a file in the job queue directory.
115

    
116

    
117
Client RPC
118
~~~~~~~~~~
119

    
120
RPC between Ganeti clients and the Ganeti master daemon supports the following
121
operations:
122

    
123
SubmitJob(ops)
124
  Submits a list of opcodes and returns the job identifier. The identifier is
125
  guaranteed to be unique during the lifetime of a cluster.
126
WaitForJobChange(job_id, fields, […], timeout)
127
  This function waits until a job changes or a timeout expires. The condition
128
  for when a job changed is defined by the fields passed and the last log
129
  message received.
130
QueryJobs(job_ids, fields)
131
  Returns field values for the job identifiers passed.
132
CancelJob(job_id)
133
  Cancels the job specified by identifier. This operation may fail if the job
134
  is already running, canceled or finished.
135
ArchiveJob(job_id)
136
  Moves a job into the …/archive/ directory. This operation will fail if the
137
  job has not been canceled or finished.
138

    
139

    
140
Job and opcode status
141
~~~~~~~~~~~~~~~~~~~~~
142

    
143
Each job and each opcode has, at any time, one of the following states:
144

    
145
Queued
146
  The job/opcode was submitted, but did not yet start.
147
Waiting
148
  The job/opcode is waiting for a lock to proceed.
149
Running
150
  The job/opcode is running.
151
Canceled
152
  The job/opcode was canceled before it started.
153
Success
154
  The job/opcode ran and finished successfully.
155
Error
156
  The job/opcode was aborted with an error.
157

    
158
If the master is aborted while a job is running, the job will be set to the
159
Error status once the master started again.
160

    
161

    
162
History
163
~~~~~~~
164

    
165
Archived jobs are kept in a separate directory,
166
/var/lib/ganeti/queue/archive/.  This is done in order to speed up the
167
queue handling: by default, the jobs in the archive are not touched by
168
any functions. Only the current (unarchived) jobs are parsed, loaded,
169
and verified (if implemented) by the master daemon.
170

    
171

    
172
Ganeti updates
173
~~~~~~~~~~~~~~
174

    
175
The queue has to be completely empty for Ganeti updates with changes
176
in the job queue structure. In order to allow this, there will be a
177
way to prevent new jobs entering the queue.