root / doc / locking.rst @ ab2e463a
History | View | Annotate | Download (3.2 kB)
1 | a25c1b2a | Michael Hanselmann | Ganeti locking |
---|---|---|---|
2 | a25c1b2a | Michael Hanselmann | ============== |
3 | a25c1b2a | Michael Hanselmann | |
4 | a25c1b2a | Michael Hanselmann | Introduction |
5 | a25c1b2a | Michael Hanselmann | ------------ |
6 | 0f933d15 | Guido Trotter | |
7 | 0f933d15 | Guido Trotter | This document describes lock order dependencies in Ganeti. |
8 | 0f933d15 | Guido Trotter | It is divided by functional sections |
9 | 0f933d15 | Guido Trotter | |
10 | 0f933d15 | Guido Trotter | |
11 | a25c1b2a | Michael Hanselmann | Opcode Execution Locking |
12 | a25c1b2a | Michael Hanselmann | ------------------------ |
13 | 0f933d15 | Guido Trotter | |
14 | 7faf5110 | Michael Hanselmann | These locks are declared by Logical Units (LUs) (in cmdlib.py) and |
15 | 7faf5110 | Michael Hanselmann | acquired by the Processor (in mcpu.py) with the aid of the Ganeti |
16 | 7faf5110 | Michael Hanselmann | Locking Library (locking.py). They are acquired in the following order: |
17 | 7faf5110 | Michael Hanselmann | |
18 | 7faf5110 | Michael Hanselmann | * BGL: this is the Big Ganeti Lock, it exists for retrocompatibility. |
19 | 7faf5110 | Michael Hanselmann | New LUs acquire it in a shared fashion, and are able to execute all |
20 | 7faf5110 | Michael Hanselmann | toghether (baring other lock waits) while old LUs acquire it |
21 | 7faf5110 | Michael Hanselmann | exclusively and can only execute one at a time, and not at the same |
22 | 7faf5110 | Michael Hanselmann | time with new LUs. |
23 | 7faf5110 | Michael Hanselmann | * Instance locks: can be declared in ExpandNames() or DeclareLocks() |
24 | 7faf5110 | Michael Hanselmann | by an LU, and have the same name as the instance itself. They are |
25 | 7faf5110 | Michael Hanselmann | acquired as a set. Internally the locking library acquired them in |
26 | 7faf5110 | Michael Hanselmann | alphabetical order. |
27 | 7faf5110 | Michael Hanselmann | * Node locks: can be declared in ExpandNames() or DeclareLocks() by an |
28 | 7faf5110 | Michael Hanselmann | LU, and have the same name as the node itself. They are acquired as |
29 | 7faf5110 | Michael Hanselmann | a set. Internally the locking library acquired them in alphabetical |
30 | 7faf5110 | Michael Hanselmann | order. Given this order it's possible to safely acquire a set of |
31 | 7faf5110 | Michael Hanselmann | instances, and then the nodes they reside on. |
32 | 7faf5110 | Michael Hanselmann | |
33 | 7faf5110 | Michael Hanselmann | The ConfigWriter (in config.py) is also protected by a SharedLock, which |
34 | 7faf5110 | Michael Hanselmann | is shared by functions that read the config and acquired exclusively by |
35 | 7faf5110 | Michael Hanselmann | functions that modify it. Since the ConfigWriter calls |
36 | 7faf5110 | Michael Hanselmann | rpc.call_upload_file to all nodes to distribute the config without |
37 | 7faf5110 | Michael Hanselmann | holding the node locks, this call must be able to execute on the nodes |
38 | 7faf5110 | Michael Hanselmann | in parallel with other operations (but not necessarily concurrently with |
39 | 7faf5110 | Michael Hanselmann | itself on the same file, as inside the ConfigWriter this is called with |
40 | 7faf5110 | Michael Hanselmann | the internal config lock held. |
41 | 0f933d15 | Guido Trotter | |
42 | a25c1b2a | Michael Hanselmann | |
43 | a25c1b2a | Michael Hanselmann | Job Queue Locking |
44 | a25c1b2a | Michael Hanselmann | ----------------- |
45 | a25c1b2a | Michael Hanselmann | |
46 | a25c1b2a | Michael Hanselmann | The job queue is designed to be thread-safe. This means that its public |
47 | 7faf5110 | Michael Hanselmann | functions can be called from any thread. The job queue can be called |
48 | 7faf5110 | Michael Hanselmann | from functions called by the queue itself (e.g. logical units), but |
49 | 7faf5110 | Michael Hanselmann | special attention must be paid not to create deadlocks or an invalid |
50 | 7faf5110 | Michael Hanselmann | state. |
51 | a25c1b2a | Michael Hanselmann | |
52 | 7faf5110 | Michael Hanselmann | The single queue lock is used from all classes involved in the queue |
53 | 7faf5110 | Michael Hanselmann | handling. During development we tried to split locks, but deemed it to |
54 | 7faf5110 | Michael Hanselmann | be too dangerous and difficult at the time. Job queue functions |
55 | 7faf5110 | Michael Hanselmann | acquiring the lock can be safely called from all the rest of the code, |
56 | 7faf5110 | Michael Hanselmann | as the lock is released before leaving the job queue again. Unlocked |
57 | 7faf5110 | Michael Hanselmann | functions should only be called from job queue related classes (e.g. in |
58 | 7faf5110 | Michael Hanselmann | jqueue.py) and the lock must be acquired beforehand. |
59 | a25c1b2a | Michael Hanselmann | |
60 | 7faf5110 | Michael Hanselmann | In the job queue worker (``_JobQueueWorker``), the lock must be released |
61 | 7faf5110 | Michael Hanselmann | before calling the LU processor. Otherwise a deadlock can occur when log |
62 | 7faf5110 | Michael Hanselmann | messages are added to opcode results. |
63 | a25c1b2a | Michael Hanselmann | |
64 | a25c1b2a | Michael Hanselmann | |
65 | a25c1b2a | Michael Hanselmann | Node Daemon Locking |
66 | a25c1b2a | Michael Hanselmann | ------------------- |
67 | a25c1b2a | Michael Hanselmann | |
68 | 7faf5110 | Michael Hanselmann | The node daemon contains a lock for the job queue. In order to avoid |
69 | 7faf5110 | Michael Hanselmann | conflicts and/or corruption when an eventual master daemon or another |
70 | 7faf5110 | Michael Hanselmann | node daemon is running, it must be held for all job queue operations |
71 | a25c1b2a | Michael Hanselmann | |
72 | 7faf5110 | Michael Hanselmann | There's one special case for the node daemon running on the master node. |
73 | 7faf5110 | Michael Hanselmann | If grabbing the lock in exclusive fails on startup, the code assumes all |
74 | 7faf5110 | Michael Hanselmann | checks have been done by the process keeping the lock. |
75 | 558fd122 | Michael Hanselmann | |
76 | 558fd122 | Michael Hanselmann | .. vim: set textwidth=72 : |