Revision 9725b53d

b/Makefile.am
136 136
	doc/iallocator.rst \
137 137
	doc/index.rst \
138 138
	doc/install.rst \
139
	doc/locking.rst \
139 140
	doc/rapi.rst \
140 141
	doc/security.rst
141 142

  
......
202 203
	doc/examples/dumb-allocator \
203 204
	doc/examples/hooks/ethers \
204 205
	doc/examples/hooks/ipsec.in \
205
	doc/locking.txt \
206 206
	test/testutils.py \
207 207
	test/mocks.py \
208 208
	$(dist_TESTS) \
b/doc/design-2.0.rst
692 692
one thread and must be thread-safe. For simplicity, a single lock is used for
693 693
the whole job queue.
694 694

  
695
A more detailed description can be found in doc/locking.txt.
695
A more detailed description can be found in doc/locking.rst.
696 696

  
697 697

  
698 698
Internal RPC
b/doc/index.rst
14 14
   security.rst
15 15
   design-2.0.rst
16 16
   design-2.1.rst
17
   locking.rst
17 18
   hooks.rst
18 19
   iallocator.rst
19 20
   rapi.rst
b/doc/locking.rst
1
Ganeti locking
2
==============
3

  
4
Introduction
5
------------
6

  
7
This document describes lock order dependencies in Ganeti.
8
It is divided by functional sections
9

  
10

  
11
Opcode Execution Locking
12
------------------------
13

  
14
These locks are declared by Logical Units (LUs) (in cmdlib.py) and acquired by
15
the Processor (in mcpu.py) with the aid of the Ganeti Locking Library
16
(locking.py). They are acquired in the following order:
17

  
18
  * BGL: this is the Big Ganeti Lock, it exists for retrocompatibility. New LUs
19
    acquire it in a shared fashion, and are able to execute all toghether
20
    (baring other lock waits) while old LUs acquire it exclusively and can only
21
    execute one at a time, and not at the same time with new LUs.
22
  * Instance locks: can be declared in ExpandNames() or DeclareLocks() by an LU,
23
    and have the same name as the instance itself. They are acquired as a set.
24
    Internally the locking library acquired them in alphabetical order.
25
  * Node locks: can be declared in ExpandNames() or DeclareLocks() by an LU, and
26
    have the same name as the node itself. They are acquired as a set.
27
    Internally the locking library acquired them in alphabetical order. Given
28
    this order it's possible to safely acquire a set of instances, and then the
29
    nodes they reside on.
30

  
31
The ConfigWriter (in config.py) is also protected by a SharedLock, which is
32
shared by functions that read the config and acquired exclusively by functions
33
that modify it. Since the ConfigWriter calls rpc.call_upload_file to all nodes
34
to distribute the config without holding the node locks, this call must be able
35
to execute on the nodes in parallel with other operations (but not necessarily
36
concurrently with itself on the same file, as inside the ConfigWriter this is
37
called with the internal config lock held.
38

  
39

  
40
Job Queue Locking
41
-----------------
42

  
43
The job queue is designed to be thread-safe. This means that its public
44
functions can be called from any thread. The job queue can be called from
45
functions called by the queue itself (e.g. logical units), but special
46
attention must be paid not to create deadlocks or an invalid state.
47

  
48
The single queue lock is used from all classes involved in the queue handling.
49
During development we tried to split locks, but deemed it to be too dangerous
50
and difficult at the time. Job queue functions acquiring the lock can be safely
51
called from all the rest of the code, as the lock is released before leaving
52
the job queue again. Unlocked functions should only be called from job queue
53
related classes (e.g. in jqueue.py) and the lock must be acquired beforehand.
54

  
55
In the job queue worker (``_JobQueueWorker``), the lock must be released before
56
calling the LU processor. Otherwise a deadlock can occur when log messages are
57
added to opcode results.
58

  
59

  
60
Node Daemon Locking
61
-------------------
62

  
63
The node daemon contains a lock for the job queue. In order to avoid conflicts
64
and/or corruption when an eventual master daemon or another node daemon is
65
running, it must be held for all job queue operations
66

  
67
There's one special case for the node daemon running on the master node. If
68
grabbing the lock in exclusive fails on startup, the code assumes all checks
69
have been done by the process keeping the lock.
/dev/null
1
Ganeti locking
2
==============
3

  
4
Introduction
5
------------
6

  
7
This document describes lock order dependencies in Ganeti.
8
It is divided by functional sections
9

  
10

  
11
Opcode Execution Locking
12
------------------------
13

  
14
These locks are declared by Logical Units (LUs) (in cmdlib.py) and acquired by
15
the Processor (in mcpu.py) with the aid of the Ganeti Locking Library
16
(locking.py). They are acquired in the following order:
17

  
18
  * BGL: this is the Big Ganeti Lock, it exists for retrocompatibility. New LUs
19
    acquire it in a shared fashion, and are able to execute all toghether
20
    (baring other lock waits) while old LUs acquire it exclusively and can only
21
    execute one at a time, and not at the same time with new LUs.
22
  * Instance locks: can be declared in ExpandNames() o DeclareLocks() by an LU,
23
    and have the same name as the instance itself. They are acquired as a set.
24
    Internally the locking library acquired them in alphabetical order.
25
  * Node locks: can be declared in ExpandNames() o DeclareLocks() by an LU, and
26
    have the same name as the node itself. They are acquired as a set.
27
    Internally the locking library acquired them in alphabetical order. Given
28
    this order it's possible to safely acquire a set of instances, and then the
29
    nodes they reside on.
30

  
31
The ConfigWriter (in config.py) is also protected by a SharedLock, which is
32
shared by functions that read the config and acquired exclusively by functions
33
that modify it. Since the ConfigWriter calls rpc.call_upload_file to all nodes
34
to distribute the config without holding the node locks, this call must be able
35
to execute on the nodes in parallel with other operations (but not necessarily
36
concurrently with itself on the same file, as inside the ConfigWriter this is
37
called with the internal config lock held.
38

  
39

  
40
Job Queue Locking
41
-----------------
42

  
43
The job queue is designed to be thread-safe. This means that its public
44
functions can be called from any thread. The job queue can be called from
45
functions called by the queue itself (e.g. logical units), but special
46
attention must be paid not to create deadlocks or an invalid state.
47

  
48
The single queue lock is used from all classes involved in the queue handling.
49
During development we tried to split locks, but deemed it to be too dangerous
50
and difficult at the time. Job queue functions acquiring the lock can be safely
51
called from all the rest of the code, as the lock is released before leaving
52
the job queue again. Unlocked functions should only be called from job queue
53
related classes (e.g. in jqueue.py) and the lock must be acquired beforehand.
54

  
55
In the job queue worker (``_JobQueueWorker``), the lock must be released before
56
calling the LU processor. Otherwise a deadlock can occur when log messages are
57
added to opcode results.
58

  
59

  
60
Node Daemon Locking
61
-------------------
62

  
63
The node daemon contains a lock for the job queue. In order to avoid conflicts
64
and/or corruption when an eventual master daemon or another node daemon is
65
running, it must be held for all job queue operations
66

  
67
There's one special case for the node daemon running on the master node. If
68
grabbing the lock in exclusive fails on startup, the code assumes all checks
69
have been done by the process keeping the lock.

Also available in: Unified diff