Revision 5ee09f03
b/doc/design-2.1.rst | ||
---|---|---|
9 | 9 |
changing too much of the core code, while addressing issues and adding new |
10 | 10 |
features and improvements over 2.0, in a timely fashion. |
11 | 11 |
|
12 |
.. contents:: :depth: 3
|
|
12 |
.. contents:: :depth: 4
|
|
13 | 13 |
|
14 | 14 |
Objective |
15 | 15 |
========= |
... | ... | |
80 | 80 |
this is a lightweight framework, for abstracting the different storage |
81 | 81 |
operation, and not for modelling the storage hierarchy. |
82 | 82 |
|
83 |
|
|
84 |
Locking improvements |
|
85 |
~~~~~~~~~~~~~~~~~~~~ |
|
86 |
|
|
87 |
Current State and shortcomings |
|
88 |
++++++++++++++++++++++++++++++ |
|
89 |
|
|
90 |
The class ``LockSet`` (see ``lib/locking.py``) is a container for one or many |
|
91 |
``SharedLock`` instances. It provides an interface to add/remove locks and to |
|
92 |
acquire and subsequently release any number of those locks contained in it. |
|
93 |
|
|
94 |
Locks in a ``LockSet`` are always acquired in alphabetic order. Due to the way |
|
95 |
we're using locks for nodes and instances (the single cluster lock isn't |
|
96 |
affected by this issue) this can lead to long delays when acquiring locks if |
|
97 |
another operation tries to acquire multiple locks but has to wait for yet |
|
98 |
another operation. |
|
99 |
|
|
100 |
In the following demonstration we assume to have the instance locks ``inst1``, |
|
101 |
``inst2``, ``inst3`` and ``inst4``. |
|
102 |
|
|
103 |
#. Operation A grabs lock for instance ``inst4``. |
|
104 |
#. Operation B wants to acquire all instance locks in alphabetic order, but it |
|
105 |
has to wait for ``inst4``. |
|
106 |
#. Operation C tries to lock ``inst1``, but it has to wait until |
|
107 |
Operation B (which is trying to acquire all locks) releases the lock again. |
|
108 |
#. Operation A finishes and releases lock on ``inst4``. Operation B can |
|
109 |
continue and eventually releases all locks. |
|
110 |
#. Operation C can get ``inst1`` lock and finishes. |
|
111 |
|
|
112 |
Technically there's no need for Operation C to wait for Operation A, and |
|
113 |
subsequently Operation B, to finish. Operation B can't continue until |
|
114 |
Operation A is done (it has to wait for ``inst4``), anyway. |
|
115 |
|
|
116 |
Proposed changes |
|
117 |
++++++++++++++++ |
|
118 |
|
|
119 |
Non-blocking lock acquiring |
|
120 |
^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
121 |
|
|
122 |
Acquiring locks for OpCode execution is always done in blocking mode. They |
|
123 |
won't return until the lock has successfully been acquired (or an error |
|
124 |
occurred, although we won't cover that case here). |
|
125 |
|
|
126 |
``SharedLock`` and ``LockSet`` must be able to be acquired in a |
|
127 |
non-blocking way. They must support a timeout and abort trying to acquire |
|
128 |
the lock(s) after the specified amount of time. |
|
129 |
|
|
130 |
Retry acquiring locks |
|
131 |
^^^^^^^^^^^^^^^^^^^^^ |
|
132 |
|
|
133 |
To prevent other operations from waiting for a long time, such as described in |
|
134 |
the demonstration before, ``LockSet`` must not keep locks for a prolonged period |
|
135 |
of time when trying to acquire two or more locks. Instead it should, with an |
|
136 |
increasing timeout for acquiring all locks, release all locks again and |
|
137 |
sleep some time if it fails to acquire all requested locks. |
|
138 |
|
|
139 |
A good timeout value needs to be determined. In any case should ``LockSet`` |
|
140 |
proceed to acquire locks in blocking mode after a few (unsuccessful) attempts |
|
141 |
to acquire all requested locks. |
|
142 |
|
|
143 |
One proposal for the timeout is to use ``2**tries`` seconds, where ``tries`` |
|
144 |
is the number of unsuccessful tries. |
|
145 |
|
|
146 |
In the demonstration before this would allow Operation C to continue after |
|
147 |
Operation B unsuccessfully tried to acquire all locks and released all |
|
148 |
acquired locks (``inst1``, ``inst2`` and ``inst3``) again. |
|
149 |
|
|
150 |
Other solutions discussed |
|
151 |
+++++++++++++++++++++++++ |
|
152 |
|
|
153 |
There was also some discussion on going one step further and extend the job |
|
154 |
queue (see ``lib/jqueue.py``) to select the next task for a worker depending on |
|
155 |
whether it can acquire the necessary locks. While this may reduce the number of |
|
156 |
necessary worker threads and/or increase throughput on large clusters with many |
|
157 |
jobs, it also brings many potential problems, such as contention and increased |
|
158 |
memory usage, with it. As this would be an extension of the changes proposed |
|
159 |
before it could be implemented at a later point in time, but we decided to stay |
|
160 |
with the simpler solution for now. |
|
161 |
|
|
162 |
|
|
83 | 163 |
Feature changes |
84 | 164 |
--------------- |
85 | 165 |
|
Also available in: Unified diff