History | View | Annotate | Download (2.1 kB)
Increase the lock timeouts before we block-acquire
This has been observed to cause problems on real clusters via thefollowing mechanism:
- a long job (e.g. a replace-disks) is keeping an exclusive lock on an instance- the watcher starts and submits its query instances opcode which...
Generalize the OpCode-should-be-in-mcpu test
Currently, the unittest TestDispatchTable in mcpu unittest does ahard-coded approach to test whether an opcode should be included ornot in the mcpu.Processor dispatch table. This is not flexible, so wereplace it with two changes:...
Add consistency test for mcpu dispatch table
Signed-off-by: Michael Hanselmann <hansmi@google.com>Reviewed-by: Iustin Pop <iustin@google.com>
mcpu: Adjust lock acquire strategy
The changes to job queue processing require some changes on this class'interface. LockAttemptTimeoutStrategy might move to another place, but that'llbe done in a later patch.
Signed-off-by: Michael Hanselmann <hansmi@google.com>...
Ignore log messages in unittests
mcpu: Use new timeout class for timeout
mcpu: Change lock attempt timeout calculation
With this patch all timeouts are pre-calculated. The interface ofthe _LockTimeoutStrategy class is also changed a bit; NextAttemptnow returns a new instance.
mcpu: Implement lock timeouts
The timeout is always between ~0.1 and ~10.0 seconds. A smallvariation of ±5% is added to prevent different jobs fromfighting each other. After 10 attempts to acquire the locks witha timeout, a blocking acquire is made.
Lock status reporting will be improved in a separate patch....