Bug #993

Transaction deadlock, exception in callbacks.py

Added by Vangelis Koukis over 12 years ago. Updated almost 11 years ago.

Status:Closed Start date:08/03/2011
Priority:Medium Due date:11/11/2011
Assignee:Christos Stavrakakis % Done:

0%

Category:logic Spent time: -
Target version:v0.9.0

Description

This exception has been reported from the deployment:

Aug  1 20:50:14 worker1 2011-08-01 17:50:14,672 - synnefo.dispatcher[19468] - ERROR - Unexpected error#012Traceback (most recent call last):#012  File "/srv/okeanos/
synnefo/logic/callbacks.py", line 61, in update_db#012    msg["status"], msg["logmsg"])#012  File "/usr/lib/pymodules/python2.6/django/db/transaction.py", line 299,
in _commit_on_success#012    res = func(*args, **kw)#012  File "/srv/okeanos/synnefo/logic/backend.py", line 92, in process_op_status#012    vm.save()#012  File "/us
r/lib/pymodules/python2.6/django/db/models/base.py", line 434, in save#012    self.save_base(using=using, force_insert=force_insert, force_update=force_update)#012
File "/usr/lib/pymodules/python2.6/django/db/models/base.py", line 500, in save_base#012    rows = manager.using(using).filter(pk=pk_val)._update(values)#012  File " 
/usr/lib/pymodules/python2.6/django/db/models/query.py", line 491, in _update#012    return query.get_compiler(self.db).execute_sql(None)#012  File "/usr/lib/pymodul
es/python2.6/django/db/models/sql
/compiler.py", line 861, in execute_sql#012    cursor = super(SQLUpdateCompiler, self).execute_sql(result_type)#012  File "/usr/lib/pymodules/python2.6/django/db/mod
els/sql/compiler.py", line 727, in execute_sql#012    cursor.execute(sql, params)#012  File "/usr/lib/pymodules/python2.6/django/db/backends/mysql/base.py", line 86,
 in execute#012    return self.cursor.execute(query, args)#012  File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 166, in execute#012    self.errorhandler
(self, exc, value)#012  File "/usr/lib/pymodules/python2.6/MySQLdb/connections.py", line 35, in defaulterrorhandler#012    raise errorclass, errorvalue#012Operationa
lError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

Need to reproduce it and understand its cause.
Interaction with the transaction consistency model?
Switch to manual transaction handling instead of relying on Django for transaction management?


Related issues

related to Synnefo - Bug #1029: Non-transactional processing of requests Closed 08/31/2011 11/11/2011

History

#1 Updated by Giorgos Gousios over 12 years ago

It looks like a dispatcher instance holds an exclusive row lock (e.g. through a SELECT... FOR UPDATE statement) on the VM entry that the current dispatcher instance tries to update the status for. This could happen when two dispatchers start processing status update messages for the same VM at the same time. My intuition is that this does not happen very often, and in any case, this is expected behavior in transactional systems. Manual transaction handling will not help in this case. The reported problem should not affect the consistency of the system, since an exception is thrown and the message being processed is not acknowledged, while no change happens to the database.

I suggest we first investigate whether the problem happens often (e.g. using innotop or MySQL's log). If it does, we should check MySQL's InnoDB status (section LATEST DETECTED DEADLOCK) to see which queries/transactions caused the deadlock. Then, we have three options:

  • Handle the specific exception and restart the transactions that fail
  • Only handle deadlock errors affecting user experience (e.g. from logic functions to update VM keywords), if such errors ever occur. This will need manual transaction handling in the API layer.
  • Just ignore the problem, as the transactions will be rerun either automatically or through user intervention anyway, but document the cases where this problem occurs.

#2 Updated by Vangelis Koukis over 12 years ago

  • Target version changed from v0.5.5 to v0.6

#3 Updated by Vangelis Koukis over 12 years ago

  • Target version changed from v0.6 to v0.6.1

#4 Updated by Vangelis Koukis over 12 years ago

  • Target version changed from v0.6.1 to v0.6.2

#5 Updated by Vangelis Koukis over 12 years ago

  • Target version changed from v0.6.2 to 67

#6 Updated by Vangelis Koukis over 12 years ago

  • Target version changed from 67 to v0.8.0

#7 Updated by Vangelis Koukis over 12 years ago

  • Due date set to 11/11/2011

#8 Updated by Vangelis Koukis about 12 years ago

  • Assignee changed from Giorgos Gousios to Christos Stavrakakis
  • Target version changed from v0.8.0 to v0.9.0

We haven't been able to reproduce this yet, moving to v0.9, need to revise the way we do transactions on the DB (#1029).

#9 Updated by Christos Stavrakakis almost 11 years ago

  • Status changed from Assigned to Closed

Also available in: Atom PDF