The case is as follows:
When using Show innodb status to check the engine status, a deadlock problem was discovered:
*** (1) TRANSACTION:
TRANSACTION 0 677833455, ACTIVE 0 sec, process no 11393, OS thread id 278546 starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 320
MySQL thread id 83, query id 162348740 dcnet03 dcnet Searching rows for update
update TSK_TASK set STATUS_ID=1064,UPDATE_TIME=now () where STATUS_ID=1061 and MON_TIME
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 849384 n bits 208 index `PRIMARY` of table `dcnet_db/TSK_TASK` trx id 0 677833455 lock_mode X locks rec but not gap waiting
Record lock , heap no 92 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
0: len 8; hex 800000000097629c; asc b ;; 1: len 6; hex 00002866eaee; asc (f ;; 2: len 7; hex 00000d40040110; @ ;; 3: len 8; hex 80000000000050b2; asc P ;; 4: len 8; hex 800000000000502a; asc P*;; 5: len 8; hex 8000000000005426; 2c66d29c; asc A ,f ;; 7: len 23; hex 75706c6f6164666972652e636f6d2f6 8616e642e706870; asc xxx.com/;; 8: len 8; hex 800000000000042b; asc +;; 9: len 4; hex 474bfa 2b; asc GK +;; 10: len 8; hex 8000000000004e24; asc N$;;
*** (2) TRANSACTION:
TRANSACTION 0 677833454, ACTIVE 0 sec, process no 11397, OS thread id 344086 updating or deleting, thread declared inside InnoDB 499
mysql tables in use 1, locked 1
3 lock struct(s), heap size 320, undo log entries 1
MySQL thread id 84, query id 162348739 dcnet03 dcnet Updating
update TSK_TASK set STATUS_ID=1067,UPDATE_TIME=now () where ID in (9921180)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 849384 n bits 208 index `PRIMARY` of table `dcnet_db/TSK_TASK` trx id 0 677833454 lock_mode X locks rec but not gap
Record lock, heap no 92 PHYSICAL RECORD: n_fields 11 ; compact format; info bits 0
0: len 8; hex 800000000097629c; asc b ;; 1: len 6; hex 00002866eaee; asc (f ;; 2: len 7; hex 00000d40040110; asc @ ;; 3: len 8; hex 80000000000050b2; asc P ;; 4: len 8; hex 800000000000502a; asc P*;; 5: len 8; hex 8000000000005426; f ;; 7:len 23; hex 75706c6f6164666972652e636f6d2f6 8616e642e706870; asc uploadfire.com/hand.php;; 8: len 8; hex 800000000000042b; 10: len 8; hex 8000000000004e24; asc N$ ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 843102 n bits 600 index `KEY_TSKTASK_MONTIME2` of table `dcnet_db/TSK_TASK` trx id 0 677833454 lock_mode X locks rec but not gap waiting
Record lock, heap no 395 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
0: len 8; hex 8000000000000425; asc %;; 1: len 8; hex 800012412c66d29c; 800000000097629c; asc b ;;
*** WE ROLL BACK TRANSACTION (1)
This deadlock problem involves the TSK_TASK table, which is used to save system monitoring tasks. The following are the relevant fields and indexes:
ID: primary key;
MON_TIME: monitoring time;
STATUS_ID :Task status;
index: KEY_TSKTASK_MONTIME2 (STATUS_ID, MON_TIME).
Analysis shows that the two statements involved should not involve the same TSK_TASK record, so why does it cause a deadlock?
After querying the MySQL official website documentation, I found that this is related to MySQL's indexing mechanism. MySQL's InnoDB engine uses row-level locks. My original understanding was that records are locked directly, but this is not actually the case.
The key points are as follows:
instead of locking records, the index is locked;
during UPDATE and DELETE operations, MySQL not only locks all index records scanned by the WHERE condition, but also locks adjacent key values, the so-called next-key locking;
for example, the statement UPDATE TSK_TASK SET UPDATE_TIME = NOW() WHERE ID > 10000 will lock all records with a primary key greater than or equal to 1000. Before the statement is completed, you cannot operate on records with a primary key equal to 10000;
when the non-cluster index ( When a non-cluster index record is locked, the related cluster index record also needs to be locked to complete the corresponding operation.
Analyzing the two SQL statements where the problem occurred, it is not difficult to find the problem:
when "update TSK_TASK set STATUS_ID=1064,UPDATE_TIME=now () where STATUS_ID=1061 and MON_TIME
Assuming that "update TSK_TASK set STATUS_ID=1067,UPDATE_TIME=now () where ID in (9921180)" is executed almost simultaneously, this statement first locks the cluster index (primary key). Since the value of STATUS_ID needs to be updated, it is also necessary to lock a certain part of KEY_TSKTASK_MONTIME2. some index records.
In this way, the first statement locks the record of KEY_TSKTASK_MONTIME2 and waits for the primary key index, while the second statement locks the record of the primary key index and waits for the record of KEY_TSKTASK_MONTIME2. In this case, a deadlock occurs.
The author solved the deadlock problem by splitting the first statement:
first find the qualified ID: select ID from TSK_TASK where STATUS_ID=1061 and MON_TIME < date_sub(now(), INTERVAL 30 minute); then update the status: update TSK_TASK set STATUS_ID=1064 where ID in (….)
At this point, the deadlock problem is completely solved.