一、問題提出 問題是由姜大師提出的、問題如下: 表: mysql> show create table c \G *************************** 1. row *************************** Table: c Create Table: CREATE TABLE `c` ( `a` int(11) NOT NULL AUTO_INCREMENT, `b` int(11) DEFAULT NULL, PRIMARY KEY (`a`), UNIQUE KEY `b` (`b`) ) ENGINE=InnoDB 1 row in set (0.01 sec) 開啟兩個會話不斷的執行 replace into c values(NULL,1); 會觸發死鎖。問死鎖觸發的原因。
我使用的環境: MYSQL 5.7.14 debug版本、隔離級別RR、自動提交,很顯然這里的c表中的可以select出來的記錄始終是1條 只是a列不斷的增大,但是這里實際存儲空間確不止1條,因為從heap no來看二級索引中,heap no 已經到了 7,也就是有至少7(7-1)條記錄,只是其他記錄標記為del并且被purge線程放到了page free_list中。
MySQL thread id 2, OS thread handle 140737154311936, query id 642 localhost root cleaning up
---lock strcut(1):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
TABLE LOCK table `test`.`c4` trx id 184771 lock mode IX
---lock strcut(2):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
RECORD LOCKS space id 413 page no 4 n bits 72 index id2 of table `test`.`c4` trx id 184771 lock_mode X(LOCK_X)
Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0:len 4;hex 80000014;asc;;
1:len 4;hex 80000014;asc;;
---lock strcut(3):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
RECORD LOCKS space id 413 page no 3 n bits 72 index PRIMARY of table `test`.`c4` trx id 184771 lock_mode X(LOCK_X) locks rec but not gap(LOCK_REC_NOT_GAP)
Record lock, heap no 4 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0:len 4;hex 80000014;asc;;
1:len 6;hex 00000002d1bd;asc;;
2:len 7;hex a600000e230110;asc # ;;
3:len 4;hex 80000014;asc;;
---lock strcut(4):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
RECORD LOCKS space id 413 page no 4 n bits 72 index id2 of table `test`.`c4` trx id 184771 lock_mode X(LOCK_X) locks gap before rec(LOCK_GAP)
Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0:len 4;hex 8000001e;asc;;
1:len 4;hex 8000001e;asc;;
正常的版本只有
點擊(此處)折疊或打開
---TRANSACTION 184771, ACTIVE 45 sec
4 lock struct(s), heap size 1160, 3 row lock(s)
MySQL thread id 2, OS thread handle 140737154311936, query id 642 localhost root cleaning up
部分后面的都是我加上的,其實修改很簡單,innodb其實自己寫好了只是沒有開啟,我開啟后加上了序號來表示順序。 上面是一個 select * from c where id2= 20 for update; b列為輔助索引的所有4 lock struct(s),可以看到有了這些信息分析 不那么難了。 這里稍微分析一下 表結構為: mysql> show create table c4; +-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+ | c4 | CREATE TABLE `c4` ( `id1` int(11) NOT NULL, `id2` int(11) DEFAULT NULL, PRIMARY KEY (`id1`), KEY `id2` (`id2`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 | +-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) 數據為: mysql> select * from c4; +-----+------+ | id1 | id2 | +-----+------+ | 1 | 1 | | 10 | 10 | | 20 | 20 | | 30 | 30 | +-----+------+ 4 rows in set (0.00 sec) 語句為: select * from c where id2= 20 for update; RR模式 從鎖結構鏈表來看,這個語句在輔助索引分別鎖定了 id2:20 id1:20 LOCK_X|LOCK_ORDINARY 也就是NEXT KEY LOCK 同時鎖定了 id2:30 id1:30 LOCK_X|LOCK_GAP也就是gap lock不包含這一列 那么畫個圖容易理解黃色部分為鎖定部分:
修改出現的問題:修改源碼打印出所有lock struct 在線上顯然是不能用的。因為打印出來后show engine innodb status 會非常 長,甚至引發其他問題,但是測試是可以,其次修改了打印死鎖事物鎖鏈表到日志后,每次只要遇到死鎖信息可以打印 到日志,但是每次MYSQLD都會掛掉,但是不影響分析了。
三、預備知識(自我理解) 1、 Precise modes: #define LOCK_ORDINARY 0 /*!< this flag denotes an ordinary next-key lock in contrast to LOCK_GAP or LOCK_REC_NOT_GAP */ 默認是LOCK_ORDINARY及普通的next_key_lock,鎖住行及以前的間隙 #define LOCK_GAP 512 /*!< when this bit is set, it means that the lock holds only on the gap before the record; for instance, an x-lock on the gap does not give permission to modify the record on which the bit is set; locks of this type are created when records are removed from the index chain of records */ 間隙鎖,鎖住行以前的間隙,不鎖住本行 #define LOCK_REC_NOT_GAP 1024 /*!< this bit means that the lock is only on the index record and does NOT block inserts to the gap before the index record; this is used in the case when we retrieve a record with a unique key, and is also used in locking plain SELECTs (not part of UPDATE or DELETE) when the user has set the READ COMMITTED isolation level */ 行鎖,鎖住行而不鎖住任何間隙 #define LOCK_INSERT_INTENTION 2048 /*!< this bit is set when we place a waiting gap type record lock request in order to let an insert of an index record to wait until there are no conflicting locks by other transactions on the gap; note that this flag remains set when the waiting lock is granted, or if the lock is inherited record */ 插入意向鎖,如果插入的記錄在某個已經鎖定的間隙內為這個鎖 2、參數innodb_autoinc_lock_mode的值為1,也許不能保證replace into的順序。 3、infimum和supremum 一個page中包含這兩個偽列,頁中所有的行未刪除(刪除未purge)的行都連接到這兩個虛列之間,其中 supremum偽列的鎖始終為next_key_lock。 4、heap no 此行在page中的heap no heap no存儲在fixed_extrasize 中,heap no 為物理存儲填充的序號,頁的空閑空間掛載在page free鏈表中(頭插法)可以重用, 但是重用此heap no不變,如果一直是insert 則heap no 不斷增加,并非按照KEY大小排序的邏輯鏈表順序,而是物理填充順序 5、n bits 和這個page相關的鎖位圖的大小如果我的表有9條數據 還包含2個infimum和supremum虛擬列 及 64+11 bits,及75bits但是必須被8整除為一個字節就是 80 bits 6、隱含鎖(Implicit lock)和顯示鎖(explict) 鎖有隱含和顯示之分。隱含鎖通常發生在 insert 的時候對cluster index和second index 都加隱含鎖,如果是UPDATE(DELETE)對cluster index加顯示鎖 輔助 索引加隱含鎖。目的在于減少鎖結構的內存開銷,如果有事務需要和這個隱含鎖而不兼容,這個事務需要幫助 insert或者update(delete)事物將隱含 鎖變為顯示鎖,然后給自己加鎖,通常insert主鍵檢查會給自己加上S鎖,REPLACE、delete、update通常會給自己加上X鎖。
我們可以隱隱約約看到row_ins_sec_index_entry_low和row_ins_clust_index_entry_low回檢查是否有重復的行 分別代表是二級索引和聚集索引的相關檢查,因為就這個案例主鍵不可能出現重復值,而二級索引這個例子中肯定是 重復的,索引row_ins_sec_index_entry_low觸發了等待,其實我們知道這里的鎖方式如下列子: ---lock strcut(2):(Add by gaopeng) In modify Version I force check all REC_LOCK/TAB_LOCK chain! for this Trx RECORD LOCKS space id 406 page no 4 n bits 72 index b of table `test`.`c` trx id 177891 lock_mode X(LOCK_X) waiting(LOCK_WAIT) Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 80000001; asc ;; 1: len 4; hex 80000006; asc ;; LOCK_X|LOCK_ORDINARY|LOCK_WAIT:需要X的next_key lock處于等待狀態他需要鎖定(infimum,{1,6}]這個區間。 這也是死鎖發生的關鍵一個環節。
MySQL thread id 5, OS thread handle 140734658983680, query id 4646 localhost root update
replace into c values(null,1)
---lock strcut(1):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
TABLE LOCK table `mysqlslap`.`c` trx id 289636 lock mode IX
---lock strcut(2):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
RECORD LOCKS space id 407 page no 4 n bits 104 index b of table `mysqlslap`.`c` trx id 289636 lock_mode X(LOCK_X)
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0:len 8;hex 73757072656d756d;asc supremum;;
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0:len 4;hex 80000001;asc;;
1:len 4;hex 80001221;asc!;;
---lock strcut(3):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
RECORD LOCKS space id 407 page no 4 n bits 104 index b of table `mysqlslap`.`c` trx id 289636 lock_mode X(LOCK_X) locks gap before rec(LOCK_GAP)
Record lock, heap no 20 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0:len 4;hex 80000001;asc;;
1:len 4;hex 80001220;asc;;
---lock strcut(4):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
RECORD LOCKS space id 407 page no 4 n bits 104 index b of table `mysqlslap`.`c` trx id 289636 lock_mode X(LOCK_X)
---lock strcut(5):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
RECORD LOCKS space id 407 page no 4 n bits 104 index b of table `mysqlslap`.`c` trx id 289636 lock_mode X(LOCK_X)
Record lock, heap no 20 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0:len 4;hex 80000001;asc;;
1:len 4;hex 80001220;asc;;
---lock strcut(6):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
RECORD LOCKS space id 407 page no 3 n bits 104 index PRIMARY of table `mysqlslap`.`c` trx id 289636 lock_mode X(LOCK_X) locks rec butnot gap(LOCK_REC_NOT_GAP)
Record lock, heap no 8 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
0:len 4;hex 80001221;asc!;;
1:len 6;hex 000000046b64;asc kd;;
2:len 7;hex 30000001f00c97;asc 0 ;;
3:len 4;hex 80000001;asc;;
---lock strcut(7):(Add by gaopeng)In modify Version I force check all REC_LOCK/TAB_LOCK for this Trx
RECORD LOCKS space id 407 page no 4 n bits 104 index b of table `mysqlslap`.`c` trx id 289636 lock_mode X(LOCK_X) locks gap before rec(LOCK_GAP) insert intention(LOCK_INSERT_INTENTION) waiting(LOCK_WAIT)
Record lock, heap no 20 PHYSICAL RECORD: n_fields 2; compact format; info bits 32