RAC与单实例关于锁冲突表现的区别

上周五,在一套生产库的RAC上,发现大量的enq: TM – contention等待,表现为:

                                                                                    SQL Child
       SID USERNAME        EVENT                                    SQL_ID             Number                     P1               P2           P3 LAST_CALL_ET
---------- --------------- ---------------------------------------- --------------- --------- ---------------------- ---------------- ------------ ------------
......
      6215 XXXXX           enq: TM - contention                     08gtqt7qv8dyp           0             1414332419           254724            0          401
      4709 XXXXX           enq: TM - contention                     0ndqw20zktps2           0             1414332418           254724            0          261
      3721 XXXXX           enq: TM - contention                     0ndqw20zktps2           0             1414332418           254724            0          332
      5536 XXXXX           enq: TM - contention                     0ndqw20zktps2           0             1414332418           254724            0            6
      5113 XXXXX           enq: TM - contention                     0ndqw20zktps2           0             1414332418           254724            0          435
      4374 XXXXX           enq: TM - contention                     0ndqw20zktps2           0             1414332418           254724            0          130
      3827 XXXXX           enq: TM - contention                     0ps4av8r3yktg           0             1414332419           254724            0          477
      4118 XXXXX           enq: TM - contention                     17560ynh15fc4           0             1414332419           254724            0          343
      5638 XXXXX           enq: TM - contention                     20mx1b2rk64k5           0             1414332419           254724            0          393
      3355 XXXXX           enq: TM - contention                     284fa880r2cn4           0             1414332419           254724            0          373
      5358 XXXXX           enq: TM - contention                     2yctkhay5xkhh           0             1414332419           254724            0          273
      5705 XXXXX           enq: TM - contention                     2yn8v99vhr6sa           1             1414332418           254724            0          478
.....

问题发生在object_id=254724这个对象的争用上,查看v$lock的情况,发现这个对象的争用全部在节点1上,节点2上没有(因为这套库做了业务隔离,所以只在一个节点)。v$lock的情况表现如下:

ADDR             KADDR                   SID TY        ID1        ID2      LMODE    REQUEST      CTIME      BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
......
C00000199849BB58 C00000199849BB80       3354 TM     254724          0          0          2        564          0
C0000019984A4E10 C0000019984A4E38       6103 TM     254724          0          0          3        564          0
C0000019984CD160 C0000019984CD188       3817 TM     254724          0          0          2        564          0
C0000019984B4140 C0000019984B4168       4085 TM     254724          0          0          2        566          0
C0000019984A3508 C0000019984A3530       4199 TM     254724          0          0          2        567          0
C0000019984A3BE0 C0000019984A3C08       6419 TM     254724          0          0          2        567          0
C0000019984B8C58 C0000019984B8C80       4127 TM     254724          0          0          2        567          0
C0000019984B90D8 C0000019984B9100       3558 TM     254724          0          0          2        567          0
C0000019984BE748 C0000019984BE770       4166 TM     254724          0          0          2        567          0
C0000019984C0038 C0000019984C0060       5796 TM     254724          0          0          2        567          0
C0000019984C2A50 C0000019984C2A78       5851 TM     254724          0          0          3        567          0
C0000019984C58E8 C0000019984C5910       4388 TM     254724          0          0          2        567          0
C0000019984A5A88 C0000019984A5AB0       3973 TM     254724          0          0          2        568          0
C0000019984910E8 C000001998491110       3566 TM     254724          0          0          2        571          0
C000001998494658 C000001998494680       4754 TM     254724          0          0          2        571          0
C0000019984BA548 C0000019984BA570       3827 TM     254724          0          0          3        571          0
C0000019984CBF30 C0000019984CBF58       3591 TM     254724          0          0          2        571          0
C0000019984CC3C8 C0000019984CC3F0       5481 TM     254724          0          0          2        571          0
C0000019984CC728 C0000019984CC750       4827 TM     254724          0          0          2        571          0
C0000019984D7768 C0000019984D7790       6031 TM     254724          0          0          3        571          0
C000001998491A00 C000001998491A28       3900 TM     254724          0          0          4        572          0
C00000199849AA60 C00000199849AA88       4094 TM     254724          0          0          2        572          0
C0000019984A63A0 C0000019984A63C8       5768 TM     254724          0          0          2        572          0
C0000019984AE650 C0000019984AE678       5705 TM     254724          0          0          2        572          0
C0000019984AF400 C0000019984AF428       6358 TM     254724          0          0          2        572          0
C0000019984BA1E8 C0000019984BA210       5038 TM     254724          0          0          2        572          0
C000001998491688 C0000019984916B0       3193 TM     254724          0          3          0        841          1

1066 rows selected.

当然解决问题比较简单,把enq: TM – contention的会话杀了。
这里SID=3193 持有了TM-3,阻塞SID=3900 TM-4的请求,但是为什么很多TM-2的请求都被阻塞了,这就很难理解。因为TM-2只和TM-6冲突,当时并没有TM-6在被请求或持有。

于是我在一台单实例做测试如下:

会话一:

SQL> select distinct sid from v$mystat;
 
SID
----------
456
 
SQL> lock table t1 in ROW EXCLUSIVE mode ;
 
Table(s) Locked.

会话二:

SQL> select distinct sid from v$mystat;
 
SID
----------
880
 
SQL> lock table t1 in SHARE mode ;

挂起

会话三:

SQL> select distinct sid from v$mystat;
 
SID
----------
863
 
SQL> lock table t1 in ROW SHARE mode ;
 
Table(s) Locked.

单实例的测试结果表明:会话一持有TM-3,会话二请求TM-4被挂起,会话三继续请求TM-2不会被挂起。
但这个过程到RAC(会话在相同节点)下去测试就变成了:会话一持有TM-3,会话二请求TM-4被挂起,会话三继续请求TM-2被挂起,表现完全不一样。
这样在RAC上,TM锁冲突确实会更加严重,如果子表外键无索引,父表操作了一行记录需要请求子表的TM-4被阻塞后,虽然子表其它外键有索引,其它父表操作请求子表的TM-2却依然被阻塞。


看看某个业务的外键设计,想想都可怕。
外键无索引的详细说明,请参考:外键无索引造成锁问题的详细说明

关于紫砂壶

感悟技术人生
此条目发表在Oracle性能问题分类目录,贴了, 标签。将固定链接加入收藏夹。