事务队列等待深入分析:记录锁


在发现TX锁时,我们如何鉴别是哪一种情况导致的呢?当存在TX等待队列时,如何找到锁所在的对象呢?

  记录锁

  这类锁是事务插入/删除/更新数据记录时加在记录的锁。对于插入操作,数据在未提交之前对其他事务是“不可见”的,因而不会导致TX等待。这一类的TX锁是比较容易鉴别的——只有这类锁的模式(mode)是6(即排它锁,exclusive)。通过v$lock很容易鉴定出来:


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->HELLODBA.COM> create table tx_lock_tab (a number, b char(1), c varchar2(20));

  Table created.

  HELLODBA.COM> insert into tx_lock_tab (a, b, c) values (99, 'C', 'AOAKAPSOD');

  1 row created.

  HELLODBA.COM> insert into tx_lock_tab (a, b, c) values (100, 'C', 'AOAKAPSOD');

  1 row created.

  HELLODBA.COM> insert into tx_lock_tab (a, b, c) values (101, 'd', 'AOAKAPSOD');

  1 row created.

  HELLODBA.COM> commit;

  Commit complete.

  session 1中:

  HELLODBA.COM> delete from tx_lock_tab where a=100;

  1 row deleted.

  HELLODBA.COM> select * from v$lock where type='TX';

  ADDR KADDR SID TYPE ID1 ID2 LMODE REQUEST CTIME BLOCK

  -------- -------- ---------- ---- ---------- ---------- ---------- ---------- ---------- ----------

  1E200F98 1E2010B4 135 TX 458781 104223 6 0 36 0
  可以看到,Lock Mode为6.

  等待队列分析

  当发现系统中由于此类锁导致的等待而致使会话hung住时,可以通过以下过程来找到导致阻塞的会话、语句和对象。

  续上,在Session 2中执行:


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->HELLODBA.COM> delete from tx_lock_tab where a=100;
  此时,锁等待队列形成。从V$ENQUEUE_LOCK中可以查询到这一锁队列:


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->HELLODBA.COM> select * from V$ENQUEUE_LOCK where type='TX';

  ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK

  -------- -------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------

  1EC34448 1EC3445C 127 TX 458781 104223 0 6 213 0
  注意:V$ENQUEUE_LOCK的结构与V$LOCK结构很相似,但是,V$LOCK查询到的是被持有的锁及其SID,V$ENQUEUE_LOCK查询到的导致队列等待的锁以及请求锁的SID。

  同时,通过v$session_event可以查询到会话等待事件是"enq: TX - row lock contention":


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->HELLODBA.COM> select s.sid, s.event

  2 from v$session_event s, v$enqueue_lock l

  3 where s.sid = l.sid

  4 and s.event like 'enq: TX%';

  SID EVENT

  ---------- -----------------------------------------------------

  127 enq: TX - row lock contention
  这一类锁是代码逻辑造成的——第一个获取锁的事务没有提交或回滚,导致其他会话等待。因此,找到逻辑代码是解决此类锁等待的关键。对于等待事务,由于其语句被锁阻塞住,因此从V$SESSION中得到的SQL信息就是其当前正在执行的语句,也就是发生等待的语句:


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->HELLODBA.COM> select w.waiting_session,

  2 s2.username waiting_user,

  3 q2.sql_text waiting_sql

  4 from dba_waiters w,

  5 v$session s2,

  6 v$sqlarea q2

  7 where w.waiting_session = s2.sid

  8 and s2.sql_address = q2.address;

  WAITING_SESSION WAITING_USER WAITING_SQL

  --------------- ------------------------------ ---------------------------------------------

  127 DEMO delete from tx_lock_tab where a=100
  对于持有锁的会话,由于其在申请到锁资源后还可能会执行其他语句,因此不能通过v$session直接查询到发生锁的语句,而需要通过v$open_cursor来找到该语句:


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

--> HELLODBA.COM> select w.holding_session,

  2 s1.username holding_user,

  3 q1.sql_text holding_sql

  4 from dba_waiters w,

  5 v$session s1,

  6 v$open_cursor q1,

  7 v$locked_object l1,

  8 dba_objects o1

  9 where w.holding_session = s1.sid

  10 and s1.sid = q1.sid(+)

  11 and l1.session_id = s1.sid

  12 and l1.object_id = o1.object_id

  13 and (upper(q1.sql_text) like '%DELETE%' or upper(q1.sql_text) like '%UPDATE%')

  14 and upper(q1.sql_text) like '%'||o1.object_name||'%';

  HOLDING_SESSION HOLDING_USER HOLDING_SQL

  --------------- ------------------------------ ------------------------------------------------------------

  129 DEMO delete from tx_lock_tab where a=100
  这一查询结果可能会存在多条记录,那我们就需要结合等待会话的语句,从逻辑上分析是哪一条语句产生的锁。

  除了语句,我们还可以通过会话信息中找到发生等待的具体数据记录:


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->HELLODBA.COM> select s.sid, o.object_name wait_object

  2 s.row_wait_obj#,

  3 s.row_wait_file#,

  4 s.row_wait_block#,

  5 s.row_wait_row#,

  6 dbms_rowid.rowid_create(1,

  7 s.row_wait_obj#,

  8 s.row_wait_file#,

  9 s.row_wait_block#,

  10 s.row_wait_row#) l_rowid

  11 from v$session s, v$enqueue_lock l, dba_objects o

  12 where s.sid = l.sid and s.row_wait_obj# = o.object_id(+)

  13 and s.sid = 127;

  SID WAIT_OBJECT ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# L_ROWID

  ---------- ----------- ------------- -------------- --------------- ------------- ------------------

  127 TX_LOCK_TAB 198074 5 67366 1 AAAwW6AAFAAAQcmAAB
  通过得到的rowid,可以查询到具体记录: 


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->HELLODBA.COM> select * from tx_lock_tab where rowid = chartorowid('AAAwW6AAFAAAQcmAAB');

  A B C

  ---------- - --------------------

  100 C AOAKAPSOD
  死锁分析

  对于此类锁所引发的死锁问题,通过Trace文件可以很快定位出来。首先,我们可以通过锁的请求、持有模式(x)可以知道这是行级锁:


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->---------Blocker(s)-------- ---------Waiter(s)---------

  Resource Name process session holds waits process session holds waits

  TX-000a005e-0016ce18 231 197 X 295 305 X

  TX-000e001d-00496580 295 305 X 591 569 X

  TX-00040016-001d820b 591 569 X 574 510 X

  TX-00030002-001fadc2 574 510 X 231 197 X
  同时,我们还可以找到导致死锁的语句、执行语句的用户、客户端及模块信息:


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

--> Current SQL statement for this session:

  UPDATE CSS_CARRIER SET REC_UPD_DT = TO_TIMESTAMP (:"SYS_B_0", :"SYS_B_1") WHERE (CARRIER_ID = :"SYS_B_2")

  ...

  Information on the OTHER waiting sessions:

  Session 305:

  pid=295 serial=186 audsid=902537392 user: 33/CSSJAVA

  O/S info: user: oracle, term: unknown, ospid: , machine: as04.cargosmart.com

  program: JDBC Thin Client

  application name: JDBC Thin Client, hash value=0

  Current SQL Statement:

  UPDATE CSS_CARRIER SET REC_UPD_DT = TO_TIMESTAMP (:"SYS_B_0", :"SYS_B_1") WHERE (CARRIER_ID = :"SYS_B_2")

  ...

  End of information on OTHER waiting sessions.
  此外,还能知道锁所在的对象和数据记录(ROWID):


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

--> Rows waited on:

  Session 305: obj - rowid = 00005B9A - AAAFuaABBAAAAaKAAI

  (dictionary objn - 23450, file - 65, block - 1674, slot - 8)

  Session 569: obj - rowid = 00005B9A - AAAFuaABBAAAAaKABU

  (dictionary objn - 23450, file - 65, block - 1674, slot - 84)

  Session 510: obj - rowid = 00005B9A - AAAFuaABBAAAAaKAAQ

  (dictionary objn - 23450, file - 65, block - 1674, slot - 16)

  Session 197: obj - rowid = 00005B9A - AAAFuaABBAAAAaKAAF

  (dictionary objn - 23450, file - 65, block - 1674, slot - 5)
  然后再通过以上语句及会话的其它信息,找到相应代码,结合数据对象及产生锁的记录分析逻辑过程,修正会导致死锁的代码。

  解决方法

  要解决等待会话被“僵死”的问题,关键要看导致阻塞的会话正在做什么、或者正在等待什么,找到其事务长时间不提交的根本原因: 


Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

--> select s.sid, s.event, s.wait_time, q.sql_text

  from v$session s, v$sqlarea q

  where s.sql_address = q.address(+)

  and s.sql_hash_value = q.hash_value(+)

  and s.sid = 129;
  根本解决方法就是要调整应用逻辑,避免死锁。

  例如,我们有一个案例:一个应用是多线程服务的,当收到请求事件后,服务进程会打开一个游标,对游标中数据逐一进行业务处理、统一更新。由于不同进程打开游标的时间不同,游标查询语句获取的数据顺序也会不同,因此这个应用经常抛出死锁错误。我们给出的解决方案就是在游标查询语句中加入排序,使更新数据按某一特定顺序进行,从而避免死锁。


« 
» 
快速导航

Copyright © 2016 phpStudy | 豫ICP备2021030365号-3