一个enqueue等待事件的分析


 数据库:Oracle 9.2.0.8

  平台:HP-UX 11.11

  在对客户的数据库进行健康检查时发现:

  TOP 5等待事件中最高的是enqueue,其次是PX Deq: Execute Reply,这两个等待事件占了所有事件的65%以上。

  等待事件“enqueue”所占比例最高,一共等待了14,487s的时间,但从后面的enqueue的统计信息来看,所统计的SQ enqueue和TX enqueue的等待时间总和只有49s,其他统计的enqueue等待时间基本为0s,所以还有其他未统计进statspack的enqueue存在,并占据了大量了等待时间。

  Top 5 Timed Events

  ~~~~~~~~~~~~~~~~~~                                                     % Total

  Event                                               Waits    Time (s) Ela Time

  -------------------------------------------- ------------ ----------- --------

  enqueue                                             6,339      14,487    44.02

  PX Deq: Execute Reply                               2,773       7,222    21.94

  CPU time                                                        5,314    16.15

  db file sequential read                           982,418       3,574    10.86

  row cache lock                                     80,167         691     2.10

  -------------------------------------------------------------

  Enqueue activity for DB: IBSS  Instance: ibss2  Snaps: 311 -312

  -> Enqueue stats gathered prior to 9i should not be compared with 9i data

  -> ordered by Wait Time desc, Waits desc

  Avg Wt         Wait

  Eq     Requests    Succ Gets Failed Gets       Waits   Time (ms)     Time (s)

  -- ------------ ------------ ----------- ----------- ------------- ------------

  SQ        5,589        5,589           0         206        181.50           37

  TX      189,789      189,398         391         163         76.10           12

  HW        1,601        1,599           2         423           .55            0

  US        1,949        1,949           0         339           .67            0

  TM      912,315      912,254          56         155           .36            0

  TT          638          638           0          48           .92            0

  FB           39           39           0          26           .31            0

  TD           16           16           0          16           .50            0

  IR           10           10           0           8           .63            0

  WL            8            6           2           6           .67            0

  DR            4            4           0           4           .50            0

  RO            3            3           0           3           .67            0

  TA           26           26           0           2           .50            0

  DL          123          123           0           1          1.00            0

  TS            6            6           0           5           .00            0

  CF       12,113       12,113           0           1           .00            0

  -------------------------------------------------------------

  采样查询等待事件为enqueue的session:

  SQL> select sid,status,program,sql_hash_value from v$session where sid in (select sid from v$session_wait where event='enqueue');

  SID STATUS   PROGRAM                   SQL_HASH_VALUE

  ---------- -------- ------------------------- --------------

  194 ACTIVE   ORACLE.EXE                    4166735688

  311 ACTIVE   oracle@IBSSDB2 (P000)         1611783128

  在多个时间点采样发现,enqueue等待事件始终只有两个这session和这两个sql_hash_value。

  查找sid=194的sql:

  SQL> select sql_text from v$sql where hash_value=4166735688;

  no rows selected

  sid:194找不到相应的sql。

  查找sid=311的sql:

  SQL> select sql_text from v$sql where hash_value=1611783128;

  SQL_TEXT

  --------------------------------------------------------------------------------

  SELECT /*+ Q2447000 */ A1."INST_ID",A1."FORMATID",A1."GLOBALID",A1."BRANCHID",A1

  ."REFCOUNT" FROM "SYS"."GV$GLOBAL_TRANSACTION" A1 WHERE A1."FORMATID"=:B1 AND (R

  AWTOHEX(A1."GLOBALID")=:B2 OR A1."GLOBALID"=:B3) AND (RAWTOHEX(A1."BRANCHID")=:B

  4 OR A1."BRANCHID"=:B5)

  通过V$LOCK分析锁:

  SESS                    ID1        ID2      LMODE    REQUEST TY

  ---------------- ---------- ---------- ---------- ---------- --

  Holder: 217              18          0          4          0 DX

  Waiter: 311              18          0          0          4 DX

  Waiter: 194              18          0          0          6 DX

  在锁的等待中,持有者是217,等待者是311和192,等待的锁为DX锁。

  DX锁:Distributed transaction entry

  通过用oradebug分析,同样发现session 311被session 194阻塞,而session 194被session 217阻塞。

  Open chains found:

  Chain 1 : <cnode/sid/sess_srno/ospid/wait_event> :

  <1/217/952/11188/SQL*Net more data to client>

  -- <1/194/20578/13616/enqueue> -- <1/311/44354/24637/enqueue>

  关于这三个session的信息:

  SQL> select sid,USERNAME,STATUS,MACHINE,OSUSER,PROGRAM,MODULE from v$session where sid in (217,194,311);

  SPID    SID USERNAME          STATUS   MACHINE    OSUSER           PROGRAM                MODULE

  ------ ----- ----------------- -------- ---------- ---------------- ---------------------- ------------

  11188    194 IBSSCX            ACTIVE   DL380G4    Administrator    ORACLE.EXE             ORACLE.EXE

  13616    217 IBSSCX            ACTIVE   DL380G4    Administrator    ORACLE.EXE             ORACLE.EXE

  24637    311 ORA_RECO_070361   ACTIVE   DL380G4    SYSTEM           oracle@IBSSDB2 (P000)

  SID

  ---------- ---------

  217

  194

  311

  oracle9i 11188     1  0  Jul 19  ?         0:52 oracleibss2 (LOCAL=NO)

  oracle9i 13616     1  0  Jul 19  ?         8:55 oracleibss2 (LOCAL=NO)

  oracle9i 24637     1  0  Aug 16  ?         6:42 ora_p000_ibss2

  检查session 217:

  SQL> select sid,status,program,sql_hash_value from v$session where sid=217;

  SID STATUS   PROGRAM                   SQL_HASH_VALUE

  ---------- -------- ------------------------- --------------

  217 ACTIVE   ORACLE.EXE                    4198533449

  SQL> select event from v$session_wait where sid=217;

  EVENT

  ----------------------------------------------------------------

  SQL*Net more data to client

  SQL> select sql_text from v$sql where hash_value=4198533449;

  SQL_TEXT

  --------------------------------------------------------------------------------

  SELECT "MSINFO_ID","DISC_ID" FROM "TB_CM_MSINFO" "AA"

  SQL> select count(*) from IBSSCX.TB_CM_MSINFO;

  COUNT(*)

  ----------

  1780102

  检查session 217后发现,该session的status始终为ACTIVE,等待事件为:SQL*Net more data to client

  sql为一条简单的select语句,并且查询的表只有1780102行数据,怀疑该进程僵死。

  相关的"PX Deq: Execute Reply"等待事件分析:

  select * from

  (select p.spid,s.sid,s.program,s.sql_hash_value,s.status,

  to_char(s.LOGON_TIME,'yyyy-mm-dd HH24:MI:SS'),s.LAST_CALL_ET,

  w.event

  from v$session s,v$session_wait w,v$process p where s.type='USER'

  and s.sid=w.sid and p.addr=s.paddr

  order by s.LOGON_TIME)

  where rownum<=10;

  SPID     SID PROGRAM                           SQL_HASH_VALUE STATUS   TO_CHAR(S.LOGON_TIM LAST_CALL_ET EVENT

  ------ ----- --------------------------------- -------------- -------- ------------------- ------------ ------------------------------

  19312     14 oracle@IBSSDB2 (CJQ0)                          0 ACTIVE   2009-07-09 23:01:37     16121061 rdbms ipc message

  26267    182 ORACLE.EXE                            3477857362 INACTIVE 2009-07-19 14:30:33     15284881 SQL*Net message from client

  11188    217 ORACLE.EXE                            4198533449 ACTIVE   2009-07-19 15:17:55     15284427 SQL*Net more data to client

  13616    194 ORACLE.EXE                            4166735688 ACTIVE   2009-07-19 15:25:37     15284414 enqueue

  24637    311 oracle@IBSSDB2 (P000)                 1611783128 ACTIVE   2009-08-16 10:18:07     12883664 enqueue

  24632    304 ORACLE.EXE                            1963934439 ACTIVE   2009-08-16 10:18:07     12883665 PX Deq: Execute Reply

  3261      68 JDBC Thin Client                      3314230268 INACTIVE 2009-11-18 08:54:56      1125959 SQL*Net message from client

  2848      40 JDBC Thin Client                       588438348 INACTIVE 2009-11-20 11:35:25         4548 SQL*Net message from client

  4147     335 SAPP2">rmrescheck@IBSSAPP2 (TNS V1-V3)                0 INACTIVE 2009-12-01 11:53:59         2491 SQL*Net message from client

  4159      52 rmibssengine@IBSSAPP2 (TNS V1-V3)      3206692553 INACTIVE 2009-12-01 11:54:04           21 SQL*Net message from client

  从session状态、运行时间及等待事件来看,存在问题的session有:

  SPID                SID

  ------------ ----------

  11188               217

  13616               194

  24637               311

  24632               304

  已知 217、194、311在一条等待链上,而session 304的等待事件是"PX Deq: Execute Reply",在等待所有其他执行子进程完成,而session 311正是并行执行子进程,在整个等待事件统计中,只有1个"PX Deq: Execute Reply"等待事件,所以该等待事件也是由enqueue引发,根本原因也是session 217 hang住导致的。

  建议检查session 217是否确实僵死,若僵死,建议将其kill


« 
» 
快速导航

Copyright © 2016 phpStudy | 豫ICP备2021030365号-3