在awr报告中的top wait中显示了’latch: row cache objects’竞争
Top 5 Timed Foreground Events
Event Waits Time(s) Avg wait (ms) % DB time Wait Class latch: row cache objects 11,509,891 291,065 25 71.65 Concurrency DB CPU 37,417 9.21 db file sequential read 2,221,943 19,023 9 4.68 User I/O read by other session 1,464,324 9,340 6 2.30 User I/O enq: TX - row lock contention 19,750 4,928 250 1.21 Application Per Second Per Transaction Per Exec Per Call DB Time(s): 112.9 2.4 0.02 0.05 DB CPU(s): 10.4 0.2 0.00 0.00 Redo size: 580,567.8 12,152.7 Logical reads: 417,007.5 8,729.0 Block changes: 2,931.4 61.4 Physical reads: 34,146.3 714.8 Physical writes 288.6 6.0 User calls: 2,507.1 52.5 Parses: 1,023.5 21.4 Hard parses: 528.4 11.1 W/A MB processed: 7.6 0.4 Logons: 0.3 0.0 Executes: 5,231.8 109.5 Rollbacks: 0.7 0.0 Transactions: 47.8
SQL ordered by Elapsed Time
Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
% Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100
%Total – Elapsed Time as a percentage of Total DB time
%CPU – CPU Time as a percentage of Elapsed Time
%IO – User I/O Time as a percentage of Elapsed Time
Captured SQL account for 5.1% of Total DB Time (s): 406,212
Captured PL/SQL account for 0.3% of Total DB Time (s): 406,212
Elapsed Time (s) Executions Elapsed Time per Exec (s) %Total %CPU %IO SQL Id SQL Module SQL Text 7,316.50 3 2,438.83 1.80 0.74 97.96 4ux9hyv2xrxqp JDBC Thin Client select * from (select page$tem... 3,240.46 0 0.80 0.00 0.00 3jkhph1czbx52 JDBC Thin Client update (select c.paf045 c045, ... 3,092.00 41,263 0.07 0.76 1.08 98.03 44njhv5k4r2bx JDBC Thin Client select a.akb020 hospital_id, a... 1,405.07 4 351.27 0.35 2.50 93.19 92st5dkx3b59d JDBC Thin Client select count(1) from ( SELECT ... 1,399.77 1 1,399.77 0.34 0.00 0.00 0q7cfvguf04k9 JDBC Thin Client update (select c.paf045 c045, ... 1,037.16 11,147 0.09 0.26 29.24 35.89 1w68qv0t71aak siserv@el01cn04.exalogic.gzsi.cn (TNS V1-V3) begin PKG_SERVICES . PRC_BIZ_S... 707.83 169 4.19 0.17 38.56 23.96 b29wgyggbpj29 JDBC Thin Client select * from ( select b.akb02... 615.84 10,590,959 0.00 0.15 40.13 0.00 grwydz59pu6mc select text from view$ where r... 408.78 4 102.20 0.10 13.63 66.25 5xtsfsftqxj7v JDBC Thin Client select v.aac001, (select c.aac... 359.40 78,750 0.00 0.09 2.95 0.00 7dsyr136296ty JDBC Thin Client select cur_num_no from SYS_SER...
SQL ordered by CPU Time
Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
%Total – CPU Time as a percentage of Total DB CPU
%CPU – CPU Time as a percentage of Elapsed Time
%IO – User I/O Time as a percentage of Elapsed Time
Captured SQL account for 3.3% of Total CPU Time (s): 37,417
Captured PL/SQL account for 0.8% of Total CPU Time (s): 37,417
CPU Time (s) Executions CPU per Exec (s) %Total Elapsed Time (s) %CPU %IO SQL Id SQL Module SQL Text 303.26 11,147 0.03 0.81 1,037.16 29.24 35.89 1w68qv0t71aak siserv@el01cn04.exalogic.gzsi.cn (TNS V1-V3) begin PKG_SERVICES . PRC_BIZ_S... 272.93 169 1.61 0.73 707.83 38.56 23.96 b29wgyggbpj29 JDBC Thin Client select * from ( select b.akb02... 247.13 10,590,959 0.00 0.66 615.84 40.13 0.00 grwydz59pu6mc select text from view$ where r... 68.86 4,972 0.01 0.18 131.43 52.40 1.14 df3fssxryx4g5 JDBC Thin Client select nvl(ac01.aac012, ac01.B... 55.86 18 3.10 0.15 120.11 46.51 0.01 admybmxruwxkz PL/SQL Developer select * from v$open_cursor wh... 55.70 4 13.93 0.15 408.78 13.63 66.25 5xtsfsftqxj7v JDBC Thin Client select v.aac001, (select c.aac... 54.33 3 18.11 0.15 7,316.50 0.74 97.96 4ux9hyv2xrxqp JDBC Thin Client select * from (select page$tem... 37.54 4 9.38 0.10 113.28 33.14 30.75 453gakrhkhmnw JDBC Thin Client with rs as(select m.aac001, m.... 37.35 4 9.34 0.10 210.48 17.74 61.40 5w5gchz2ymag5 JDBC Thin Client with rs as(select m.aac001, m.... 35.17 4 8.79 0.09 1,405.07 2.50 93.19 92st5dkx3b59d JDBC Thin Client select count(1) from ( SELECT ...
在oracle11.2.0.3中,会动态地增加’latch:row cache objects’的等待可以查看awr报告的’Latch Miss Sources’部分,top location calling部分是’row cache objects kqrpre: find obj’:
Latch Miss Sources only latches with sleeps are shown ordered by name, sleeps desc Latch Name Where NoWait Misses Sleeps Waiter Sleeps .......... row cache objects kqrpre: find obj 0 6,049,715 4,177,088 row cache objects kqreqd: reget 0 3,227,630 3,311,481 row cache objects kqreqd 0 2,286,734 4,076,892 row cache objects kqrso 0 8,571 7,388 row cache objects kqrbgl 0 428 408 row cache objects kqrpre: init complete 0 310 323 row cache objects kqrigt 0 232 210 row cache objects kqrigt2 0 227 237 row cache objects kqrbip 0 220 114 row cache objects kqrbtm: pop parent 0 209 222 row cache objects kqrbpr: KQRRSNRL 0 103 198 row cache objects kqrbfr 0 15 47 row cache objects kqrigt3 0 12 3 row cache objects ksucallcbksafely: kqrhngc 0 8 0 row cache objects kqrsrd 0 3 19 row cache objects kqrcmt: while loop 0 2 3
‘row cache objects’闩锁被调用的最频繁的是’kqrpre: find obj’.这个模式尝试在行缓存中找到正被解析对象的详细信息.在解析时行缓存会被搜索.进程搜索是在缓存中通过一个由’row cache objects’闩锁所保护的链表来找到它所需要的信息.当另外的进程也在解析并通过行缓存进行查找时就可能会发生竞争,特别是解析活过于频繁或执行一些不必要的解析时产生竞争的机会就会增大.
从系统状态跟踪文件中通常可以看到以下相关进程:
()+492< -kslges()+2152<-kqreqd()+256<-kkoiqb()+9436<-kkooqb()+684<-kkoqbc()+2256<-apakkoqb()+140
<-apaqbdDescendents()+460<-apadrv()+2020<-opitca()+2112 <-kksFullTypeCheck()+72<-rpiswu2()
在这个跟踪堆中大部分时间花在KKOIQB函数.这个函数使用行缓存以顺序调用来找索引名字因此这个表的索引会以索引名字字母的顺序的来排序.在这种情况下在不同执行计划之间优化器需要一个一致的方法来选择一个索引.通过按索引名字字母来排序来完成.
造成这个问题的原因是因为在oracle11.2中优化器的结构的改变.在oracle11.2.0.3中对于使用视图或子查询的查询会有更多的查询转换.这就会产生更多的优化操作也就意味着会有更多的row cache latch调用.
IOStat by Function summary 'Data' columns suffixed with M,G,T,P are in multiples of 1024 other columns suffixed with K,M,G,T,P are in multiples of 1000 ordered by (Data Read + Write) desc Function Name Reads:Data Reqs per sec Data per sec Writes:Data Reqs per sec Data per sec Waits:Count Avg Tm(ms) Direct Reads 223.3G 119.22 63.6844 0M 0.00 0M 0 RMAN 62.2G 64.06 17.7467 216M 2.10 .060147 113.5K 0.27 Buffer Cache Reads 37.3G 1270.31 10.6290 0M 0.00 0M 2609.5K 6.30 Others 3.8G 17.59 1.08626 3.1G 7.79 .884942 65.7K 0.59 DBWR 0M 0.00 0M 5.3G 73.69 1.50228 0 LGWR 2M 0.04 .000556 2.4G 71.72 .683616 234.4K 0.11 Direct Writes 0M 0.00 0M 687M 0.65 .191301 0 Streams AQ 0M 0.00 0M 0M 0.00 0M 2 8.00 TOTAL: 326.7G 1471.23 93.1470 11.7G 155.95 3.32229 3023.2K 5.47
解决方法:
1.检查程序使用绑定变量来降低每秒528的硬解析次数
2.在业务高峰期间停止rman备份
3.优化消耗大量cpu时间和执行时间长的sql语句(执行时间长和消耗cpu时间大的sql有些是重复的)
在使用以上解决方法后在第二天业务高峰期间恢复正常了