Oracle响应时间分析报告分为系统级与会话级,报告相比awr报告更加直观清楚有助于快速分析定位性能问题,这里使用OSM工具来生成这两种类型的报告,该工具是由Craig Shallahamer所写。
在数据库中创建osm用户并安装osm脚本所需要使用的对象
[oracle@db1 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.4.0 Production on Wed Feb 27 15:43:54 2019 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP, Data Mining and Real Application Testing options SQL> create user osm identified by "osm" default tablespace sx temporary tablespace temp; User created. SQL> grant connect,resource,dba to osm; Grant succeeded. SQL> conn osm/osm Connected. SQL> exec sys.dbms_lock.sleep(5); PL/SQL procedure successfully completed. SQL> @osmprep.sql OraPub System Monitor - Interactive (OSM-I) installation script. (c)1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008 by OraPub, Inc. (c)2009,2010,2011,2012,2013,2014,2015 by OraPub, Inc. There is absolutely no guarantee with this software. You may use this software at your own risk, not OraPub's risk. No value is implied or stated. You may need to run $ORACLE_HOME/rdbms/admin/catblock.sql Connect as the user who will be using the OSM. Press ENTER to continue. ..... To categorize wait events for OSM reports, run: For pre-10g systems, run event_type_nc.sql For 10g and beyond, run event_type.sql Once you cateogrize the wait events, the installation is complete. Menu is osm.sql ENJOY!! SQL> @event_type.sql file: event_type.sql for Oracle 10g and beyond... About to categorize wait events for OSM reports. Press ENTER to re-create the o$event_type table. ..... OraPub Categorization Summary ----------------------------------------------- TYPE COUNT(*) -------------------- ---------- bogus 126 ior 20 iow 59 other 1162 COUNT(*) ---------- 1367 Oracle Categorization Summary ----------------------------------------------- WAIT_CLASS COUNT(*) -------------------- ---------- Administrative 55 Application 17 Cluster 50 Commit 2 Concurrency 33 Configuration 24 Idle 96 Network 35 Other 958 Queueing 9 Scheduler 8 WAIT_CLASS COUNT(*) -------------------- ---------- System I/O 32 User I/O 48 13 rows selected. COUNT(*) ---------- 1367
OSM工具包中的rtsess.sql是用来生成会话级报告,rtsysx.sql,rtpctx.sql用来生成实例级报告
实例级Oracle响应时间分析报告是基于rtsysx.sql脚本,它将捕获指定时间间隔内实例范围内关于响应时间方面的详细信息。这个脚本将对实例级统计信息(v$sysstat,v$sys_time_model)与实例级等待事件统计信息(v$system_event)生成快照。下面的例子在120秒的时间间隔内,脚本每10秒被唤醒一次,从v$session视图中查询活动的SQL并存储当前正在运行的SQL_ID。在报告生命周期结束后,其它的统计住处快照会生成,计算出时间差异并生成报告。几乎报告中的所有信息都可以从Statspack或AWR报告中进行收集。使用rtsysx.sql脚可以生成格式化的输出可以快速的执行Oracle响应时间分析。使用脚本rtsysx.sql脚本生成的报告包括以下几个组成部分:
.第一部分是关注工作量负载情况
.第二部分是高级别的响应时间分类信息
.第三部分是IO与非IO情况
.第四部分是没有使用绑定变量的SQL语句
.对于Oracle 10g及以上版本,第五部分是关于操作系统CPU利用率
下面执行rtsysx.sql脚本,执行生命周期是120秒,脚本每10秒被唤醒一次。
SQL> @rtsysx.sql 120 10 OraPub's Response Time Analysis (RTA) interactive system level delta report Initializing response time delta objects... Sleeping and probing active SQL for next 120 seconds... Done sleeping...gathering and storing current values... *** Response Time Ratio and Workload Metrics RT Ratio Ora Trx/s Block Changes/s User Calls/s Execs/s -------- ---------- --------------- ------------ ------------ 0.325 0.54 90.63 27.41 100.00 *** Response Time System Summary (delta - interactive - system level) Tot CPU CPU SP CPU BG CPU Parse CPU Recur Tot Wait IO Wait Other Wait Time Time Time Time Time Ora CPU Time Time Time % % (sec) (sec) (sec) (sec) (sec) Util % (sec) (sec) (sec) IO Wait Other Wait ---------- -------- -------- --------- --------- ------- -------- -------- ---------- ------- ---------- 34 26 8 2 15 0.4 17 15 1 92 8 *** I/O Wait Time Summary w/Event Details (delta - interactive - system level) IO Wait IO WRITE IO READ Time Wait Time Wait Time % IO % IO (sec) (sec) (sec) Write Read -------- --------- ---------- ----- ---- 15 10 5 66 34 Tot Call Avg Call Wait Time Wait Time IO Wait Event R,W % (sec) (ms) Tot Waits ------------------------------------------------------------ --- ----- ----------- ----------- --------- LGWR real time apply sync W 57 8.69 65.83 132 db file sequential read R 31 4.67 4.97 940 LGWR-LNS wait on channel W 5 0.71 1.74 408 *** Other Wait Time (non-I/O) Event Detail (delta - interactive - system level) Tot Call Avg Call Wait Time Wait Time Non IO (other) Wait Event % (sec) (ms) Tot Waits ------------------------------------------------------------ ----- ----------- ----------- --------- gc current block 2-way 26 0.35 0.62 569 gc cr grant 2-way 17 0.23 0.48 484 reliable message 10 0.14 1.21 116 os thread startup 10 0.14 23.33 6 enq: US - contention 6 0.08 0.49 163 library cache pin 6 0.08 0.56 144 library cache lock 4 0.06 0.61 98 gc current grant 2-way 4 0.05 0.45 111 *** SQL Activity Details During Probe Phys Rds Log Rds Tot Time CPU Time Rows Stmt SQL ID Sec/EXE (k) (k) (sec) (sec) Sec/PIO Sec/LIO Runs (k) Sorts Type ---------------- --------- -------- -------- --------- ---------- ------- ------- -------- ------- ------ ----- gz5bfrcjq060u 0.01 0 0 0.3 0.3 ####### 0.001 24 0 23 INSER c77k33u5u7zgc 0.06 0 17 0.1 0.1 ####### 0.000 2 0 2 SELEC 8fb44rrg8a5rh 0.13 0 15 0.1 0.1 ####### 0.000 1 0 2 SELEC 98564h3vavfcm -25.78 -0 -0 -51.6 -0.6 51.552 0.276 2 0 0 inser *** SQL Similar Statements During Delta SQL Statement (shown if first 10 chars) Count ---------------------------------------------------------------------- -------- SELECT NVL 2 *** OS CPU Breakdown During Delta Category Percent ----------------------------------- -------- Idle 96.51 IO Wait 0.44 Nice 0.00 System 0.38 User 2.47 Delta is 123.53 seconds Number of CPU cores is 80
报告的第一部分:Response Time Ratio and Workload Metrics
报告的第一部分提供了与Statspack与AWR中Load Profile部分相同的Workload Metrics。这部分信息在比较响应时间快照之间的差异时非常有用。如果工作负载减少那么可以预期响应时间减少。
*** Response Time Ratio and Workload Metrics RT Ratio Ora Trx/s Block Changes/s User Calls/s Execs/s -------- ---------- --------------- ------------ ------------ 0.325 0.54 90.63 27.41 100.00
报告的第二部分:Response Time System Summary
这部分信息显示总的CPU时间为34秒,总的等待时间为17秒,也可以说是在120秒的时间间隔内,Oracle进程消耗的CPU时间只有34秒,消耗的等待时间只有17秒。而且还可以看到17秒的等待时间中IO等待时间为15秒,非IO等待时间为1秒。在120秒的时间间隔内,Oracle进程只消耗了总CPU可用时间的0.4%,这个数据是使用Oracle进程消耗的总CPU时间除以主机可用CPU时间。在120秒时间间隔的主机的CPU可用时间为CPU的内核数量乘以报告时间间隔。在这里主机的CPU内核数量为80,报告时间间隔为120秒,所以Oracle所消耗的CPU时间为34/(120*80)=0.4%。如果主机上只运行该实例,那么它也提供了操作系统CPU利用率给我们,因此也不用执行操作系统命令来查看CPU利用情况了。
*** Response Time System Summary (delta - interactive - system level) Tot CPU CPU SP CPU BG CPU Parse CPU Recur Tot Wait IO Wait Other Wait Time Time Time Time Time Ora CPU Time Time Time % % (sec) (sec) (sec) (sec) (sec) Util % (sec) (sec) (sec) IO Wait Other Wait ---------- -------- -------- --------- --------- ------- -------- -------- ---------- ------- ---------- 34 26 8 2 15 0.4 17 15 1 92 8
报告的第三部分:I/O Wait Time Summary w/Event Details
如果IO有问题,那么你肯定想知道是读还是写有问题,管理员可以从了解IO负载类型来给出相关的解决方案。比如一个IO读问题可以通过将常被访问的数据块保存在Oracle Cache中来使用IO读的影响降低到最小,如果一个IO写问题可以通过配置,比如联机重做日志文件的数量与大小来使IO写的影响降低到最小。从报告中可以看到IO总等待时间为15秒,其中IO写为10秒,IO读为5秒。其中LGWR real time apply sync事件平均等待一次的时间是65.83毫秒,这是因为配置了ADG,对于同城异地容灾来说这个等待时间也还是正常的,db file sequential read事件平均等待一次的时间为4.97毫秒也是正常的。
*** I/O Wait Time Summary w/Event Details (delta - interactive - system level) IO Wait IO WRITE IO READ Time Wait Time Wait Time % IO % IO (sec) (sec) (sec) Write Read -------- --------- ---------- ----- ---- 15 10 5 66 34 Tot Call Avg Call Wait Time Wait Time IO Wait Event R,W % (sec) (ms) Tot Waits ------------------------------------------------------------ --- ----- ----------- ----------- --------- LGWR real time apply sync W 57 8.69 65.83 132 db file sequential read R 31 4.67 4.97 940 LGWR-LNS wait on channel W 5 0.71 1.74 408
报告的第四部分:Other Wait Time (non-I/O) Event Detail
这部分显示了非IO等待事件的汇总与底层相关的等待事件详细信息,因为非IO等待时间总共才只有1秒,这并不影响性能。所以相关的等待事件我们也就不用查看了。
*** Other Wait Time (non-I/O) Event Detail (delta - interactive - system level) Tot Call Avg Call Wait Time Wait Time Non IO (other) Wait Event % (sec) (ms) Tot Waits ------------------------------------------------------------ ----- ----------- ----------- --------- gc current block 2-way 26 0.35 0.62 569 gc cr grant 2-way 17 0.23 0.48 484 reliable message 10 0.14 1.21 116 os thread startup 10 0.14 23.33 6 enq: US - contention 6 0.08 0.49 163 library cache pin 6 0.08 0.56 144 library cache lock 4 0.06 0.61 98 gc current grant 2-way 4 0.05 0.45 111
报告的第五部分:SQL Activity Details During Probe
为了帮助分析应用程序,报告捕获了直接影响响应时间的SQL语句并显示了资源消耗情况,以下面的数据来看,在捕获的SQL语句所消耗的资源都是很少的不会影响性能,其中语句的物理读为0,逻辑读总大小也才32K。
*** SQL Activity Details During Probe Phys Rds Log Rds Tot Time CPU Time Rows Stmt SQL ID Sec/EXE (k) (k) (sec) (sec) Sec/PIO Sec/LIO Runs (k) Sorts Type ---------------- --------- -------- -------- --------- ---------- ------- ------- -------- ------- ------ ----- gz5bfrcjq060u 0.01 0 0 0.3 0.3 ####### 0.001 24 0 23 INSER c77k33u5u7zgc 0.06 0 17 0.1 0.1 ####### 0.000 2 0 2 SELEC 8fb44rrg8a5rh 0.13 0 15 0.1 0.1 ####### 0.000 1 0 2 SELEC 98564h3vavfcm -25.78 -0 -0 -51.6 -0.6 51.552 0.276 2 0 0 inser
报告的第六部分:SQL Similar Statements During Delta
在执行rtsysx.sql脚本所指定的第二个参数就与查找类似SQL语句相关,类似SQL语句是除了where子句中的过滤与连接条件不同之外其它完全相同的语句。第二个参数我们指定的是10,也就是说类似语句会被统计且统计数大于1的语句的前10个字符才会被显示。
*** SQL Similar Statements During Delta SQL Statement (shown if first 10 chars) Count ---------------------------------------------------------------------- -------- SELECT NVL 2
报告的第七部分:Operating System CPU Utilization
这部分显示了操作系统使用的详细情况。从Oracle 10g开始,Oracle捕获操作系统CPU的使用的详细信息并且这些信息可以通过v$osstat视图来查看。
*** OS CPU Breakdown During Delta Category Percent ----------------------------------- -------- Idle 96.51 IO Wait 0.44 Nice 0.00 System 0.38 User 2.47 Delta is 123.53 seconds Number of CPU cores is 80
会话级Oracle响应时间分析报告
执行脚本rtsess9.sql来对指定会话1110来生成会话级Oracle响应时间分析报告,从下面的报告中可以看到会话的响应时间为699.29秒,其中队列时间为608.20秒,非计数时间为91.09秒,而队列时间中IO队列时间只有0.3秒,Net+Client队列时间占了607.73秒。这说明会话一直在等待客户端程序进行调用。
SQL> @rtsess9 1110 =================================================================== Session Level Response Time Profile Oracle session 1110 CPU statistics number is 12 ...... Session level response time details for SID 1110 *** Response Time Summary Response Service Queue Unaccount % CPU % Queue % UAT Time(sec) Time(sec) Time(sec) Time(sec) RT RT RT [rt=st+qt+uat] [st] [qt] [uat] [st/rt] [qt/rt] [uat/rt] -------------- --------- --------- --------- ------- ------- -------- 699.29 0.00 608.20 91.09 0.00 86.97 13.03 *** Queue Time Summary QT QT QT Queue Time(sec) I/O(sec) Net+Client(sec) Other(sec) [qio+qnc+qot] [qio] [qnc] [qot] --------------- -------- --------------- ---------- 608.20 0.03 607.73 0.44 *** Queue Time IO Timing Detail QT QT QT I/O(sec) Write I/O(sec) Read I/O(sec) % Writes Time % Read Time [tio=wio+rio] [wio] [rio] [wio/tio] [rio/tio] ------------- -------------- ------------- ------------- ----------- 0.03 0.03 0.00 99.97 0.00 *** Queue Time IO Event Timing Detail Wait Time Wait Event Name (sec) ---------------------------------------- --------- direct path write 0.01 log file sync 0.02 *** Queue Time Other Event Timing Detail Wait Time Wait Event Name (sec) ---------------------------------------- --------- gc cr block 2-way 0.08 library cache pin 0.01 gc current block congested 0.01 gc current block 2-way 0.31 row cache lock 0.01 events in waitclass Other 0.01 library cache lock 0.01 *** Wait Event Time Not Categorized (for QA) ......
如果应用程序用户与Oracle服务器进程都在等待这是不正常的。如果用户已经执行了命令并且正等待命令执行结束,同时,相关的Oracle服务器进程正等待从客户端进程接收信息,那么在这两者之间存在问题。那么大概问题区域就是网络与客户端进程了。
一如既往的表示支持!