Oracle ASM异常dismount磁盘组

朋友反馈他们单位Oracle RAC中的一个节点异常关闭了,操作环境是IBM小机720,EMC存储,Oracle是11.2.0.4。收集了一些信息给我,查看该节点实例的alert.log文件发现了如下信息:

Thu Mar 31 14:29:18 2016
NOTE:Waiting for all pending writes to complete before de-registering: grpnum 3
Thu Mar 31 14:29:20 2016
WARNING: group 3 dismounted: failed to write virtual extent 263 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 263 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 265 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 265 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 265 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 265 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 265 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 265 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 265 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 265 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 260 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 260 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 246 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 246 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 246 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 246 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 246 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 246 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 246 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 246 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 260 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 260 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 260 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 260 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 260 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 260 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 261 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 261 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 261 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 261 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 261 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 261 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 261 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 261 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 244 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 244 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 244 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 244 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 257 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 257 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 244 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 244 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 257 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 257 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 244 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 257 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 244 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 257 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 257 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 257 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 259 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 259 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 259 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 259 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 259 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 259 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 259 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 259 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 254 of file 264 
WARNING: group 3 dismounted: failed to write virtual extent 254 of file 264 
Thu Mar 31 14:29:21 2016
WARNING: requested mirror side 1 of virtual extent 0 logical extent 0 offset 16384 is not allocated; I/O request failed
WARNING: requested mirror side 2 of virtual extent 0 logical extent 1 offset 16384 is not allocated; I/O request failed
WARNING: requested mirror side 3 of virtual extent 0 logical extent 2 offset 16384 is not allocated; I/O request failed
Errors in file /opt/app/oracle/diag/rdbms/jtpsoft/jtpsoft1/trace/jtpsoft1_arc1_10551306.trc:
ORA-00202: control file: '+GJJ_DG/jtpsoft/controlfile/current.260.841581789'
ORA-15079: ASM file is closed
ORA-15079: ASM file is closed
ORA-15079: ASM file is closed
System state dump requested by (instance=1, osid=15007960 (CKPT)), summary=[abnormal instance termination].
System State dumped to trace file /opt/app/oracle/diag/rdbms/jtpsoft/jtpsoft1/trace/jtpsoft1_diag_4784254_20160331142927.trc
Thu Mar 31 14:29:28 2016
ORA-1092 : opitsk aborting process
Thu Mar 31 14:29:34 2016
ORA-1092 : opitsk aborting process
Thu Mar 31 14:29:34 2016
License high water mark = 181
Thu Mar 31 14:29:37 2016
Instance terminated by CKPT, pid = 15007960
USER (ospid: 15073490): terminating the instance
Instance terminated by USER, pid = 15073490

从上面信息可以看到磁盘组3(也就是磁盘组+GJJ_DG)已经dismounted了并且ckpt进程终止了实例。如是查看跟踪文件/opt/app/oracle/diag/rdbms/jtpsoft/jtpsoft1/trace/jtpsoft1_arc1_10551306.trc,可以看到如下内容:

Trace file /opt/app/oracle/diag/rdbms/jtpsoft/jtpsoft1/trace/jtpsoft1_arc1_10551306.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /opt/app/oracle/product/11.2.0/db_1
System name:	AIX
Node name:	p720a
Release:	1
Version:	6
Machine:	00C855104C00
Instance name: jtpsoft1
Redo thread mounted by this instance: 1
Oracle process number: 41
Unix process pid: 10551306, image: oracle@p720a (ARC1)


*** 2016-03-31 14:29:21.527
*** SESSION ID:(641.1) 2016-03-31 14:29:21.527
*** CLIENT ID:() 2016-03-31 14:29:21.527
*** SERVICE NAME:(SYS$BACKGROUND) 2016-03-31 14:29:21.527
*** MODULE NAME:() 2016-03-31 14:29:21.527
*** ACTION NAME:() 2016-03-31 14:29:21.527
 

*** TRACE FILE RECREATED AFTER BEING REMOVED ***

WARNING:failed xlate 2 
WARNING: requested mirror side 1 of virtual extent 0 logical extent 0 offset 16384 is not allocated; I/O request failed
WARNING:failed xlate 2 
WARNING: requested mirror side 2 of virtual extent 0 logical extent 1 offset 16384 is not allocated; I/O request failed
WARNING:failed xlate 2 
WARNING: requested mirror side 3 of virtual extent 0 logical extent 2 offset 16384 is not allocated; I/O request failed
DDE rules only execution for: ORA 202
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
----- START DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
Successfully dispatched
----- END DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (SUCCESS, 0 csec) -----
Executing ASYNC actions
----- END DDE Actions Dump (total 0 csec) -----

*** 2016-03-31 14:29:21.533
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=1, mask=0x0)
----- Error Stack Dump -----
ORA-00202: control file: '+GJJ_DG/jtpsoft/controlfile/current.260.841581789'
ORA-15079: ASM file is closed
ORA-15079: ASM file is closed
ORA-15079: ASM file is closed
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.

----- Call Stack Trace -----
calling              call     entry                argument values in hex      
location             type     point                (? means dubious value)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+40        bl       0000000109B3BDC0     FFFFFFFFFFF6D60 ? 000000001 ?
                                                   000000001 ? 000000000 ?
                                                   000000000 ? 000000001 ?
                                                   000000001 ? 000000000 ?
ksedst1()+112        call     skdstdst()           FFFFFFFFFFF6E38 ? 000002004 ?
                                                   110720F48 ? 10A6D6A74 ?
                                                   10A6D5F50 ? FFFFFFFFFFF7190 ?
                                                   FFFFFFFFFFF6F40 ?
                                                   2050033FFFF6E18 ?
ksedst()+40          call     ksedst1()            10A6D6A68 ? 7000000000291 ?
                                                   10A6D6A3C ? B000000000000 ?
                                                   10A6D5F50 ? 000000000 ?
                                                   400000000 ? 1D25C0C474A0C ?
dbkedDefDump()+1516  call     ksedst()             000000001 ? FFFFFFFFFFF72F0 ?
                                                   000000001 ? 000000000 ?
                                                   000000003 ? 00001ED50 ?
                                                   000000000 ? 000000000 ?
ksedmp()+72          call     dbkedDefDump()       1A006FE88 ? 9001000A0000E00 ?
                                                   FFFFFFFFFFF7CD0 ? 004C3009B ?
                                                   000000000 ? 000000001 ?
                                                   000000001 ? 0000000CC ?
kcccsi()+416         call     ksedmp()             CA000000CA ? 000000001 ?
                                                   000000031 ? 7000104BCDE1D56 ?
                                                   000000000 ? 7000104C8CC4478 ?
                                                   000000006 ? 000000030 ?
kccrhd()+1488        call     kcccsi()             000000020 ? 000000000 ?
                                                   110738B50 ? 10A344FE0 ?
                                                   FFFFFFFFFFF7FA0 ? 004C3009B ?
                                                   102D0B800 ? 11298BF30 ?
kccgft_refresh_hdr(  call     kccrhd()             FFFFFFFFFFF8100 ? 004C3009B ?
)+124                                              11297B0B0 ? FFFFFFFFFFFA5E0 ?
                                                   6B736C636D617266 ?
                                                   112939850 ? 10A38F248 ?
                                                   10A38F258 ?
kcc_noenq_refresh()  call     kccgft_refresh_hdr(  110E9DA20 ? 000000004 ?
+40                           )                    000000018 ? 11012D7C8 ?
                                                   000000021 ? 00000374E ?
                                                   000000001 ? FFFFFFFFFFF81F0 ?
kccocx_noenq()+292   call     kcc_noenq_refresh()  20B6CECD0 ? 000000102 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFFFFF8070 ?
                                                   44244288000537D0 ?
                                                   9000000000051AC ?
                                                   FFFFFFFFFFF8030 ?
...省略...

上面的内容也就是不能对磁盘组执行IO操作,并没有得到更有用的信息。如是查看asm的alert.log文件,可以看到如下信息:

Thu Mar 31 14:29:18 2016
WARNING: Waited 15 secs for write IO to PST disk 0 in group 3.
WARNING: Waited 15 secs for write IO to PST disk 1 in group 3.
WARNING: Waited 15 secs for write IO to PST disk 2 in group 3.
WARNING: Waited 15 secs for write IO to PST disk 0 in group 3.
WARNING: Waited 15 secs for write IO to PST disk 1 in group 3.
WARNING: Waited 15 secs for write IO to PST disk 2 in group 3.
Thu Mar 31 14:29:18 2016
NOTE: process _b000_+asm1 (22544528) initiating offline of disk 0.3043853018 (GJJ_DG_0000) with mask 0x7e in group 3
NOTE: process _b000_+asm1 (22544528) initiating offline of disk 1.3043853019 (GJJ_DG_0001) with mask 0x7e in group 3
NOTE: process _b000_+asm1 (22544528) initiating offline of disk 2.3043853020 (GJJ_DG_0002) with mask 0x7e in group 3
NOTE: checking PST: grp = 3
GMON checking disk modes for group 3 at 10 for pid 30, osid 22544528
ERROR: no read quorum in group: required 2, found 0 disks
NOTE: checking PST for grp 3 done.
NOTE: initiating PST update: grp = 3, dsk = 0/0xb56d82da, mask = 0x6a, op = clear
NOTE: initiating PST update: grp = 3, dsk = 1/0xb56d82db, mask = 0x6a, op = clear
NOTE: initiating PST update: grp = 3, dsk = 2/0xb56d82dc, mask = 0x6a, op = clear
GMON updating disk modes for group 3 at 11 for pid 30, osid 22544528
ERROR: no read quorum in group: required 2, found 0 disks
Thu Mar 31 14:29:18 2016
NOTE: cache dismounting (not clean) group 3/0x46ED7239 (GJJ_DG) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 30277648, image: oracle@p720a (B001)
Thu Mar 31 14:29:18 2016
NOTE: halting all I/Os to diskgroup 3 (GJJ_DG)
Thu Mar 31 14:29:18 2016
NOTE: LGWR doing non-clean dismount of group 3 (GJJ_DG)
NOTE: LGWR sync ABA=22.4306 last written ABA 22.4306
WARNING: Offline for disk GJJ_DG_0000 in mode 0x7f failed.
WARNING: Offline for disk GJJ_DG_0001 in mode 0x7f failed.
WARNING: Offline for disk GJJ_DG_0002 in mode 0x7f failed.
Thu Mar 31 14:29:18 2016
kjbdomdet send to inst 2
detach from dom 3, sending detach message to inst 2
Thu Mar 31 14:29:18 2016
List of instances:
 1 2
Dirty detach reconfiguration started (new ddet inc 1, cluster inc 20)
 Global Resource Directory partially frozen for dirty detach
* dirty detach - domain 3 invalid = TRUE 
 305 GCS resources traversed, 0 cancelled
Thu Mar 31 14:29:18 2016
ERROR: ORA-15130 in COD recovery for diskgroup 3/0x46ed7239 (GJJ_DG)
ERROR: ORA-15130 thrown in RBAL for group number 3
Dirty Detach Reconfiguration complete
Errors in file /opt/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_12386526.trc:
ORA-15130: diskgroup "GJJ_DG" is being dismounted
Thu Mar 31 14:29:18 2016
WARNING: dirty detached from domain 3
NOTE: cache dismounted group 3/0x46ED7239 (GJJ_DG) 
NOTE: cache deleting context for group GJJ_DG 3/0x46ed7239
SQL> alter diskgroup GJJ_DG dismount force /* ASM SERVER:1189966393 */ 
Thu Mar 31 14:29:18 2016
NOTE: No asm libraries found in the system
ERROR: ORA-15130 in COD recovery for diskgroup 3/0x46ed7239 (GJJ_DG)
ERROR: ORA-15130 thrown in RBAL for group number 3
Errors in file /opt/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_12386526.trc:
ORA-15130: diskgroup "" is being dismounted
ERROR: ORA-15130 in COD recovery for diskgroup 3/0x46ed7239 (GJJ_DG)
ERROR: ORA-15130 thrown in RBAL for group number 3
Errors in file /opt/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_12386526.trc:
ORA-15130: diskgroup "" is being dismounted
ERROR: ORA-15130 in COD recovery for diskgroup 3/0x46ed7239 (GJJ_DG)
ERROR: ORA-15130 thrown in RBAL for group number 3
Errors in file /opt/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_12386526.trc:
ORA-15130: diskgroup "" is being dismounted
Thu Mar 31 14:29:27 2016
NOTE: ASM client jtpsoft1:jtpsoft disconnected unexpectedly.
NOTE: check client alert log.
NOTE: Trace records dumped in trace file /opt/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_14024932.trc
ASM Health Checker found 1 new failures
Thu Mar 31 14:29:30 2016
ERROR: ORA-15130 in COD recovery for diskgroup 3/0x46ed7239 (GJJ_DG)
ERROR: ORA-15130 thrown in RBAL for group number 3
Errors in file /opt/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_12386526.trc:
ORA-15130: diskgroup "" is being dismounted
ERROR: ORA-15130 in COD recovery for diskgroup 3/0x46ed7239 (GJJ_DG)
ERROR: ORA-15130 thrown in RBAL for group number 3
Errors in file /opt/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_12386526.trc:
ORA-15130: diskgroup "" is being dismounted
ERROR: ORA-15130 in COD recovery for diskgroup 3/0x46ed7239 (GJJ_DG)
ERROR: ORA-15130 thrown in RBAL for group number 3
Errors in file /opt/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_12386526.trc:
ORA-15130: diskgroup "" is being dismounted
Thu Mar 31 14:29:37 2016
GMON dismounting group 3 at 12 for pid 31, osid 30277648
Thu Mar 31 14:29:37 2016
NOTE: Disk GJJ_DG_0000 in mode 0x7f marked for de-assignment
NOTE: Disk GJJ_DG_0001 in mode 0x7f marked for de-assignment
NOTE: Disk GJJ_DG_0002 in mode 0x7f marked for de-assignment
SUCCESS: diskgroup GJJ_DG was dismounted
SUCCESS: alter diskgroup GJJ_DG dismount force /* ASM SERVER:1189966393 */
SUCCESS: ASM-initiated MANDATORY DISMOUNT of group GJJ_DG
Thu Mar 31 14:29:37 2016
NOTE: diskgroup resource ora.GJJ_DG.dg is offline
Thu Mar 31 14:29:44 2016
Errors in file /opt/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_15073504.trc:
ORA-17503: ksfdopn:2 Failed to open file +GJJ_DG/jtpsoft/spfilejtpsoft.ora
ORA-15001: diskgroup "GJJ_DG" does not exist or is not mounted
ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  1048576    614400   606078           204800          200639              0             N  BACK_DG/
MOUNTED  NORMAL  N         512   4096  1048576     10240     9310             2048            3631              0             Y  CRS_DG/
MOUNTED  NORMAL  N         512   4096  1048576   1228800   858311           409600          224355              0             N  GJJ_DG/

从上面的信息’Thu Mar 31 14:29:18 2016’与’WARNING: Waited 15 secs for write IO to PST disk 1 in group 3’可知在2016-03-31 14:29:18这个时间点,由于对ASM正常或高冗余磁盘所执行的ASM PST心跳检测出现了延迟,而且延迟时间超过了15秒,因此ASM实例dismount了ASM磁盘组。这种心跳方式会忽略对外部冗余磁盘组的检测,在ASM PST重新验证之前ASM实例会停止执行更多的PST心跳检测,但PST心跳延迟不会dismount外部冗余磁盘组。可能出现这种情况有以下几种原因:
1.由于多路径设备的某些物理路径脱机或丢失
2.多路径执行路径故障转移
3.服务器负载或存储/多路径/操作系统的维护,但这种情况可以排除因为负载不高,也没进行任何维护操作。

那么在出现这种情况时,可以通过以下几种方式来解决:
1.检查操作系统和存储对磁盘管理的响应时间
2.尽可能把磁盘的响应时间维持在15秒以下,但这将处于了操作系统,多路径软件与内核参数等多种因素,需要一一排查。
3.如果不能保证磁盘的响应时间维持在15秒以下,那么可以对ASM实例设置隐含参数_asm_hbetaiowait,对于这个参值数如果遇到bug 17274537可以设置120,并且这个bug在12.1.0.2中被修复了。

Thu Mar 31 14:30:05 2016
SQL> ALTER DISKGROUP GJJ_DG MOUNT  /* asm agent *//* {0:23:23972} */ 
NOTE: cache registered group GJJ_DG number=3 incarn=0x46ed72a8
NOTE: cache began mount (not first) of group GJJ_DG number=3 incarn=0x46ed72a8
NOTE: Assigning number (3,0) to disk (/dev/rhdiskpower5)
NOTE: Assigning number (3,1) to disk (/dev/rhdiskpower6)
NOTE: Assigning number (3,2) to disk (/dev/rhdiskpower7)
Thu Mar 31 14:30:05 2016
GMON querying group 3 at 14 for pid 27, osid 29163580
NOTE: cache opening disk 0 of grp 3: GJJ_DG_0000 path:/dev/rhdiskpower5
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache opening disk 1 of grp 3: GJJ_DG_0001 path:/dev/rhdiskpower6
NOTE: F1X0 found on disk 1 au 2 fcn 0.0
NOTE: cache opening disk 2 of grp 3: GJJ_DG_0002 path:/dev/rhdiskpower7
NOTE: F1X0 found on disk 2 au 2 fcn 0.0
NOTE: cache mounting (not first) normal redundancy group 3/0x46ED72A8 (GJJ_DG)
Thu Mar 31 14:30:05 2016
kjbdomatt send to inst 2
Thu Mar 31 14:30:05 2016
NOTE: attached to recovery domain 3
NOTE: redo buffer size is 256 blocks (1053184 bytes)
Thu Mar 31 14:30:05 2016
NOTE: LGWR attempting to mount thread 2 for diskgroup 3 (GJJ_DG)
NOTE: LGWR found thread 2 closed at ABA 22.4306
NOTE: LGWR mounted thread 2 for diskgroup 3 (GJJ_DG)
NOTE: LGWR opening thread 2 at fcn 0.383934 ABA 23.4307
NOTE: cache mounting group 3/0x46ED72A8 (GJJ_DG) succeeded
NOTE: cache ending mount (success) of group GJJ_DG number=3 incarn=0x46ed72a8
Thu Mar 31 14:30:05 2016
NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 3
SUCCESS: diskgroup GJJ_DG was mounted
SUCCESS: ALTER DISKGROUP GJJ_DG MOUNT  /* asm agent *//* {0:23:23972} */

从上面的信息可以看到在 2016-03-31 14:30:05这个时间点asm_agent在执行mount磁盘组(GJJ_DG),在14:29:18 dismount磁盘组到14:30:05 mount磁盘之间的时间间隔是47秒。所以等我检查时ASM磁盘组(GJJ_DG)已经mount成功了,于是只能让客户去检查操作系统,多路径软件和光纤。

Oracle 12c DBCA出现PRCR-1079 ORA-12547 CRS-5017

Oracle 12c用dbca创建数据库时出现了PRCR-1079 ORA-12547 CRS-5017不能启动数据库。
12c

因为这里安装了Oracle Restart,所以尝试使用srvctl start database命令来手动启动数据库,但是结果还是一样不能启动。
[grid@oracle12c 12.1]$ srvctl start database -db jycs
PRCR-1079 : Failed to start resource ora.jycs.db
ORA-12547: TNS:lost contact
CRS-5017: The resource action “ora.jycs.db start” encountered the following error:
ORA-12547: TNS:lost contact
. For details refer to “(:CLSN00107:)” in “/u01/app/grid/diag/crs/oracle12c/crs/trace/ohasd_oraagent_grid.trc”.

CRS-2674: Start of ‘ora.jycs.db’ on ‘oracle12c’ failed
ORA-12547: TNS:lost contact
[grid@oracle12c 12.1]$ crs_stat -t
CRS-275: This command is not supported in Oracle Restart environment.

检查grid的GRID_BASE与GRID_HOME目录与oracle的ORACLE_BASE与ORACLE_HOME目录,发现用户与组权限都是对的。
[grid@oracle12c 12.1]$ ls -lrt
total 8
drwxr-x—. 72 grid oinstall 4096 Mar 30 22:50 grid
drwxrwxr-x. 70 oracle oinstall 4096 Mar 31 12:59 db

[grid@oracle12c app]$ ls -lrt
total 8
drwxr-x—. 72 grid oinstall 4096 Mar 30 22:50 grid
drwxrwxr-x. 70 oracle oinstall 4096 Mar 31 12:59 oracle

在有些情况下由于某些不确定的原因会造成oracle二进制文件的权限,由于检查GRID_HOME/bin与ORACLE_HOME/bin目录下的oracle文件的权限。
[oracle@oracle12c bin]$ ls -lrt oracle
-rwxr-x–x. 1 oracle oinstall 291255038 Mar 30 22:47 oracle

[grid@oracle12c bin]$ ls -lrt oracle
-rwxr-x–x. 1 grid oinstall 291255038 Mar 30 22:47 oracle

[oracle@oracle12c bin]$ cd $ORACLE_HOME/bin
[oracle@oracle12c bin]$ chmod 6751 oracle

[grid@oracle12c bin]$ cd $ORACLE_HOME/bin
[grid@oracle12c bin]$ chmod 6751 oracle
[grid@oracle12c bin]$ srvctl start database -db jycs
在对oracle文件修改权限后成功启动。

RMAN跨小版本跨平台与字节序传输表空间

将Linux平台上的源数据库中的tspitr与test表空间传输到AIX平台上。并在源主机上使用目录/u02/transport来存储被转换的数据文件。操作步骤如下:
1.将要被传输的表空间tspitr与test设置为只读

SQL> alter tablespace tspitr read only;

Tablespace altered.

SQL> alter tablespace test read only;

Tablespace altered.

2.检查源平台与目标平台信息支不支持传输操作
数据库所支持的平台信息:

SQL> select platform_name,endian_format from v$transportable_platform;

PLATFORM_NAME                                                                    ENDIAN_FORMAT
-------------------------------------------------------------------------------- --------------
Solaris[tm] OE (32-bit)                                                          Big
Solaris[tm] OE (64-bit)                                                          Big
Microsoft Windows IA (32-bit)                                                    Little
Linux IA (32-bit)                                                                Little
AIX-Based Systems (64-bit)                                                       Big
HP-UX (64-bit)                                                                   Big
HP Tru64 UNIX                                                                    Little
HP-UX IA (64-bit)                                                                Big
Linux IA (64-bit)                                                                Little
HP Open VMS                                                                      Little
Microsoft Windows IA (64-bit)                                                    Little
IBM zSeries Based Linux                                                          Big
Linux x86 64-bit                                                                 Little
Apple Mac OS                                                                     Big
Microsoft Windows x86 64-bit                                                     Little
Solaris Operating System (x86)                                                   Little
IBM Power Based Linux                                                            Big
Solaris Operating System (x86-64)                                                Little
HP IA Open VMS                                                                   Little

源平台:

SQL> select d.platform_name,endian_format from v$transportable_platform tp,v$database d where tp.platform_name=d.platform_name ;

PLATFORM_NAME                                                                                         ENDIAN_FORMAT
----------------------------------------------------------------------------------------------------- --------------
Linux 64-bit for AMD                                                                                  Little

目标平台


SQL> select d.platform_name,endian_format from v$transportable_platform tp,v$database d where tp.platform_name=d.platform_name ;

PLATFORM_NAME                                                                    ENDIAN_FORMAT
-------------------------------------------------------------------------------- --------------
AIX-Based Systems (64-bit)                                                       Big

3.确认要被传输的表空间是否是自包含表空间(TSPITR,TEST):

SQL> exec sys.dbms_tts.transport_set_check('TSPITR',true);

PL/SQL procedure successfully completed.


SQL> exec sys.dbms_tts.transport_set_check('TEST',true);

PL/SQL procedure successfully completed.

SQL> select * from sys.transport_set_violations;

no rows selected

如果没有行选择,表示该表空间只包含表数据,可以传输。

4.记录表空间传输前表tspitr与test中的记录:


SQL> select count(*) from tspitr.tspitr;

  COUNT(*)
----------
     50315

SQL> select count(*) from test.test;

  COUNT(*)
----------
     50316

5.使用RMAN将源数据库中的表空间tspitr,test转换为目标平台字节序格式,使用format参数来控制被转换后数据文件的文件名和存储目录.

[oracle@oracle11g ~]export NLS_DATE_FORMAT='yyyy-mm-dd hh24:mi:ss'
[oracle@oracle11g ~]rman target/

Recovery Manager: Release 10.2.0.5.0 - Production on Mon Mar 30 10:35:41 2015

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: TEST (DBID=2168949517)


RMAN> convert tablespace "TSPITR","TEST"
2> to platform 'AIX-Based Systems (64-bit)'
3> format ='/u02/transport/%U';

Starting backup at 2015-03-30 10:37:27
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=141 devtype=DISK
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00006 name=/u01/app/oracle/oradata/test/tspitr01.dbf
converted datafile=/u02/transport/data_D-TEST_I-2168949517_TS-TSPITR_FNO-6_3aq349v8
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00007 name=/u01/app/oracle/oradata/test/test01.dbf
converted datafile=/u02/transport/data_D-TEST_I-2168949517_TS-TEST_FNO-7_3bq349vo
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:01
Finished backup at 2015-03-30 10:37:45


[oracle@oracle11g transport]$ ls -lrt
total 112776
-rw-r----- 1 oracle oinstall 104865792 Mar 30 10:37 data_D-TEST_I-2168949517_TS-TSPITR_FNO-6_3aq349v8
-rw-r----- 1 oracle oinstall  10493952 Mar 30 10:37 data_D-TEST_I-2168949517_TS-TEST_FNO-7_3bq349vo

6.使用导出工具创建传输表空间元数据dump文件

SQL> create or replace directory test_dump as '/u02/transport';

Directory created.

SQL> grant read,write on directory test_dump to public;

Grant succeeded.


[oracle@oracle11g dump_test]$ expdp \'sys/zzh_2046@test as sysdba\' directory=test_dump dumpfile=tspitr_test.dmp logfile=tspitr_test.log transport_tablespaces=TSPITR,TEST

Export: Release 10.2.0.5.0 - Production on Monday, 30 March, 2015 10:57:37

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SYS"."SYS_EXPORT_TRANSPORTABLE_01":  "sys/********@test AS SYSDBA" directory=test_dump dumpfile=tspitr_test.dmp logfile=tspitr_test.log transport_tablespaces=TSPITR,TEST
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Master table "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TRANSPORTABLE_01 is:
  /u02/dump_test/tspitr_test.dmp
Job "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully completed at 10:57:55

[oracle@oracle11g dump_test]ls -lrt

-rw-r----- 1 oracle oinstall      960 Mar 30 15:23 tspitr_test.log
-rw-r----- 1 oracle oinstall    90112 Mar 30 15:23 tspitr_test.dmp

7.将转换后存储在/u02/transport目录中的数据文件与导出的元数据文件tspitr_test.dmp传输到目标主机的目录/yb_oradata/transport中

ftp> put data_D-TEST_I-2168949517_TS-TSPITR_FNO-6_3aq349v8
200 PORT command successful.
150 Opening data connection for data_D-TEST_I-2168949517_TS-TSPITR_FNO-6_3aq349v
8.
226 Transfer complete.
ftp: 发送 104865792 字节,用时 8.86秒 11839.88千字节/秒。
ftp> put data_D-TEST_I-2168949517_TS-TEST_FNO-7_3bq349vo
200 PORT command successful.
150 Opening data connection for data_D-TEST_I-2168949517_TS-TEST_FNO-7_3bq349vo.

226 Transfer complete.
ftp: 发送 10493952 字节,用时 0.90秒 11659.95千字节/秒。
ftp> put tspitr_test.dmp
200 PORT command successful.
150 Opening data connection for tspitr_test.dmp.
226 Transfer complete.

[IBMP740-1:oracle:/yb_oradata]$ls -lrt
-rwxrwxrwx    1 oracle   dba       104865792 Mar 30 12:42 data_D-TEST_I-2168949517_TS-TSPITR_FNO-6_3aq349v8
-rwxrwxrwx    1 oracle   dba        10493952 Mar 30 12:42 data_D-TEST_I-2168949517_TS-TEST_FNO-7_3bq349vo
-rwxrwxrwx    1 oracle   dba           98304 Mar 30 12:42 tspitr_test.dmp

8.将要被传输的表空间附加到目标数据库中

[IBMP740-1:oracle:/yb_oradata]$sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Mon Mar 30 11:31:47 2015

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create user tspitr identified by "tspitr";

User created.

SQL> grant dba,connect,resource to tspitr;

Grant succeeded.

SQL> create user test identified by "test";

User created.

SQL> grant dba,connect,resource to test;

Grant succeeded.



SQL> create or replace directory test_dump as '/yb_oradata/transport';

Directory created.

SQL> grant read,write on directory test_dump to public;

Grant succeeded.


[IBMP740-1:oracle]$impdp system/system directory=test_dump dumpfile=tspitr_test.dmp transport_datafiles=/yb_oradata/data_D-TEST_I-2168949517_TS-TEST_FNO-7_3bq349vo,/yb_oradata/data_D-TEST_I-2168949517_TS-TSPITR_FNO-6_3aq349v8

Import: Release 10.2.0.4.0 - 64bit Production on Monday, 30 March, 2015 11:45:48

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01":  system/******** directory=test_dump dumpfile=tspitr_test.dmp transport_datafiles=/yb_oradata/transport/data_D-TEST_I-2168949517_TS-TEST_FNO-7_3bq349vo,/yb_oradata/transport/data_D-TEST_I-2168949517_TS-TSPITR_FNO-6_3aq349v8
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
ORA-39123: Data Pump transportable tablespace job aborted
ORA-00721: changes by release 10.2.0.5.0 cannot be used by release 10.2.0.3.0

Job "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" stopped due to fatal error at 11:45:53

出错原因是因为源数据库的compatible=10.2.0.5.0,目标数据库的compatible=10.2.0.3.0这里在导出使用version=10.2.0.3.0导出元数据后再执行导出也是同样会报这个错误,因为这里不是逻辑导出,而是传输表空间,所以version参数不起作用。所以想修改源数据库的compatible参数为10.2.0.3.0,但在10g以后,compatible参数只能增大不能减少。
SQL> show parameter compatible

NAME TYPE VALUE
———————————— ———– ——————————
compatible string 10.2.0.5.0

SQL> alter system set compatible=’10.2.0.3.0′ scope=spfile;

System altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 167772160 bytes
Fixed Size 1272600 bytes
Variable Size 109053160 bytes
Database Buffers 54525952 bytes
Redo Buffers 2920448 bytes
ORA-00201: control file version 10.2.0.5.0 incompatible with ORACLE version
10.2.0.3.0
ORA-00202: control file: ‘/u01/app/oracle/oradata/test/control01.ctl’

在将源数据库的compatible参数修改为10.2.0.3.0后无法启动数据库。

这里是从10.2.0.5(源数据库的compatible参数为10.2.0.5.0)向10.2.0.4(这里目标数据库中的compatible参数却为10.2.0.3.0,不是10.2.0.4.0)跨平台传输表空间不会成功,也就是从高版本向低版本传输表空间不能成功。但从低版本向高版本传输表空间是可以成功,例如下面的示例:
我们要将Linux平台上的源数据库中的tspitr与test表空间传输到AIX平台上。并在源主机上使用目录/u02/transport来存储被转换的数据文件。操作步骤如下:
1.将要被传输的表空间tspitr与test设置为只读

SQL> alter tablespace tspitr read only;

Tablespace altered.

SQL> alter tablespace test read only;

Tablespace altered.

2.检查源平台与目标平台信息支不支持传输操作
数据库所支持的平台信息:

SQL> select platform_name,endian_format from v$transportable_platform;

PLATFORM_NAME                                                                    ENDIAN_FORMAT
-------------------------------------------------------------------------------- --------------
Solaris[tm] OE (32-bit)                                                          Big
Solaris[tm] OE (64-bit)                                                          Big
Microsoft Windows IA (32-bit)                                                    Little
Linux IA (32-bit)                                                                Little
AIX-Based Systems (64-bit)                                                       Big
HP-UX (64-bit)                                                                   Big
HP Tru64 UNIX                                                                    Little
HP-UX IA (64-bit)                                                                Big
Linux IA (64-bit)                                                                Little
HP Open VMS                                                                      Little
Microsoft Windows IA (64-bit)                                                    Little
IBM zSeries Based Linux                                                          Big
Linux x86 64-bit                                                                 Little
Apple Mac OS                                                                     Big
Microsoft Windows x86 64-bit                                                     Little
Solaris Operating System (x86)                                                   Little
IBM Power Based Linux                                                            Big
Solaris Operating System (x86-64)                                                Little
HP IA Open VMS                                                                   Little

源平台:

SQL> select d.platform_name,endian_format from v$transportable_platform tp,v$database d where tp.platform_name=d.platform_name ;

PLATFORM_NAME                                                                                         ENDIAN_FORMAT
----------------------------------------------------------------------------------------------------- --------------
Linux 64-bit for AMD                                                                                  Little

目标平台


SQL> select d.platform_name,endian_format from v$transportable_platform tp,v$database d where tp.platform_name=d.platform_name ;

PLATFORM_NAME                                                                    ENDIAN_FORMAT
-------------------------------------------------------------------------------- --------------
AIX-Based Systems (64-bit)                                                       Big

3.确认要被传输的表空间是否是自包含表空间(TSPITR,TEST):

SQL> exec sys.dbms_tts.transport_set_check('TSPITR',true);

PL/SQL procedure successfully completed.


SQL> exec sys.dbms_tts.transport_set_check('TEST',true);

PL/SQL procedure successfully completed.

SQL> select * from sys.transport_set_violations;

no rows selected

如果没有行选择,表示该表空间只包含表数据,可以传输。

4.记录表空间传输前表tspitr与test中的记录:


SQL> select count(*) from tspitr.tspitr;

  COUNT(*)
----------
     50315

SQL> select count(*) from test.test;

  COUNT(*)
----------
     50316

5.使用RMAN将源数据库中的表空间tspitr,test转换为目标平台字节序格式,使用format参数来控制被转换后数据文件的文件名和存储目录.

[oracle@weblogic28 ~]$ export NLS_DATE_FORMAT='yyyy-mm-dd hh24:mi:ss'
[oracle@weblogic28 ~]$ rman target/

Recovery Manager: Release 10.2.0.1.0 - Production on Mon Mar 30 15:16:38 2015

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

connected to target database: JYTEST (DBID=3911337604)

RMAN> convert tablespace "TSPITR","TEST"
to platform 'AIX-Based Systems (64-bit)'
2> 3> format ='/u02/transport/%U';

Starting backup at 2015-03-30 15:17:09
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=134 devtype=DISK
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00006 name=/u01/app/oracle/oradata/jytest/tspitr01.dbf
converted datafile=/u02/transport/data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00007 name=/u01/app/oracle/oradata/jytest/test01.dbf
converted datafile=/u02/transport/data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:01
Finished backup at 2015-03-30 15:17:12

[root@weblogic28 transport]# ls -lrt
total 102528
-rw-r----- 1 oracle oinstall 52436992 Mar 30 15:17 data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm
-rw-r----- 1 oracle oinstall 52436992 Mar 30 15:17 data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn

6.使用导出工具创建传输表空间元数据dump文件

SQL> create or replace directory test_dump as '/u02/transport';

Directory created.

SQL> grant read,write on directory test_dump to public;

Grant succeeded.


[oracle@weblogic28 ~]$ expdp \'sys/system as sysdba\' directory=test_dump dumpfile=tspitr_test.dmp logfile=tspitr_test.log transport_tablespaces=TSPITR,TEST

Export: Release 10.2.0.1.0 - 64bit Production on Monday, 30 March, 2015 15:23:19

Copyright (c) 2003, 2005, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
Starting "SYS"."SYS_EXPORT_TRANSPORTABLE_01":  'sys/******** AS SYSDBA' directory=test_dump dumpfile=tspitr_test.dmp logfile=tspitr_test.log transport_tablespaces=TSPITR,TEST
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Master table "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TRANSPORTABLE_01 is:
  /u02/transport/tspitr_test.dmp
Job "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully completed at 15:23:29


[root@weblogic28 transport]# ls -lrt
total 102624
-rw-r----- 1 oracle oinstall 52436992 Mar 30 15:17 data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm
-rw-r----- 1 oracle oinstall 52436992 Mar 30 15:17 data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn
-rw-r----- 1 oracle oinstall      960 Mar 30 15:23 tspitr_test.log
-rw-r----- 1 oracle oinstall    90112 Mar 30 15:23 tspitr_test.dmp

7.将转换后存储在/u02/transport目录中的数据文件与导出的元数据文件tspitr_test.dmp传输到目标主机的目录/yb_oradata/transport中

ftp> get data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm
200 PORT command successful.
150 Opening BINARY mode data connection for data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm (52436992 bytes).
226 Transfer complete.
52436992 bytes received in 0.4497 seconds (1.139e+05 Kbytes/s)
local: data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm remote: data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm
ftp> get data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn
200 PORT command successful.
150 Opening BINARY mode data connection for data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn (52436992 bytes).
226 Transfer complete.
52436992 bytes received in 0.4518 seconds (1.133e+05 Kbytes/s)
local: data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn remote: data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn
ftp> get tspitr_test.dmp
200 PORT command successful.
150 Opening BINARY mode data connection for tspitr_test.dmp (90112 bytes).
226 Transfer complete.
90112 bytes received in 0.001825 seconds (4.822e+04 Kbytes/s)
local: tspitr_test.dmp remote: tspitr_test.dmp


[IBMP740-1:oracle:/yb_oradata/transport]$ls -lrt
total 205008
-rwxrwxrwx    1 oracle   dba        52436992 Mar 30 15:06 data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm
-rwxrwxrwx    1 oracle   dba        52436992 Mar 30 15:06 data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn
-rwxrwxrwx    1 oracle   dba           90112 Mar 30 15:07 tspitr_test.dmp

8.将要被传输的表空间附加到目标数据库中

[IBMP740-1:oracle:/yb_oradata]$sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Mon Mar 30 11:31:47 2015

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create user tspitr identified by "tspitr";

User created.

SQL> grant dba,connect,resource to tspitr;

Grant succeeded.

SQL> create user test identified by "test";

User created.

SQL> grant dba,connect,resource to test;

Grant succeeded.



SQL> create or replace directory test_dump as '/yb_oradata/transport';

Directory created.

SQL> grant read,write on directory test_dump to public;

Grant succeeded.



IBMP740-1:oracle:/yb_oradata/transport]impdp system/system directory=test_dump dumpfile=tspitr_test.dmp transport_datafiles=/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm,/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn
Import: Release 10.2.0.4.0 - 64bit Production on Monday, 30 March, 2015 15:09:38

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01":  system/******** directory=test_dump dumpfile=tspitr_test.dmp transport_datafiles=/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_01q34qbm,/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_02q34qbn
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" successfully completed at 15:09:42



[IBMP740-1:oracle:/yb_oradata/transport]$sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Mon Mar 30 16:36:19 2015

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> show parameter compatible

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
compatible                           string      10.2.0.3.0
SQL> select count(*) from tspitr.tspitr;

  COUNT(*)
----------
     50315


SQL> select count(*) from test.test;

  COUNT(*)
----------
     50316

可以看到,从10.2.0.1(compatible=10.2.0.1.0)向10.2.0.4(compatible=10.2.0.3.0)跨平台和字节序传输表空间是可以执行成功。

为了从10.2.0.5向10.2.0.4(compatible=10.2.0.3.0)跨平台传输表空间,我这里测试一下将10.2.0.1源数据库升级到 10.2.0.5但compatible设置为10.2.0.1.0或小于目标数据库的compatible=10.2.0.3.0来进行测试。

[oracle@weblogic28 ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Mar 30 19:17:59 2015

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select version from v$instance;

VERSION
-----------------
10.2.0.5.0

SQL> show parameter compatible

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
compatible                           string      10.2.0.1.0

将源数据库从10.2.0.1升级到10.2.0.5后compatible参数还是为10.2.0.1.0

使用RMAN将源数据库中的表空间tspitr,test转换为目标平台字节序格式,使用format参数来控制被转换后数据文件的文件名和存储目录.

[oracle@weblogic28 ~]$ export NLS_DATE_FORMAT='yyyy-mm-dd hh24:mi:ss'
[oracle@weblogic28 ~]$ rman target/

Recovery Manager: Release 10.2.0.5.0 - Production on Mon Mar 30 19:21:39 2015

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: JYTEST (DBID=3911337604)

RMAN> convert tablespace "TSPITR","TEST"
2> to platform 'AIX-Based Systems (64-bit)'
3> format ='/u02/transport/%U';

Starting backup at 2015-03-30 19:22:01
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=148 devtype=DISK
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00006 name=/u01/app/oracle/oradata/jytest/tspitr01.dbf
converted datafile=/u02/transport/data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_03q358mq
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:03
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00007 name=/u01/app/oracle/oradata/jytest/test01.dbf
converted datafile=/u02/transport/data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_04q358mt
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:03
Finished backup at 2015-03-30 19:22:08

使用导出工具创建传输表空间元数据dump文件

SQL> create or replace directory test_dump as '/u02/transport';

Directory created.

SQL> grant read,write on directory test_dump to public;

Grant succeeded.


[oracle@weblogic28 ~]$ expdp \'sys/system as sysdba\' directory=test_dump dumpfile=tspitr_test.dmp logfile=tspitr_test.log transport_tablespaces=TSPITR,TEST

Export: Release 10.2.0.5.0 - 64bit Production on Monday, 30 March, 2015 19:23:47

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SYS"."SYS_EXPORT_TRANSPORTABLE_01":  "sys/******** AS SYSDBA" directory=test_dump dumpfile=tspitr_test.dmp logfile=tspitr_test.log transport_tablespaces=TSPITR,TEST
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Master table "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TRANSPORTABLE_01 is:
  /u02/transport/tspitr_test.dmp
Job "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully completed at 19:24:00

[oracle@weblogic28 transport]$ ls -lrt
total 102624
-rw-r----- 1 oracle oinstall 52436992 Mar 30 19:22 data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_03q358mq
-rw-r----- 1 oracle oinstall 52436992 Mar 30 19:22 data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_04q358mt
-rw-r--r-- 1 oracle oinstall      986 Mar 30 19:24 tspitr_test.log
-rw-r----- 1 oracle oinstall    90112 Mar 30 19:24 tspitr_test.dmp

将转换后存储在/u02/transport目录中的数据文件与导出的元数据文件tspitr_test.dmp传输到目标主机的目录/yb_oradata/transport中

ftp> get data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_04q358mt
200 PORT command successful.
150 Opening BINARY mode data connection for data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_04q358mt (52436992 bytes).
226 Transfer complete.
52436992 bytes received in 0.4462 seconds (1.148e+05 Kbytes/s)
local: data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_04q358mt remote: data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_04q358mt
ftp> get data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_03q358mq
200 PORT command successful.
150 Opening BINARY mode data connection for data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_03q358mq (52436992 bytes).
226 Transfer complete.
52436992 bytes received in 0.4458 seconds (1.149e+05 Kbytes/s)
local: data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_03q358mq remote: data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_03q358mq
ftp> get tspitr_test.dmp
200 PORT command successful.
150 Opening BINARY mode data connection for tspitr_test.dmp (90112 bytes).
226 Transfer complete.
90112 bytes received in 0.00183 seconds (4.809e+04 Kbytes/s)
local: tspitr_test.dmp remote: tspitr_test.dmp

[IBMP740-1:oracle:/yb_oradata/transport]$rm tspitr_test.log
[IBMP740-1:oracle:/yb_oradata/transport]$ls -lrt
total 205008
-rw-r--r--    1 oracle   dba        52436992 Mar 30 19:08 data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_04q358mt
-rw-r--r--    1 oracle   dba        52436992 Mar 30 19:08 data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_03q358mq
-rw-r--r--    1 oracle   dba           90112 Mar 30 19:08 tspitr_test.dmp

将要被传输的表空间附加到目标数据库中

[IBMP740-1:oracle:/yb_oradata]$sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Mon Mar 30 11:31:47 2015

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create user tspitr identified by "tspitr";

User created.

SQL> grant dba,connect,resource to tspitr;

Grant succeeded.

SQL> create user test identified by "test";

User created.

SQL> grant dba,connect,resource to test;

Grant succeeded.



SQL> create or replace directory test_dump as '/yb_oradata/transport';

Directory created.

SQL> grant read,write on directory test_dump to public;

Grant succeeded.



[IBMP740-1:oracle:/yb_oradata/transport]$impdp system/system directory=test_dump dumpfile=tspitr_test.dmp transport_datafiles=/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_04q358mt,/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_03q358mq
Import: Release 10.2.0.4.0 - 64bit Production on Monday, 30 March, 2015 19:12:27

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01":  system/******** directory=test_dump dumpfile=tspitr_test.dmp transport_datafiles=/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_04q358mt,/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_03q358mq
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" successfully completed at 19:12:33

SQL> select count(*) from tspitr.tspitr;

  COUNT(*)
----------
     50315

SQL> select count(*) from test.test;

  COUNT(*)
----------
     50316

可以看到执行成功。虽然源数据库是10.2.0.5,但其compatible为10.2.0.1.0,而目标数据库是10.2.0.4,其compatible参数为10.2.0.3.0,传输表空间是能成功执行的。

下面将源数据库的compatible修改为10.2.0.3.0(与目标数据库的compatible一样)

[oracle@weblogic28 dbs]$ vi initjytest.ora
省....
*.compatible='10.2.0.3.0'

SQL> startup pfile='/u01/app/oracle/product/10.2.0/db/dbs/initjytest.ora';
ORACLE instance started.

Total System Global Area 1073741824 bytes
Fixed Size                  2101736 bytes
Variable Size             335547928 bytes
Database Buffers          729808896 bytes
Redo Buffers                6283264 bytes
Database mounted.
Database opened.
SQL> show parameter compatible

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
compatible                           string      10.2.0.3.0

使用RMAN将源数据库中的表空间tspitr,test转换为目标平台字节序格式,使用format参数来控制被转换后数据文件的文件名和存储目录.

[oracle@weblogic28 dbs]$ rman target/

Recovery Manager: Release 10.2.0.5.0 - Production on Mon Mar 30 19:47:07 2015

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

connected to target database: JYTEST (DBID=3911337604)


RMAN> convert tablespace "TSPITR","TEST"
2> to platform 'AIX-Based Systems (64-bit)'
3> format ='/u02/transport/%U';

Starting backup at 2015-03-30 19:47:28
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=141 devtype=DISK
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00006 name=/u01/app/oracle/oradata/jytest/tspitr01.dbf
converted datafile=/u02/transport/data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_05q35a6h
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:01
channel ORA_DISK_1: starting datafile conversion
input datafile fno=00007 name=/u01/app/oracle/oradata/jytest/test01.dbf
converted datafile=/u02/transport/data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_06q35a6i
channel ORA_DISK_1: datafile conversion complete, elapsed time: 00:00:01
Finished backup at 2015-03-30 19:47:31

使用导出工具创建传输表空间元数据dump文件

[oracle@weblogic28 dbs]$ expdp \'sys/system as sysdba\' directory=test_dump dumpfile=tspitr_test.dmp logfile=tspitr_test.log transport_tablespaces=TSPITR,TEST

Export: Release 10.2.0.5.0 - 64bit Production on Monday, 30 March, 2015 19:48:36

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SYS"."SYS_EXPORT_TRANSPORTABLE_01":  "sys/******** AS SYSDBA" directory=test_dump dumpfile=tspitr_test.dmp logfile=tspitr_test.log transport_tablespaces=TSPITR,TEST
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Master table "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TRANSPORTABLE_01 is:
  /u02/transport/tspitr_test.dmp
Job "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully completed at 19:48:50


[oracle@weblogic28 transport]$ ls -lrt
total 102624
-rw-r----- 1 oracle oinstall 52436992 Mar 30 19:47 data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_05q35a6h
-rw-r----- 1 oracle oinstall 52436992 Mar 30 19:47 data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_06q35a6i
-rw-r--r-- 1 oracle oinstall      986 Mar 30 19:48 tspitr_test.log
-rw-r----- 1 oracle oinstall    90112 Mar 30 19:48 tspitr_test.dmp

将转换后存储在/u02/transport目录中的数据文件与导出的元数据文件tspitr_test.dmp传输到目标主机的目录/yb_oradata/transport中

ftp> get data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_05q35a6h
200 PORT command successful.
150 Opening BINARY mode data connection for data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_05q35a6h (52436992 bytes).
226 Transfer complete.
52436992 bytes received in 0.4461 seconds (1.148e+05 Kbytes/s)
local: data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_05q35a6h remote: data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_05q35a6h
ftp> get data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_06q35a6i
200 PORT command successful.
150 Opening BINARY mode data connection for data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_06q35a6i (52436992 bytes).
226 Transfer complete.
52436992 bytes received in 0.4481 seconds (1.143e+05 Kbytes/s)
local: data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_06q35a6i remote: data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_06q35a6i
ftp> get tspitr_test.dmp
200 PORT command successful.
150 Opening BINARY mode data connection for tspitr_test.dmp (90112 bytes).
226 Transfer complete.
90112 bytes received in 0.001821 seconds (4.833e+04 Kbytes/s)
local: tspitr_test.dmp remote: tspitr_test.dmp

将要被传输的表空间附加到目标数据库中

[IBMP740-1:oracle:/yb_oradata]$sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Mon Mar 30 11:31:47 2015

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create user tspitr identified by "tspitr";

User created.

SQL> grant dba,connect,resource to tspitr;

Grant succeeded.

SQL> create user test identified by "test";

User created.

SQL> grant dba,connect,resource to test;

Grant succeeded.



SQL> create or replace directory test_dump as '/yb_oradata/transport';

Directory created.

SQL> grant read,write on directory test_dump to public;

Grant succeeded.



[IBMP740-1:oracle:/yb_oradata/transport]$impdp system/system directory=test_dump dumpfile=tspitr_test.dmp transport_datafiles=/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_05q35a6h,/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_06q35a6i


Import: Release 10.2.0.4.0 - 64bit Production on Monday, 30 March, 2015 19:32:30

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01":  system/******** directory=test_dump dumpfile=tspitr_test.dmp transport_datafiles=/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TSPITR_FNO-6_05q35a6h,/yb_oradata/transport/data_D-JYTEST_I-3911337604_TS-TEST_FNO-7_06q35a6i
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "SYSTEM"."SYS_IMPORT_TRANSPORTABLE_01" successfully completed at 19:32:32

可以看到,将源数据库10.2.0.5的compatible设置为10.2.0.3.0与目标数据库10.2.0.4的compatible一样,执行传输表空间是可以执行成功的。

之前使用pfile参数文件参源数据库的compatible参数从10.2.0.1.0修改为10.2.0.3.0了,现在将其修改为10.2.0.1.0,并重新使用pfile文件启动源数据库会报错。

[oracle@weblogic28 dbs]$ vi initjytest.ora
省....
*.compatible='10.2.0.1.0'

SQL> startup pfile='/u01/app/oracle/product/10.2.0/db/dbs/initjytest.ora';
ORACLE instance started.

Total System Global Area 1073741824 bytes
Fixed Size                  2101736 bytes
Variable Size             335547928 bytes
Database Buffers          729808896 bytes
Redo Buffers                6283264 bytes
ORA-00201: control file version 10.2.0.3.0 incompatible with ORACLE version
10.2.0.1.0
ORA-00202: control file: '/u01/app/oracle/oradata/jytest/control01.ctl'

这也就证明从oracle 10g开始,compatible参数只能增大,不能缩小。

总结:
1.跨小版本执行跨平台与字节序的表空间传输时,源数据库的compatible参数必须小于或等于目标数据库的compatible参数
2.当源数据库的compatible参数大于目标数据库的compatible参数时,要执行跨平台与字节序的表空间传输只有将目标数据库升级为与源数据库相同的版本号且compatible参数要大于或等于源数据库的compatible参数。
3.从oracle 10g开始,compatible参数只能增大,不能缩小。

nls_timestamp_format参数在11.2中的变化

nls_timestamp_format参数在11.2.0.2及以后版本通过pfile或spfile或都不能进行修改了,在会话级还是能进行修改,Oracle提供若干NLS参数定制数据库和客户机以适应本地格式,例如有NLS_LANGUAGE,NLS_DATE_FORMAT,NLS_CALENDER等,可以通过查询以下数据字典或v$视图查看。
NLS_DATABASE_PARAMETERS:显示数据库当前NLS参数取值,包括数据库字符集取值
NLS_SESSION_PARAMETERS:显示由NLS_LANG设置的参数,或经过alter session改变后的参数值(不包括由NLS_LANG设置的客户端字符集)
NLS_INSTANCE_PARAMETE:显示由参数文件init.ora定义的参数
V$NLS_PARAMETERS:显示数据库当前NLS参数取值

使用下列方法可以修改NLS参数
(1)更新props$
(2)修改实例启动时使用的初始化参数文件
(3)修改环境变量NLS_LANG
(4)使用ALTER SESSION语句,在oracle会话中修改
(5)使用某些SQL函数

NLS作用优先级别:Sql function > alter session >环境变量或注册表>参数文件>数据库默认参数。如果会话级与实例级别和数据库级别参数不一致,就会以会话级的为准,因为会话级别的参数优先级高于实例级别和数据库级别的参数。客户端的环境变量或注册表会对会话参数产生影响,比如NLS_LANG参数
NLS_LANG==language_territory.client character set
Language:显示oracle消息,校验,日期命名
Territory:指定默认日期、数字、货币等格式
Client character set:指定客户端将使用的字符集
例如:NLS_LANG=AMERICAN_AMERICA.US7ASCII
AMERICAN是语言,AMERICA是地区,US7ASCII是客户端字符集,那么就会影响 nls_date_format,nls_timestamp_foramt等日期的格式。

下面通过操作来进行验证。
在会话级进行修改

SQL> select version from v$instance;

VERSION
-----------------
11.2.0.4.0

SQL> select a.name,a.VALUE from v$parameter a where a.name='nls_timestamp_format';

NAME                           VALUE
------------------------------ ----------------------------------------
nls_timestamp_format           DD-MON-RR HH.MI.SSXFF AM

将nls_timestamp_format参数设置成yyyy-mm-dd hh24:mi:ssxff格式

SQL> alter session set NLS_TIMESTAMP_FORMAT='yyyy-mm-dd hh24:mi:ssxff';

Session altered.

查看是否在会话级修改成功

SQL>  select * from NLS_SESSION_PARAMETERS;

PARAMETER                                                    VALUE
------------------------------------------------------------ ----------------------------------------
NLS_LANGUAGE                                                 AMERICAN
NLS_TERRITORY                                                AMERICA
NLS_CURRENCY                                                 $
NLS_ISO_CURRENCY                                             AMERICA
NLS_NUMERIC_CHARACTERS                                       .,
NLS_CALENDAR                                                 GREGORIAN
NLS_DATE_FORMAT                                              DD-MON-RR
NLS_DATE_LANGUAGE                                            AMERICAN
NLS_SORT                                                     BINARY
NLS_TIME_FORMAT                                              HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT                                         yyyy-mm-dd hh24:mi:ssxff

PARAMETER                                                    VALUE
------------------------------------------------------------ ----------------------------------------
NLS_TIME_TZ_FORMAT                                           HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT                                      DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY                                            $
NLS_COMP                                                     BINARY
NLS_LENGTH_SEMANTICS                                         BYTE
NLS_NCHAR_CONV_EXCP                                          FALSE

17 rows selected.

SQL> show parameter nls_timestamp_format

NAME                                 TYPE                           VALUE
------------------------------------ ------------------------------ ------------------------------
nls_timestamp_format                 string                         yyyy-mm-dd hh24:mi:ssxff

从上面的查询可以看到在会话级nls_timestamp_format参数设置成了yyyy-mm-dd hh24:mi:ssxff格式。

下面通过alter system语句与spfile参数文件来在系统级别进行修改

SQL> alter system set NLS_TIMESTAMP_FORMAT='yyyy-mm-dd hh24:mi:ssxff' scope=spfile;

System altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area  669581312 bytes
Fixed Size                  1366724 bytes
Variable Size             306185532 bytes
Database Buffers          356515840 bytes
Redo Buffers                5513216 bytes
Database mounted.
Database opened.

重新启动实例后,查看nls_timestamp_format参数的值没有被修改

SQL> show parameter nls_timestamp_format

NAME                                 TYPE                           VALUE
------------------------------------ ------------------------------ ------------------------------
nls_timestamp_format                 string                         DD-MON-RR HH.MI.SSXFF AM

通过pfile来检查spfile文件中nls_timestamp_format参数的设置可以发现,alter system对nls_timestamp_format的修改并没有存储到spfile文件中。

SQL> create pfile='$ORACLE_BASE/inittemp1.ora' from spfile; 

File created.
[oracle@rac2 oracle]$ cat inittemp.ora
rac2.__db_cache_size=419430400
rac1.__db_cache_size=419430400
rac1.__java_pool_size=4194304
rac2.__java_pool_size=4194304
rac2.__large_pool_size=8388608
rac1.__large_pool_size=8388608
rac2.__oracle_base='/u01/app/oracle'#ORACLE_BASE set from environment
rac1.__oracle_base='/u01/app/oracle'#ORACLE_BASE set from environment
rac1.__pga_aggregate_target=125829120
rac2.__pga_aggregate_target=125829120
rac1.__sga_target=671088640
rac2.__sga_target=671088640
rac1.__shared_io_pool_size=0
rac2.__shared_io_pool_size=0
rac1.__shared_pool_size=230686720
rac2.__shared_pool_size=230686720
rac1.__streams_pool_size=0
rac2.__streams_pool_size=0
*.audit_file_dest='/u01/app/oracle/admin/rac/adump'
*.audit_trail='db'
*.cluster_database=true
*.compatible='11.2.0.4.0'
*.control_files='+DATADG/rac/controlfile/current.265.864929297'
*.db_block_size=8192
*.db_create_file_dest='+DATADG'
*.db_domain=''
*.db_name='rac'
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=racXDB)'
rac2.instance_number=2
rac1.instance_number=1
*.open_cursors=300
*.pga_aggregate_target=125829120
*.processes=150
*.remote_listener='rac-scan:1521'
*.remote_login_passwordfile='exclusive'
*.sga_max_size=671088640
*.sga_target=671088640
rac2.thread=2
rac1.thread=1
#rac1.undo_tablespace='UNDOTBS1'
#rac2.undo_tablespace='UNDOTBS2'

手工向pfile参数文件中增加nls_timestamp_format参数设置

[oracle@rac2 oracle]$ cat inittemp.ora
rac2.__db_cache_size=419430400
rac1.__db_cache_size=419430400
rac1.__java_pool_size=4194304
rac2.__java_pool_size=4194304
rac2.__large_pool_size=8388608
rac1.__large_pool_size=8388608
rac2.__oracle_base='/u01/app/oracle'#ORACLE_BASE set from environment
rac1.__oracle_base='/u01/app/oracle'#ORACLE_BASE set from environment
rac1.__pga_aggregate_target=125829120
rac2.__pga_aggregate_target=125829120
rac1.__sga_target=671088640
rac2.__sga_target=671088640
rac1.__shared_io_pool_size=0
rac2.__shared_io_pool_size=0
rac1.__shared_pool_size=230686720
rac2.__shared_pool_size=230686720
rac1.__streams_pool_size=0
rac2.__streams_pool_size=0
*.audit_file_dest='/u01/app/oracle/admin/rac/adump'
*.audit_trail='db'
*.cluster_database=true
*.compatible='11.2.0.4.0'
*.control_files='+DATADG/rac/controlfile/current.265.864929297'
*.db_block_size=8192
*.db_create_file_dest='+DATADG'
*.db_domain=''
*.db_name='rac'
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=racXDB)'
rac2.instance_number=2
rac1.instance_number=1
*.open_cursors=300
*.pga_aggregate_target=125829120
*.processes=150
*.remote_listener='rac-scan:1521'
*.remote_login_passwordfile='exclusive'
*.sga_max_size=671088640
*.sga_target=671088640
rac2.thread=2
rac1.thread=1
nls_timestamp_format='yyyy-mm-dd hh24:mi:ssxff'

使用增加nls_timestamp_format参数的pfile文件来启动数据库,报错不能启动数据库。

SQL> startup pfile='$ORACLE_BASE/inittemp.ora'
ORACLE instance started.

Total System Global Area  669581312 bytes
Fixed Size                  1366724 bytes
Variable Size             306185532 bytes
Database Buffers          356515840 bytes
Redo Buffers                5513216 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-30012: undo tablespace 'hh24:mi:ssxff'' does not exist or of wrong type
Process ID: 7669
Session ID: 1 Serial number: 5

尝试通过props$来进行修改

SQL> update sys.props$ set value$='yyyy-mm-dd hh24:mi:ssxff' where name='NLS_TIMESTAMP_FORMAT'; 

1 row updated.

SQL> commit;

Commit complete.

SQL> select name, value$ from sys.props$ where name='NLS_TIMESTAMP_FORMAT';

NAME
------------------------------
VALUE$
--------------------------------------------------------------------------------
NLS_TIMESTAMP_FORMAT
yyyy-mm-dd hh24:mi:ssxff

修改之后查看数据库级别nls_timestamp_format参数值已经修改成功

SQL> select parameter,value from nls_database_parameters where parameter='NLS_TIMESTAMP_FORMAT';

PARAMETER
------------------------------
VALUE
--------------------------------------------------------------------------------
NLS_TIMESTAMP_FORMAT
yyyy-mm-dd hh24:mi:ssxff

修改之后查看实例级别nls_timestamp_format参数值已经修改成功

SQL> select parameter,value from nls_instance_parameters where parameter='NLS_TIMESTAMP_FORMAT';

PARAMETER
------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
NLS_TIMESTAMP_FORMAT
yyyy-mm-dd hh24:mi:ssxff

修改之后查看会话级别nls_timestamp_format参数值没有修改成功

SQL> select parameter,value from nls_session_parameters where parameter='NLS_TIMESTAMP_FORMAT';

PARAMETER
------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
NLS_TIMESTAMP_FORMAT
DD-MON-RR HH.MI.SSXFF AM

查看客户端环境变量nls_lang= AMERICAN_AMERICA.ZHS16GBK,所以在会话级nls_timestamp_format的格式仍然为DD-MON-RR HH.MI.SSXFF AM

[oracle@rac2 ~]$ vi .bash_profile
# .bash_profile

# Get the aliases and functions
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH

TEMP=/u01/tmp
TMPDIR=/u01/tmp
export TEMP TMPDIR
export LD_ASSUME_KERNEL=2.6.9
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/db
export ORACLE_SID=rac2
export ORACLE_UNQNAME=rac
export NLS_LANG=AMERICAN_AMERICA.ZHS16GBK
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
export LD_LIBRARY_PATH
export PATH=$PATH:$ORACLE_HOME/bin
CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib
CLASSPATH=$CLASSPATH:$ORACLE_HOME/network/jlib
export CLASSPATH

修改环境变量,这里选择去掉nls_lang

[oracle@rac2 ~]$ vi .bash_profile
# .bash_profile

# Get the aliases and functions
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH

TEMP=/u01/tmp
TMPDIR=/u01/tmp
export TEMP TMPDIR
export LD_ASSUME_KERNEL=2.6.9
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/db
export ORACLE_SID=rac2
export ORACLE_UNQNAME=rac
#export NLS_LANG=AMERICAN_AMERICA.ZHS16GBK
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
export LD_LIBRARY_PATH
export PATH=$PATH:$ORACLE_HOME/bin
CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib
CLASSPATH=$CLASSPATH:$ORACLE_HOME/network/jlib
export CLASSPATH

再次通过客户端连接到数据库查看nls_timestamp_foramt参数

[oracle@rac2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Tue Aug 18 09:01:23 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

数据库级别nls_timestamp_foramt参数为修改后的yyyy-mm-dd hh24:mi:ssxff

SQL> select parameter,value from nls_database_parameters where parameter='NLS_TIMESTAMP_FORMAT';

PARAMETER
------------------------------
VALUE
--------------------------------------------------------------------------------
NLS_TIMESTAMP_FORMAT
yyyy-mm-dd hh24:mi:ssxff

实例级别nls_timestamp_foramt参数为修改后的yyyy-mm-dd hh24:mi:ssxff

SQL> select parameter,value from nls_instance_parameters where parameter='NLS_TIMESTAMP_FORMAT';

PARAMETER
------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
NLS_TIMESTAMP_FORMAT
yyyy-mm-dd hh24:mi:ssxff

会话级别nls_timestamp_foramt参数为修改后的yyyy-mm-dd hh24:mi:ssxff,说明nls_lang环境变量没有对会话级产生影响了。

SQL>  select parameter,value from nls_session_parameters where parameter='NLS_TIMESTAMP_FORMAT';

PARAMETER
------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
NLS_TIMESTAMP_FORMAT
yyyy-mm-dd hh24:mi:ssxff


SQL> show parameter nls

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
nls_calendar                         string
nls_comp                             string      BINARY
nls_currency                         string
nls_date_format                      string
nls_date_language                    string
nls_dual_currency                    string
nls_iso_currency                     string
nls_language                         string      AMERICAN
nls_length_semantics                 string      BYTE
nls_nchar_conv_excp                  string      FALSE
nls_numeric_characters               string

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
nls_sort                             string
nls_territory                        string      AMERICA
nls_time_format                      string
nls_time_tz_format                   string
nls_timestamp_format                 string      yyyy-mm-dd hh24:mi:ssxff
nls_timestamp_tz_format              string

SQL> select name,value from v$parameter where name='nls_timestamp_format';

NAME
--------------------------------------------------------------------------------
VALUE
--------------------------------------------------------------------------------
nls_timestamp_format
yyyy-mm-dd hh24:mi:ssxff

用远程客户机(windows)用plsql连接需要设置环境变量nls_timestamp_foramt,将nls_timestamp_format设置成与props$中nls_timestamp_format参数相同的参数值,以确保会话级与实例级和数据库级一致,避免不一致所产生的问题。在设置完环境变后,用plsql连接数据库来进行检查

Connected to Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 
Connected as SYS
 
SQL> show parameter nls
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
nls_calendar                         string      GREGORIAN
nls_comp                             string      BINARY
nls_currency                         string      $
nls_date_format                      string      DD-MON-RR
nls_date_language                    string      AMERICAN
nls_dual_currency                    string      $
nls_iso_currency                     string      AMERICA
nls_language                         string      AMERICAN
nls_length_semantics                 string      BYTE
nls_nchar_conv_excp                  string      FALSE
nls_numeric_characters               string      .,
nls_sort                             string      BINARY
nls_territory                        string      AMERICA
nls_time_format                      string      HH.MI.SSXFF AM
nls_time_tz_format                   string      HH.MI.SSXFF AM TZR
nls_timestamp_format                 string      yyyy-mm-dd hh24:mi:ssxff
nls_timestamp_tz_format              string      DD-MON-RR HH.MI.SSXFF AM TZR
 
SQL> select name,value from v$parameter where name='nls_timestamp_format';
 
NAME                                                                             VALUE
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
nls_timestamp_format                                                             yyyy-mm-dd hh24:mi:ssxff
 
SQL> select to_timestamp('2015-08-18 09:39:01.19','yyyy-mm-dd hh24:mi:ssxff') from dual;
 
TO_TIMESTAMP('2015-08-1809:39:
--------------------------------------------------------------------------------
2015-08-18 09:39:01.190000000

总结:
Nls_timestamp_format在数在11.2.0.2及以后版本不能通过pfile或spfile来在实例级别进行修改。要想在实例级修改nls_timestamp_format可以通过props$来进行修改,这是在数据库级别进行修改,但由于不能在实例级别设置,那么实例级会继承数据库级别的设置,因此只要在数据库级别设置了nls_timestamp_format,那么实例级别的nls_timestamp_format也就等于设置了。

客户端的环境变量(UNIX/Linux)nls_lang会在会话级别影响nls_timestamp_format。

客户端的环境变量(windows)nls_timestamp_format会在会话级别影响nls_timestamp_format。

 

_awr_sql_child_limit是否能控制awr记录sql执行次数的问题

在ACOUG中国行长沙站的活动上分亨了一个关于high version count的主题,在活动中老盖提到在AWR报告的SQL ordered by Version Count部分,当SQL的version count大于等于200时就不会记录SQL的Executions,但是今天在优化SQL时仔细观察了一下情况并不是这样。以下是10.2.0.4的一个AWR报告的SQL ordered by Version Count
_awr_sql_child_limit_2

下面是11.2.0.1.0的一个AWR报告的SQL ordered by Version Count
_awr_sql_child_limit_3
如上图所展现的一样,在oracle 10g,11g中,虽然_awr_sql_child_limit参数设置200,但还是记录了version count大于200的SQL语句的执行次数。而在有的awr报告中又存在version count大于等于200的sql在awr报告中不会记录它的执行次数,我查询了与awr相关的所有参数并没有找到其它用来控制的参数,在MOS上也没有找到与之有关的文章。但这应该是BUG,但其对Oracle的运行没有影响。

执行时间在1秒以下的SQL同样也会引发性能问题

某客户的操作人员反应很慢不能操作,管理人员登录小机系统后发现CPU使用到了95%以上。而且这种情况持续了几个月。小机是IBM的P520,配置是2颗4核的CPU,内存是32G,Oracle是10.2.0.5。topas与生成的AWR报告如下:
10

3

4

从上面的load profile部分可以看到每秒执行的sql与事务数并不高,因为是周末并没有太多人使用系统。
5

从上面的top 等待事件来看主要是CPU time。如是查看这个时间段生成的ADDM报告:

          DETAILED ADDM REPORT FOR TASK 'TASK_72988' WITH ID 72988
          --------------------------------------------------------

              Analysis Period: 20-DEC-2015 from 09:00:33 to 10:00:34
         Database ID/Instance: 1329630138/1
      Database/Instance Names: ORCL/orcl
                    Host Name: dbserv
             Database Version: 10.2.0.5.0
               Snapshot Range: from 26245 to 26246
                Database Time: 16644 seconds
        Average Database Load: 4.6 active sessions

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

FINDING 1: 93% impact (15444 seconds)
-------------------------------------
SQL statements consuming significant database time were found.

   RECOMMENDATION 1: SQL Tuning, 100% benefit (38004 seconds)
      ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
         "fb44z8kbnu8wg".
         RELEVANT OBJECT: SQL statement with SQL_ID fb44z8kbnu8wg and
         PLAN_HASH 1055903002
         SELECT COUNT(EmailDetail.F_EMAIL_DETAIL_ID) AS UnReadCount FROM
         T_LK_EMAIL_EMAILGROUP EmailGroup LEFT OUTER JOIN T_LK_EMAIL_DETAIL
         EmailDetail ON EmailGroup.F_EMAIL_GROUP_ID=EmailDetail.F_EMAIL_GROUP_
         ID AND (EmailDetail.F_RECEIVER_ID=:param0 OR
         (EmailDetail.F_SENDER_ID=:param0 AND EmailDetail.F_RECEIVER_ID IS
         NULL)) AND EmailDetail.F_EMAIL_STATE='0' AND
         EmailDetail.F_RECORD_STATE='0' WHERE EmailGroup.F_PARENT_ID=0 AND
         (EmailGroup.F_EMAIL_GROUP_ID!=4 OR (EmailGroup.F_GROUP_TYPE='USER'
         AND EmailGroup.F_USER_ID=:param0))
      ACTION: Investigate the SQL statement with SQL_ID "fb44z8kbnu8wg" for
         possible performance improvements.
         RELEVANT OBJECT: SQL statement with SQL_ID fb44z8kbnu8wg and
         PLAN_HASH 1055903002
         SELECT COUNT(EmailDetail.F_EMAIL_DETAIL_ID) AS UnReadCount FROM
         T_LK_EMAIL_EMAILGROUP EmailGroup LEFT OUTER JOIN T_LK_EMAIL_DETAIL
         EmailDetail ON EmailGroup.F_EMAIL_GROUP_ID=EmailDetail.F_EMAIL_GROUP_
         ID AND (EmailDetail.F_RECEIVER_ID=:param0 OR
         (EmailDetail.F_SENDER_ID=:param0 AND EmailDetail.F_RECEIVER_ID IS
         NULL)) AND EmailDetail.F_EMAIL_STATE='0' AND
         EmailDetail.F_RECORD_STATE='0' WHERE EmailGroup.F_PARENT_ID=0 AND
         (EmailGroup.F_EMAIL_GROUP_ID!=4 OR (EmailGroup.F_GROUP_TYPE='USER'
         AND EmailGroup.F_USER_ID=:param0))
      RATIONALE: SQL statement with SQL_ID "fb44z8kbnu8wg" was executed 16029
         times and had an average elapsed time of 0.93 seconds.

  

FINDING 2: 76% impact (12602 seconds)
-------------------------------------
Time spent on the CPU by the instance was responsible for a substantial part
of database time.

   RECOMMENDATION 1: SQL Tuning, 100% benefit (38004 seconds)
      ACTION: Run SQL Tuning Advisor on the SQL statement with SQL_ID
         "fb44z8kbnu8wg".
         RELEVANT OBJECT: SQL statement with SQL_ID fb44z8kbnu8wg and
         PLAN_HASH 1055903002
         SELECT COUNT(EmailDetail.F_EMAIL_DETAIL_ID) AS UnReadCount FROM
         T_LK_EMAIL_EMAILGROUP EmailGroup LEFT OUTER JOIN T_LK_EMAIL_DETAIL
         EmailDetail ON EmailGroup.F_EMAIL_GROUP_ID=EmailDetail.F_EMAIL_GROUP_
         ID AND (EmailDetail.F_RECEIVER_ID=:param0 OR
         (EmailDetail.F_SENDER_ID=:param0 AND EmailDetail.F_RECEIVER_ID IS
         NULL)) AND EmailDetail.F_EMAIL_STATE='0' AND
         EmailDetail.F_RECORD_STATE='0' WHERE EmailGroup.F_PARENT_ID=0 AND
         (EmailGroup.F_EMAIL_GROUP_ID!=4 OR (EmailGroup.F_GROUP_TYPE='USER'
         AND EmailGroup.F_USER_ID=:param0))
      ACTION: Investigate the SQL statement with SQL_ID "fb44z8kbnu8wg" for
         possible performance improvements.
         RELEVANT OBJECT: SQL statement with SQL_ID fb44z8kbnu8wg and
         PLAN_HASH 1055903002
         SELECT COUNT(EmailDetail.F_EMAIL_DETAIL_ID) AS UnReadCount FROM
         T_LK_EMAIL_EMAILGROUP EmailGroup LEFT OUTER JOIN T_LK_EMAIL_DETAIL
         EmailDetail ON EmailGroup.F_EMAIL_GROUP_ID=EmailDetail.F_EMAIL_GROUP_
         ID AND (EmailDetail.F_RECEIVER_ID=:param0 OR
         (EmailDetail.F_SENDER_ID=:param0 AND EmailDetail.F_RECEIVER_ID IS
         NULL)) AND EmailDetail.F_EMAIL_STATE='0' AND
         EmailDetail.F_RECORD_STATE='0' WHERE EmailGroup.F_PARENT_ID=0 AND
         (EmailGroup.F_EMAIL_GROUP_ID!=4 OR (EmailGroup.F_GROUP_TYPE='USER'
         AND EmailGroup.F_USER_ID=:param0))
      RATIONALE: SQL statement with SQL_ID "fb44z8kbnu8wg" was executed 16029
         times and had an average elapsed time of 0.93 seconds.
      RATIONALE: Average CPU used per execution was 0.76 seconds.

  
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

          ADDITIONAL INFORMATION
          ----------------------

Wait class "Application" was not consuming significant database time.
Wait class "Commit" was not consuming significant database time.
Wait class "Concurrency" was not consuming significant database time.
Wait class "Configuration" was not consuming significant database time.
Wait class "Network" was not consuming significant database time.
Wait class "User I/O" was not consuming significant database time.
Session connect and disconnect calls were not consuming significant database
time.
Hard parsing of SQL statements was not consuming significant database time.

The database's maintenance windows were active during 99% of the analysis
period.

The analysis of I/O performance is based on the default assumption that the
average read time for one database block is 10000 micro-seconds.

An explanation of the terminology used in this report is available when you
run the report with the 'ALL' level of detail.

task_name
---------
TASK_72988

6

7

从上面的信息ADDM报告与top sql部分可以看到在快照26245到26246之间Database Time为16644秒。而找到的一条SQL消耗了76%的CPU时间。如果对这两条SQL执行优化应该可以将CPU消耗显著降低。而该SQL虽然每执行一次的时间是0.93秒,消耗的CPU时间只有0.76秒,但在周末的时间内一个小时都执行了16,029次,消耗的CPU时间是12249秒而且小机的CPU数量只有2颗(6核),那么每秒该SQL的执行次数就是=16029/3600=4.5次,所以大部分的CPU被该SQL所消耗了。这还是周末,如果上班时间该SQL执行的次数会以倍数增加,那么CPU的消耗就会更高。

SQL语句如下:

SELECT COUNT(EmailDetail.F_EMAIL_DETAIL_ID) AS UnReadCount
  FROM T_LK_EMAIL_EMAILGROUP EmailGroup
  LEFT OUTER JOIN T_LK_EMAIL_DETAIL EmailDetail ON EmailGroup.F_EMAIL_GROUP_ID =
                                                   EmailDetail.F_EMAIL_GROUP_ID
                                               AND (EmailDetail.F_RECEIVER_ID =3440
                                                    OR
                                                   (EmailDetail.F_SENDER_ID =3418
                                                    AND
                                                   EmailDetail.F_RECEIVER_ID IS NULL))
                                               AND EmailDetail.F_EMAIL_STATE = '0'
                                               AND EmailDetail.F_RECORD_STATE = '0'
 WHERE EmailGroup.F_PARENT_ID = 0
   AND (EmailGroup.F_EMAIL_GROUP_ID != 4 OR
       (EmailGroup.F_GROUP_TYPE = 'USER' AND
       EmailGroup.F_USER_ID = 23402));

其执行计划如下:
8
从执行计划来看该SQL的cost也不高(执行时间是0.93秒,cpu时间是0.76秒),从SQL的执行计划来看见两个表是使用的嵌套循环,而驱动表T_LK_EMAIL_EMAILGROUP的数据量是1w多行,T_LK_EMAIL_DETAIL表的数量是20w行左右。而表T_LK_EMAIL_EMAILGROUP执行全表扫描后满足查询条件的记录有4条,所以就得对表T_LK_EMAIL_DETAIL中的记录遍历4次来找到与驱动表相匹配的记录,虽然每次执行时间不长,但是在并发执行次数高,而物理CPU数量不足的情况下还是会引发性能问题。而这两个表有等值连接条件F_EMAIL_GROUP_ID,而且在驱动表中F_EMAIL_GROUP_ID列创建了索引,所以这里选择在表T_LK_EMAIL_DETAIL表的F_EMAIL_GROUP_ID列上创建索引之后执行计划如下所示:
9
SQL执行计划的COST显示增加了,但SQL执行时间只有0.1s提高了10倍。在对该SQL优化之后,业务系统恢复正常,CPU使用率也维持在20%左右。
11

所以在优化时,不能简单的根据SQL执行时间来判断该SQL是否会引发性能问题,要具体问题具体分析。

AIX filesystemcache引发的Oracle事故

今天上班同事说医保数据库expdp导出没有完成,同时医保业务人员报告登录系统有时能登录,有时不能登录。Expdp导出日志如下:

[IBMP740-1:root:/yb_oradata/RLZYbak/dpdump]#cat insur_changde_150921_2330.log
Export: Release 10.2.0.4.0 - 64bit Production on Monday, 21 September, 2015 23:30:00

Copyright (c) 2003, 2007, Oracle.  All rights reserved.
;;; 
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "INSUR_CHANGDE"."SYS_EXPORT_SCHEMA_07":  insur_changde/******** directory=dump_RLZY dumpfile=insur_changde_150921_2330.dmp logfile=insur_changde_150921_2330.log 
Estimate in progress using BLOCKS method...
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 492.0 GB
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/ROLE_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/SYNONYM/SYNONYM
Processing object type SCHEMA_EXPORT/TYPE/TYPE_SPEC
Processing object type SCHEMA_EXPORT/DB_LINK
Processing object type SCHEMA_EXPORT/SEQUENCE/SEQUENCE
Processing object type SCHEMA_EXPORT/SEQUENCE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/COMMENT
Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_SPEC
Processing object type SCHEMA_EXPORT/FUNCTION/FUNCTION
Processing object type SCHEMA_EXPORT/FUNCTION/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/PROCEDURE/PROCEDURE
Processing object type SCHEMA_EXPORT/PROCEDURE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/PACKAGE/COMPILE_PACKAGE/PACKAGE_SPEC/ALTER_PACKAGE_SPEC
Processing object type SCHEMA_EXPORT/FUNCTION/ALTER_FUNCTION
Processing object type SCHEMA_EXPORT/PROCEDURE/ALTER_PROCEDURE
Processing object type SCHEMA_EXPORT/VIEW/VIEW
Processing object type SCHEMA_EXPORT/VIEW/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/PACKAGE/PACKAGE_BODY
Processing object type SCHEMA_EXPORT/TYPE/TYPE_BODY
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/TRIGGER
Processing object type SCHEMA_EXPORT/TABLE/INDEX/FUNCTIONAL_AND_BITMAP/INDEX
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_AND_BITMAP/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/JOB
. . exported "INSUR_CHANGDE"."MT_BIZ_SCENE_FIN"          51.86 GB 1111873243 rows
. . exported "INSUR_CHANGDE"."MT_FEE_FIN"                22.76 GB 133817090 rows

从上面的expdp日志信息来看并没有错误,更像理导出进程停止,如果查看dba_datapump_job视图来查看有没有被异常终止的epxdp导出job
1

从上面的信息可以看到insur_changde用户出现了多个expdp导出异常终止的job。从job命名规则可以看到最近的异常终止job是sys_export_schema_07,而且状态是空闲的。那么重新连接sys_export_schema_07这个job来查看job状态。

[IBMP740-1:oracle:/yb_oradata/RLZYbak]$expdp 'insur_changde/"power$20140224"' attach=SYS_EXPORT_SCHEMA_07

Export: Release 10.2.0.4.0 - 64bit Production on Tuesday, 22 September, 2015 16:51:51

Copyright (c) 2003, 2007, Oracle.  All rights reserved.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

Job: SYS_EXPORT_SCHEMA_07
  Owner: INSUR_CHANGDE                  
  Operation: EXPORT                         
  Creator Privs: FALSE                          
  GUID: 20448E2327C5015EE053C0A80201015E
  Start Time: Tuesday, 22 September, 2015 16:51:56
  Mode: SCHEMA                         
  Instance: RLZY
  Max Parallelism: 1
  EXPORT Job Parameters:
  Parameter Name      Parameter Value:
     CLIENT_COMMAND        insur_changde/******** directory=dump_RLZY dumpfile=insur_changde_150921_2330.dmp logfile=insur_changde_150921_2330.log 
  State: IDLING                         
  Bytes Processed: 80,139,523,792
  Percent Done: 41
  Current Parallelism: 1
  Job Error Count: 0
  Dump File: /yb_oradata/RLZYbak/dpdump/insur_changde_150921_2330.dmp
    bytes written: 80,145,354,752
  
Worker 1 Status:
  State: UNDEFINED                      
  Object Schema: INSUR_CHANGDE
  Object Name: LV_INDIPAR
  Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA
  Completed Objects: 3
  Total Objects: 1,225
  Completed Rows: 288,824,659
  Worker Parallelism: 1
  
Worker 1 Status:
  State: UNDEFINED                      
  Object Schema: INSUR_CHANGDE
  Object Name: LV_INDIPAR
  Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA
  Completed Objects: 3
  Total Objects: 1,225
  Completed Rows: 288,824,659
  Worker Parallelism: 1

从上面的信息可以看出现在expdp job正在导出的表为LV_INDIPAR表,状态为UNDEFINED,也没有其它有用信息。那么为什么expdp job会异常终止了。检查alert.log文件在执行expdp导出时出现了以下错误信息。

Starting control autobackup
Control autobackup written to SBT_TAPE device
	comment 'API Version 2.0,MMS Version 1.2.0.0',
	media 'backup_nw.023.RO'
	handle 'c-1589671076-20150921-00'
Mon Sep 21 23:30:02 2015
The value (30) of MAXTRANS parameter ignored.
kupprdp: master process DM00 started with pid=188, OS id=23527444
         to execute - SYS.KUPM$MCP.MAIN('SYS_EXPORT_SCHEMA_07', 'INSUR_CHANGDE', 'KUPC$C_1_20150921233002', 'KUPC$S_1_20150921233002', 0);
kupprdp: worker process DW01 started with worker id=1, pid=189, OS id=1704856
         to execute - SYS.KUPW$WORKER.MAIN('SYS_EXPORT_SCHEMA_07', 'INSUR_CHANGDE');
Tue Sep 22 00:24:18 2015
ksvcreate: Process(q001) creation failed
Tue Sep 22 00:24:38 2015
Process startup failed, error stack:
Tue Sep 22 00:24:39 2015
Errors in file /oracle/admin/RLZY/bdump/rlzy_psp0_7471450.trc:
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-27301: OS failure message: Not enough space
ORA-27302: failure occurred at: skgpspawn3
Tue Sep 22 00:24:39 2015
Process q001 died, see its trace file
Tue Sep 22 00:24:39 2015
ksvcreate: Process(q001) creation failed
Tue Sep 22 00:24:51 2015

从上面信息可以看到expdp job是在21号的23:30开始执行,在22号的00:24:39出现了故障并在/oracle/admin/RLZY/bdump/rlzy_psp0_7471450.trc文件中生成了错误信息如下。

*** 2015-09-20 00:24:36.347
Process startup failed, error stack:
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-27301: OS failure message: Not enough space
ORA-27302: failure occurred at: skgpspawn3

根据MOS文章Troubleshooting ORA-27300 ORA-27301 ORA-27302 errors (Doc ID 579365.1),出现这种错误信息主要是因为内存或交换区被用尽的原因,如是检查系统内存与交换区的使用情况

[IBMP740-1:root:/]#topas_nmon
lqtopas_nmonqqh=HelpqqqqqqqqqqqqqHost=IBMP740-1qqqqqqRefresh=2 secsqqq16:58.25
x Memory x
x          Physical  PageSpace |        pages/sec  In     Out | FileSystemCache                                                                                       x
x% Used       99.8%     68.0%  | to Paging Space   0.0    0.0 | (numperm) 49.6%                                                                                       x
x% Free        0.2%     32.0%  | to File System  586.6   11.7 | Process   42.7%                                                                                       x
xMB Used   63572.0MB 11142.9MB | Page Scans      126.5        | System     7.6%                                                                                       x
xMB Free     108.0MB  5241.1MB | Page Cycles       0.0        | Free       0.2%                                                                                       x
xTotal(MB) 63680.0MB 16384.0MB | Page Steals     126.5        |           ------                                                                                      x
x                              | Page Faults    1317.1        | Total    100.0%                                                                                       x
x------------------------------------------------------------ | numclient 49.6%                                                                                       x
xMin/Maxperm     1853MB(  3%)  55589MB( 90%) < --% of RAM      | maxclient 90.0%                                                                                       x
xMin/Maxfree     960   1088       Total Virtual   78.2GB      | User      89.3%                                                                                       x
xMin/Maxpgahead    2      8    Accessed Virtual   33.6GB 43.0%| Pinned     9.4%                                                                                       x
x                                                             | lruable pages   15811872.0  

从上面的信息可以看到物理内存为63680.0MB,交换区为16384.0M了,物理内存使用了63572.0M,交换区使用了11142.9M,物理内存了可用内存只有108.0M占总物理内存的0.2%,交换区是5241.1M占总交换区的32%。FileSystemCache (numperm) 49.6% 说明AIX 文件系统缓存占用了物理内存的49.6%,Process 42.7%说明进程占用了物理内存的42.7%,System 7.6%说明系统占用了物理内存的7.6%, Free 0.2%说明了可用的物理内存只有0.2%。并且可以看到Maxperm=90%,maxclient=90%,说明文件系统缓存使用物理内存的最大限制为物理内存的90%。

检查AIX系统中消耗内存前10的进程,如下所示大部分是Oracle相关进程

[IBMP740-1:root:/]#ps -ealf | head -1 ; ps -ealf | sort -rn +9 | head 
       F S      UID      PID     PPID   C PRI NI ADDR    SZ    WCHAN    STIME    TTY  TIME CMD
  240001 A   oracle  6553662        1   0  60 20 a31123590 115936            Jun 27      - 22:05 ora_lgwr_RLZY
  240001 A   oracle 57671750        1   0  60 20 c41744590 111768 f1000e0004ee48c8   Sep 16      - 28:48 oracleRLZY (LOCAL=NO)
  240001 A   oracle 61735218        1   0  60 20 c44fc4590 109912 f1000e00100440c8   Sep 16      - 31:40 oracleRLZY (LOCAL=NO)
  240001 A   oracle 58982776        1   0  60 20 fb447b590 109528 f1000e0004a0b8c8   Sep 16      - 12:57 oracleRLZY (LOCAL=NO)
  240001 A   oracle 26935684        1   0  60 20 f416f4590 108264            Jun 27      -  2:07 ora_arc1_RLZY
  240001 A   oracle 26870144        1   0  60 20 cf16cf590 108264            Jun 27      -  2:37 ora_arc0_RLZY
  240001 A   oracle  7536818        1   0  60 20 a71127590 108248            Jun 27      - 15:59 ora_cjq0_RLZY
  240001 A   oracle  7733430        1   0  60 20 8a0e0a590 106096            Jun 27      -  8:54 ora_dbw0_RLZY
  240001 A   oracle  8913722        1  24  72 20 864c86590 104764            Sep 16      - 18:14 oracleRLZY (LOCAL=NO)
  240001 A   oracle 26214712        1   0  60 20 944194590 104584          16:51:55      -  0:00 ora_dm00_RLZY

[IBMP740-1:root:/]#topas -M
Topas Monitor for host:    IBMP740-1   Interval:   2    Tue Sep 22 17:13:05 2015
================================================================================
REF1    SRAD  TOTALMEM  INUSE    FREE    FILECACHE  HOMETHRDS  CPUS
--------------------------------------------------------------------------------
   0     0     60.4G    60.3G    106.5    30.8G        748      0-31
   1     1       0.0      0.0      0.0      0.0        625      32-63
================================================================================
CPU     SRAD  TOTALDISP   LOCALDISP%  NEARDISP%   FARDISP%
------------------------------------------------------------
  36       1       439      100.0         0.0        0.0
  60       1       345      100.0         0.0        0.0
  56       1       184      100.0         0.0        0.0
   0       0       144      100.0         0.0        0.0
  32       1        93      100.0         0.0        0.0
  16       0        88      100.0         0.0        0.0
   8       0        54      100.0         0.0        0.0
  40       1        43      100.0         0.0        0.0
  12       0        36      100.0         0.0        0.0
  20       0        28      100.0         0.0        0.0
   4       0        28      100.0         0.0        0.0
  28       0        21      100.0         0.0        0.0
  44       1        18      100.0         0.0        0.0
  24       0        12      100.0         0.0        0.0
  52       1        11      100.0         0.0        0.0
  48       1         1      100.0         0.0        0.0
  17       0         0      0.0           0.0        0.0
  18       0         0      0.0           0.0        0.0
  19       0         0      0.0           0.0        0.0
  10       0         0      0.0           0.0        0.0
  21       0         0      0.0           0.0        0.0
  22       0         0      0.0           0.0        0.0
  23       0         0      0.0           0.0        0.0
   9       0         0      0.0           0.0        0.0
  25       0         0      0.0           0.0        0.0
  26       0         0      0.0           0.0        0.0
  27       0         0      0.0           0.0        0.0
   7       0         0      0.0           0.0        0.0
  29       0         0      0.0           0.0        0.0
  30       0         0      0.0           0.0        0.0
  31       0         0      0.0           0.0        0.0
   6       0         0      0.0           0.0        0.0
  33       1         0      0.0           0.0        0.0
   5       0         0      0.0           0.0        0.0

从上面的信息可知除了系统所用的物理内存之外,总的可用物理内存是60.4G,使用了60.3G,可用106.5M,文件系统缓存是30.8G。
使用操作系统命令vmo -a –F来查看操作系统参数

[IBMP740-1:root:/]#vmo -a -F
             ame_cpus_per_pool = n/a
               ame_maxfree_mem = n/a
           ame_min_ucpool_size = n/a
               ame_minfree_mem = n/a
               ams_loan_policy = n/a
  enhanced_affinity_affin_time = 1
enhanced_affinity_vmpool_limit = 10
                esid_allocator = 0
           force_relalias_lite = 0
             kernel_heap_psize = 65536
                  lgpg_regions = 0
                     lgpg_size = 0
               low_ps_handling = 1
                       maxfree = 1088
                       maxperm = 14230680
                        maxpin = 13137354
                       maxpin% = 80
                 memory_frames = 16302080
                 memplace_data = 0
          memplace_mapped_file = 0
        memplace_shm_anonymous = 0
            memplace_shm_named = 0
                memplace_stack = 0
                 memplace_text = 0
        memplace_unmapped_file = 0
                       minfree = 960
                       minperm = 474353
                      minperm% = 3
                     nokilluid = 0
                       npskill = 32768
                       npswarn = 131072
           num_locks_per_semid = 1
                     numpsblks = 4194304
               pinnable_frames = 14750156
           relalias_percentage = 0
                         scrub = 0
                      v_pinshm = 0
              vmm_default_pspa = 0
                vmm_klock_mode = 1
            wlm_memlimit_nonpg = 1
##Restricted tunables
               ame_sys_memview = n/a
                cpu_scale_memp = 8
         data_stagger_interval = 161
                         defps = 1
enhanced_affinity_attach_limit = 100
     enhanced_affinity_balance = 100
     enhanced_affinity_private = 40
      enhanced_memory_affinity = 1
                     framesets = 2
                     htabscale = n/a
                  kernel_psize = 65536
          large_page_heap_size = 0
               lru_file_repage = 0
             lru_poll_interval = 10
                     lrubucket = 131072
                    maxclient% = 90
                      maxperm% = 90
               mbuf_heap_psize = 65536
               memory_affinity = 1
          multiple_semid_lists = 0
                 munmap_npages = 16384
                     npsrpgmax = 262144
                     npsrpgmin = 196608
                   npsscrubmax = 262144
                   npsscrubmin = 196608
            num_sem_undo_lists = 0
             num_sems_per_lock = 1
              num_spec_dataseg = 0
                numperm_global = 1
             page_steal_method = 1
          psm_timeout_interval = 20000
             relalias_lockmode = 1
                      rpgclean = 0
                    rpgcontrol = 2
                    scrubclean = 0
                shm_1tb_shared = 12
           shm_1tb_unsh_enable = 1
              shm_1tb_unshared = 256
         soft_min_lgpgs_vmpool = 0
              spec_dataseg_int = 512
              strict_maxclient = 1
                strict_maxperm = 0
                   sync_npages = 0
                 thrpgio_inval = 1024
                thrpgio_npages = 1024
               vm_mmap_areload = 0
          vm_modlist_threshold = -1
              vm_pvlist_dohard = 0
              vm_pvlist_szpcnt = 0
               vmm_fork_policy = 1
            vmm_mpsize_support = 2
               vmm_vmap_policy = 0
                  vtiol_avg_ms = 200
                  vtiol_minreq = 25
            vtiol_minth_active = 1
                    vtiol_mode = 0
               vtiol_pgin_mode = 2
              vtiol_pgout_mode = 2
               vtiol_q_cpu_pct = 2500
          vtiol_thread_cpu_pct = 5000

主要是maxclient% = 90,maxperm% = 90参数,说明文件系统缓存使用物理内存的最大限制为物理内存的90%。所以这里只需要将maxclient%与maxperm%参数调小,让系统有空闲内存来分配给新产生的进程来执行特定操作。调整maxclient%与maxperm%参数。

[IBMP740-1:root:/]#vmo -p -o maxclient%=20
Modification to restricted tunable maxclient%, confirmation required yes/no yes
Setting maxclient% to 20 in nextboot file
Setting maxclient% to 20
Warning: a restricted tunable has been modified
[IBMP740-1:root:/]#vmo -p -o maxperm%=20
Modification to restricted tunable maxperm%, confirmation required yes/no yes
Setting maxperm% to 20 in nextboot file
Setting maxperm% to 20
Warning: a restricted tunable has been modified

调整后再次查看操作系统参数

[IBMP740-1:root:/]#vmo -a -F
             ame_cpus_per_pool = n/a
               ame_maxfree_mem = n/a
           ame_min_ucpool_size = n/a
               ame_minfree_mem = n/a
               ams_loan_policy = n/a
  enhanced_affinity_affin_time = 1
enhanced_affinity_vmpool_limit = 10
                esid_allocator = 0
           force_relalias_lite = 0
             kernel_heap_psize = 65536
                  lgpg_regions = 0
                     lgpg_size = 0
               low_ps_handling = 1
                       maxfree = 1088
                       maxperm = 3162370
                        maxpin = 13137354
                       maxpin% = 80
                 memory_frames = 16302080
                 memplace_data = 0
          memplace_mapped_file = 0
        memplace_shm_anonymous = 0
            memplace_shm_named = 0
                memplace_stack = 0
                 memplace_text = 0
        memplace_unmapped_file = 0
                       minfree = 960
                       minperm = 790590
                      minperm% = 5
                     nokilluid = 0
                       npskill = 32768
                       npswarn = 131072
           num_locks_per_semid = 1
                     numpsblks = 4194304
               pinnable_frames = 14770780
           relalias_percentage = 0
                         scrub = 0
                      v_pinshm = 0
              vmm_default_pspa = 0
                vmm_klock_mode = 1
            wlm_memlimit_nonpg = 1
##Restricted tunables
               ame_sys_memview = n/a
                cpu_scale_memp = 8
         data_stagger_interval = 161
                         defps = 1
enhanced_affinity_attach_limit = 100
     enhanced_affinity_balance = 100
     enhanced_affinity_private = 40
      enhanced_memory_affinity = 1
                     framesets = 2
                     htabscale = n/a
                  kernel_psize = 65536
          large_page_heap_size = 0
               lru_file_repage = 0
             lru_poll_interval = 10
                     lrubucket = 131072
                    maxclient% = 20
                      maxperm% = 20
               mbuf_heap_psize = 65536
               memory_affinity = 1
          multiple_semid_lists = 0
                 munmap_npages = 16384
                     npsrpgmax = 262144
                     npsrpgmin = 196608
                   npsscrubmax = 262144
                   npsscrubmin = 196608
            num_sem_undo_lists = 0
             num_sems_per_lock = 1
              num_spec_dataseg = 0
                numperm_global = 1
             page_steal_method = 1
          psm_timeout_interval = 20000
             relalias_lockmode = 1
                      rpgclean = 0
                    rpgcontrol = 2
                    scrubclean = 0
                shm_1tb_shared = 12
           shm_1tb_unsh_enable = 1
              shm_1tb_unshared = 256
         soft_min_lgpgs_vmpool = 0
              spec_dataseg_int = 512
              strict_maxclient = 1
                strict_maxperm = 0
                   sync_npages = 0
                 thrpgio_inval = 1024
                thrpgio_npages = 1024
               vm_mmap_areload = 0
          vm_modlist_threshold = -1
              vm_pvlist_dohard = 0
              vm_pvlist_szpcnt = 0
               vmm_fork_policy = 1
            vmm_mpsize_support = 2
               vmm_vmap_policy = 0
                  vtiol_avg_ms = 200
                  vtiol_minreq = 25
            vtiol_minth_active = 1
                    vtiol_mode = 0
               vtiol_pgin_mode = 2
              vtiol_pgout_mode = 2
               vtiol_q_cpu_pct = 2500
          vtiol_thread_cpu_pct = 5000

从上面的结果看到修改生效了,文件系统缓存最大可以使用20%的物理内存。

[IBMP740-1:root:/]#topas_nmon
lqtopas_nmonqqh=HelpqqqqqqqqqqqqqHost=IBMP740-1qqqqqqRefresh=2 secsqqq17:44.52
x Memory x
x          Physical  PageSpace |        pages/sec  In     Out | FileSystemCache                                                                                       x
x% Used       69.8%     67.2%  | to Paging Space   0.5    0.0 | (numperm) 19.3%                                                                                       x
x% Free       30.2%     32.8%  | to File System 9455.5    8.4 | Process   42.9%                                                                                       x
xMB Used   44476.3MB 11010.5MB | Page Scans     9562.1        | System     7.6%                                                                                       x
xMB Free   19203.7MB  5373.5MB | Page Cycles       0.0        | Free      30.2%                                                                                       x
xTotal(MB) 63680.0MB 16384.0MB | Page Steals    9510.6        |           ------                                                                                      x
x                              | Page Faults    7478.9        | Total    100.0%                                                                                       x
x------------------------------------------------------------ | numclient 19.3%                                                                                       x
xMin/Maxperm     3088MB(  5%)  12353MB( 20%) < --% of RAM      | maxclient 20.0%                                                                                       x
xMin/Maxfree     960   1088       Total Virtual   78.2GB      | User      59.3%                                                                                       x
xMin/Maxpgahead    2      8    Accessed Virtual   33.6GB 43.0%| Pinned     9.4%                                                                                       x
x                                                             | lruable pages   15811872.0

可以看到现在空闲物理内存为30.2%,文件系统缓存(FileSystemCache) 19.2%,maxperm,maxclient为20%。现在执行expdp导出正常。

通过这个问题可以看出,AIX为了提高系统IO能力将空闲的物理内存作为文件系统缓存来使用,而且缺省参数可以使用物理内存的90%,这个缺省值在实际的生产环境中是很容易将内存耗尽的,所以AIX推荐的缺省值也是有问题的。

谓词条件的数据类型随意书写对SQL性能造成巨大的影响

最近在优化某系统中发现许多SQL语句在书写谓词条件(wheret条件)时完全不根据表结构定义的字段数据类型来,而是随意书写谓词条件,这样造成原来能走正确索引的结果不能使用该索引,其结果就是查询语句的性能很差,这里将我所遇到的两种情况介绍一下.

第一种情况是谓词条件进行了数据类型的转换转换使得CBO无法使用索引:
其SQL语句如下所示,该SQL的功能是统计一年社保中心一年内由于各种伤害或骨折所发生的医疗费用

select a.hospital_id,
       c.hospital_name,
       count(distinct a.serial_no) rc,
       round(sum(b.real_pay), 2) ylfyze,
       round(sum(case
                   when b.fund_id in ('001') then
                    b.real_pay
                   else
                    0
                 end),
             2) tczc,
       round(sum(case
                   when b.fund_id in ('201') then
                    b.real_pay
                   else
                    0
                 end),
             2) zffy,
       round(sum(case
                   when b.fund_id in ('003', '999') then
                    b.real_pay
                   else
                    0
                 end),
             2) yyzf
  from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d
 where a.hospital_id = b.hospital_id
   and a.serial_no = b.serial_no
   and a.hospital_id = c.hospital_id
   and a.fin_disease = d.icd
   and d.center_id = a.center_id
   and a.valid_flag = 1
   and b.valid_flag = 1
   and a.biz_type = 12
   and a.pers_type in (1, 2)
   and (d.disease like '%伤%' or d.disease like '%骨折%')
   and a.center_id = '430740'
   and to_char(a.fin_date, 'yyyymmdd') >= '20140101'
   and to_char(a.fin_date, 'yyyymmdd') < = '20141231'
 group by a.hospital_id, c.hospital_name
 order by a.hospital_id

上述SQL执行情况如下,其执行时间为4分40秒

SQL> set timing on
SQL> set autotrace traceonly
SQL> select c.hospital_id,
  2         c.hospital_name,
  3         count(distinct a.serial_no) rc,
  4         round(sum(b.real_pay), 2) ylfyze,
  5         round(sum(case
  6                     when b.fund_id in ('001') then
  7                      b.real_pay
  8                     else
  9                      0
 10                   end),
 11               2) tczc,
 12         round(sum(case
 13                     when b.fund_id in ('201') then
 14                      b.real_pay
 15                     else
 16                      0
 17                   end),
 18               2) zffy,
 19         round(sum(case
 20                     when b.fund_id in ('003', '999') then
 21                      b.real_pay
 22                     else
 23                      0
 24                   end),
 25               2) yyzf
 26    from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d
 27   where a.hospital_id = b.hospital_id
 28     and a.serial_no = b.serial_no
 29     and a.hospital_id = c.hospital_id
 30     and a.fin_disease = d.icd
 31     and d.center_id = a.center_id
 32     and a.valid_flag = 1
 33     and b.valid_flag = 1
 34     and a.biz_type = 12
 35     and a.pers_type in (1, 2)
 36     and (d.disease like '%伤%' or d.disease like '%骨折%')
 37     and a.center_id = '430740'
 38     and to_char(a.fin_date, 'yyyymmdd') >= '20140101'
 39     and to_char(a.fin_date, 'yyyymmdd') < = '20141231'
 40   group by c.hospital_id, c.hospital_name
 41   order by c.hospital_id
 42  ;

Elapsed: 00:04:39.59

Execution Plan
----------------------------------------------------------
Plan hash value: 1467084556

---------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                      |     1 |   148 |  4254  (20)| 00:00:04 |
|   1 |  SORT GROUP BY                   |                      |     1 |   148 |  4254  (20)| 00:00:04 |
|*  2 |   TABLE ACCESS BY INDEX ROWID    | MT_PAY_RECORD_FIN    |     1 |    31 |     1   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                  |                      |     1 |   148 |  4252  (20)| 00:00:04 |
|   4 |     NESTED LOOPS                 |                      |     1 |   117 |  4251  (20)| 00:00:04 |
|   5 |      NESTED LOOPS                |                      |     3 |   252 |  4250  (20)| 00:00:04 |
|   6 |       INDEX FULL SCAN            | IDX_BS_HOSPITAL_NAME |  1227 | 39264 |     2   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN           |     1 |    52 |     3   (0)| 00:00:01 |
|*  8 |        INDEX RANGE SCAN          | PK_MT_BIZ_FIN        |     1 |       |     3   (0)| 00:00:01 |
|*  9 |      TABLE ACCESS BY INDEX ROWID | BS_DISEASE           |     1 |    33 |     1   (0)| 00:00:01 |
|* 10 |       INDEX RANGE SCAN           | INX_BS_DISEASE_01    |     1 |       |     1   (0)| 00:00:01 |
|* 11 |     INDEX RANGE SCAN             | I_MT_PAY_RECORD_FIN_1|     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER("B"."VALID_FLAG")=1)
   7 - filter(TO_NUMBER("A"."VALID_FLAG")=1 AND (TO_NUMBER("A"."PERS_TYPE")=1 OR
              TO_NUMBER("A"."PERS_TYPE")=2))
   8 - access("A"."HOSPITAL_ID"="C"."HOSPITAL_ID" AND "A"."CENTER_ID"='430740')
       filter("A"."CENTER_ID"='430740' AND TO_NUMBER("A"."BIZ_TYPE")=12 AND
              TO_CHAR(INTERNAL_FUNCTION("A"."FIN_DATE"),'yyyymmdd')>='20140101' AND
              TO_CHAR(INTERNAL_FUNCTION("A"."FIN_DATE"),'yyyymmdd')< ='20141231')
   9 - filter("D"."DISEASE" LIKE '%伤%' OR "D"."DISEASE" LIKE '%骨折%')
  10 - access("D"."CENTER_ID"='430740' AND "A"."FIN_DISEASE"="D"."ICD")
  11 - access("A"."HOSPITAL_ID"="B"."HOSPITAL_ID" AND "A"."SERIAL_NO"="B"."SERIAL_NO")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     161233  consistent gets
      83048  physical reads
        624  redo size
       1197  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

上述SQL对于表BS_HOSPITAL只查询了hospital_name列,而在BS_HOSPITAL表中存在索引IDX_BS_HOSPITAL_NAME(hospital_name,hospital_id)所以首先对IDX_BS_HOSPITAL_NAME索引全扫描这样就不用再回表查询从索引中就是得到hospital_name列的值作为结果集1。再通过对MT_BIZ_FIN表执行索引(PK_MT_BIZ_FIN)范围扫描,再回表查询返回其记录作为结果集2,再以结果集1作为驱动表进行嵌套循环连接。再与表BS_DISEASE,I_MT_PAY_RECORD_FIN_1执行嵌套循环连接,再执行分组排序。其实在MT_BIZ_FIN表中存在复合索引INDI_MT_BIZ_FIN_F_H(FIN_DATE,HOSPITAL_ID,BIZ_TYPE, TREATMENT_TYPE, CENTER_ID),而查询条件中用到了find_date,hospital_id,biz_type,center_id,只是这里因为谓词条件中对于fin_date条件是to_char(a.fin_date, 'yyyymmdd') >= '20140101' and to_char(a.fin_date, 'yyyymmdd') < = '20141231',而fin_date(费用完成时间)是日期类型,这里将find_date转换成字符型所以没有办法使用索引INDI_MT_BIZ_FIN_F_H。 将to_char(a.fin_date, 'yyyymmdd') >= '20140101' and to_char(a.fin_date, 'yyyymmdd') < = '20141231'条件改写成 a.fin_date between to_date('20140101','yyyymmdd') and to_date('20141231','yyyymmdd') ,改写后其SQL语句如下所示:

select  c.hospital_id,
       c.hospital_name,
       count(distinct a.serial_no) rc,
       round(sum(b.real_pay), 2) ylfyze,
       round(sum(case
                   when b.fund_id in ('001') then
                    b.real_pay
                   else
                    0
                 end),
             2) tczc,
       round(sum(case
                   when b.fund_id in ('201') then
                    b.real_pay
                   else
                    0
                 end),
             2) zffy,
       round(sum(case
                   when b.fund_id in ('003', '999') then
                    b.real_pay
                   else
                    0
                 end),
             2) yyzf
  from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d
 where a.hospital_id = b.hospital_id
   and a.serial_no = b.serial_no
   and a.hospital_id = c.hospital_id
   and a.fin_disease = d.icd
   and d.center_id = a.center_id
   and a.valid_flag = 1
   and b.valid_flag = 1
   and a.biz_type = 12
   and a.pers_type in (1, 2)
   and (d.disease like '%伤%' or d.disease like '%骨折%')
   and a.center_id = '430740'
   and a.fin_date between to_date('20140101','yyyymmdd') and to_date('20141231','yyyymmdd')
group by c.hospital_id, c.hospital_name
 order by c.hospital_id

来实际执行一次,其执行结果如下所示,现在执行时间稳定在1-2秒之间,能满足客户要求。

SQL> set autotrace traceonly
SQL> select  c.hospital_id,
  2         c.hospital_name,
  3         count(distinct a.serial_no) rc,
  4         round(sum(b.real_pay), 2) ylfyze,
  5         round(sum(case
  6                     when b.fund_id in ('001') then
  7                      b.real_pay
  8                     else
  9                      0
 10                   end),
 11               2) tczc,
 12         round(sum(case
 13                     when b.fund_id in ('201') then
 14                      b.real_pay
 15                     else
 16                      0
 17                   end),
 18               2) zffy,
 19         round(sum(case
 20                     when b.fund_id in ('003', '999') then
 21                      b.real_pay
 22                     else
 23                      0
 24                   end),
 25               2) yyzf
 26    from mt_biz_fin a, mt_pay_record_fin b, bs_hospital c, bs_disease d
 27   where a.hospital_id = b.hospital_id
 28     and a.serial_no = b.serial_no
 29     and a.hospital_id = c.hospital_id
 30     and a.fin_disease = d.icd
 31     and d.center_id = a.center_id
 32     and a.valid_flag = 1
 33     and b.valid_flag = 1
 34     and a.biz_type = 12
 35     and a.pers_type in (1, 2)
 36     and (d.disease like '%伤%' or d.disease like '%骨折%')
 37     and a.center_id = '430740'
 38     and a.fin_date between to_date('20140101','yyyymmdd') and to_date('20141231','yyyymmdd')
 39  group by c.hospital_id, c.hospital_name
 40   order by c.hospital_id
 41  ;

Elapsed: 00:00:01.02

Execution Plan
----------------------------------------------------------
Plan hash value: 1467084556

---------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                      |    17 |  2516 |  1529  (15)| 00:00:02 |
|   1 |  SORT GROUP BY                   |                      |    17 |  2516 |  1529  (15)| 00:00:02 |
|*  2 |   TABLE ACCESS BY INDEX ROWID    | MT_PAY_RECORD_FIN    |     1 |    31 |     1   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                  |                      |    17 |  2516 |  1528  (15)| 00:00:02 |
|   4 |     NESTED LOOPS                 |                      |    33 |  3861 |  1521  (15)| 00:00:02 |
|   5 |      NESTED LOOPS                |                      |   354 | 29736 |  1450  (16)| 00:00:02 |
|   6 |       INDEX FULL SCAN            | IDX_BS_HOSPITAL_NAME |  1227 | 39264 |     2   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN           |     1 |    52 |     1   (0)| 00:00:01 |
|*  8 |        INDEX RANGE SCAN          | INDI_MT_BIZ_FIN_F_H  |     1 |       |     1   (0)| 00:00:01 |
|*  9 |      TABLE ACCESS BY INDEX ROWID | BS_DISEASE           |     1 |    33 |     1   (0)| 00:00:01 |
|* 10 |       INDEX RANGE SCAN           | INX_BS_DISEASE_01    |     1 |       |     1   (0)| 00:00:01 |
|* 11 |     INDEX RANGE SCAN             | I_MT_PAY_RECORD_FIN_1|     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER("B"."VALID_FLAG")=1)
   7 - filter(TO_NUMBER("A"."VALID_FLAG")=1 AND (TO_NUMBER("A"."PERS_TYPE")=1 OR
              TO_NUMBER("A"."PERS_TYPE")=2))
   8 - access("A"."HOSPITAL_ID"="C"."HOSPITAL_ID" AND "A"."FIN_DATE">=TO_DATE(' 2014-01-01
              00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "A"."CENTER_ID"='430740' AND "A"."FIN_DATE"< =TO_DATE('
              2014-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
       filter("A"."CENTER_ID"='430740' AND TO_NUMBER("A"."BIZ_TYPE")=12)
   9 - filter("D"."DISEASE" LIKE '%伤%' OR "D"."DISEASE" LIKE '%骨折%')
  10 - access("D"."CENTER_ID"='430740' AND "A"."FIN_DISEASE"="D"."ICD")
  11 - access("A"."HOSPITAL_ID"="B"."HOSPITAL_ID" AND "A"."SERIAL_NO"="B"."SERIAL_NO")


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      71411  consistent gets
          0  physical reads
          0  redo size
       1197  bytes sent via SQL*Net to client
        492  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed


第二种谓词条件的数据类型隐式转换无法使用索引的情况,其原始SQL语句如下所示,查询一个医疗机构的费用支出情况

select  a.hospital_id,
       count(distinct a.serial_no) rc,
       round(sum(b.real_pay), 2) ylfyze,
       round(sum(case
                   when b.fund_id in ('001') then
                    b.real_pay
                   else
                    0
                 end),
             2) tczc,
       round(sum(case
                   when b.fund_id in ('201') then
                    b.real_pay
                   else
                    0
                 end),
             2) zffy,
       round(sum(case
                   when b.fund_id in ('003', '999') then
                    b.real_pay
                   else
                    0
                 end),
             2) yyzf
  from mt_biz_fin a, mt_pay_record_fin b
 where a.hospital_id = b.hospital_id
   and a.serial_no = b.serial_no
   and a.valid_flag = '1'
   and b.valid_flag = '1'
   and a.biz_type = '12'
   and a.pers_type in ('1', '2')    
   and b.hospital_id=4307000231
group by a.hospital_id

该SQL的执行计划如下所示,执行了1分22秒:

SQL> set autotrace traceonly
SQL> select  a.hospital_id,
  2         count(distinct a.serial_no) rc,
  3         round(sum(b.real_pay), 2) ylfyze,
  4         round(sum(case
  5                     when b.fund_id in ('001') then
  6                      b.real_pay
  7                     else
  8                      0
  9                   end),
 10               2) tczc,
 11         round(sum(case
 12                     when b.fund_id in ('201') then
 13                      b.real_pay
 14                     else
 15                      0
 16                   end),
 17               2) zffy,
 18         round(sum(case
 19                     when b.fund_id in ('003', '999') then
 20                      b.real_pay
 21                     else
 22                      0
 23                   end),
 24               2) yyzf
 25    from mt_biz_fin a, mt_pay_record_fin b
 26   where a.hospital_id = b.hospital_id
 27     and a.serial_no = b.serial_no
 28     and a.valid_flag = ‘1’
 29     and b.valid_flag = ‘1’
 30     and a.biz_type = ‘12’
 31     and a.pers_type in ('1', '2')    
 32     and b.hospital_id=4307000231
 33  group by a.hospital_id
 34  ;

no rows selected

Elapsed: 00:01:22.20

Execution Plan
----------------------------------------------------------
Plan hash value: 3673479381

--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |     1 |    61 |   127K (16)| 00:01:56 |
|   1 |  SORT GROUP BY               |                   |     1 |    61 |   127K (16)| 00:01:56 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN        |     1 |    30 |     1   (0)| 00:00:01 |
|   3 |    NESTED LOOPS              |                   |    45 |  2745 |   127K (16)| 00:01:56 |
|*  4 |     TABLE ACCESS FULL        | MT_PAY_RECORD_FIN |  8327 |   252K|   123K (16)| 00:01:53 |
|*  5 |     INDEX RANGE SCAN         | PK_MT_BIZ_FIN     |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"."BIZ_TYPE"='12’ AND "A"."VALID_FLAG"='1’ AND
              ("A"."PERS_TYPE"='1’ OR "A"."PERS_TYPE"='2’)
   4 - filter(TO_NUMBER("B"."HOSPITAL_ID")=4307000231 AND "B"."VALID_FLAG"='1')
   5 - access("A"."HOSPITAL_ID"="B"."HOSPITAL_ID" AND "A"."SERIAL_NO"="B"."SERIAL_NO")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
     572386  consistent gets
     383935  physical reads
          0  redo size
        638  bytes sent via SQL*Net to client
        481  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          0  rows processed

从执行计划中可以看到在访问表MT_PAY_RECORD_FIN时使用的全表扫描,而在表MT_PAY_RECORD_FIN上存在索引PK_MT_PAY_RECORD_FIN(HOSPITAL_ID, SERIAL_NO)为什么没有使用该索引了,查询条件中的谓词条件是b.hospital_id=4307000231而从Predicate Information信息中的4 – filter(TO_NUMBER(“B”.”HOSPITAL_ID”)=4307000231
可知hospital_id在表中是字符型,而在书写查询条件时使用的是数字类型,这里CBO进行数据类型的隐式转换。所以使用不了索引。我们需要写成b.hospital_id=’4307000231′,修改后的SQL如下所示:

select  a.hospital_id,
       count(distinct a.serial_no) rc,
       round(sum(b.real_pay), 2) ylfyze,
       round(sum(case
                   when b.fund_id in ('001') then
                    b.real_pay
                   else
                    0
                 end),
             2) tczc,
       round(sum(case
                   when b.fund_id in ('201') then
                    b.real_pay
                   else
                    0
                 end),
             2) zffy,
       round(sum(case
                   when b.fund_id in ('003', '999') then
                    b.real_pay
                   else
                    0
                 end),
             2) yyzf
  from mt_biz_fin a, mt_pay_record_fin b
 where a.hospital_id = b.hospital_id
   and a.serial_no = b.serial_no
   and a.valid_flag = '1'
   and b.valid_flag = '1'
   and a.biz_type = '12'
   and a.pers_type in ('1', '2')    
   and b.hospital_id='4307000231'
group by a.hospital_id

来真实执行一次,现在能使用索引之后执行时间只要0.1秒

SQL> select  a.hospital_id,
  2         count(distinct a.serial_no) rc,
  3         round(sum(b.real_pay), 2) ylfyze,
  4         round(sum(case
  5                     when b.fund_id in ('001') then
  6                      b.real_pay
  7                     else
  8                      0
  9                   end),
 10               2) tczc,
 11         round(sum(case
 12                     when b.fund_id in ('201') then
 13                      b.real_pay
 14                     else
 15                      0
 16                   end),
 17               2) zffy,
 18         round(sum(case
 19                     when b.fund_id in ('003', '999') then
 20                      b.real_pay
 21                     else
 22                      0
 23                   end),
 24               2) yyzf
 25    from mt_biz_fin a, mt_pay_record_fin b
 26   where a.hospital_id = b.hospital_id
 27     and a.serial_no = b.serial_no
 28     and a.valid_flag = '1'
 29     and b.valid_flag = '1'
 30     and a.biz_type = '12'
 31     and a.pers_type in ('1', '2')    
 32     and b.hospital_id='4307000231'
 33  group by a.hospital_id
 34  ;

no rows selected

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3142857175

------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                     |     1 |    61 |   115   (1)| 00:00:01 |
|   1 |  SORT GROUP BY                 |                     |     1 |    61 |   115   (1)| 00:00:01 |
|*  2 |   TABLE ACCESS BY INDEX ROWID  | MT_PAY_RECORD_FIN   |     1 |    31 |     1   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                |                     |   139 |  8479 |   115   (1)| 00:00:01 |
|*  4 |     TABLE ACCESS BY INDEX ROWID| MT_BIZ_FIN          |   139 |  4170 |    87   (2)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | INDI_MT_BIZ_FIN_H_F |   371 |       |    19   (6)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN           | PK_MT_PAY_RECORD_FIN|     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("B"."VALID_FLAG"='1')
   4 - filter("A"."VALID_FLAG"='1' AND ("A"."PERS_TYPE"='1' OR
             A"."PERS_TYPE"='2'))
   5 - access("A"."HOSPITAL_ID"='4307000231')
       filter("A"."BIZ_TYPE"='12')
   6 - access("B"."HOSPITAL_ID"='4307000231' AND "A"."SERIAL_NO"="B"."SERIAL_NO")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        203  consistent gets
          0  physical reads
          0  redo size
        638  bytes sent via SQL*Net to client
        481  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          0  rows processed

从上面的执行计划可以看到现在访问表MT_PAY_RECORD_FIN能正确使用索引PK_MT_PAY_RECORD_FIN,但这里CBO并不是先访问表MT_PAY_RECORD_FIN,这里执行了谓词传递,从Predicate Information 中的 5 – access(“A”.”HOSPITAL_ID”=’4307000231′)可知是先对索引INDI_MT_BIZ_FIN_H_F执行索引范围,但是在查询条件中并没有写a.hospital_id=’4307000231’这个条件,这就是谓词传递的结果,因为有b.hospital_id=’4307000231′ and a.hospital_id=b.hospital_id,所以CBO推导出a.hospital_id=’4307000231’。

在优化这个系统时发现好多类似这两种情况的SQL,都是因为在书写SQL语句时根本就没有注意字段的类型,不同的开发人员书写的SQL语句,有的人谓词数据类型书写正确,有的人谓词数据类型书写不正确。希望开发人员在书写SQL谓词条件时注意数据类型,一定要书写正确。

kksfbc child completion与ksdxexeotherwait引发CPU使用异常

某客户操作人员反应很慢不能操作,管理人员登录小机系统后发现CPU使用到了96%。而且这种情况持续了几个月。以下是登录后小机后载取的topas图,而且是周末,并没有人使用系统。小机是IBM的550,配置是2颗6核的CPU,内存是48G。
1

如是登录数据库执行以下脚本来查看当前数据库消耗CPU最多的进程在执行什么

Connected to Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 
Connected as gtp2
 


SQL> select s.sid,p.SPID,s.username,s.event,s.wait_time,s.state,s.seconds_in_wait,p.PROGRAM,s.MACHINE,
  2  (select  c.SQL_FULLTEXT from v$sqlarea c where c.SQL_ID=s.SQL_ID) sql_fulltext,
  3  (select  c.BIND_DATA from v$sqlarea c where c.SQL_ID=s.SQL_ID) BIND_DATA,s.SQL_ID
  4  from v$session s,v$process p
  5  where p.SPID in(491720,90116,127336,529102,987524,331990)
  6  and s.event not like'%SQL*Net%' and s.USERNAME='GTP2'
  7  order by s.wait_time desc
  8  ;
 
       SID SPID         USERNAME                       EVENT                                                             WAIT_TIME STATE               SECONDS_IN_WAIT PROGRAM                                          MACHINE                                                          SQL_FULLTEXT                                                                     BIND_DATA                                                                        SQL_ID
---------- ------------ ------------------------------ ---------------------------------------------------------------- ---------- ------------------- --------------- ------------------------------------------------ ---------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------
      1020 90116        GTP2                           kksfbc child completion                                                  -1 WAITED SHORT TIME             53742 oracleorcl@dbserv                                WORKGROUP\WIN-AUQ43P0UU9L                                                                                                                                                                                                          063cu7y841kmc
      1020 987524       GTP2                           kksfbc child completion                                                  -1 WAITED SHORT TIME             53742 oracleorcl@dbserv                                WORKGROUP\WIN-AUQ43P0UU9L                                                                                                                                                                                                          063cu7y841kmc
      1020 331990       GTP2                           kksfbc child completion                                                  -1 WAITED SHORT TIME             53742 oracleorcl@dbserv                                WORKGROUP\WIN-AUQ43P0UU9L                                                                                                                                                                                                          063cu7y841kmc
      1020 491720       GTP2                           kksfbc child completion                                                  -1 WAITED SHORT TIME             53742 oracleorcl@dbserv                                WORKGROUP\WIN-AUQ43P0UU9L                                                                                                                                                                                                          063cu7y841kmc
 
4 rows selected
 
SQL> select s.sid,p.SPID,s.username,s.event,s.wait_time,s.state,s.seconds_in_wait,p.PROGRAM,s.MACHINE,
  2  (select  c.SQL_FULLTEXT from v$sqlarea c where c.SQL_ID=s.SQL_ID) sql_fulltext,
  3  (select  c.BIND_DATA from v$sqlarea c where c.SQL_ID=s.SQL_ID) BIND_DATA,s.SQL_ID
  4  from v$session s,v$process p
  5  where p.SPID in(491720,90116,127336,529102,987524,331990)
  6  and s.event not like'%SQL*Net%' and s.USERNAME='GTP2'
  7  order by s.wait_time desc
  8  ;
 
       SID SPID         USERNAME                       EVENT                                                             WAIT_TIME STATE               SECONDS_IN_WAIT PROGRAM                                          MACHINE                                                          SQL_FULLTEXT                                                                     BIND_DATA                                                                        SQL_ID
---------- ------------ ------------------------------ ---------------------------------------------------------------- ---------- ------------------- --------------- ------------------------------------------------ ---------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------
      1020 90116        GTP2                           ksdxexeotherwait                                                         -1 WAITED SHORT TIME              3342 oracleorcl@dbserv                                WORKGROUP\WIN-AUQ43P0UU9L                                                                                                                                                                                                          063cu7y841kmc
      1020 987524       GTP2                           ksdxexeotherwait                                                         -1 WAITED SHORT TIME              3342 oracleorcl@dbserv                                WORKGROUP\WIN-AUQ43P0UU9L                                                                                                                                                                                                          063cu7y841kmc
      1020 331990       GTP2                           ksdxexeotherwait                                                         -1 WAITED SHORT TIME              3342 oracleorcl@dbserv                                WORKGROUP\WIN-AUQ43P0UU9L                                                                                                                                                                                                          063cu7y841kmc
      1020 491720       GTP2                           ksdxexeotherwait                                                         -1 WAITED SHORT TIME              3342 oracleorcl@dbserv                                WORKGROUP\WIN-AUQ43P0UU9L                                                                                                                                                                                                          063cu7y841kmc
 
4 rows selected
 

从上面的信息可以看到这些进程的等待事件为kksfbc child completion,ksdxexeotherwait。当看到这种情况时第一反应是不是遇到的BUG,以KKSFBC CHILD COMPLETION为关键字到MOS查询可以找到,该Bug的症状为进程不断spin且hang住、出现’KKSFBC CHILD COMPLETION’等待事件、还可能伴有’Waits for “cursor: pin S”‘等待事件,直接影响的版本有11.1.0.6、10.2.0.3和10.2.0.4。而我这里的版本是10.2.0.1。
2

对于该Bug的描述是在发生’kksfbc child completion’等待事件后会话陷入无休止的自旋(spins)中,这种自旋(spins)发生在由堆栈调用(stack call)kksSearchChildList->kkshgnc陷入对kksSearchChildList函数的无限循环中。需要更详细的stack call,如是对系统进程90116进行跟踪。

SQL> oradebug setospid 90116
Oracle pid: 40, Unix process pid: 90116, image: oracleorcl@dbserv
SQL> oradebug unlimit;
Statement processed.
SQL> oradebug short_stack;
ksdxfstk+002c< -ksdxcb+04e4<-sspuser+0068<-00004750<-kksfbc+0bb0<-kkspsc0+0f3c<-kksParseCursor+00d4<-opiosq0+0b10<-kpooprx+0168<-kpoal8+0400<-opiodr+0adc<-ttcpip+1004<-opitsk+1000<-opiino+0990<-opiodr+0adc<-opidrv+0474<-sou2o+0090<-opimai_real+01bc<-main+0098<-__start+0070
SQL> oradebug dump processstate 10;
Statement processed.
SQL>  oradebug dump systemstate 266;
Statement processed.
SQL> oradebug tracefile_name
/oracle/admin/orcl/udump/orcl_ora_90116.trc

查看生成的跟踪文件orcl_ora_90116.trc有如下内容:

SO: 7000001486ab188, type: 4, owner: 70000014346c5a8, flag: INIT/-/-/0x00
    (session) sid: 1020 trans: 0, creator: 70000014346c5a8, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0000-0000-00000000, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 0, prv: 0, sql: 7000001473dcf10, psql: 7000001225ac0c8, user: 82/GTP2
    O/S info: user: gtp-default, term: WIN-AUQ43P0UU9L, ospid: 6708:12196, machine: WORKGROUP\WIN-AUQ43P0UU9L
              program: w3wp.exe
    last wait for 'kksfbc child completion' blocking sess=0x0 seq=2831 wait_time=48850 seconds since wait started=572057
                =0, =0, =0
    Dumping Session Wait History
     for 'kksfbc child completion' count=1 wait_time=48850
                =0, =0, =0

可以从以上trace中看到会话确实曾长时间处于’kksfbc child completion’等待中,之后陷入无限自旋(spins)中消耗了大量CPU时间。但这里实际的表现又存有差异,引发无限循环的函数是kksfbc而不是kksSearchChildList(常规的调用序列是:kksParseCursor->kkspsc0->kksfbc ->kksSearchChildList->kkshgnc)。kksfbc意为K[Kernel]K[Kompile]S[Shared]F[Find]B[Best]C[Child]该函数用以在软解析时找寻合适的子游标,在10.2.0.2以后引入了mutex互斥体来取代原有的Cursor Pin机制,Mutex较Latch更为轻量级。虽然mutex的引入改变了众多cursor pin的内部机制,但kksfbc仍需要持有library cache latches才能扫描library cache hash chains。另一方面当kksfbc函数针对某个parent cursor找到合适child cursor后,可能使用KKSCHLPINx方法将该child cursor pin住,这个时候就需要exclusive地持有该child cursor相应的mutex。Oracle在10.2.0.4上提供了该Bug的one-off Patch
8575528,其在10.2.0.4 psu4以后的等价补丁为(Equivalent patch)为merge patch 9696904:8557428 9696904 7527908 Both fixes are needed. 6795880 superceded by 8575528 in 9696904 which includes extra files so may cause new conflicts。但merge patch 9696904目前仅有Linux x86/64平台上的版本,而问题数据库所在平台为IBM AIX on POWER Systems (64-bit),而且版本是10.2.0.1。那么要解决这个问题是不是没有办法了,其实不然,我们可以将数据库从10.2.0.1升级到10.2.0.5来解决这个BUG,在升级到10.2.0.5之后确实解决这个问题。

Oracle 11R2 Grid Infrastructure执行root.sh脚本rootcrs.pl execution failed的处理

Oracle 11.2.0.4在Redhat Linux 6.1上执行/u01/app/product/11.2.0/crs/root.sh脚本时报以下错误信息:

/u01/app/product/11.2.0/crs/bin/srvctl start nodeapps -n beiku1 ... failed
FirstNode configuration failed at /u01/app/product/11.2.0/crs/crs/install/crsconfig_lib.pm line 9379.
/u01/app/product/11.2.0/crs/perl/bin/perl -I/u01/app/product/11.2.0/crs/perl/lib -I/u01/app/product/11.2.0/crs/crs/install /u01/app/product/11.2.0/crs/crs/install/rootcrs.pl execution failed

从上面的错误信息可以看到在执行srvctl start nodeapps -n bieku1时失败,尝试手动执行这个命令

[grid@beiku1 bin]$ ./srvctl start nodeapps -n beiku1
PRCR-1013 : Failed to start resource ora.ons
PRCR-1064 : Failed to start resource ora.ons on node beiku1
CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/beiku1/agent/crsd/oraagent_grid/oraagent_grid.log"
CRS-2674: Start of 'ora.ons' on 'beiku1' failed

错误信息是Start of ‘ora.ons’ on ‘beiku1’ failed,那么来检查$ORACLE_HOME/cfgtoollogs/crsconfig/rootcrs_$HOSTNAME.log日志文件

[grid@beiku1 crs]$ cd $ORACLE_HOME/cfgtoollogs/crsconfig/
[grid@beiku1 crsconfig]$ ls -lrt
total 332
-rwxrwxr-x 1 grid oinstall  81336 Aug 26 15:36 srvmcfg0.log
-rwxrwxr-x 1 grid oinstall  18719 Aug 26 15:36 srvmcfg1.log
-rwxrwxr-x 1 grid oinstall  23213 Aug 26 15:36 srvmcfg2.log
-rwxrwxr-x 1 grid oinstall  24700 Aug 26 15:36 srvmcfg3.log
-rwxrwxr-x 1 grid oinstall  10705 Aug 26 15:36 srvmcfg4.log
-rwxrwxr-x 1 grid oinstall  25594 Aug 26 15:37 srvmcfg5.log
-rwxrwxr-x 1 grid oinstall 132771 Aug 26 15:37 rootcrs_beiku1.log
[grid@beiku1 crsconfig]$ cat rootcrs_beiku1.log
2015-08-26 15:36:52: J2EE (OC4J) Container Resource Add Wallet ... passed ...
2015-08-26 15:36:52: Running as user grid: /u01/app/product/11.2.0/crs/bin/qosctl -autogenerate
2015-08-26 15:36:52: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/qosctl -autogenerate '
2015-08-26 15:36:54: Removing file /tmp/fileoriV8Q
2015-08-26 15:36:54: Successfully removed file: /tmp/fileoriV8Q
2015-08-26 15:36:54: /bin/su successfully executed

2015-08-26 15:36:54: qosctl output: User qosadmin added successfully.

User oc4jadmin added successfully.

2015-08-26 15:36:54: Running as user grid: /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user oc4jadmin
2015-08-26 15:36:54: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user oc4jadmin '
2015-08-26 15:36:55: Removing file /tmp/fileHsIIY7
2015-08-26 15:36:55: Successfully removed file: /tmp/fileHsIIY7
2015-08-26 15:36:55: /bin/su successfully executed

2015-08-26 15:36:55: Running as user grid: /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user qosadmin
2015-08-26 15:36:55: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user qosadmin '
2015-08-26 15:36:55: Removing file /tmp/fileQXtLZo
2015-08-26 15:36:55: Successfully removed file: /tmp/fileQXtLZo
2015-08-26 15:36:55: /bin/su successfully executed

2015-08-26 15:36:55: Invoking "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:55: trace file=/u01/app/product/11.2.0/crs/cfgtoollogs/crsconfig/srvmcfg5.log
2015-08-26 15:36:55: Running as user grid: /u01/app/product/11.2.0/crs/bin/srvctl add cvu
2015-08-26 15:36:55:   Invoking "/u01/app/product/11.2.0/crs/bin/srvctl add cvu" as user "grid"
2015-08-26 15:36:55: Executing /bin/su grid -c "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:55: Executing cmd: /bin/su grid -c "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:57: add cvu ... success
2015-08-26 15:36:57: starting nodeapps...
2015-08-26 15:36:57: DHCP_flag=0
2015-08-26 15:36:57: nodes_to_start=beiku1
2015-08-26 15:37:18: exit value of start nodeapps/vip is 1
2015-08-26 15:37:18: output for start nodeapps is  PRCR-1013 : Failed to start resource ora.ons PRCR-1064 : Failed to start resource ora.ons on node beiku1 CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/beiku1/agent/crsd/oraagent_grid/oraagent_grid.log" CRS-2674: Start of 'ora.ons' on 'beiku1' failed
2015-08-26 15:37:18: output of startnodeapp after removing already started mesgs is PRCR-1013 : Failed to start resource ora.ons PRCR-1064 : Failed to start resource ora.ons on node beiku1 CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/beiku1/agent/crsd/oraagent_grid/oraagent_grid.log" CRS-2674: Start of 'ora.ons' on 'beiku1' failed
2015-08-26 15:37:18: /u01/app/product/11.2.0/crs/bin/srvctl start nodeapps -n beiku1 ... failed

检查I $GRID_HOME/opmn/logs/ons.log.*文件,看是否有以下错误:
1.

[grid@beiku1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@beiku1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.beiku1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@beiku1 logs]$ cat ons.log.beiku1
[2015-08-26T15:37:02+08:00] [internal] getaddrinfo(::0, 6200, 1) failed (Hostname and service name not provided or found): Connection timed out

如果存在上面的错误信息,那么原因就是/etc/hosts文件中localhost对应的IP地址不是127.0.0.1。解决方法如就是确保DNS和/etc/hosts文件正确设置了localhost,DNS或/etc/hosts文件依赖于(/etc/nsswitch.conf, or /etc/netsvc.conf depend on platform),这些配置文件中的命名解决方案的设置,可以参考MOS中的ID 942166.1 or ID 969254.1文档来进行处理。

2.

[grid@beiku1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@beiku1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.beiku1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@beiku1 logs]$ cat ons.log.beiku1
[2015-08-26T15:37:02+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:37:02+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)
[2015-08-26T15:39:42+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:39:42+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)
[2015-08-26T15:48:40+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:48:40+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)

原因是有其它的进程占用的ONS服务的端口

[grid@beiku1 logs]$ grep port $ORACLE_HOME/opmn/conf/ons.config
localport=6100          # line added by Agent
remoteport=6200         # line added by Agent

[root@beiku1 /]# lsof | grep 6200 | grep LISTEN
ons       16413      grid    6u     IPv6     162533                  TCP *:6200 (LISTEN)

可以看到进程ID16413的ons进程占用了6200端口,解决方法是确保这个端口不被其它进行所占用,如果是在执行 rootupgrade.sh脚本进行升级之前被占用,那么可能的原因是旧版本的ons进程还在运行。

3.

[grid@beiku1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@beiku1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.beiku1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@beiku1 logs]$ cat ons.log.beiku1
[2015-08-26T15:48:40+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:48:40+08:00] [ons] [ERROR:1] [17] [ons-listener] 0000:0000:0000:0000:0000:0000:0000:0001,6100: BIND (Cannot assign requested address)

这种情况可能是IPV6被部分配置了,11gR2 Grid Infrastructure不支持IPv6。解决方法就是在$GRID_HOME/opmn/conf/ons.config and ons.config.文件中设置下面的参数:
interface=ipv4

这里出现的错误是第2种,进程ID16413的ons进程占用了6200端口,解决方法是确保这个端口不被其它进行所占用

[root@beiku1 /]# lsof | grep 6200 | grep LISTEN
ons       16413      grid    6u     IPv6     162533                  TCP *:6200 (LISTEN)
[root@beiku1 /]# kill -9 16413

再重新执行root.sh脚本

[root@beiku1 /]# ./u01/app/product/11.2.0/crs/root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/product/11.2.0/crs

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/product/11.2.0/crs/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
PRKO-2190 : VIP exists for node beiku1, VIP name beiku1-vip
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

在kill掉占用6200端口的进程之后,root.sh脚本可以成功执行。