一个存储档案的rac数据库起不来了,生产环境是linux rac 11.2.0.4,原因是因为用工具测试磁盘IO时损坏了ocr所在磁盘组与存储数据ASM磁盘的磁盘头。下面是恢复过程:
1.检查crs的状态:
[grid@darac1 ~]$ crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager [root@darac1 crsd]# ps -ef|grep crs root 3126 1 1 10:34 ? 00:00:31 /u01/app/product/11.2.0/crs/bin/ohasd.bin reboot grid 3514 1 0 10:34 ? 00:00:07 /u01/app/product/11.2.0/crs/bin/oraagent.bin grid 3525 1 0 10:34 ? 00:00:00 /u01/app/product/11.2.0/crs/bin/mdnsd.bin grid 3537 1 0 10:34 ? 00:00:16 /u01/app/product/11.2.0/crs/bin/gpnpd.bin grid 3549 1 1 10:34 ? 00:00:33 /u01/app/product/11.2.0/crs/bin/gipcd.bin root 4128 1 0 10:54 ? 00:00:02 /u01/app/product/11.2.0/crs/bin/cssdmonitor root 4144 1 0 10:54 ? 00:00:01 /u01/app/product/11.2.0/crs/bin/cssdagent grid 4167 1 2 10:55 ? 00:00:14 /u01/app/product/11.2.0/crs/bin/ocssd.bin root 4354 3680 0 11:04 pts/1 00:00:00 grep crs
2.强制关闭crs
[root@darac1 bin]# ./crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'darac1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'darac1' CRS-2673: Attempting to stop 'ora.gipcd' on 'darac1' CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'darac1' CRS-2677: Stop of 'ora.cssdmonitor' on 'darac1' succeeded CRS-2677: Stop of 'ora.mdnsd' on 'darac1' succeeded CRS-2677: Stop of 'ora.gipcd' on 'darac1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'darac1' CRS-2677: Stop of 'ora.gpnpd' on 'darac1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'darac1' has completed CRS-4133: Oracle High Availability Services has been stopped.
3.以exclusive模式启动crs
[root@darac1 bin]# ./crsctl start crs -excl -nocrs CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.mdnsd' on 'darac1' CRS-2676: Start of 'ora.mdnsd' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'darac1' CRS-2676: Start of 'ora.gpnpd' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'darac1' CRS-2672: Attempting to start 'ora.gipcd' on 'darac1' CRS-2676: Start of 'ora.gipcd' on 'darac1' succeeded CRS-2676: Start of 'ora.cssdmonitor' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'darac1' CRS-2672: Attempting to start 'ora.diskmon' on 'darac1' CRS-2676: Start of 'ora.diskmon' on 'darac1' succeeded CRS-2676: Start of 'ora.cssd' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.drivers.acfs' on 'darac1' CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'darac1' CRS-2672: Attempting to start 'ora.ctssd' on 'darac1' CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'darac1' CRS-2676: Start of 'ora.ctssd' on 'darac1' succeeded CRS-2676: Start of 'ora.drivers.acfs' on 'darac1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'darac1' CRS-2676: Start of 'ora.asm' on 'darac1' succeeded
4.查看GI相关的alert.log日志文件如何
[ohasd(5040)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running). 2016-10-13 11:20:47.302: [gpnpd(5215)]CRS-2328:GPNPD started on node darac1. 2016-10-13 11:20:58.388: [ohasd(5040)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE 2016-10-13 11:21:00.608: [cssd(5318)]CRS-1713:CSSD daemon is started in clustered mode 2016-10-13 11:21:01.521: [/u01/app/product/11.2.0/crs/bin/orarootagent.bin(5304)]CRS-5013:Agent "/u01/app/product/11.2.0/crs/bin/orarootagent.bin" failed to start process "/u01/app/product/11.2.0/crs/bin/osysmond" for action "start": details at "(:CLSN00008:)" in "/u01/app/product/11.2.0/crs/log/darac1/agent/ohasd/orarootagent_root//orarootagent_root.log" 2016-10-13 11:21:03.585: [ohasd(5040)]CRS-2878:Failed to restart resource 'ora.crf' 2016-10-13 11:21:05.399: [/u01/app/product/11.2.0/crs/bin/orarootagent.bin(5340)]CRS-5013:Agent "/u01/app/product/11.2.0/crs/bin/orarootagent.bin" failed to start process "/u01/app/product/11.2.0/crs/bin/osysmond" for action "start": details at "(:CLSN00008:)" in "/u01/app/product/11.2.0/crs/log/darac1/agent/ohasd/orarootagent_root//orarootagent_root.log" 2016-10-13 11:21:10.703: [ohasd(5040)]CRS-2878:Failed to restart resource 'ora.crf' 2016-10-13 11:21:23.464: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:21:38.698: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:21:53.925: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:22:09.463: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:22:24.804: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:22:40.252: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:22:56.722: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:23:12.009: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:23:27.290: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:23:42.872: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:23:58.198: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:24:13.500: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:24:28.786: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:24:43.488: [client(5394)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u01/app/product/11.2.0/crs/log/darac1/client/ocrcheck_5394.log. 2016-10-13 11:24:43.959: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:24:51.823: [client(5424)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u01/app/product/11.2.0/crs/log/darac1/client/crsctl_grid.log. 2016-10-13 11:24:59.345: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:25:14.526: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:25:29.696: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:25:44.860: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:26:00.042: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:26:15.218: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:26:30.409: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:26:45.577: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:26:49.031: [client(5460)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u01/app/product/11.2.0/crs/log/darac1/client/ocrconfig_5460.log. 2016-10-13 11:27:00.766: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:27:15.951: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:27:31.142: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:27:46.339: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:28:01.530: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:28:16.733: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:28:32.008: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:28:47.191: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:29:02.389: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:29:17.610: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:29:32.832: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:29:48.035: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:30:03.229: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:30:18.434: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:30:33.679: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:30:48.876: [cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:31:01.534: [/u01/app/product/11.2.0/crs/bin/cssdagent(5284)]CRS-5818:Aborted command 'start' for resource 'ora.cssd'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/product/11.2.0/crs/log/darac1/agent/ohasd/oracssdagent_root//oracssdagent_root.log. 2016-10-13 11:31:01.540: [cssd(5318)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log 2016-10-13 11:31:01.541: [cssd(5318)]CRS-1603:CSSD on node darac1 shutdown by user.
从上面的信息可以看到找不到voting files
5.检查ASM的alert.log可以找如下创建CRSDG,DATADG磁盘组的创建语句:
Wed Dec 02 16:09:01 2015 SQL> CREATE DISKGROUP CRSDG EXTERNAL REDUNDANCY DISK '/dev/raw/raw1' ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
6.检查磁盘头
[grid@darac1 ~]$ kfed read /dev/raw/raw1 kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483648 ; 0x008: disk=0 kfbh.check: 300392945 ; 0x00c: 0x11e7a1f1 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 B7F46200 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
7.使用kfed恢复CRSDG的磁盘头,但因为备份信息也被损坏所以恢复时报错,而且没有手动备份
[grid@darac1 ~]$ kfed repair /dev/raw/raw1 KFED-00320: Invalid block num1 = [0], num2 = [1], error = [endian_kfbh]
没有通过自动备份的磁盘头信息来进行恢复,只能使用自动备份的ocr信息来恢复了操作如下。
8.创建磁盘组
[grid@darac1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 13 13:00:42 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production With the Real Application Clusters and Automatic Storage Management options SQL> select * from v$asm_diskgroup; no rows selected SQL> create diskgroup CRSDG external redundancy disk '/dev/raw/raw1' attribute 'COMPATIBLE.ASM' = '11.2.0.0.0'; Diskgroup created.
9.查看自动备份的ocr文件
[root@darac1 bin]# ./ocrconfig -showbackup PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy darac2 2016/10/13 06:29:53 /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup00.ocr darac2 2016/10/13 02:29:45 /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup01.ocr darac2 2016/10/12 22:29:37 /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup02.ocr darac2 2016/10/12 02:27:20 /u01/app/product/11.2.0/crs/cdata/darac-cluster/day.ocr darac2 2016/10/11 22:27:10 /u01/app/product/11.2.0/crs/cdata/darac-cluster/week.ocr
10.还原ocr
[root@darac1 bin]# ./ocrconfig -restore /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup00.ocr
11.处理votedisk
[root@darac1 bin]# ./ocrconfig -restore /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup00.ocr [root@darac1 bin]# ./crsctl replace votedisk +CRSDG Successful addition of voting disk 44eaf86504ea4f76bfb43cb7931a3fc7. Successfully replaced voting disk group with +CRSDG. CRS-4266: Voting file(s) successfully replaced
12.创建asm spfile
[grid@darac1 ~]$ vi /tmp/asm.txt instance_type='asm' large_pool_size=12M remote_login_passwordfile= 'EXCLUSIVE' asm_diskstring = '/dev/raw/raw*' asm_power_limit =1 [grid@darac1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 13 13:40:02 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production With the Real Application Clusters and Automatic Storage Management options SQL> create spfile='+CRSDG' FROM pfile='/tmp/asm.txt'; File created.
13.重启crs
[root@darac1 bin]# ./crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'darac1' CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'darac1' CRS-2673: Attempting to stop 'ora.ctssd' on 'darac1' CRS-2673: Attempting to stop 'ora.asm' on 'darac1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'darac1' CRS-2677: Stop of 'ora.ctssd' on 'darac1' succeeded CRS-2677: Stop of 'ora.mdnsd' on 'darac1' succeeded CRS-2677: Stop of 'ora.asm' on 'darac1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'darac1' CRS-2677: Stop of 'ora.drivers.acfs' on 'darac1' succeeded CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'darac1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'darac1' CRS-2677: Stop of 'ora.cssd' on 'darac1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'darac1' CRS-2677: Stop of 'ora.gipcd' on 'darac1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'darac1' CRS-2677: Stop of 'ora.gpnpd' on 'darac1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'darac1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@darac1 bin]# ./crsctl start crs CRS-4123: Oracle High Availability Services has been started. [grid@darac1 ~]$ crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.CRSDG.dg ONLINE ONLINE darac1 ONLINE ONLINE darac2 ora.DATADG.dg ONLINE OFFLINE darac1 ONLINE OFFLINE darac2 ora.LISTENER.lsnr ONLINE ONLINE darac1 ONLINE ONLINE darac2 ora.asm ONLINE ONLINE darac1 Started ONLINE ONLINE darac2 Started ora.gsd OFFLINE OFFLINE darac1 OFFLINE OFFLINE darac2 ora.net1.network ONLINE ONLINE darac1 ONLINE ONLINE darac2 ora.ons ONLINE ONLINE darac1 ONLINE OFFLINE darac2 ora.registry.acfs ONLINE ONLINE darac1 ONLINE ONLINE darac2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE darac1 ora.cvu 1 ONLINE ONLINE darac1 ora.darac.db 1 ONLINE OFFLINE Corrupted Controlfi le 2 ONLINE OFFLINE Corrupted Controlfi le ora.darac1.vip 1 ONLINE ONLINE darac1 ora.darac2.vip 1 ONLINE ONLINE darac2 ora.darac3.vip 1 ONLINE OFFLINE ora.oc4j 1 ONLINE OFFLINE STARTING ora.scan1.vip 1 ONLINE ONLINE darac1
从上面的信息可以看到DATADG磁盘组没有加载,数据库darac也没有启动,并且显示错误的控制文件。alert_asm1.log中,有创建磁盘组的信息:
Wed Dec 02 18:27:46 2015 SQL> CREATE DISKGROUP DATADG EXTERNAL REDUNDANCY DISK '/dev/raw/raw3' SIZE 10240M ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
14.查看磁盘组的状态
SQL> select name,state from v$asm_diskgroup; NAME STATE -------------------------------------------------- ---------------------- CRSDG MOUNTED ARCH MOUNTED
15.手动加载DATADG磁盘报错
SQL> alter diskgroup DATADG mount; alter diskgroup DATADG mount * ERROR at line 1: ORA-15032: not all alterations performed ORA-15017: diskgroup "DATADG" cannot be mounted ORA-15040: diskgroup is incomplete
16.查看磁盘组磁盘头的状态,可以看到/dev/raw/raw3为candidate
SQL> select name,path,header_status from v$asm_disk; NAME PATH HEADER_STATUS -------------------------------------------------- -------------------------------------------------- ------------------------------ /dev/raw/raw3 CANDIDATE ARCH_0000 /dev/raw/raw2 MEMBER CRSDG_0000 /dev/raw/raw1 MEMBER
17.尝试使用自动备份的磁盘头信息来恢复磁盘头,这个DATADG磁盘恢复成功。
[grid@darac1 ~]$ kfed repair /dev/raw/raw3 SQL> select name,state from v$asm_diskgroup; NAME STATE -------------------------------------------------- ---------------------- CRSDG MOUNTED DATADG DISMOUNTED ARCH MOUNTED SQL> select name,path,header_status from v$asm_disk; NAME PATH HEADER_STATUS -------------------------------------------------- -------------------------------------------------- ------------------------------ /dev/raw/raw3 MEMBER ARCH_0000 /dev/raw/raw2 MEMBER CRSDG_0000 /dev/raw/raw1 MEMBER
18.手动加载DATADG磁盘报错
SQL> alter diskgroup DATADG mount; Diskgroup altered. SQL> select name,state from v$asm_diskgroup; NAME STATE -------------------------------------------------- ---------------------- CRSDG MOUNTED DATADG MOUNTED ARCH MOUNTED
19.查看磁盘组磁盘头的状态,可以看到/dev/raw/raw3为member
SQL> select name,path,header_status from v$asm_disk; NAME PATH HEADER_STATUS -------------------------------------------------- -------------------------------------------------- ------------------------------ ARCH_0000 /dev/raw/raw2 MEMBER DATADG_0000 /dev/raw/raw3 MEMBER CRSDG_0000 /dev/raw/raw1 MEMBER
20.启动数据库darac
[grid@darac1 ~]$ srvctl start database -d darac [grid@darac1 ~]$ crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.ARCH.dg ONLINE ONLINE darac1 ONLINE ONLINE darac2 ora.CRSDG.dg ONLINE ONLINE darac1 ONLINE ONLINE darac2 ora.DATADG.dg ONLINE ONLINE darac1 ONLINE ONLINE darac2 ora.LISTENER.lsnr ONLINE ONLINE darac1 ONLINE ONLINE darac2 ora.asm ONLINE ONLINE darac1 Started ONLINE ONLINE darac2 Started ora.gsd OFFLINE OFFLINE darac1 OFFLINE OFFLINE darac2 ora.net1.network ONLINE ONLINE darac1 ONLINE ONLINE darac2 ora.ons ONLINE ONLINE darac1 ONLINE ONLINE darac2 ora.registry.acfs ONLINE ONLINE darac1 ONLINE ONLINE darac2 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE darac1 ora.cvu 1 ONLINE ONLINE darac1 ora.darac.db 1 ONLINE ONLINE darac1 Open 2 ONLINE ONLINE darac2 Open ora.darac1.vip 1 ONLINE ONLINE darac1 ora.darac2.vip 1 ONLINE ONLINE darac2 ora.darac3.vip 1 ONLINE OFFLINE ora.oc4j 1 ONLINE ONLINE darac1 ora.scan1.vip 1 ONLINE ONLINE darac1
到此数据库恢复成功。
1 第13步,重建DATADG操作,此举不怕把磁盘组上原先存在的业务数据给整丢了?
2 第8步,重建了CRSDG磁盘组后,不可以直接执行:kfed repairt /dev/raw/raw1?也就是,第9步的意义何在?
还是说,kfed 操作必须是在 CRS磁盘组 正常的情况下,才能执行,当 crs磁盘组 故障时,得用这命令:
./crsctl replace votedisk +CRSDG
3 第7步,kfed repairt /dev/raw/raw1 失败,而第17步,却可以使用自动备份的磁盘头来恢复,为何?
第7步,kfed repairt /dev/raw/raw1 失败,而第17步,却可以使用自动备份的磁盘头来恢复,为何?
因为第17步恢复的的datadg磁盘组,IO测试工具没有把/dev/raw/raw3磁盘所自动备份的磁盘头信息给破坏
第8步,重建了CRSDG磁盘组后,不可以直接执行:kfed repairt /dev/raw/raw1?也就是,第9步的意义何在?
还是说,kfed 操作必须是在 CRS磁盘组 正常的情况下,才能执行,当 crs磁盘组 故障时,得用这命令:
./crsctl replace votedisk +CRSDG
第9步的意义是查看自动备份的ocr文件,因为文件每四个小时备份一次,还原时要指定备分文件名
重建crsdg磁盘组后,只能使用备份的ocr文件来还原
第13步,是ASM的日志文件中显示了操作信息,因为磁盘组不能mount ,asm实例选择的重建操作,是日志记录,我不是的恢复操作记录。