数据守护实时主备搭建
下列机器事先都安装了DM,安装路径为’/dm8’,执行程序保存在’/dm8/bin’目录中,数据存放路径为’/dm8/data’。
各主备库的实例名建议采用“组名_守护环境_序号”的方式命名,方便按组区分不同实例,注意总长度不能超过16。本示例中组名为“GRP1”,配置为实时主备,主库命名为“GRP1_RT_01”,备库命名为“GRP1_RT_02”。
机器名 IP地址 初始状态 操作系统 dm209 10.10.13.209(对外) 主库 dm1 redhat 7.8 11.11.11.209(mal对内) dm210 10.10.13.210(对外) 11.11.11.210(mal对内) 备库 dm2 redhat 7.8 dm211 11.11.11.211 确认监视器 redhat 7.8 实例名 port_num mal_inst_dw_port mal_host mal_port mal_dw_port dm1 5236 5239 11.11.11.209 5237 5238 dm2 5236 5239 11.11.11.210 5237 5238
数据准备
在主库机器上初始化数据库到目录/dm8/data:
[dmdba@dm209 dm8]$ dminit path=/dm8/data db_name=dm instance_name=dm1 port_num=5236 page_size=8 charset=0 initdb V8 db version: 0x7000c file dm.key not found, use default license! License will expire on 2022-10-21 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL log file path: /dm8/data/dm/dm01.log log file path: /dm8/data/dm/dm02.log write to dir [/dm8/data/dm]. create dm database success. 2022-01-13 10:28:01
注册服务用于启动数据库
[root@dm209 root]# ./dm_service_installer.sh -dm_ini /dm8/data/dm/dm.ini -p dm1 -t dmserver Created symlink from /etc/systemd/system/multi-user.target.wants/DmServicedm1.service to /usr/lib/systemd/system/DmServicedm1.service. Finished to create the service (DmServicedm1)
正常启动数据库并正常关闭
[root@dm209 root]# service DmServicedm1 start Redirecting to /bin/systemctl start DmServicedm1.service [root@dm209 root]# ps -ef | grep dmserver dmdba 18402 1 17 10:33 ? 00:00:04 /dm8/bin/dmserver path=/dm8/data/dm/dm.ini -noconsole root 18487 4030 0 10:34 pts/1 00:00:00 grep --color=auto dmserver [root@dm209 ~]# su - dmdba Last login: Thu Jan 13 10:16:30 CST 2022 on pts/1 [dmdba@dm209 ~]$ disql SYSDBA/SYSDBA@localhost:5236 Server[localhost:5236]:mode is normal, state is open login used time : 7.108(ms) disql V8 SQL> exit [root@dm209 ~]# service DmServicedm1 stop Redirecting to /bin/systemctl stop DmServicedm1.service
注册服务用于将数据库启动到mount状态
[root@dm209 root]# ./dm_service_installer.sh -dm_ini /dm8/data/dm/dm.ini -p dm1mount -t dmserver -m mount Created symlink from /etc/systemd/system/multi-user.target.wants/DmServicedm1mount.service to /usr/lib/systemd/system/DmServicedm1mount.service. Finished to create the service (DmServicedm1mountT)
备份主库
使用联机备份:
[root@dm209 root]# ./dm_service_installer.sh -t dmap -dm_ini /dm8/data/dm/dm.ini Created symlink from /etc/systemd/system/multi-user.target.wants/DmAPService.service to /usr/lib/systemd/system/DmAPService.service. Finished to create the service (DmAPService) [root@dm209 root]# service DmAPService start Redirecting to /bin/systemctl start DmAPService.service SQL> BACKUP DATABASE FULL BACKUPSET '/dm8/data/dm_full_01'; executed successfully used time: 00:00:02.521. Execute id is 501.
创建备库备库并恢复
[dmdba@dm210 data]$ dminit path=/dm8/data db_name=dm instance_name=dm2 port_num=5236 page_size=8 charset=0 initdb V8 db version: 0x7000c file dm.key not found, use default license! License will expire on 2022-10-21 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL log file path: /dm8/data/dm/dm01.log log file path: /dm8/data/dm/dm02.log write to dir [/dm8/data/dm]. create dm database success. 2022-01-13 16:02:51
将主库的备份复制备库:
[root@dm209 data]# scp -r dm_full_01 10.10.13.210:/dm8/data root@10.10.13.210's password: dm_full_01.bak 100% 15MB 30.7MB/s 00:00 dm_full_01_1.bak 100% 164KB 6.2MB/s 00:00 dm_full_01.meta 100% 85KB 3.6MB/s 00:00 [root@dm209 data]#
恢复备库:
[dmdba@dm210 data]$ dmrman CTLSTMT="RESTORE DATABASE '/dm8/data/dm/dm.ini' FROM BACKUPSET '/dm8/data/dm_full_01'" dmrman V8 RESTORE DATABASE '/dm8/data/dm/dm.ini' FROM BACKUPSET '/dm8/data/dm_full_01' file dm.key not found, use default license! Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL [Percent:100.00%][Speed:0.00M/s][Cost:00:00:06][Remaining:00:00:00] restore successfully. time used: 00:00:06.629 [dmdba@dm210 ~]$ dmrman CTLSTMT="RECOVER DATABASE '/dm8/data/dm/dm.ini' FROM BACKUPSET '/dm8/data/dm_full_01'" dmrman V8 RECOVER DATABASE '/dm8/data/dm/dm.ini' FROM BACKUPSET '/dm8/data/dm_full_01' file dm.key not found, use default license! Database mode = 0, oguid = 0 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL EP[0]'s cur_lsn[25922], file_lsn[25922] [Percent:100.00%][Speed:0.00PKG/s][Cost:00:00:00][Remaining:00:00:00] recover successfully! time used: 00:00:02.895 [dmdba@dm210 ~]$ dmrman CTLSTMT="RECOVER DATABASE '/dm8/data/dm/dm.ini' UPDATE DB_MAGIC" dmrman V8 RECOVER DATABASE '/dm8/data/dm/dm.ini' UPDATE DB_MAGIC file dm.key not found, use default license! Database mode = 0, oguid = 0 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL EP[0]'s cur_lsn[26568], file_lsn[26568] recover successfully! time used: 00:00:01.093
配置操作
一.主库
配置dm.ini文件,配置以下参数:
DW_INACTIVE_INTERVAL = 60 #接收守护进程消息超时时间 ALTER_MODE_STATUS = 0 #不允许手工方式修改实例模式/状态 ENABLE_OFFLINEi_TS = 2 #不允许备库OFFLINE表空间 MAL_INI = 1 #打开MAL系统 ARCH_INI = 1 #打开归档配置 RLOG_SEND_APPLY_MON = 64 #统计最近64次的日志发送信息
配置dmmal.ini文件
配置MAL系统,各主备库的dmmal.ini配置必须完全一致,MAL_HOST使用内部网络IP,MAL_PORT与dm.ini中PORT_NUM使用不同的端口值,MAL_DW_PORT是各实例对应的守护进程之间,以及守护进程和监视器之间的通信端口,配置如下:
MAL_CHECK_INTERVAL = 5 #MAL链路检测时间间隔 MAL_CONN_FAIL_INTERVAL = 5 #判定MAL链路断开的时间 [MAL_INST1] MAL_INST_NAME = dm1 #实例名,和dm.ini中的INSTANCE_NAME一致 MAL_HOST = 11.11.11.209 #MAL系统监听TCP连接的IP地址 MAL_PORT = 5237 #MAL系统监听TCP连接的端口 MAL_INST_HOST = 10.10.13.209 #实例的对外服务IP地址 MAL_INST_PORT = 5236 #实例的对外服务端口,和dm.ini中的PORT_NUM一致 MAL_DW_PORT = 5238 #实例本地的守护进程监听TCP连接的端口 MAL_INST_DW_PORT = 5239 #实例监听守护进程TCP连接的端口 [MAL_INST2] MAL_INST_NAME = dm2 #实例名,和dm.ini中的INSTANCE_NAME一致 MAL_HOST = 11.11.11.210 #MAL系统监听TCP连接的IP地址 MAL_PORT = 5237 #MAL系统监听TCP连接的端口 MAL_INST_HOST = 10.10.13.210 #实例的对外服务IP地址 MAL_INST_PORT = 5236 #实例的对外服务端口,和dm.ini中的PORT_NUM一致 MAL_DW_PORT = 5238 #实例本地的守护进程监听TCP连接的端口 MAL_INST_DW_PORT = 5239 #实例监听守护进程TCP连接的端口
[dmdba@dm209 dm]$ vi dmmal.ini MAL_CHECK_INTERVAL = 5 MAL_CONN_FAIL_INTERVAL = 5 [MAL_INST1] MAL_INST_NAME = dm1 MAL_HOST = 11.11.11.209 MAL_PORT = 5237 MAL_INST_HOST = 10.10.13.209 MAL_INST_PORT = 5236 MAL_DW_PORT = 5238 MAL_INST_DW_PORT = 5239 [MAL_INST2] MAL_INST_NAME = dm2 MAL_HOST = 11.11.11.210 MAL_PORT = 5237 MAL_INST_HOST = 10.10.13.210 MAL_INST_PORT = 5236 MAL_DW_PORT = 5238 MAL_INST_DW_PORT = 5239
配置dmarch.ini
修改dmarch.ini,配置本地归档和实时归档。除了本地归档外,其他归档配置项中的ARCH_DEST表示实例是Primary模式时,需要同步归档数据的目标实例名。当前实例dm1是主库,需要向dm2(实时备库)同步数据,因此实时归档的ARCH_DEST配置为DM2。
[ARCHIVE_REALTIME] #实时归档 ARCH_TYPE = REALTIME #实时归档类型 ARCH_DEST = dm2 #实时归档目标实例名 [ARCHIVE_LOCAL1] ARCH_TYPE = LOCAL #本地归档类型 ARCH_DEST = /dm8/data/dm/arch #本地归档文件存放路径 ARCH_FILE_SIZE = 128 #单位Mb,本地单个归档文件最大值 ARCH_SPACE_LIMIT = 0 #单位Mb,0表示无限制,范围1024~4294967294M [dmdba@dm209 dm]$ vi dmarch.ini [ARCHIVE_REALTIME] ARCH_TYPE = REALTIME ARCH_DEST = dm2 [ARCHIVE_LOCAL1] ARCH_TYPE = LOCAL ARCH_DEST = /dm8/data/dm/arch ARCH_FILE_SIZE = 128 ARCH_SPACE_LIMIT = 0
配置dmwatcher.ini
修改dmwatcher.ini配置守护进程,配置为全局守护类型,使用自动切换模式。
[GRP1] DW_TYPE = GLOBAL #全局守护类型 DW_MODE = AUTO #自动切换模式 生产建议设置成手动 DW_ERROR_TIME = 10 #远程守护进程故障认定时间 INST_RECOVER_TIME = 60 #主库守护进程启动恢复的间隔时间 INST_ERROR_TIME = 10 #本地实例故障认定时间 INST_OGUID = 111111 #守护系统唯一OGUID值 INST_INI = /dm8/data/dm/dm.ini #dm.ini配置文件路径 INST_AUTO_RESTART = 1 #打开实例的自动启动功能(建议设置成0) INST_STARTUP_CMD = /dm8/bin/dmserver #命令行方式启动 RLOG_SEND_THRESHOLD = 0 #指定主库发送日志到备库的时间阀值,默认关闭 RLOG_APPLY_THRESHOLD = 0 #指定备库重演日志的时间阀值,默认关闭 [dmdba@dm209 dm]$ vi dmwatcher.ini [GRP1] DW_TYPE = GLOBAL DW_MODE = AUTO DW_ERROR_TIME = 10 INST_RECOVER_TIME = 60 INST_ERROR_TIME = 10 INST_OGUID = 111111 INST_INI = /dm8/data/dm/dm.ini INST_AUTO_RESTART = 1 INST_STARTUP_CMD = /dm8/bin/dmserver RLOG_SEND_THRESHOLD = 0 RLOG_APPLY_THRESHOLD = 0
以mount模式启动主库
[root@dm209 root]# service DmServicedm1 stop Redirecting to /bin/systemctl stop DmServicedm1.service [root@dm209 root]# service DmServicedm1mountT start Redirecting to /bin/systemctl start DmServicedm1mountT.service
设置主库 OGUID
SQL> sp_set_oguid(111111); DMSQL executed successfully used time: 115.044(ms). Execute id is 0.
修改主数据库模式
SQL> alter database primary; executed successfully used time: 80.002(ms). Execute id is 0.
将主库的dm.ini,dmmal.ini,dmarch.ini,dmwatcher.ini文件复制到备库
[root@dm209 dm]# scp dm*.ini 10.10.13.210:/dm8/data/dm root@10.10.13.210's password: dmarch.ini 100% 346 18.3KB/s 00:00 dm.ini 100% 52KB 29.9MB/s 00:00 dmmal.ini 100% 547 16.9KB/s 00:00 dmwatcher.ini 100% 568 332.1KB/s 00:00 [root@dm209 dm]#
二备库
备机修改相关配置
修改dm.ini
INSTANCE_NAME = DM2
dmmal.ini与主库一致不用修改
修改dmarch.ini
ARCH_DEST = DM1
dmwatcher.ini 和主库一致
注册服务用于启动数据库到mount状态
[root@dm210 root]# ./dm_service_installer.sh -dm_ini /dm8/data/dm/dm.ini -p dm2mount -t dmserver -m mount Created symlink from /etc/systemd/system/multi-user.target.wants/DmServicedm2mount.service to /usr/lib/systemd/system/DmServicedm2mount.service. Finished to create the service (DmServicedm2mount)
注册服务用于启动数据库
[root@dm210 root]# ./dm_service_installer.sh -dm_ini /dm8/data/dm/dm.ini -p dm2 -t dmserver Created symlink from /etc/systemd/system/multi-user.target.wants/DmServicedm2.service to /usr/lib/systemd/system/DmServicedm2.service. Finished to create the service (DmServicedm2)
以mount方式启动备库
[root@dm210 root]# service DmServicedm2mount start Redirecting to /bin/systemctl start DmServicedm2mount.service
设置备库 OGUID
SQL> sp_set_oguid(111111); DMSQL executed successfully used time: 115.044(ms). Execute id is 0.
修改备数据库模式
如果当前数据库不是normal模式,需要先修改dm.ini中ALTER_MODE_STATUS值为1,允许修改数据库模式,修改Standby模式成功后再改回为0。如果是normal模式,请忽略下面的第1步和第3步。
SQL> alter database standby; alter database standby; [-720]:Error in line: 1 Dmwatcher is active, or current configuration(ALTER_MODE_STATUS) not allowed to alter database. used time: 0.636(ms). Execute id is 0. SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1); DMSQL executed successfully used time: 7.791(ms). Execute id is 1. SQL> alter database standby; executed successfully used time: 94.298(ms). Execute id is 0. SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0); DMSQL executed successfully used time: 8.414(ms). Execute id is 2.
注册并启动守护进程
主库
[root@dm209 root]# ./dm_service_installer.sh -watcher_ini /dm8/data/dm/dmwatcher.ini -p dw1 -t dmwatcher Created symlink from /etc/systemd/system/multi-user.target.wants/DmWatcherServicedw1.service to /usr/lib/systemd/system/DmWatcherServicedw1.service. Finished to create the service (DmWatcherServicedw1) [root@dm209 root]# service DmWatcherServicedw1 start Redirecting to /bin/systemctl start DmWatcherServicedw1.service
备库
[root@dm210 root]# ./dm_service_installer.sh -watcher_ini /dm8/data/dm/dmwatcher.ini -p dw2 -t dmwatcher Created symlink from /etc/systemd/system/multi-user.target.wants/DmWatcherServicedw2.service to /usr/lib/systemd/system/DmWatcherServicedw2.service. Finished to create the service (DmWatcherServicedw2) [root@dm210 root]# service DmWatcherServicedw2 start Redirecting to /bin/systemctl start DmWatcherServicedw2.service
查看file_lsn与cur_lsn主备库是否一致
主库
[dmdba@dm209 arch]$ disql SYSDBA/SYSDBA Server[LOCALHOST:5236]:mode is primary, state is mount login used time : 4.992(ms) disql V8 SQL> select file_LSN, cur_LSN from v$rlog; LINEID FILE_LSN CUR_LSN ---------- -------------------- -------------------- 1 28059 28059 used time: 4.592(ms). Execute id is 700.
备库
[dmdba@dm210 ~]$ disql SYsDBA/SYSDBA Server[LOCALHOST:5236]:mode is standby, state is open login used time : 5.676(ms) disql V8 SQL> select file_LSN, cur_LSN from v$rlog; LINEID FILE_LSN CUR_LSN ---------- -------------------- -------------------- 1 28059 28059 used time: 3.306(ms). Execute id is 200.
配置监视器(基本要求,安装dm8的软件)
由于主库和实时备库的守护进程配置为自动切换模式,因此这里选择配置确认监视器。和普通监视器相比,确认监视器除了相同的命令支持外,在主库发生故障时,能够自动通知实时备库接管为新的主库,具有自动故障处理的功能。
修改dmmonitor.ini配置确认监视器,其中MON_DW_IP中的IP和PORT和dmmal.ini中的MAL_HOST和MAL_DW_PORT配置项保持一致。
MON_DW_CONFIRM = 1 #确认监视器模式 MON_LOG_PATH = /dm8/data/log #监视器日志文件存放路径 MON_LOG_INTERVAL = 60 #每隔60s定时记录系统信息到日志文件 MON_LOG_FILE_SIZE = 32 #每个日志文件最大32M MON_LOG_SPACE_LIMIT = 0 #不限定日志文件总占用空间 [GRP1] MON_INST_OGUID = 111111 #组GRP1的唯一OGUID值 #以下配置为监视器到组GRP1的守护进程的连接信息,以“IP:PORT”的形式配置 #IP对应dmmal.ini中的MAL_HOST,PORT对应dmmal.ini中的MAL_DW_PORT MON_DW_IP = 11.11.11.209:5238 MON_DW_IP = 11.11.11.210:5238 [dmdba@dm211 data]$ vi dmmonitor.ini MON_DW_CONFIRM = 1 MON_LOG_PATH = /dm8/data/log MON_LOG_INTERVAL = 60 MON_LOG_FILE_SIZE = 32 MON_LOG_SPACE_LIMIT = 0 [GRP1] MON_INST_OGUID = 111111 MON_DW_IP = 11.11.11.209:5238 MON_DW_IP = 11.11.11.210:5238
启动监视器:
[dmdba@dm211 data]$ dmmonitor /dm8/data/dmmonitor.ini [monitor] 2022-01-14 10:43:58: DMMONITOR[4.0] V8 [monitor] 2022-01-14 10:43:58: DMMONITOR[4.0] IS READY. [monitor] 2022-01-14 10:43:58: Received message from(DM1) WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 10:43:57 OPEN OK DM1 OPEN PRIMARY VALID 3 28059 28059 [monitor] 2022-01-14 10:43:58: Received message from(DM2) WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 10:43:57 OPEN OK DM2 OPEN STANDBY VALID 3 28059 28059 show 2022-01-14 10:44:26 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 111111 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 11.11.11.209 5238 2022-01-14 10:44:24 GLOBAL VALID OPEN DM1 OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 10.10.13.209 5236 OK DM1 OPEN PRIMARY 0 0 REALTIME VALID 4264 28059 4264 28059 NONE < > DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 11.11.11.210 5238 2022-01-14 10:44:25 GLOBAL VALID OPEN DM2 OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 10.10.13.210 5236 OK DM2 OPEN STANDBY 0 0 REALTIME VALID 4210 28059 4210 28059 NONE DATABASE(DM2) APPLY INFO FROM (DM1), REDOS_PARALLEL_NUM (1): DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[4264, 4264, 4264], (RLSN, SLSN, KLSN)[28059, 28059, 28059], N_TSK[0], TSK_MEM_USE[0] REDO_LSN_ARR: (28059) #================================================================================#
或者[root@ora19c root]# ./dm_service_installer.sh -i /dm8/data/dmmonitor.ini -p DM -t dmmonitor
ln -s '/usr/lib/systemd/system/DmMonitorServiceDM.service' '/etc/systemd/system/multi-user.target.wants/DmMonitorServiceDM.service' Finished to create the service (DmMonitorServiceDM)
使用服务方式启动不能执行相关命令来显示监控信息所以一般不使用
主备同步测试
主库:
[dmdba@dm209 arch]$ disql SYSDBA/SYSDBA@localhost:5236 Server[localhost:5236]:mode is primary, state is open login used time : 6.172(ms) disql V8 SQL> create table t1(id int); executed successfully used time: 60.028(ms). Execute id is 800. SQL> insert into t1 values(1); affect rows 1 used time: 1.253(ms). Execute id is 801. SQL> commit; executed successfully used time: 2.770(ms). Execute id is 802. SQL> select * from t1; LINEID ID ---------- ----------- 1 1 used time: 1.224(ms). Execute id is 803.
备库:
[dmdba@dm210 ~]$ disql SYSDBA/SYSDBA@localhost:5236 Server[localhost:5236]:mode is standby, state is open login used time : 5.946(ms) disql V8 SQL> select * from t1; LINEID ID ---------- ----------- 1 1 used time: 7.131(ms). Execute id is 300.
主库:
SQL> update t1 set id=2; affect rows 1 used time: 1.081(ms). Execute id is 804. SQL> commit; executed successfully used time: 2.552(ms). Execute id is 805. SQL> select * from t1; LINEID ID ---------- ----------- 1 2 used time: 0.435(ms). Execute id is 806.
备库:
SQL> select * from t1; LINEID ID ---------- ----------- 1 2 used time: 2.014(ms). Execute id is 301.
主备切换测试
操作直接在监控器里执行。
Switchover 切换
show 2022-01-14 11:15:30 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 111111 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 11.11.11.209 5238 2022-01-14 11:15:29 GLOBAL VALID OPEN DM1 OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 10.10.13.209 5236 OK DM1 OPEN PRIMARY 0 0 REALTIME VALID 4288 28136 4288 28136 NONE < > DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 11.11.11.210 5238 2022-01-14 11:15:29 GLOBAL VALID OPEN DM2 OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 10.10.13.210 5236 OK DM2 OPEN STANDBY 0 0 REALTIME VALID 4210 28136 4210 28136 NONE DATABASE(DM2) APPLY INFO FROM (DM1), REDOS_PARALLEL_NUM (1): DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[4288, 4288, 4288], (RLSN, SLSN, KLSN)[28136, 28136, 28136], N_TSK[0], TSK_MEM_USE[0] REDO_LSN_ARR: (28136) #================================================================================# choose switchover grp1 Can choose one of the following instances to do switchover: 1: DM2 switchover grp1.dm2 [monitor] 2022-01-14 11:16:27: Not login dmmonitor or server public key changed, please try to login again! login username:SYSDBA password: [monitor] 2022-01-14 11:17:03: Login dmmonitor success! switchover grp1.dm2 [monitor] 2022-01-14 11:17:23: Start to switchover instance DM2 [monitor] 2022-01-14 11:17:23: Notify dmwatcher(DM1) switch to SWITCHOVER status [monitor] 2022-01-14 11:17:23: Dmwatcher process DM1 status switching [OPEN-->SWITCHOVER] [monitor] 2022-01-14 11:17:23: Switch dmwatcher DM1 to SWITCHOVER status success [monitor] 2022-01-14 11:17:23: Notify dmwatcher(DM2) switch to SWITCHOVER status [monitor] 2022-01-14 11:17:23: Dmwatcher process DM2 status switching [OPEN-->SWITCHOVER] [monitor] 2022-01-14 11:17:25: Switch dmwatcher DM2 to SWITCHOVER status success [monitor] 2022-01-14 11:17:25: Instance DM1 start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) [monitor] 2022-01-14 11:17:25: Instance DM1 execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) success [monitor] 2022-01-14 11:17:25: Instance DM2 start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) [monitor] 2022-01-14 11:17:25: Instance DM2 execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) success [monitor] 2022-01-14 11:17:25: Instance DM1 start to execute sql ALTER DATABASE MOUNT [monitor] 2022-01-14 11:17:25: Instance DM1 execute sql ALTER DATABASE MOUNT success [monitor] 2022-01-14 11:17:25: Instance DM2 start to execute sql SP_APPLY_KEEP_PKG() [monitor] 2022-01-14 11:17:25: Instance DM2 execute sql SP_APPLY_KEEP_PKG() success [monitor] 2022-01-14 11:17:25: Instance DM2 start to execute sql ALTER DATABASE MOUNT [monitor] 2022-01-14 11:17:25: Instance DM2 execute sql ALTER DATABASE MOUNT success [monitor] 2022-01-14 11:17:25: Instance DM1 start to execute sql ALTER DATABASE STANDBY [monitor] 2022-01-14 11:17:25: Instance DM1 execute sql ALTER DATABASE STANDBY success [monitor] 2022-01-14 11:17:25: Instance DM2 start to execute sql ALTER DATABASE PRIMARY [monitor] 2022-01-14 11:17:26: Instance DM2 execute sql ALTER DATABASE PRIMARY success [monitor] 2022-01-14 11:17:26: Notify instance DM2 to change all arch status to be invalid [monitor] 2022-01-14 11:17:26: Succeed to change all instances arch status to be invalid [monitor] 2022-01-14 11:17:26: Instance DM1 start to execute sql ALTER DATABASE OPEN FORCE [monitor] 2022-01-14 11:17:26: Instance DM1 execute sql ALTER DATABASE OPEN FORCE success [monitor] 2022-01-14 11:17:26: Instance DM2 start to execute sql ALTER DATABASE OPEN FORCE [monitor] 2022-01-14 11:17:26: Instance DM2 execute sql ALTER DATABASE OPEN FORCE success [monitor] 2022-01-14 11:17:26: Instance DM1 start to execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) [monitor] 2022-01-14 11:17:26: Instance DM1 execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) success [monitor] 2022-01-14 11:17:26: Instance DM2 start to execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) [monitor] 2022-01-14 11:17:26: Instance DM2 execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) success [monitor] 2022-01-14 11:17:26: Notify dmwatcher(DM1) switch to OPEN status [monitor] 2022-01-14 11:17:26: Dmwatcher process DM1 status switching [SWITCHOVER-->OPEN] [monitor] 2022-01-14 11:17:27: Switch dmwatcher DM1 to OPEN status success [monitor] 2022-01-14 11:17:27: Notify dmwatcher(DM2) switch to OPEN status [monitor] 2022-01-14 11:17:28: Dmwatcher process DM2 status switching [SWITCHOVER-->OPEN] [monitor] 2022-01-14 11:17:28: Switch dmwatcher DM2 to OPEN status success [monitor] 2022-01-14 11:17:28: Notify group(GRP1)'s dmwatcher to do clear [monitor] 2022-01-14 11:17:29: Clean request of dmwatcher processer DM1 success [monitor] 2022-01-14 11:17:29: Clean request of dmwatcher processer DM2 success [monitor] 2022-01-14 11:17:29: Switchover instance DM2 success 2022-01-14 11:17:29 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 111111 TRUE AUTO FALSE < > DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 11.11.11.210 5238 2022-01-14 11:17:28 GLOBAL VALID OPEN DM2 OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 10.10.13.210 5236 OK DM2 OPEN PRIMARY 0 0 REALTIME VALID 4289 29494 4289 29494 NONE < > DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 11.11.11.209 5238 2022-01-14 11:17:28 GLOBAL VALID OPEN DM1 OK 1 1 OPEN STANDBY DSC_OPEN REALTIME INVALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 10.10.13.209 5236 OK DM1 OPEN STANDBY 0 0 REALTIME INVALID 4288 28136 4288 28136 NONE DATABASE(DM1) APPLY INFO FROM (DM2), REDOS_PARALLEL_NUM (1): DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[4288, 4288, 4288], (RLSN, SLSN, KLSN)[28136, 28136, 28136], N_TSK[0], TSK_MEM_USE[0] REDO_LSN_ARR: (28136) #================================================================================# [monitor] 2022-01-14 11:17:30: Dmwatcher process DM2 status switching [OPEN-->RECOVERY] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 11:17:30 RECOVERY OK DM2 OPEN PRIMARY VALID 4 29494 29494 [monitor] 2022-01-14 11:17:33: Dmwatcher process DM2 status switching [RECOVERY-->OPEN] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 11:17:32 OPEN OK DM2 OPEN PRIMARY VALID 4 29494 29494
Takeover 接管
choose takeover grp1 Group(grp1) has active primary instance, not get standby instance list any more!
将主库网络直接中断:
[root@dm210 data]# systemctl stop network #此时可以接管了: [monitor] 2022-01-14 14:36:50: Received message timeout from(DM2) WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 14:36:39 ERROR OK DM2 OPEN PRIMARY VALID 4 29501 29501 [monitor] 2022-01-14 14:36:51: Check primary instance error in group(GRP1), start to auto takeover [monitor] 2022-01-14 14:36:51: Notify group(GRP1)'s active dmwatcher to set MID [monitor] 2022-01-14 14:36:51: Notify group(GRP1)'s active dmwatcher to set MID success [monitor] 2022-01-14 14:36:51: Start to takeover use instance DM1 [monitor] 2022-01-14 14:36:51: Notify dmwatcher(DM1) switch to TAKEOVER status [monitor] 2022-01-14 14:36:51: Dmwatcher process DM1 status switching [OPEN-->TAKEOVER] [monitor] 2022-01-14 14:36:52: Switch dmwatcher DM1 to TAKEOVER status success [monitor] 2022-01-14 14:36:52: Instance DM1 start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 7) [monitor] 2022-01-14 14:36:52: Instance DM1 execute sql SP_SET_GLOBAL_DW_STATUS(0, 7) success [monitor] 2022-01-14 14:36:52: Instance DM1 start to execute sql SP_APPLY_KEEP_PKG() [monitor] 2022-01-14 14:36:52: Instance DM1 execute sql SP_APPLY_KEEP_PKG() success [monitor] 2022-01-14 14:36:52: Instance DM1 start to execute sql ALTER DATABASE MOUNT [monitor] 2022-01-14 14:36:52: Instance DM1 execute sql ALTER DATABASE MOUNT success [monitor] 2022-01-14 14:36:52: Instance DM1 start to execute sql ALTER DATABASE PRIMARY [monitor] 2022-01-14 14:36:53: Instance DM1 execute sql ALTER DATABASE PRIMARY success [monitor] 2022-01-14 14:36:53: Notify instance DM1 to change all arch status to be invalid [monitor] 2022-01-14 14:36:53: Succeed to change all instances arch status to be invalid [monitor] 2022-01-14 14:36:53: Instance DM1 start to execute sql ALTER DATABASE OPEN FORCE [monitor] 2022-01-14 14:36:53: Instance DM1 execute sql ALTER DATABASE OPEN FORCE success [monitor] 2022-01-14 14:36:53: Instance DM1 start to execute sql SP_SET_GLOBAL_DW_STATUS(7, 0) [monitor] 2022-01-14 14:36:53: Instance DM1 execute sql SP_SET_GLOBAL_DW_STATUS(7, 0) success [monitor] 2022-01-14 14:36:53: Notify dmwatcher(DM1) switch to OPEN status [monitor] 2022-01-14 14:36:54: Dmwatcher process DM1 status switching [TAKEOVER-->OPEN] [monitor] 2022-01-14 14:36:54: Switch dmwatcher DM1 to OPEN status success [monitor] 2022-01-14 14:36:54: Notify group(GRP1)'s dmwatcher to do clear [monitor] 2022-01-14 14:36:54: Clean request of dmwatcher processer DM1 success [monitor] 2022-01-14 14:36:54: Success to takeover use instance DM1 [monitor] 2022-01-14 14:36:54: Group(GRP1) use instance DM1 auto takeover success
恢复DW环境
恢复原主库的网络后,监视器自动将原主库切换成了备库:
[monitor] 2022-01-14 14:42:19: Dmwatcher process DM2 status switching [NONE-->MON CONFIRM] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 14:42:19 MON CONFIRM OK DM2 SUSPEND PRIMARY VALID 4 29501 29501 [monitor] 2022-01-14 14:42:21: Dmwatcher process DM2 status switching [MON CONFIRM-->STARTUP] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 14:42:20 STARTUP OK DM2 SUSPEND PRIMARY VALID 4 29501 29501 [monitor] 2022-01-14 14:42:23: Instance DM2[PRIMARY, SUSPEND, ISTAT_SAME:TRUE] error WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 14:42:22 STARTUP ERROR DM2 SUSPEND PRIMARY VALID 4 29501 29501 [monitor] 2022-01-14 14:42:35: Instance DM2[STANDBY, MOUNT, ISTAT_SAME:TRUE] recover to OK WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 14:42:35 STARTUP OK DM2 MOUNT STANDBY INVALID 4 29501 29501 [monitor] 2022-01-14 14:42:36: Dmwatcher process DM2 status switching [STARTUP-->OPEN] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 14:42:36 OPEN OK DM2 OPEN STANDBY INVALID 4 29501 29501 [monitor] 2022-01-14 14:42:40: Dmwatcher process DM1 status switching [OPEN-->RECOVERY] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 14:42:39 RECOVERY OK DM1 OPEN PRIMARY VALID 5 30859 30859 [monitor] 2022-01-14 14:42:43: Dmwatcher process DM1 status switching [RECOVERY-->OPEN] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 14:42:42 OPEN OK DM1 OPEN PRIMARY VALID 5 30859 30859 show 2022-01-14 14:43:05 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 111111 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 11.11.11.209 5238 2022-01-14 14:43:03 GLOBAL VALID OPEN DM1 OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 10.10.13.209 5236 OK DM1 OPEN PRIMARY 0 0 REALTIME VALID 4359 30859 4359 30859 NONE < > DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 11.11.11.210 5238 2022-01-14 14:43:03 GLOBAL VALID OPEN DM2 OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 10.10.13.210 5236 OK DM2 OPEN STANDBY 0 0 REALTIME VALID 4356 30859 4356 30859 NONE DATABASE(DM2) APPLY INFO FROM (DM1), REDOS_PARALLEL_NUM (1): DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[4359, 4359, 4359], (RLSN, SLSN, KLSN)[30859, 30859, 30859], N_TSK[0], TSK_MEM_USE[0] REDO_LSN_ARR: (30859) #================================================================================#
DW 环境启动与关闭测试
因为Global 守护类型的守护进程,会自动将数据库实例切换到 Open 状态,并将守护进程状态也切换为 Open。因此在关闭DW系统时,必须按照一定的顺序来关闭守护进程和实例。
可以在监视器中执行 Stop Instance 命令关闭数据守护系统,命令执行成功后,数据库实例正常关闭。但守护进程并没有真正退出,而是将状态切换为Shutdown 状态。
如果使用手动方式关闭数据守护系统,请严格按照以下顺序:
1) 如果启动了确认监视器,先关闭确认监视器(防止自动接管)
2) 关闭主库守护进程(防止重启实例)
3) 关闭备库守护进程(防止重启实例)
在关闭守护进程时会自动关闭对应的DM实例。 所以我们这里只需要关闭对应的守护进程即可。
关闭DW 环境
我们这里使用手工方式关闭DW.
1.关闭监视器进程
直接ctrl + c 结束命令即可。
2.关闭主库守护进程
[root@dm209 root]# systemctl stop DmWatcherServicedw1 [root@dm209 root]# ps -ef|grep dm.ini root 4515 18565 0 15:16 pts/1 00:00:00 grep --color=auto dm.ini
3.关闭备库守护进程
[root@dm210 log]# service DmWatcherServicedw2 stop Redirecting to /bin/systemctl stop DmWatcherServicedw2.service [root@dm210 log]# ps -ef | grep dm.ini root 3725 20825 0 15:16 pts/2 00:00:00 grep --color=auto dm.ini
启动DW环境
启动备库守护进程
[root@dm210 log]# service DmWatcherServicedw2 start Redirecting to /bin/systemctl start DmWatcherServicedw2.service [root@dm210 log]# ps -ef | grep dm.ini dmdba 3795 1 1 15:18 ? 00:00:00 /dm8/bin/dmserver /dm8/data/dm/dm.ini mount root 3887 20825 0 15:18 pts/2 00:00:00 grep --color=auto dm.ini
启动守护进程时会自动启动DM 实例。
启动主库守护进程
[root@dm209 root]# service DmWatcherServicedw1 start Redirecting to /bin/systemctl start DmWatcherServicedw1.service [root@dm209 root]# ps -ef|grep dm.ini dmdba 4598 1 0 15:20 ? 00:00:00 /dm8/bin/dmserver /dm8/data/dm/dm.ini mount root 4713 18565 0 15:21 pts/1 00:00:00 grep --color=auto dm.ini
启动监视器
[dmdba@dm211 data]$ dmmonitor /dm8/data/dmmonitor.ini [monitor] 2022-01-14 15:43:58: DMMONITOR[4.0] V8 [monitor] 2022-01-14 15:43:58: DMMONITOR[4.0] IS READY. [monitor] 2022-01-14 15:43:58: Received message from(DM1) WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 15:43:57 OPEN OK DM1 OPEN PRIMARY VALID 3 30859 30859 [monitor] 2022-01-14 15:43:58: Received message from(DM2) WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-01-14 15:43:57 OPEN OK DM2 OPEN STANDBY VALID 3 30859 30859