Oracle ASM ACFS disk group rebalance

从Oracle 11.2开始,一个ASM磁盘组可以被用来创建一个或多个集群文件系统。这就是Oracle ASM集群文件系统或Oracle ACFS。这个功能通过在ASM磁盘组中创建特定的volume文件来实现,然后作为块设备给操作系统来使用,再在这些块设备上创建文件系统。下面将介绍ACFS volume文件的rebalance,mirror与extent管理。

测试环境如下:
.64-bit Oracle Linux 5.4
.Oracle Restart and ASM version 11.2.0.4.0 – 64bit

设置ACFS volumes
单实例加载ADVM/ACFS驱动的命令如下,RAC环境不需要,因为已经默认加载

[root@jyrac1 bin]# ./acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9118: oracleadvm.ko driver in use - cannot unload.
ACFS-9312: Existing ADVM/ACFS installation detected.
ACFS-9118: oracleadvm.ko driver in use - cannot unload.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
[root@jyrac1 bin]#  ./acfsload  start
ACFS-9391: Checking for existing ADVM/ACFS installation.
ACFS-9392: Validating ADVM/ACFS installation files for operating system.
ACFS-9393: Verifying ASM Administrator setup.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9322: completed
[root@jyrac1 bin]# ./acfsdriverstate version
ACFS-9325:     Driver OS kernel version = 2.6.18-8.el5(x86_64).
ACFS-9326:     Driver Oracle version = 130707.

创建一个用来创建ASM集群文件系统的磁盘组

SQL> create diskgroup acfs disk '/dev/raw/raw5','/dev/raw/raw6' attribute 'COMPATIBLE.ASM' = '11.2', 'COMPATIBLE.ADVM' = '11.2'; 

Diskgroup created.

虽然一个磁盘组可以用来存储数据库文件与ACFS volume files,但是建议为ACFS volume创建一个单独的磁盘组。这将提供角色/功能分离与对数据库文件性能有潜在好处。

检查所有磁盘组的AU大小

SQL> select group_number "Group#", name "Name", allocation_unit_size "AU size" from v$asm_diskgroup_stat;

    Group# Name                                                            AU size
---------- ------------------------------------------------------------ ----------
         1 ARCHDG                                                          1048576
         2 CRSDG                                                           1048576
         3 DATADG                                                          1048576
         4 ACFS                                                            1048576

对于所有磁盘组来说缺省的AU大小为1MB,当后面介绍volume file的区大小时会使用到AU大小。

在磁盘组ACFS中创建三个volume

[grid@jyrac1 ~]$ asmcmd volcreate -G ACFS -s 1G ACFS_VOL1
[grid@jyrac1 ~]$ asmcmd volcreate -G ACFS -s 1G ACFS_VOL2
[grid@jyrac1 ~]$ asmcmd volcreate -G ACFS -s 1G ACFS_VOL3

查看volume信息

[grid@jyrac1 ~]$ asmcmd volinfo -a
Diskgroup Name: ACFS

         Volume Name: ACFS_VOL1
         Volume Device: /dev/asm/acfs_vol1-10
         State: ENABLED
         Size (MB): 1024
         Resize Unit (MB): 32
         Redundancy: MIRROR
         Stripe Columns: 4
         Stripe Width (K): 128
         Usage: 
         Mountpath: 

         Volume Name: ACFS_VOL2
         Volume Device: /dev/asm/acfs_vol2-10
         State: ENABLED
         Size (MB): 1024
         Resize Unit (MB): 32
         Redundancy: MIRROR
         Stripe Columns: 4
         Stripe Width (K): 128
         Usage: 
         Mountpath: 

         Volume Name: ACFS_VOL3
         Volume Device: /dev/asm/acfs_vol3-10
         State: ENABLED
         Size (MB): 1024
         Resize Unit (MB): 32
         Redundancy: MIRROR
         Stripe Columns: 4
         Stripe Width (K): 128
         Usage: 
         Mountpath: 

在volume创建这后会自动被启用。当服务器重启之后可能需要手动加载ADVM/ACFS驱动(acfsload start)并启用volume(asmcmd volenable -a)。

对于每个volume,ASM将创建一个volume file。在冗余磁盘组中,每个卷将有一个dirty region logging(DRL)文件

SQL> select file_number "File#", volume_name "Volume", volume_device "Device", size_mb "MB", drl_file_number "DRL#" from v$asm_volume;

File# Volume                                   Device                                           MB       DRL#
----- ---------------------------------------- ---------------------------------------- ---------- ----------
  257 ACFS_VOL1                                /dev/asm/acfs_vol1-10                          1024        256
  259 ACFS_VOL2                                /dev/asm/acfs_vol2-10                          1024        258
  261 ACFS_VOL3                                /dev/asm/acfs_vol3-10                          1024        260

除了卷名,设备名与大小之外,还显示了ASM文件号257,259,261给卷设备使用,ASM文件号256,258,260给DRL文件使用。

查询卷文件的AU分布情况

SQL> select 
  2  xnum_kffxp,            -- virtual extent number
  3  pxn_kffxp,             -- physical extent number
  4  disk_kffxp,            -- disk number
  5  au_kffxp               -- allocation unit number
  6  from x$kffxp
  7  where number_kffxp=261-- asm file 256
  8  and group_kffxp=4      -- group number 1
  9  order by 1,2,3;

XNUM_KFFXP  PXN_KFFXP DISK_KFFXP   AU_KFFXP
---------- ---------- ---------- ----------
         0          0          0       2160
         0          1          1       2160
         1          2          1       2168
         1          3          0       2168
         2          4          0       2176
         2          5          1       2176
         3          6          1       2184
         3          7          0       2184
         4          8          0       2192
         4          9          1       2192
         5         10          1       2200
         5         11          0       2200
         6         12          0       2208
         6         13          1       2208
......

       124        248          0       3152
       124        249          1       3152
       125        250          1       3160
       125        251          0       3160
       126        252          0       3168
       126        253          1       3168
       127        254          1       3176
       127        255          0       3176
2147483648          0          0       2156
2147483648          1          1       2156
2147483648          2      65534 4294967294

259 rows selected.

当在normal冗余磁盘组中创建卷,那么卷的每个区同样也会被镜像。可以看到卷文件261有128个区。卷大小为1GB,这意味着每个区大小为8MB或8个AU。卷文件有属于它自己的区大小,不像标准的ASM文件继承来自磁盘组AU大小来初始化区大小。

在逻辑卷设备上创建ASM集群文件系统(ACFS)

[grid@jyrac1 ~]$ /sbin/mkfs -t acfs /dev/asm/acfs_vol1-10
mkfs.acfs: version                   = 11.2.0.4.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/acfs_vol1-10
mkfs.acfs: volume size               = 1073741824
mkfs.acfs: Format complete.
[grid@jyrac1 ~]$ /sbin/mkfs -t acfs /dev/asm/acfs_vol2-10
mkfs.acfs: version                   = 11.2.0.4.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/acfs_vol2-10
mkfs.acfs: volume size               = 1073741824
mkfs.acfs: Format complete.
[grid@jyrac1 ~]$ /sbin/mkfs -t acfs /dev/asm/acfs_vol3-10
mkfs.acfs: version                   = 11.2.0.4.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/acfs_vol3-10
mkfs.acfs: volume size               = 1073741824
mkfs.acfs: Format complete.

[root@jyrac1 /]# mkdir /acfs1
[root@jyrac1 /]# mkdir /acfs2
[root@jyrac1 /]# mkdir /acfs3
[root@jyrac1 /]# chown -R grid:oinstall /acfs1
[root@jyrac1 /]# chown -R grid:oinstall /acfs2
[root@jyrac1 /]# chown -R grid:oinstall /acfs3
[root@jyrac1 /]# chmod -R 777 /acfs1
[root@jyrac1 /]# chmod -R 777 /acfs2
[root@jyrac1 /]# chmod -R 777 /acfs3


[root@jyrac1 /]# mount -t acfs /dev/asm/acfs_vol1-10 /acfs1
[root@jyrac1 /]# mount -t acfs /dev/asm/acfs_vol2-10 /acfs2
[root@jyrac1 /]# mount -t acfs /dev/asm/acfs_vol3-10 /acfs3
[root@jyrac1 /]# mount | grep acfs
/dev/asm/acfs_vol1-10 on /acfs1 type acfs (rw)
/dev/asm/acfs_vol2-10 on /acfs2 type acfs (rw)
/dev/asm/acfs_vol3-10 on /acfs3 type acfs (rw)

复制一些文件到新的文件系统中

[grid@jyrac1 +asm]$ cp $ORACLE_BASE/diag/asm/+asm/+ASM1/trace/* /acfs1
[grid@jyrac1 +asm]$ cp $ORACLE_BASE/diag/asm/+asm/+ASM1/trace/* /acfs2
[grid@jyrac1 +asm]$ cp $ORACLE_BASE/diag/asm/+asm/+ASM1/trace/* /acfs3

检查使用空间

[root@jyrac1 /]# df -h /acfs?
Filesystem            Size  Used Avail Use% Mounted on
/dev/asm/acfs_vol1-10
                      1.0G  105M  920M  11% /acfs1
/dev/asm/acfs_vol2-10
                      1.0G  105M  920M  11% /acfs2
/dev/asm/acfs_vol3-10
                      1.0G  105M  920M  11% /acfs3

现在向ACFS磁盘组添加磁盘组并监控rebalance操作

SQL> alter diskgroup ACFS add disk '/dev/raw/raw7';

Diskgroup altered.

从alert_+ASM1.log文件中可以找到ARB0进程的PID为1074

[grid@jyrac1 trace]$ tail -f alert_+ASM1.log
SQL> alter diskgroup ACFS add disk '/dev/raw/raw7' 
NOTE: GroupBlock outside rolling migration privileged region
NOTE: Assigning number (4,2) to disk (/dev/raw/raw7)
NOTE: requesting all-instance membership refresh for group=4
NOTE: initializing header on grp 4 disk ACFS_0002
NOTE: requesting all-instance disk validation for group=4
Thu Jan 12 14:54:45 2017
NOTE: skipping rediscovery for group 4/0xd98640a (ACFS) on local instance.
NOTE: requesting all-instance disk validation for group=4
NOTE: skipping rediscovery for group 4/0xd98640a (ACFS) on local instance.
Thu Jan 12 14:54:45 2017
GMON updating for reconfiguration, group 4 at 249 for pid 27, osid 18644
NOTE: group 4 PST updated.
NOTE: initiating PST update: grp = 4
GMON updating group 4 at 250 for pid 27, osid 18644
NOTE: group ACFS: updated PST location: disk 0000 (PST copy 0)
NOTE: group ACFS: updated PST location: disk 0001 (PST copy 1)
NOTE: group ACFS: updated PST location: disk 0002 (PST copy 2)
NOTE: PST update grp = 4 completed successfully 
NOTE: membership refresh pending for group 4/0xd98640a (ACFS)
GMON querying group 4 at 251 for pid 18, osid 5012
NOTE: cache opening disk 2 of grp 4: ACFS_0002 path:/dev/raw/raw7
GMON querying group 4 at 252 for pid 18, osid 5012
SUCCESS: refreshed membership for 4/0xd98640a (ACFS)
NOTE: starting rebalance of group 4/0xd98640a (ACFS) at power 1
SUCCESS: alter diskgroup ACFS add disk '/dev/raw/raw7'
Starting background process ARB0
Thu Jan 12 14:54:48 2017
ARB0 started with pid=40, OS id=1074 
NOTE: assigning ARB0 to group 4/0xd98640a (ACFS) with 1 parallel I/O
cellip.ora not found.
NOTE: F1X0 copy 3 relocating from 65534:4294967294 to 2:2 for diskgroup 4 (ACFS)
Thu Jan 12 14:55:00 2017
NOTE: Attempting voting file refresh on diskgroup ACFS
NOTE: Refresh completed on diskgroup ACFS. No voting file found.

通过命令tail -f +ASM1_arb0_1074.trc来监控rebalance过程

*** 2017-01-12 14:55:18.731
ARB0 relocating file +ACFS.259.933075367 (86 entries)

*** 2017-01-12 14:55:38.599
ARB0 relocating file +ACFS.259.933075367 (1 entries)
ARB0 relocating file +ACFS.260.933075373 (17 entries)

*** 2017-01-12 14:55:39.617
ARB0 relocating file +ACFS.261.933075373 (86 entries)

*** 2017-01-12 14:55:59.106
ARB0 relocating file +ACFS.261.933075373 (1 entries)

*** 2017-01-12 14:55:59.274
ARB0 relocating file +ACFS.258.933075367 (1 entries)
ARB0 relocating file +ACFS.258.933075367 (1 entries)
ARB0 relocating file +ACFS.258.933075367 (1 entries)
ARB0 relocating file +ACFS.258.933075367 (1 entries)
ARB0 relocating file +ACFS.258.933075367 (1 entries)
ARB0 relocating file +ACFS.258.933075367 (1 entries)
ARB0 relocating file +ACFS.258.933075367 (1 entries)
ARB0 relocating file +ACFS.257.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)
ARB0 relocating file +ACFS.256.933075361 (1 entries)

*** 2017-01-12 14:56:00.201
ARB0 relocating file +ACFS.1.1 (1 entries)
ARB0 relocating file +ACFS.7.1 (1 entries)
ARB0 relocating file +ACFS.5.1 (1 entries)
ARB0 relocating file +ACFS.8.1 (1 entries)
ARB0 relocating file +ACFS.9.1 (1 entries)
ARB0 relocating file +ACFS.6.1 (1 entries)
ARB0 relocating file +ACFS.4.1 (1 entries)
ARB0 relocating file +ACFS.4.1 (1 entries)
ARB0 relocating file +ACFS.3.1 (1 entries)

.....

可以看到每个ASM文件的rebalance操作过程。这种操作行为与数据库文件是一样的,ASM对每个文件执行rebalance操作。ASM元数据文件(1-9)最先被rebalance。ASM然后对卷文件号257,259,261,ASM文件号256,258,260执行rebalance等等。
可以看到对卷文件(与其它ASM文件)执行rebalance操作并不是对存储在相关文件系统中的用户文件进行操作,而是对每个卷文件执行rebalance操作。

ACFS磁盘组中的磁盘联机操作
当一个ASM磁盘脱机时,ASM将创建staleness registry and staleness directory来跟踪磁盘联机时需要修改的区。一旦磁盘联机,ASM使用这些信息来执行快速镜像重新同步。这个功能对于ASM 11.2中的卷文件是不可用的。相反,对于联机的磁盘,ASM将重建整个磁盘内容。这就是为什么对于存储卷文件的磁盘组执行磁盘联机的性能要比存储标准数据库文件的磁盘组执行磁盘联机的性能差的原因。 对于卷文件执行快速镜像重新同步在ASM 12.1及以后的版本中是可以使用的。

小结:
ASM磁盘组可以被用来创建一般目录的集群文件系统。ASM通过在磁盘组中创建卷文件来实现,并将它们作为块设备提供给操作系统使用。现有的ASM磁盘组镜像功能(normal与high冗余)可以被用来在文件系统级别保护用户文件。ASM通过镜像卷文件区来实现,这种方式也用于任何其它的ASM文件。卷文件有它自己的区大小,不像标准数据库文件继承来自磁盘组的AU大小来初始化区大小。对存储ASM集群文件系统卷的ASM磁盘组执行rebalance操作,实际上是对每个卷文件执行rebalance操作,而不是对存储在相关文件系统中的单个用户文件执行rebalance操作。

发表评论

电子邮件地址不会被公开。