Friday, December 18, 2015

Convert RAW Devices to ASMLib, Move OCR, Voting to ASM Disk Groups




Convert RAW Devices to ASMLib, Move OCR, Voting to ASM Disk Groups

Oracle 11gR2 Cluster requires ASM and Voting disks in ASM disk groups. There is no option to use raw devices for OCR and Voting disks, from a fresh Oracle 11gR2 RAC installation. However, OCR and Voting disks are still allowed to stay on raw devices with an upgrade to 11g RAC. You would see OCR and Voting disks on raw devices in many 10g RAC installations. So, these raw devices can still be working after 11gR2 RAC upgrade.

RAW devices are a big "NO!!!" these days. Mostly because OS (starting with OEL 5 or RHEL 5) doesn't natively support raw devices. Raw devices basically came into existence for direct IO operations (to bypass OS cache). Direct IO operations support heavy writing operations. However, these days Linux kernels support direct IO operations even on a cooked file system. So, it eliminates the need for Raw devices.

After upgrading to 11gR2 RAC from 10g RAC, you may want to move away from raw devices and eventually shutdown the "rawdevices" service. This article explains how to move OCR, Voting, and regular ASM disk groups away from raw devices. Cluster needs to access OCR and Voting disks before even starting the ASM instance.. So, it needs ASMLib to support OCR and Voting on ASM Disk groups.

Install ASMLib first:
[root@rac1 ~]# yum install oracleasm-support oracleasmlib oracleasm-`uname -r`
[root@rac1 ~]# rpm -qa |grep asm
oracleasmlib-2.0.4-1.el4
oracleasm-support-2.1.3-1.el4
oracleasm-2.6.9-89.0.0.0.1.ELxenU-2.0.5-1.el4
[root@rac1 ~]#

If YUM fails to bring any one of these RPMs, go to OTN http://www.oracle.com/technetwork/topics/linux/asmlib/index-101839.html , download and install these RPMs.

Once ASMLib is installed, configure ASMlib:

/usr/sbin/oracleasm configure -i -e -u oracle -g dba -o "xvd"
There is no grid user, as its upgraded from Oracle 10g RAC. Reload ASMLib, once configured:

[root@rac1 ~]# /etc/init.d/oracleasm disable
Writing Oracle ASM library driver configuration: done
Dropping Oracle ASMLib disks: /etc/init.d/oracleasm enable [  OK  ]
Shutting down the Oracle ASMLib driver:                    [  OK  ]
[root@rac1 ~]# /etc/init.d/oracleasm enable
Writing Oracle ASM library driver configuration: done
Initializing the Oracle ASMLib driver:                     [  OK  ]
Scanning the system for Oracle ASMLib disks:               [  OK  ]
[root@rac1 ~]#

ASMLib installation and configuration is required on all nodes of the RAC. Of course, you can skip both installation and configuration of ASMLib, if you have it configured already.

My current RAW Disks:

[root@rac1 ~]# cat /etc/sysconfig/rawdevices
# This file and interface are deprecated.
# Applications needing raw device access should open regular
# block devices with O_DIRECT.
# raw device bindings
# format:   
#          
# example: /dev/raw/raw1 /dev/sda1
#          /dev/raw/raw2 8 5
#OCR
/dev/raw/raw1 /dev/xvdc1
/dev/raw/raw2 /dev/xvdd1
#Voting
/dev/raw/raw3 /dev/xvde1
/dev/raw/raw4 /dev/xvdf1
/dev/raw/raw5 /dev/xvdg1
#Data
/dev/raw/raw6 /dev/xvdh1
/dev/raw/raw7 /dev/xvdi1
[root@rac1 ~]#

[oracle@rac2 ~]$/oracle/product/11.2.0/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     196544
         Used space (kbytes)      :       6484
         Available space (kbytes) :     190060
         ID                       : 1685195531
         Device/File Name         : /dev/raw/raw1
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/raw/raw2
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check bypassed due to non-privileged user

[oracle@rac2 ~]$/oracle/product/11.2.0/11.2.0/grid/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   9a6f2f7562087f13ff949943f3d59d91 (/dev/raw/raw3) []
 2. ONLINE   0044a128314e4f36bf503691da15d1c2 (/dev/raw/raw4) []
 3. ONLINE   909c318c6c6a4fdfff042e45b79b8288 (/dev/raw/raw5) []
Located 3 voting disk(s).
[oracle@rac2 ~]$

We have allocated a new disk of 400MB for OCR and Voting each. Earlier, my OCR disk was just 100MB, Voting disk 20MB in 10g RAC. Oracle 11g RAC needs 300MB for OCR and 300MB for Voting disks. So, we have created new partitions of 400MB each. The following are the devices:

OCR: /dev/xvdj1
Voting: /dev/xvdk1

Existing ASM Data diskgroup, which holds all my datafiles: /dev/xvdh1
/dev/xvdi1

Create ASM disks on these devices: 

[root@rac1 ~]#  /etc/init.d/oracleasm createdisk OCRDISK  /dev/xvdj1
Marking disk "OCRDISK" as an ASM disk:                     [  OK  ]
[root@rac1 dev]# /etc/init.d/oracleasm createdisk VOTINGISK  /dev/xvdk1
Marking disk "VOTINGISK" as an ASM disk:                   [  OK  ]
[root@rac1 ~]# /etc/init.d/oracleasm  listdisks
OCRDISK
VOTINGISK
[root@rac1 dev]#
Connect to ASM Instance as sysasm and create the disk groups:
SQL> CREATE DISKGROUP OCR EXTERNAL REDUNDANCY DISK 'ORCL:OCRDISK' ATTRIBUTE 'compatible.asm' = '11.2' ;

Diskgroup created.

SQL> CREATE DISKGROUP VOTING EXTERNAL REDUNDANCY DISK 'ORCL:VOTINGISK' ATTRIBUTE 'compatible.asm' = '11.2';

Diskgroup created.

SQL> SELECT NAME,USABLE_FILE_MB,COMPATIBILITY,DATABASE_COMPATIBILITY,VOTING_FILES FROM V$ASM_DISKGROUP;

NAME   USABLE_FILE_MB    COMPATIBILITY   DATABASE_COMPATIBILITY   V
-----------------------------------------------------------------
DATA        16006     10.1.0.0.0         10.1.0.0.0         N
OCR           340     11.2.0.0.0         10.1.0.0.0         N
VOTING        340     11.2.0.0.0         10.1.0.0.0         N

SQL>

Then, go to the other RAC ASM instance and mount these groups.

SQL> alter diskgroup voting mount ;

Diskgroup altered.

SQL> alter diskgroup ocr mount ;

Diskgroup altered.

SQL>

Moving OCR to ASM Disk Group

OCR Devices before the migration:
[root@rac2 ~]# /oracle/product/11.2.0/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     196544
         Used space (kbytes)      :       6484
         Available space (kbytes) :     190060
         ID                       : 1685195531
         Device/File Name         : /dev/raw/raw1
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/raw/raw2
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check bypassed due to non-privileged user

[root@rac2 ~]#

Migration:

[root@rac2 ~]# /oracle/product/11.2.0/11.2.0/grid/bin/ocrconfig -replace /dev/raw/raw1 -replacement +OCR
[root@rac2 ~]# /oracle/product/11.2.0/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     196544
         Used space (kbytes)      :       6476
         Available space (kbytes) :     190068
         ID                       : 1685195531
         Device/File Name         :       +OCR
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/raw/raw2
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check bypassed due to non-privileged user

[root@rac2 ~]# /oracle/product/11.2.0/11.2.0/grid/bin/ocrconfig -delete /dev/raw/raw2

After migration: 

[root@rac2 ~]# /oracle/product/11.2.0/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     196544
         Used space (kbytes)      :       6476
         Available space (kbytes) :     190068
         ID                       : 1685195531
         Device/File Name         :       +OCR
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

[root@rac2 ~]#

For this operation to succeed, there should be at least two OCR disks. If you do not have two disks, add an OCR disk first and then drop the raw device.

[root@rac2 ~]# /oracle/product/11.2.0/11.2.0/grid/bin/ocrconfig -add +OCR

Replace operation is not possible with only one OCR Disk. Second, you need to perform these operations as root user.

Moving Voting disk to ASM Disk Group

Voting disks are easy and you do not need root access for this.

[oracle@rac2 ~]$/oracle/product/11.2.0/11.2.0/grid/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   9a6f2f7562087f13ff949943f3d59d91 (/dev/raw/raw3) []
 2. ONLINE   0044a128314e4f36bf503691da15d1c2 (/dev/raw/raw4) []
 3. ONLINE   909c318c6c6a4fdfff042e45b79b8288 (/dev/raw/raw5) []
Located 3 voting disk(s).
[oracle@rac2 ~]$crsctl replace votedisk  +VOTING
CRS-4256: Updating the profile
Successful addition of voting disk a001ae72c2504fa6bfd75bb06827edf1.
Successful deletion of voting disk 9a6f2f7562087f13ff949943f3d59d91.
Successful deletion of voting disk 0044a128314e4f36bf503691da15d1c2.
Successful deletion of voting disk 909c318c6c6a4fdfff042e45b79b8288.
Successfully replaced voting disk group with +VOTING.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced

[oracle@rac2 ~]$/oracle/product/11.2.0/11.2.0/grid/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   a001ae72c2504fa6bfd75bb06827edf1 (ORCL:VOTINGISK) [VOTING]
Located 1 voting disk(s).
[oracle@rac2 ~]$

Moving raw devices to ASMlib disks for regular data disk groups

Regular ASM disk paths before the migration:

SQL> select path from v$asm_disk;
/dev/raw/raw6
/dev/raw/raw7
..
..

SQL>

Finally, you want to move the data disk groups to ASM. There are multiple options for this:
1. Add extra disks in the group as mirrors and remove raw devices once re-balance is complete.
2. Backup the databases, drop old disk groups, recreate the disk groups with ASMLib names, and restore the backups.
3. Rename the ASM Disks to user ASMLib names (ORCL:*) of the raw devices. As the disk is already an oracle disk, ASM recognizes the disk using the label. Of course, this is risky but the fastest option.
I would like to show the third option here:
a. We need to dismount these disks or completely shutdown the ASM instance.
   Comment the disks in /etc/sysconfig/rawdevices and bounce rawdevices service (service rawdevices restart)
b. Make sure the disks are not listed in /dev/raw/
c. Then, you're all set to rename the disks. ASM complains because you're trying to rename an ORCLDISK. Use force-renamedisk!

[root@rac1 dev]# od -c /dev/xvdh1 |head -10
0000000 001 202 001 001  \0  \0  \0  \0  \0  \0  \0 200 035 355   D 274
0000020   p 002  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000040   O   R   C   L   D   I   S   K  \0  \0  \0  \0  \0  \0  \0  \0
0000060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000100  \0  \0 020  \n  \0  \0 001 003   D   A   T   A   _   0   0   0
0000120   0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000140  \0  \0  \0  \0  \0  \0  \0  \0   D   A   T   A  \0  \0  \0  \0
0000160  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000200  \0  \0  \0  \0  \0  \0  \0  \0   D   A   T   A   _   0   0   0
0000220   0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
[root@rac1 dev]#  od -c /dev/xvdi1 |head -10
0000000 001 202 001 001  \0  \0  \0  \0 001  \0  \0 200 241 355 360   b
0000020  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000040   O   R   C   L   D   I   S   K  \0  \0  \0  \0  \0  \0  \0  \0
0000060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000100  \0  \0 020  \n 001  \0 001 003   D   A   T   A   _   0   0   0
0000120   1  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000140  \0  \0  \0  \0  \0  \0  \0  \0   D   A   T   A  \0  \0  \0  \0
0000160  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000200  \0  \0  \0  \0  \0  \0  \0  \0   D   A   T   A   _   0   0   0
0000220   1  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
[root@rac1 dev]# /etc/init.d/oracleasm renamedisk /dev/xvdh1 DATA1
WARNING: Changing the label of an disk marked for ASM is a very dangerous
         operation.  If this is really what you mean to do, you must
         ensure that all Oracle and ASM instances have ceased using
         this disk.  Otherwise, you may LOSE DATA.
If you really wish to change the label, rerun with the force-renamedisk command.
Renaming disk "/dev/xvdh1" to "DATA1":                     [FAILED]
[root@rac1 dev]# /etc/init.d/oracleasm force-renamedisk /dev/xvdh1 DATA1
Renaming disk "/dev/xvdh1" to "DATA1":                     [  OK  ]
[root@rac1 dev]# /etc/init.d/oracleasm renamedisk /dev/xvdi1 DATA2
WARNING: Changing the label of an disk marked for ASM is a very dangerous
         operation.  If this is really what you mean to do, you must
         ensure that all Oracle and ASM instances have ceased using
         this disk.  Otherwise, you may LOSE DATA.
If you really wish to change the label, rerun with the force-renamedisk command.
Renaming disk "/dev/xvdi1" to "DATA2":                     [FAILED]
[root@rac1 dev]# /etc/init.d/oracleasm force-renamedisk /dev/xvdi1 DATA2
Renaming disk "/dev/xvdi1" to "DATA2":                     [  OK  ]
[root@rac1 ~]# /usr/sbin/oracleasm-discover 'ORCL:*'
Using ASMLib from /opt/oracle/extapi/32/asm/orcl/1/libasm.so
[ASM Library - Generic Linux, version 2.0.4 (KABI_V2)]
Discovered disk: ORCL:DATA1 [20964762 blocks (10733958144 bytes), maxio 64]
Discovered disk: ORCL:DATA2 [20964762 blocks (10733958144 bytes), maxio 64]
Discovered disk: ORCL:OCRDISK [803187 blocks (411231744 bytes), maxio 64]
Discovered disk: ORCL:VOTINGISK [803187 blocks (411231744 bytes), maxio 64]
[root@rac1 ~]#

Start up all the services, you're all set!! No changes are necessary to the disk groups. ASM recognizes the disks correctly and all the contents of the disks should be intact.

ASM disk paths after the migration:

SQL> select path from v$asm_disk;
ORCL:DATA1
ORCL:DATA2
..
..
SQL>

Some errors I have noticed

If OCR Diskgroup is not mounted on all RAC nodes:

[root@rac2 ~]# /oracle/product/11.2.0/11.2.0/grid/bin/ocrconfig -replace /dev/raw/raw1 -replacement +OCR
PROT-30: The Oracle Cluster Registry location to be added is not accessible
PROC-8: Cannot perform cluster registry operation because one of the parameters is invalid.
ORA-15056: additional error message
ORA-17502: ksfdcre:4 Failed to create file +OCR.255.1
ORA-15001: diskgroup "OCR" does not exist or is not mounted
ORA-06512: at line 4

[root@rac2 ~]#

ASM Compatibility (compatible.asm) has to be set to 11.2 for OCR and Voting disks.

If there is no ASMLib or wrong version of ASMlib RPM is installed, ASM instance cannot find disks with "ORCL:*" names. Of course, pseudo /dev/oracleasm/ can be used as a work around, but not a good solution, way too dangerous.

Once all devices are migrated and verified, you can stop rawdevices for good!!!

No comments:

Post a Comment