11g New Features : Top 5 Features In 11g
1.ASM Fast Mirror Resync :
ASM offlines a disk whenever it is not able to write to an extent allocated to the disk.In Oracle Database 10g,ASM no longer reads from offlined disks.Shortly after a disk is put offline,ASM drops it from disk group by re-creating the extents allocated to the disk on the remaining disks in the disk groups using redundant extent copies.This process is relatively costly operation and can take hours to complete. 11g introduces ASM Fast Mirror Resync feature which allows quick resynchronization of failed ASM disk.This feature significantly reduces the time it takes to repair a failed diskgroup, potentially from hours to minutes.When a disk goes offline following a transient failure,ASM tracks the extents that are modified during the outage.When the transient failure is repaired,ASM can quickly resynchronize only the ASM disk extents that have been affected during the outage. This tracking mechanism is very efficient as it uses one bit for each modified extent. DISK_REPAIR_TIME,an attribute for the corresponding disk group,determines the duration of disk outage that ASM will tolerate while still being able to resynchonize after you complete the repair.
The default DISK_REPAIR_TIME attribute value is 3.6h .
The REPAIR_TIMER column of V$ASM_DISK shows the amount of time left (in seconds) before an offline disk is dropped.
You can override this attribute with an ALTER DISKGROUP DISK OFFLINE statement and the DROP AFTER clause.
Note:
To use this feature, the disk group compatibility attributes must be set to 11.1 or higher.To enable Advancing compatible.asm / compatible.rdbms attributes from 10.1 to 11.1will enable the following key features:
- Fast mirror resync - Preferred mirror read- Variable size extentsPlease see Note 429098.1 Title: 11g ASM New Feature, The Disk Group Compatibility feature,for additional information.
2.Snapshot Standby :
DatabaseA snapshot standby database is a fully updatable standby database that is created by converting a physical standby database into a snapshot standby database. A snapshot standby database receives and archives, but does not apply, redo data from a primary database. A physical standby database is converted into a snapshot standby database with the following command:
SQL> ALTER DATABASE CONVERT TO SNAPSHOT STANDBY ;
The database is dismounted during conversion and must be restarted before it is mounted.Once a physical standby database has been converted into a snapshot standby database, it can be opened in read-write mode and it is fully updatable. A snapshot standby database continues to receive and archive redo data from the primary database, and this redo data will be automatically applied when the snapshot standby database is converted back into a physical standby database.A snapshot standby database is converted back into a physical standby database with the following SQL command:
SQL> ALTER DATABASE CONVERT TO PHYSICAL STANDBY;
Once a snapshot standby database has been converted back into a physical standby database and restarted, Redo Apply can be started and all redo received by the snapshot standby database will be applied.
3.Database Replay:
Large business-critical applications are complex and subjected to varying workload.Testing impact of changes to such system (software or hardware changes) becomes very difficult considering the high availability requirement with certain expected performance.Customer will need to have certain level of confidence before implementing the changes to production.The only way to achieve this is to simulate the environment same as production on test machine which is very difficult and also expensive. Database Replay allows you to test the impact of system change by replaying real-world workload on the test system before it is exposed to a production.The production workload (including transaction and dependency) of database server is recorded over an illustrative period of time which could be a peak time.This recorded data is used to replay the workload on a system that has been appropriately configured.This should give high degree of confidence before shipping changes to production.
4.Data Recovery Advisor:
Data Recover Advisor automatically diagnoses data failures, determines and presents appropriate repair options, and executes repairs at user request.The purpose of Data Recovery Advisor is to reduce the mean time to recover (MTTR) and improve manageability and reliability of Oracle Database by providing a centralized tool for automated data repair. In the context of Data Recovery Advisor, a data integrity check is a diagnostic procedure run by the Health Monitor to assess the state of the database or its components. Data integrity checks are invoked reactively when an error occurs. You can also invoke checks manually. The recovery process begins when you either suspect or discover a failure. You can discover failures in many ways, including error messages, alerts, trace files, and data integrity checks. You can then use Data Recovery Advisor to gain information and advice about failures and repair them automatically. A Failure is a persistent data corruption detected by a data integrity check. A Repair is an action that fixes one or more failures.
5.Improved Block Corruption Detection:
In earlier releases, RMAN-detected block corruptions were recorded in V$DATABASE_BLOCK_CORRUPTION. In Oracle Database 11g, live block corruptions are also recorded in that view. This view is automatically updated when corruptions are repaired; for example, using block media recovery or data file recovery. The VALIDATE command is enhanced with many new options such as VALIDATE ... BLOCK and VALIDATE DATABASE.This feature shortens the time to discover block corruptions.
Monday, June 1, 2009
RAC Configuration
Topic : Real Application Clusters
Version : Oracle 10g Release 2
Platform : RHEL AS 4 Update 4
=========== * ============ * ============ * ============ * ============ * ==========
CONTENTS
Oracle 10g RAC
1. REQUIREMENTS
2. OPERATING SYSTEM INSTALLATION
3. CONFIGURING OPERATING SYSTEM
4. CONFIGURING USER and SHARED DISK
5. CONFIGURING OCFS2 (Oracle Clustered File System)
6. INSTALL AND CONFIGURE ORACLE CLUSTERWARE
7. INSTALL ORACLE DATABASE SOFTWARE
8. CONFIGURING DISKS FOR ASM
9. CREATE ORACLE DATABASE
10.COMMAND LINE UTILITIES
=========== * ============ * ============ * ============ * ============ * ==========
Oracle 10g - REAL APPLICATION CLUSTERS
------------------------------------------------------------------------------------------------------------------
1. REQUIREMENTS
------------------------------------------------------------------------------------------------------------------
HARDWARE:
Servers / Nodes : Min 2 nodes
Processor : PIV and above
RAM : Min 1 GB
Hard Disk : 15 GB for Operating System and 4 GB for Oracle Software
Network Cards : 2 cards in each node (1 for Public IP, 1 for Private IP)
: 1 Cross cable / Network switch for Private Interconnect
Shared Disk : Shared Disk (FireWire ohci 1394 Compatible)#
: FireWire cables to connect shared disk from each node
: One FireWire card in each node to connect to FireWire port
#The 1394 Open Host Controller Interface (Open HCI) Revision 1.1 is the latest implementation specification of the link layer protocol of the 1394 Serial Bus for use in PC systems. The link layer provides additional features to support the transaction and bus management layers. The 1394 Open HCI also includes DMA engines for high-performance data transfer and a host bus interface.
SOFTWARE:
Operating System : RedHat Linux AS 4 Update 4
Cluster Software : Oracle 10g Release 2 Clusterware Database Software : Oracle 10g Release 2 Enterprise Edition for Linux
Modules : OCFS Modules: ocfs2-2.6.9-42.ELsmp-1.2.7-1.el4.i686.rpm (292.2KB) ocfs2console-1.2.7-1.el4.i386.rpm (169KB) ocfs2-tools-1.2.7-1.el4.i386.rpm (1019KB)
These modules are required to configure OCFS (oracle clustered file system) on the shared storage
Network card drivers : To recognize network cards.
Note:
Details of installation can also be obtained from http://otn.oracle.com
OCFS modules can be downloaded from: http://oss.oracle.com/projects/ocfs/
------------------------------------------------------------------------------------------------------------------
2. OPERATING SYSTEM INSTALLATION
------------------------------------------------------------------------------------------------------------------
Partitions (select manual disk partitioning):
/ - 2 GB - Fixed Size - for root /boot - 100M - Fixed Size - for Boot Loader
/usr - 6 to 8 GB - Fixed Size - for selected packages
[ select all packages while installation]
swap - Twice of RAM - Fixed Size - swap / virtual memory
/tmp - 1GB - Fixed Size - temporary files area
/var - 500 M - Fixed Size - O/S log files
/opt - 500 M - Fixed Size - for optional packages
/home - Fill to maximum available size - for storing user files
Host Name : rac1 and rac2 respectively for both nodesPublic IP : Configure now or Configure using DHCP
Firewall : Select “nofirewall”
SE Linux : Select “Disable”
Packages : Select “Customize” and select “Everything”
------------------------------------------------------------------------------------------------------------------
3. CONFIGURING OPERATING SYSTEM
------------------------------------------------------------------------------------------------------------------
a. Hostname and IP address
Use 'neat' command to assign IP address and Hostname
# neat
RAC1 RAC2
Public IP : 192.9.200.226 192.9.200.245 Mask: 255.255.255.0
Private IP : 10.0.0.4 10.0.0.5 Mask: 255.0.0.0
[Activate both network cards]
Verify your configurations by (on both nodes):
# ifconfig --> for ip addresses
# hostname --> for hostname
b. Setup the Hosts file (all nodes)
# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
#PUBLIC IP
192.9.200.171 rac1 rac1.qsoft.com
192.9.200.172 rac2 rac2.qsoft.com
#PRIVATE IP
10.0.0.3 racp1 racp1.qsoft.com
10.0.0.4 racp2 racp2.qsoft.com
#VIRTUAL IP
192.9.200.70 racvip1 racvip1.qsoft.com
192.9.200.71 racvip2 racvip2.qsoft.com
Note: Ensure that the node names are not included for the loopback address in the /etc/hosts file. If the machine name is listed in the in the loopback address entry as below:
127.0.0.1 rac1 localhost.localdomain localhost
it will need to be removed as shown below:
127.0.0.1 localhost.localdomain localhost
If the RAC node name is listed for the loopback address, you will receive the following error during the RAC installation:
ORA-00603: ORACLE server session terminated by fatal error
Ping each other node to check the connectivity:
[rac1]# ping rac2
[rac1]# ping racp2
[rac2]# ping rac1
[rac2]# ping racp1
Note: Virtual IP will not ping until the clusterware is installed
c. Setup the kernel parameters (all nodes)
Edit /etc/sysctl.conf
[root@rac1 ~]# vi /etc/sysctl.conf
kernel.shmmax=570000000
kernel.sem=250 32000 100 128
net.core.rmem_default=262144
net.core.rmem_max=262144
net.core.wmem_default=262144
net.core.wmem_max=262144
fs.file-max=65536
net.ipv4.ip_local_port_range=1024 65000
kernel.hostname = rac1 # change the hostname in other nodes
kernel.domainname = qsoft.com
Load the sysctl settings (without reboot) – on all nodes
[root@rac1 ~]# sysctl -p
Check the hostname and domain name (on all nodes):
[root@rac1 ~]# hostname
rac2
[root@rac1 ~]# domainname
qsoft.com
d. Check Firewall is disabled
[root@rac1 ~]# /etc/rc.d/init.d/iptables status
Firewall is stopped.
e. Disable SE Linux (if enabled) – on all nodes:
[root@rac1 root]# /usr/bin/system-config-securitylevel &
f. Enable/Disable services (both nodes)
# chkconfig sendmail off --> turn off the sendmail configuration
# chkconfig cups off --> turn off the printer service (optional)
# chkconfig xinetd on --> for telnet service
# chkconfig telnet on --> enable telnet
# chkconfig vsftpd on --> for ftp service
# service xinetd restart --> restart the services
# service vsftpd restart
g. Apply compat patches (all nodes)
[root@rac1 root]# ls
compat-gcc-7.3-2.96.128.i386.rpm
compat-gcc-c++-7.3-2.96.128.i386.rpm
compat-libstdc++-7.3-2.96.128.i386.rpm
compat-libstdc++-devel-7.3-2.96.128.i386.rpm
[root@rac1 root]# rpm -ivh *.rpm --force
h. Configure Shared Storage:
# ping 192.9.200.121
# vi /etc/iscsi.conf
discovery address = 192.9.200.121 ----->>uncomment this line and specify the IP
:wq
# chkconfig iscsi on -->for install & load iscsi Modules
# service iscsi restart (generate initiatorname file)
#cat /etc/initiatorname.iscsi --->>>Do not edit this file
provide macid of shared storage
# fdisk -l --> list the disks and partitions
------------------------------------------------------------------------------------------------------------------
4. CONFIGURING USER and SHARED DISK
------------------------------------------------------------------------------------------------------------------
a. Create Oracle user and Directories
You will be using OCFS2 to store the files required to be shared for the Oracle Clusterware software. When using OCFS2, the UID of the UNIX user oracle and GID of the UNIX group dba should be identical on all machines in the cluster. If either the UID or GID are different, the files on the OCFS file system may show up as "unowned" or may even be owned by a different user.
Execute following commands in all node:
# groupadd -g 2000 dba
# groupadd -g 1000 oinstall
# useradd -u 600 -g oinstall -G dba oracle10g
# passwd oracle10g
Create mount point for cluster files in all nodes:
# mkdir /home/cluster --> mount point for OCFS2
# chown -R oracle10g:oinstall /home/cluster --> change ownership to Oracle user
b. Create partitions in the shared disk [ FROM ONE NODE ONLY]
FileSystem partition size mountpoint
ocfs2 /dev/sdb1 1GB /home/cluster --> Cluster registry and Voting disk
ASM /dev/sdb2 2G +ORCL_DATA1 --> Oracle database files
ASM /dev/sdb3 2G +ORCL_DATA1 --> Oracle database files
ASM /dev/sdb4 4G +FLASH_RECOVERY_AREA --> Flash recovery area
# fdisk /dev/sdb --> give the name of the device detected [ /dev/sda]
Command (m for help): p --> Type 'p' to print partitions
Disk /dev/sdb: 500 GB, 500000000 bytes
255 heads, 63 sectors/track, 16709 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
--> Currently there are no partitions
Command (m for help): n --> Type 'n' for creating new partition
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-16709, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-16709, default 16709): +1G
Create 3 Primary partitions and remaining as Extended partitions
Command (m for help): n
Command action
e extended
p primary partition (1-4)
e
Selected partition 4
First cylinder (34-16709, default 34):
Using default value 34
Last cylinder or +size or +sizeM or +sizeK (34-16709, default 16709):
Using default value 16709
Command (m for help): p
Disk /dev/sdb: 500 GB, 500000000 bytes
255 heads, 63 sectors/track, 16709 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 13 104391 83 Linux
/dev/sdb2 14 26 104422+ 83 Linux
/dev/sdb3 27 33 56227+ 83 Linux
/dev/sdb4 34 16709 133949970 5 Extended
Command (m for help): n --> Type 'n' for creating extended partition
First cylinder (34-16709, default 34):
Using default value 34
Last cylinder or +size or +sizeM or +sizeK (34-16709, default 16709): +4G
Command (m for help): n
First cylinder (47-16709, default 47):
Using default value 47
Last cylinder or +size or +sizeM or +sizeK (47-16709, default 16709): +4G
....
....
Command (m for help): p --> Type 'p' to print all partitions
Disk /dev/sdb: 500 GB, 500000000 bytes
255 heads, 63 sectors/track, 16709 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 123 987966 83 Linux
/dev/sdb2 124 2556 19543072+ 83 Linux
/dev/sdb3 2557 4989 19543072+ 83 Linux
/dev/sdb4 4990 16709 94140900 5 Extended
/dev/sdb5 4990 9853 39070048+ 83 Linux
/dev/sdb6 9854 14717 39070048+ 83 Linux
Command (m for help): w --> Type 'w' to save and quit
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
After creating all required partitions, you should now update the kernel of the partition changes using the following syntax as the root user account:
[root@rac1 ~]# partprobe --> issue this command from all nodes
[root@rac1 root]# fdisk -l /dev/sdb --> To check the updated list of partitions
Disk /dev/sda: 500 GB, 500000000 bytes
255 heads, 63 sectors/track, 16709 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 123 987966 83 Linux
/dev/sda2 124 2556 19543072+ 83 Linux
/dev/sda3 2557 4989 19543072+ 83 Linux
/dev/sda4 4990 16709 94140900 5 Extended
/dev/sda5 4990 9853 39070048+ 83 Linux
/dev/sda6 9854 14717 39070048+ 83 Linux
c. Setting Shell limits for Oracle User
To improve the performance of the software on Linux systems, Oracle recommends you increase the following shell limits for the oracle user:
Maximum number of open file descriptors -> nofile -> 65536 (hard limit)
Maximum number of processes available to a single user -> nproc -> 16384 (hard limit)
Execute the following from all nodes:
[root@rac1 ~]# vi /etc/security/limits.conf
oracle10g soft nproc 2047
oracle10g hard nproc 16384
oracle10g soft nofile 1024
oracle10g hard nofile 65536
d. Setting correct date and time
During the installation of Oracle Clusterware, the Database, and the Companion CD, the Oracle Universal Installer (OUI) first installs the software to the local node running the installer (i.e. linux1). The software is then copied remotely to all of the remaining nodes in the cluster (i.e. linux2). During the remote copy process, the OUI will execute the UNIX "tar" command on each of the remote nodes to extract the files that were archived and copied over. If the date and time on the node performing the install is greater than that of the node it is copying to, the OUI will throw an error from the "tar" command indicating it is attempting to extract files stamped with a time in the future.
Ensure that the date and time of the all nodes are same (unless you are using Network Time). To set the date and time now, you can execute the following commands:
rac1# date -s "9/13/2007 23:00:00"rac2# date -s "9/13/2007 23:00:20" --> node 2 is greater than node 1 for safety
e. Configuring “hangcheck-timer”
Starting with Oracle9i Release 2 (9.2.0.2), the watchdog daemon has been deprecated by a Linux kernel module named hangcheck-timer which addresses availability and reliability problems much better. The hang-check timer is loaded into the Linux kernel and checks if the system hangs. It will set a timer and check the timer after a certain amount of time. There is a configurable threshold to hang-check that, if exceeded will reboot the machine.
The hangcheck-timer was normally shipped only by Oracle, however, this module is now included with Red Hat Linux AS starting with kernel versions 2.4.9-e.12 and higher
[root@rac1 ~]# find /lib/modules -name "hangcheck-timer.ko" --> check the module presence
/lib/modules/2.6.9-22.EL/kernel/drivers/char/hangcheck-timer.ko
hangcheck-tick: This parameter defines the period of time between checks of system health. The default value is 60 seconds; Oracle recommends setting it to 30 seconds.
hangcheck-margin: This parameter defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node. It defines the margin of error in seconds. The default value is 180 seconds; Oracle recommends setting it to 180 seconds.
Set the hangcheck-timer settings in /etc/modprobe.conf (all nodes)
[root@rac1 ~]# vi /etc/modprobe.conf
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
Add hangcheck-timer module in /etc/rc.local to probe it at every startup:
[root@rac1 ~]# vi /etc/rc.local
/sbin/modprobe hangcheck-timer
To test the hangcheck-timer module manually (before reboot):
[root@rac1 ~]# modprobe hangcheck-timer
[root@rac1 ~]# grep Hangcheck /var/log/messages tail -2
May 29 11:40:35 rac1 kernel: Hangcheck: starting hangcheck timer 0.5.0 (tick is 30 seconds, margin is 180 seconds).
f. Configure racnodes for remote access:
Before you can install and use Oracle Real Application clusters, you must configure either secure shell (SSH) or remote shell (RSH) for the "oracle" UNIX user account on all cluster nodes. The goal here is to setup user equivalence for the "oracle" UNIX user account. User equivalence enables the "oracle" UNIX user account to access all other nodes in the cluster (running commands and copying files) without the need for a password.
Using RSH:
# vi /etc/hosts.equiv
+rac1
+rac2
+racp1
+racp2
# chkconfig rsh on
# chkconfig rlogin on
# which rsh
/usr/kerberos/bin/rsh
# mv /usr/kerberos/bin/rsh /usr/kerberos/bin/rsh.bak
# mv /usr/kerberos/bin/rlogin /usr/kerberos/bin/rlogin.bak
# mv /usr/kerberos/bin/rcp /usr/kerberos/bin/rcp.bak
# which rsh
/usr/bin/rsh
$ rcp a.txt rac2:
Using SSH:
Perform all the above steps (1 to 6) in the all the nodes.
1. su – oracle10g
2. mkdir .ssh
3. chmod -R 700 .ssh
4. cd .ssh
5. ssh-keygen -t rsa
6. ssh-keygen -t dsa
And perform the following steps from the first node only.
7. ssh rac1 cat /home/oracle10g/.ssh/id_rsa.pub >> authorized_keys
8. ssh rac1 cat /home/oracle10g/.ssh/id_dsa.pub >> authorized_keys
9. ssh rac2 cat /home/oracle10g/.ssh/id_rsa.pub >> authorized_keys
10. ssh rac2 cat /home/oracle10g/.ssh/id_dsa.pub >> authorized_keys
11. scp authorized_keys rac2:/home/oracle10g/.ssh/
12. scp known_hosts rac2:/home/oracle10g/.ssh/
Note: The steps 12 should not prompt for password
------------------------------------------------------------------------------------------------------------------
5. CONFIGURING OCFS2 (Oracle Clustered File System)
------------------------------------------------------------------------------------------------------------------
a. Check necessary packages for Oracle software
The following packages must be installed for Oracle 10g:
make-3.80-5
glibc-2.3.4-2.9
glibc-devel-2.3.4-2.9
glibc-headers-2.3.4-2.9
glibc-kernheaders-2.4-9.1.87
cpp-3.4.3-22.1
compat-db-4.1.25-9
compat-gcc-32-3.2.3-47.3
compat-gcc-32-c++-3.2.3-47.3
compat-libstdc++-33-3.2.3-47.3
compat-libstdc++-296-2.96-132.7.2
openmotif-2.2.3-9.RHEL4.1
setarch-1.6-1
Use “rpm” command to check the availability of package:
[root@rac1 ~]# rpm -qa grep make-3.80-5
make-3.80-5
b. Install and Configure OCFS
OCFS Release 1 was released in 2002 to enable Oracle RAC users to run the clustered database without having to deal with RAW devices. The filesystem was designed to store database related files, such as data files, control files, redo logs, archive logs, etc.
Install the OCFS rpms (all nodes):
[root@rac1 OCFS_AS4U2]# ls
ocfs2-2.6.9-42.ELsmp-1.2.7-1.el4.i686.rpm ocfs2-tools-1.2.7-1.el4.i386.rpm
ocfs2console-1.2.7-1.el4.i386.rpm
[root@rac1 OCFS_AS4U4]# rpm -ivh *.rpm
Preparing... ########################################### [100%]
1:ocfs2-tools ########################################### [ 33%]
2:ocfs2-2.6.9-42.ELsmp ########################################### [ 67%]
3:ocfs2console ########################################### [100%]
Configure Cluster Nodes to OCFS (on all nodes):
[root@rac1 ~]# ocfs2console &
Cluster -> Configure Nodes -> “Node Configuration” window -> Add -> Enter the hostname and ip address of all nodes (keeping port number unchanged) -> “Apply” -> Close -> File -> Quit
[root@rac1 ~]# cat /etc/ocfs2/cluster.conf --> to verify the configuration
Understanding O2CB Service:
Before we can do anything with OCFS2 like formatting or mounting the file system, we need to first have OCFS2's cluster stack, O2CB, running (which it will be as a result of the configuration process performed above). The stack includes the following services:
● NM: Node Manager that keep track of all the nodes in the cluster.conf
● HB: Heart beat service that issues up/down notifications when nodes join or leave the cluster
● TCP: Handles communication between the nodes
● DLM: Distributed lock manager that keeps track of all locks, its owners and status
● CONFIGFS: User space driven configuration file system mounted at /config
● DLMFS: User space interface to the kernel space DLM
All of the above cluster services have been packaged in the o2cb system service (/etc/init.d/o2cb)
You can use the following commands to manage the o2cb services:
/etc/init.d/o2cb status --> check status
/etc/init.d/o2cb load --> Load all ocfs modules
/etc/init.d/o2cb online ocfs2 --> Online the cluster we created: ocfs
/etc/init.d/o2cb offline ocfs2 --> Offline the cluster we created: ocfs
/etc/init.d/o2cb unload --> Unload all ocfs modules
Configure O2CB to Start on Boot:
Before attempting to configure the on-boot properties:
REMOVE the following lines in /etc/init.d/o2cb (all nodes)
### BEGIN INIT INFO
# Provides: o2cb
# Required-Start:
# Should-Start:
# Required-Stop:
# Default-Start: 2 3 5
# Default-Stop:
# Description: Load O2CB cluster services at system boot.
### END INIT INFO
Reconfigure to implement the change (all nodes):
# chkconfig --del o2cb
# chkconfig --add o2cb
# chkconfig --list o2cb
o2cb 0:off 1:off 2:on 3:on 4:on 5:on 6:off
Unload modules from all nodes (all nodes):
# /etc/init.d/o2cb offline ocfs2
# /etc/init.d/o2cb unload
Start configuration (all nodes):
#/etc/init.d/o2cb configure
Load O2CB driver on boot (y/n) [n]: y
Cluster to start on boot (Enter "none" to clear) [ocfs2]:
Specify heartbeat dead threshold (>=7) [7]: 600
Format the drive with OCFS2 File system (from Only ONE node):
Format from GUI utility ocfs2console, use the menu [Tasks] –> Select appropriate settings -> [Format].
OR Command line [mkfs]:
# mkfs.ocfs2 -b 4K -C 32K -N 4 -L oradatafiles /dev/sdb1
where b -> Block size
C -> Cluster
N -> Nodes
L -> Label
Mount the formated slice (on all nodes) to the directory '/home/cluster' :
# mount -t ocfs2 -o datavolume,nointr /dev/sdb1 /home/cluster
Execute “mount” command to verify the mounted file system (all nodes):
[root@rac1 ~]# mount grep /home/cluster
/dev/sdb1 on /home/cluster type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
Add entry in /etc/fstab for auto mount at startup (all nodes):
/dev/sdb1 /home/cluster ocfs2 _netdev,datavolume,nointr 0 0
Check permissions on the new file system (all nodes):
# ls -ld /home/cluster/
drwxr-xr-x 3 root root 4096 May 25 21:29 /home/cluster/
Change it to Oracle user's access (from both nodes) - all nodes:
# chown oracle10g:oinstall /home/cluster; chmod 775 /home/cluster/
# ls -ld /home/cluster/
drwxrwxr-x 3 oracle10g oinstall 4096 May 25 21:29 /home/cluster/
REBOOT ALL THE NODES
After reboot verify whether ocfs partition is mounted(all nodes):
[root@rac1 ~]# mount grep /home/cluster
/dev/sdb1 on /home/cluster type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
------------------------------------------------------------------------------------------------------------------
6. INSTALL AND CONFIGURE ORACLE CLUSTERWARE
------------------------------------------------------------------------------------------------------------------
Logout from root user (GUI) and Login to Oracle user
a. Download and extract the Oracle 10g Release 2 Clusterware (only ONE node)
[oracle10g@rac1 10gRAC]$ unzip 10201_clusterware_linux32.zip
[oracle10g@rac1 10gRAC]$ ls
10201_clusterware_linux32.zip
clusterware
[oracle10g@rac1 10gRAC]$ cd clusterware/
[oracle10g@rac1 clusterware]$ ls
cluvfy doc install response rpm runInstaller stage upgrade welcome.html
a. Use cluvfy utility to verify the prerequisites before installation
# cd clusterware
# ls
cluvfy doc install response rpm runInstaller stage upgrade welcome.html
# cd rpm
# rpm -ivh cluvqdisk-1.0.1-1.rpm --> apply patch to use cluvfy utility in all the nodes
Verify the node connectivity (only ONE node):
[oracle10g@rac1 cluvfy]$
[oracle10g@rac1 cluvfy]$ ./runcluvfy.sh comp nodecon -n rac1,rac2 -verbose
Verify the prerequisites for CRS installation (only ONE node):
[oracle10g@rac1 cluvfy]$ ./runcluvfy.sh stage -pre crsinst -n rac1,rac2 -verbose
If all the above verifications complete successfully, then you can proceed with the CRS installation.
c. Invoke the Oracle Universal Installer (only ONE node)
[oracle10g@rac1 clusterware]$ ./runInstaller
1. Click Next
2. Choose path for OraInventory “/home/oracle10g/oraInventory”, Choose Oracle group as “oinstall”
3. Home Name: OraCrs10g_home Path: /home/oracle10g/crs
4. Verify requirements and click “next”
5. Specify cluster configuration details:
Cluster Name: crs Cluster nodes: rac1 racp1 racvip1
rac2 racp2 racvip2
6. Verify the network interface usage:
eth0 192.9.200.0 Public
eth1 10.0.0.0 Private
7. Specify Oracle Cluster Registry Location:
Location: /home/cluster/OCR1
Mirror Location: /home/cluster/OCR2
8. Specify Voting Disk Location:
Location: /home/cluster/Vote1
Mirror1: /home/cluster/Vote2
Mirror2: /home/cluster/Vote3
9. Click Install to start installation
10. Execute Configuration Scripts:
Execute “orainstRoot.sh” - on all nodes as root user only
[root@rac1 ~]# /home/oracle10g/oraInventory/orainstRoot.sh
Changing permissions of /home/oracle10g/oraInventory to 770.
Changing groupname of /home/oracle10g/oraInventory to dba.
The execution of the script is complete
[root@rac1 ~]# /home/oracle10g/crs/root.sh
While executing root.sh in any of the remote node, if you get a message “eth0 is not public” or any
similar error, you need to execute the VIPCA (Virtual IP Configuration Assistant) manually.
Running vipca manually:
[root@rac2 ~]# sh /home/oracle10g/crs/bin/vipca
Enter the proper IP address of your VIP and its alias names, then click Finish to complete the configuration.
You can verify the pinging of Virtual IP address now:
[oracle10g@rac2 ~]$ ping racvip1
[oracle10g@rac1 ~]$ ping racvip2
Return to the “ Execute Configuration Scripts” Screen and Click “OK”
11. Once the configurations are run successfully, click “Exit” to exit the installation
d. Post install verification (all nodes)
List the cluster nodes:
[oracle10g@rac1 ~]$ /home/oracle10g/crs/bin/olsnodes -n
rac1 1
rac2 2
Check oracle cluster auto-startup scripts:
[oracle10g@rac1 ~]$ ls -l /etc/init.d/init.*
-r-xr-xr-x 1 root root 1951 May 29 21:30 /etc/init.d/init.crs
-r-xr-xr-x 1 root root 4716 May 29 21:30 /etc/init.d/init.crsd
-r-xr-xr-x 1 root root 35396 May 29 21:30 /etc/init.d/init.cssd
-r-xr-xr-x 1 root root 3192 May 29 21:30 /etc/init.d/init.evmd
Check cluster ready services:
[oracle10g@rac1 ~]$ ps -ef grep crs
Check cluster synchronization services:
[oracle10g@rac1 ~]$ ps -ef grep css
Check the pinging of Virtual IP:
[oracle10g@rac1 ~]$ ping racvip2
[oracle10g@rac2 ~]$ ping racvip1
------------------------------------------------------------------------------------------------------------------
7. INSTALL ORACLE DATABASE SOFTWARE
------------------------------------------------------------------------------------------------------------------
Verify the prerequisites for RDBMS installation (only ONE node):
[oracle10g@rac1 cluvfy]$ cd clusterware/cluvfy
[oracle10g@rac1 cluvfy]$ ./runcluvfy.sh stage -pre dbinst -n rac1,rac2 -verbose
If all the above verifications complete successfully, then you can proceed with the CRS installation.
a. Download and extract the Oracle 10g Release 2 Database Software (one NODE only)
[oracle10g@rac1 10gRAC]$ unzip Ora10gSetup.zip
[oracle10g@rac1 10gRAC]$ cd database/
[oracle10g@rac1 database]$ ls
doc install response runInstaller stage welcome.html
b. Invoke the Oracle Universal Installer (one NODE only)
[oracle10g@rac1 database]$ ./runInstaller
1. You can verify the cluster installation by clicking “installed products”. Click “next”
2. Choose “Enterprise Edition”
3. Choose Home details:
Name: OraDb10g_home1
Path: /home/oracle10g/oracle/product/10.2.0/db_1
4. Click “Select all” for installing in all clustered nodes
5. Verify the requirements and click “next”
6. Choose “Install database Software only”
7. Click “install” to start installation
8. Execute Configuration Scripts
Execute “root.sh” in all nodes (one at a time) as root user only
[root@rac1 ~]# /home/oracle10g/oracle/product/10.2.0/db_1/root.sh
Once the scripts are run successfully, return to “Execute Configuration Scripts” window and click “ok”
9. Click “Exit” to exit the installation
c. Set the Oracle Environment
Edit the .bash_profile of oracle user (all nodes):
export ORACLE_BASE=/home/oracle10g
export ORACLE_HOME=$ORACLE_BASE/oracle/product/10.2.0/db_1
export ORA_CRS_HOME=$ORACLE_BASE/crs
export ORACLE_SID=orcl1 --> change sid in other nodes
export PATH=$ORACLE_HOME/bin:$ORA_CRS_HOME/bin:$PATH
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$ORA_CRS_HOME/lib
Execute the .bash_profile (all nodes):
[oracle10g@rac1 ~]$ . .bash_profile
Verify the environment:
[oracle10g@rac1 ~]$ echo $ORACLE_HOME
/home/oracle10g/oracle/product/10.2.0/db_1
------------------------------------------------------------------------------------------------------------------
8. CONFIGURING DISKS FOR ASM
------------------------------------------------------------------------------------------------------------------
a. Configure disks for ASM with Standard Linux I/O
Edit /etc/sysconfig/rawdevices to map the devices with Linux raw files (all nodes)
[root@rac1 ~]# vi /etc/sysconfig/rawdevices
/dev/raw/raw2 /dev/sdb2 --> sda1 is not mapped as it used for OCFS2
/dev/raw/raw3 /dev/sdb3
/dev/raw/raw5 /dev/sdb5 --> sda4 is not mapped as it is extended partition
Restart the rawdevices service (all nodes):
[root@rac1 ~]# service rawdevices restart
#raw -qa --->>to check the raw devices mapped with shared device
Change permissions to raw devices, so that the Oracle user has read and write access (all nodes):
[root@rac1 ~]# chown oracle10g:dba /dev/raw/raw2; chmod 600 /dev/raw/raw2
[root@rac1 ~]# chown oracle10g:dba /dev/raw/raw3; chmod 600 /dev/raw/raw3
[root@rac1 ~]# chown oracle10g:dba /dev/raw/raw4; chmod 600 /dev/raw/raw4
[root@rac1 ~]# chown oracle10g:dba /dev/raw/raw5; chmod 600 /dev/raw/raw5
Edit /etc/rc.local to add permission details to be assigned at every startup (all nodes):
[root@rac1 ~]# vi /etc/rc.local
chown oracle10g:dba /dev/raw/raw2; chmod 600 /dev/raw/raw2
chown oracle10g:dba /dev/raw/raw3; chmod 600 /dev/raw/raw3
chown oracle10g:dba /dev/raw/raw4; chmod 600 /dev/raw/raw4
chown oracle10g:dba /dev/raw/raw5; chmod 600 /dev/raw/raw5
------------------------------------------------------------------------------------------------------------------
9. CREATE ORACLE DATABASE
------------------------------------------------------------------------------------------------------------------
a. Configure Listeners on all nodes (as it would be required for creating database by DBCA)
Use “netca” and configure listeners and start the listener (from only ONE node)
[oracle10g@rac1 ~]$ netca
1. Choose “RAC” option
2. Choose “Listener configuration”
3. Choose “Add”
4. Add the name, protocol, and port details
5. Wait for listener to start in all nodes If listener does not start, you can manually start the listeners by using LSNRCTL command line utility
6. Click finish to exit
b. Configure ASM instance (using DBCA)
Invoke Database Configuration Assistant (DBCA) - from only ONE node
[oracle10g@rac1 ~]$ dbca
1. Choose “RAC” Options
2. Choose “Configure Automatic Storage Management”
3. Click “Select All” to select all the nodes to be configured
4. Specify the password for SYS user of ASM instance. Choose “Spfile” for creating parameter file. Specify the location of OCFS2 file system: /home/cluster/spfile+ASM.ora
5. Click “Ok” to create ASM instance
6. Initially, you'll not have any diskgroups created. Click “create new” to created diskgroups.
7. Give the diskgroup name, select disk paths required, specify the fail group name, and click “OK”
8. Please wait until the disk group is created. 9. Now, you can see the list the diskgroups created.
10. Similarly, you can create many diskgroups with the existing disks 11. Finally, click “finish” to exit ASM configuration
You can verify the asm instance (all nodes):
[oracle10g@rac1 admin]$ ps -ef grep asm
You can login to asm instance:
[oracle10g@rac1 admin]$ export ORACLE_SID=+ASM1
[oracle10g@rac1 admin]$ sqlplus
SQL*Plus: Release 10.2.0.1.0 - Production on Wed May 30 09:47:57 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Enter user-name: /as sysdba
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
SQL> select name from v$asm_diskgroup;
SQL> select group_number, disk_number, mount_status, header_status, state, name, failgroup from v$asm_disk;
SQL> select group_number, name, state, free_mb, offline_disks from v$asm_diskgroup;
c. Create database using DBCA
Invoke Database Configuration Assistant (DBCA) - from only ONE node
[oracle10g@rac1 ~]$ dbca
1. Choose “RAC” Options 2. Choose “Create database” 3. Click “Select All” to select all the nodes 4. Choose the type of database “General Purpose” 5. Specify the Global Database Name as “orcl” 6. Choose “Configure database with enterprise manager” 7. Specify the passwords for user accounts 8. Choose “Automatic storage management” 9. Select the Disk groups where the database has to be created 10. Choose “Use Oracle Managed Files” and edit the diskgroup name if you want 11. Choose whether to use Flash Recovery Area and Archiving to your database 12. Select whether to create sample schemas 13. Review the services 14. Choose Automatic or custom for memory management. Ensure that you have enough space for shared pool. 15. Review the files 16. Click “Finish” 17. Review the Initialization variables and click “OK” to start the database creation 18. Please wait until the database is created successfully 19. A summary is shown at the end for your information. Click “exit” (you can note the SID, SPFile path and OEM address)
20. The database gets restarted after clicking “exit” button You can visit OEM Home page by using the address (Like: http://rac1:1158/em)
------------------------------------------------------------------------------------------------------------------
10. COMMAND LINE UTILITIES
------------------------------------------------------------------------------------------------------------------
Using command line utilities:
Manual configuration of OEM dbconsole:
1. Ensure that you have listeners and password files created for your database and asm instances in all instances
2. Configure the repository for Enterprise Manager
$ emca -repos create
3. Configure the EM dbconsole
$ emca -config dbcontrol db
4. For any help in the syntax, you can use the following command
$ emca help=y
5. Start EM DBConsole during next startup:
$ emctl start dbconsole
Manual configuration of database services with srvctl:
Add ASM details:
$ srvctl add asm -n rac1 -i +ASM1 -o $ORACLE_HOME$ srvctl add asm -n rac2 -i +ASM2 -o $ORACLE_HOME
$ srvctl enable asm -n rac1$ srvctl enable adm -n rac2
Add database details:
$ srvctl add database -d orcl -o /home/oracle9i/OraHome1$ srvctl add instance -d orcl -i orcl1 -n rac1 $ srvctl add instance -d orcl -i orcl2 -n rac2
Check the configuration:
$ srvctl config database -d orcl
Start or Stop database:
$ srvctl { start stop } database -d orcl [ -o normal ]
Start or Stop instance:
$ srvctl { start stop } instance -d orcl -i orcl1
Check configuration in the OCR:
$ crs_stat -t
Follow the steps (manual) to shutdown your servers:
$ emctl stop dbconsole --> to stop the Database Console
$ srvctl stop database -d orcl --> to stop the database in all the instances
$ srvctl status database -d orcl --> to check the status
$ srvctl stop asm -n rac1 --> to stop the ASM in each node
$ srvctl stop asm -n rac2 --> to stop the ASM in each node
$ srvctl stop nodeapps -n rac1 --> to stop other utilities in each node # crsctl stop crs --> to stop cluster ready services
Checklist at next startup:
– Switch on the shared disks
– Switch on the Servers
– Check for CSS and CRS Processes (all nodes) [oracle10g@rac1 ~]$ crsctl check crs
– Check the Pinging between nodes
[oracle10g@rac2 ~]$ ping rac1 [oracle10g@rac2 ~]$ ping racp1 [oracle10g@rac2 ~]$ ping racvp1
– Check whether OCFS is mounted [oracle10g@rac2 ~]$ mount grep /home/cluster
– Start the Node applications in all nodes
[oracle10g@rac1 ~]$ srvctl status nodeapps -n rac1
VIP is running on node: rac1
GSD is not running on node: rac1
Listener is not running on node: rac1
ONS daemon is running on node: rac1
[oracle10g@rac1 ~]$ srvctl start nodeapps -n rac1
[oracle10g@rac1 ~]$ srvctl start nodeapps -n rac2
– Start the ASM instance in all nodes
[oracle10g@rac1 ~]$ srvctl start asm -n rac1
[oracle10g@rac1 ~]$ srvctl start asm -n rac2
– Start the Database instance from one node
[oracle10g@rac1 ~]$ srvctl start database -d orcl
– Start the Enterprise Manager DB Console on all nodes
[oracle10g@rac1 ~]$ emctl start dbconsole
[oracle10g@rac2 ~]$ emctl start dbconsole
Configure Network interfaces:
$ oifcfg getif -node rac1
$ oifcfg getif -global
$ oifcfg setif/:
Note on Parameter file:
Parameter Types:
Identical across instances
db_name
compatible
cluster_database
control_file
db_block_size
Unique across instances
instance_name
instance_number
rollback_segments
thread
undo_tablespace
Multi-Valued Parameters
fast_start_mttr_target
instance_groups (for parallel query operations)
Spfile Features:
Oracle recommends to use spfile
Easier to use and manage
Single, central location
Parameters are applicable to all instances
It permits dynamic changes
Persistent
Can specify common values and specific values
*.open_cursors=300
racins1.open_cursors=200
Tuning considerations:
Statistics / events to look for:
SQL> select * from gv$sysstat where name like '%gcs %';
This will give you a result set with specific attention to GCS messages sent across the nodes. If this value is inconsistent across nodes or if huge differences are apparent then it might be time to investigate.
===================================================================
The following query can also be run to monitor the average cr block receive time
since the last startup (should be less than 15):
set numwidth 20
column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9
select b1.inst_id, b2.value "GCS CR BLOCKS RECEIVED",
b1.value "GCS CR BLOCK RECEIVE TIME",
((b1.value / b2.value) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"
from gv$sysstat b1, gv$sysstat b2
where b1.name = 'global cache cr block receive time' and
b2.name = 'global cache cr blocks received' and b1.inst_id = b2.inst_id ;
===================================================================
When a consistent read buffer cannot be found in the local cache, an attempt is
made to find a usable version in another instance. There are 3 possible outcomes,
depending on whether any instance in the cluster has the requested data block
cached or not:
a) A cr block is received (i.e. another instance found or managed to produce the
wanted version). The "global cache cr blocks received" statistic is incremented.
b) No other instance has the block cached and therefor the requesting instance
needs to read from disk, but a shared lock will be granted to the requestor
The "global cache gets" statistic is incremented
c) RAC Only --> A current block is received (the current block is good enough for
the query ). The " global cache current blocks received" statistic is
incremented.
===================================================================
Querying the gv$ges_traffic_controller or
gv$dlm_traffic_controller views, you may find that the TCKT_AVAIL shows '0'. To
find out the available network buffer space we introduce the concepts of tickets.
The maximum number of tickets available is a function of the network send buffer
size. In the case of lmd and lmon, they always buffer their messages in case of
ticket unavailability. A node relies on messages to come back from the remote
node to release tickets for reuse.
===================================================================
1. Tuning UDP
Linux (edit files)
/proc/sys/net/core/rmem_default
/proc/sys/net/core/rmem_max
/proc/sys/net/core/wmem_default
/proc/sys/net/core/wmem_max
2. Use netstat command to check the if the interconnect is slow, busy, or faulty,
you can look for dropped packets,retransmits, or cyclic redundancy check errors (CRC)
3. Use of Backup interconnect by setting parameter:
CLUSTER_INTERCONNECTS
4. Choose Locally managed tablespaces with Automatic Segment Space Management
Version : Oracle 10g Release 2
Platform : RHEL AS 4 Update 4
=========== * ============ * ============ * ============ * ============ * ==========
CONTENTS
Oracle 10g RAC
1. REQUIREMENTS
2. OPERATING SYSTEM INSTALLATION
3. CONFIGURING OPERATING SYSTEM
4. CONFIGURING USER and SHARED DISK
5. CONFIGURING OCFS2 (Oracle Clustered File System)
6. INSTALL AND CONFIGURE ORACLE CLUSTERWARE
7. INSTALL ORACLE DATABASE SOFTWARE
8. CONFIGURING DISKS FOR ASM
9. CREATE ORACLE DATABASE
10.COMMAND LINE UTILITIES
=========== * ============ * ============ * ============ * ============ * ==========
Oracle 10g - REAL APPLICATION CLUSTERS
------------------------------------------------------------------------------------------------------------------
1. REQUIREMENTS
------------------------------------------------------------------------------------------------------------------
HARDWARE:
Servers / Nodes : Min 2 nodes
Processor : PIV and above
RAM : Min 1 GB
Hard Disk : 15 GB for Operating System and 4 GB for Oracle Software
Network Cards : 2 cards in each node (1 for Public IP, 1 for Private IP)
: 1 Cross cable / Network switch for Private Interconnect
Shared Disk : Shared Disk (FireWire ohci 1394 Compatible)#
: FireWire cables to connect shared disk from each node
: One FireWire card in each node to connect to FireWire port
#The 1394 Open Host Controller Interface (Open HCI) Revision 1.1 is the latest implementation specification of the link layer protocol of the 1394 Serial Bus for use in PC systems. The link layer provides additional features to support the transaction and bus management layers. The 1394 Open HCI also includes DMA engines for high-performance data transfer and a host bus interface.
SOFTWARE:
Operating System : RedHat Linux AS 4 Update 4
Cluster Software : Oracle 10g Release 2 Clusterware Database Software : Oracle 10g Release 2 Enterprise Edition for Linux
Modules : OCFS Modules: ocfs2-2.6.9-42.ELsmp-1.2.7-1.el4.i686.rpm (292.2KB) ocfs2console-1.2.7-1.el4.i386.rpm (169KB) ocfs2-tools-1.2.7-1.el4.i386.rpm (1019KB)
These modules are required to configure OCFS (oracle clustered file system) on the shared storage
Network card drivers : To recognize network cards.
Note:
Details of installation can also be obtained from http://otn.oracle.com
OCFS modules can be downloaded from: http://oss.oracle.com/projects/ocfs/
------------------------------------------------------------------------------------------------------------------
2. OPERATING SYSTEM INSTALLATION
------------------------------------------------------------------------------------------------------------------
Partitions (select manual disk partitioning):
/ - 2 GB - Fixed Size - for root /boot - 100M - Fixed Size - for Boot Loader
/usr - 6 to 8 GB - Fixed Size - for selected packages
[ select all packages while installation]
swap - Twice of RAM - Fixed Size - swap / virtual memory
/tmp - 1GB - Fixed Size - temporary files area
/var - 500 M - Fixed Size - O/S log files
/opt - 500 M - Fixed Size - for optional packages
/home - Fill to maximum available size - for storing user files
Host Name : rac1 and rac2 respectively for both nodesPublic IP : Configure now or Configure using DHCP
Firewall : Select “nofirewall”
SE Linux : Select “Disable”
Packages : Select “Customize” and select “Everything”
------------------------------------------------------------------------------------------------------------------
3. CONFIGURING OPERATING SYSTEM
------------------------------------------------------------------------------------------------------------------
a. Hostname and IP address
Use 'neat' command to assign IP address and Hostname
# neat
RAC1 RAC2
Public IP : 192.9.200.226 192.9.200.245 Mask: 255.255.255.0
Private IP : 10.0.0.4 10.0.0.5 Mask: 255.0.0.0
[Activate both network cards]
Verify your configurations by (on both nodes):
# ifconfig --> for ip addresses
# hostname --> for hostname
b. Setup the Hosts file (all nodes)
# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
#PUBLIC IP
192.9.200.171 rac1 rac1.qsoft.com
192.9.200.172 rac2 rac2.qsoft.com
#PRIVATE IP
10.0.0.3 racp1 racp1.qsoft.com
10.0.0.4 racp2 racp2.qsoft.com
#VIRTUAL IP
192.9.200.70 racvip1 racvip1.qsoft.com
192.9.200.71 racvip2 racvip2.qsoft.com
Note: Ensure that the node names are not included for the loopback address in the /etc/hosts file. If the machine name is listed in the in the loopback address entry as below:
127.0.0.1 rac1 localhost.localdomain localhost
it will need to be removed as shown below:
127.0.0.1 localhost.localdomain localhost
If the RAC node name is listed for the loopback address, you will receive the following error during the RAC installation:
ORA-00603: ORACLE server session terminated by fatal error
Ping each other node to check the connectivity:
[rac1]# ping rac2
[rac1]# ping racp2
[rac2]# ping rac1
[rac2]# ping racp1
Note: Virtual IP will not ping until the clusterware is installed
c. Setup the kernel parameters (all nodes)
Edit /etc/sysctl.conf
[root@rac1 ~]# vi /etc/sysctl.conf
kernel.shmmax=570000000
kernel.sem=250 32000 100 128
net.core.rmem_default=262144
net.core.rmem_max=262144
net.core.wmem_default=262144
net.core.wmem_max=262144
fs.file-max=65536
net.ipv4.ip_local_port_range=1024 65000
kernel.hostname = rac1 # change the hostname in other nodes
kernel.domainname = qsoft.com
Load the sysctl settings (without reboot) – on all nodes
[root@rac1 ~]# sysctl -p
Check the hostname and domain name (on all nodes):
[root@rac1 ~]# hostname
rac2
[root@rac1 ~]# domainname
qsoft.com
d. Check Firewall is disabled
[root@rac1 ~]# /etc/rc.d/init.d/iptables status
Firewall is stopped.
e. Disable SE Linux (if enabled) – on all nodes:
[root@rac1 root]# /usr/bin/system-config-securitylevel &
f. Enable/Disable services (both nodes)
# chkconfig sendmail off --> turn off the sendmail configuration
# chkconfig cups off --> turn off the printer service (optional)
# chkconfig xinetd on --> for telnet service
# chkconfig telnet on --> enable telnet
# chkconfig vsftpd on --> for ftp service
# service xinetd restart --> restart the services
# service vsftpd restart
g. Apply compat patches (all nodes)
[root@rac1 root]# ls
compat-gcc-7.3-2.96.128.i386.rpm
compat-gcc-c++-7.3-2.96.128.i386.rpm
compat-libstdc++-7.3-2.96.128.i386.rpm
compat-libstdc++-devel-7.3-2.96.128.i386.rpm
[root@rac1 root]# rpm -ivh *.rpm --force
h. Configure Shared Storage:
# ping 192.9.200.121
# vi /etc/iscsi.conf
discovery address = 192.9.200.121 ----->>uncomment this line and specify the IP
:wq
# chkconfig iscsi on -->for install & load iscsi Modules
# service iscsi restart (generate initiatorname file)
#cat /etc/initiatorname.iscsi --->>>Do not edit this file
provide macid of shared storage
# fdisk -l --> list the disks and partitions
------------------------------------------------------------------------------------------------------------------
4. CONFIGURING USER and SHARED DISK
------------------------------------------------------------------------------------------------------------------
a. Create Oracle user and Directories
You will be using OCFS2 to store the files required to be shared for the Oracle Clusterware software. When using OCFS2, the UID of the UNIX user oracle and GID of the UNIX group dba should be identical on all machines in the cluster. If either the UID or GID are different, the files on the OCFS file system may show up as "unowned" or may even be owned by a different user.
Execute following commands in all node:
# groupadd -g 2000 dba
# groupadd -g 1000 oinstall
# useradd -u 600 -g oinstall -G dba oracle10g
# passwd oracle10g
Create mount point for cluster files in all nodes:
# mkdir /home/cluster --> mount point for OCFS2
# chown -R oracle10g:oinstall /home/cluster --> change ownership to Oracle user
b. Create partitions in the shared disk [ FROM ONE NODE ONLY]
FileSystem partition size mountpoint
ocfs2 /dev/sdb1 1GB /home/cluster --> Cluster registry and Voting disk
ASM /dev/sdb2 2G +ORCL_DATA1 --> Oracle database files
ASM /dev/sdb3 2G +ORCL_DATA1 --> Oracle database files
ASM /dev/sdb4 4G +FLASH_RECOVERY_AREA --> Flash recovery area
# fdisk /dev/sdb --> give the name of the device detected [ /dev/sda]
Command (m for help): p --> Type 'p' to print partitions
Disk /dev/sdb: 500 GB, 500000000 bytes
255 heads, 63 sectors/track, 16709 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
--> Currently there are no partitions
Command (m for help): n --> Type 'n' for creating new partition
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-16709, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-16709, default 16709): +1G
Create 3 Primary partitions and remaining as Extended partitions
Command (m for help): n
Command action
e extended
p primary partition (1-4)
e
Selected partition 4
First cylinder (34-16709, default 34):
Using default value 34
Last cylinder or +size or +sizeM or +sizeK (34-16709, default 16709):
Using default value 16709
Command (m for help): p
Disk /dev/sdb: 500 GB, 500000000 bytes
255 heads, 63 sectors/track, 16709 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 13 104391 83 Linux
/dev/sdb2 14 26 104422+ 83 Linux
/dev/sdb3 27 33 56227+ 83 Linux
/dev/sdb4 34 16709 133949970 5 Extended
Command (m for help): n --> Type 'n' for creating extended partition
First cylinder (34-16709, default 34):
Using default value 34
Last cylinder or +size or +sizeM or +sizeK (34-16709, default 16709): +4G
Command (m for help): n
First cylinder (47-16709, default 47):
Using default value 47
Last cylinder or +size or +sizeM or +sizeK (47-16709, default 16709): +4G
....
....
Command (m for help): p --> Type 'p' to print all partitions
Disk /dev/sdb: 500 GB, 500000000 bytes
255 heads, 63 sectors/track, 16709 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 123 987966 83 Linux
/dev/sdb2 124 2556 19543072+ 83 Linux
/dev/sdb3 2557 4989 19543072+ 83 Linux
/dev/sdb4 4990 16709 94140900 5 Extended
/dev/sdb5 4990 9853 39070048+ 83 Linux
/dev/sdb6 9854 14717 39070048+ 83 Linux
Command (m for help): w --> Type 'w' to save and quit
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
After creating all required partitions, you should now update the kernel of the partition changes using the following syntax as the root user account:
[root@rac1 ~]# partprobe --> issue this command from all nodes
[root@rac1 root]# fdisk -l /dev/sdb --> To check the updated list of partitions
Disk /dev/sda: 500 GB, 500000000 bytes
255 heads, 63 sectors/track, 16709 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 123 987966 83 Linux
/dev/sda2 124 2556 19543072+ 83 Linux
/dev/sda3 2557 4989 19543072+ 83 Linux
/dev/sda4 4990 16709 94140900 5 Extended
/dev/sda5 4990 9853 39070048+ 83 Linux
/dev/sda6 9854 14717 39070048+ 83 Linux
c. Setting Shell limits for Oracle User
To improve the performance of the software on Linux systems, Oracle recommends you increase the following shell limits for the oracle user:
Maximum number of open file descriptors -> nofile -> 65536 (hard limit)
Maximum number of processes available to a single user -> nproc -> 16384 (hard limit)
Execute the following from all nodes:
[root@rac1 ~]# vi /etc/security/limits.conf
oracle10g soft nproc 2047
oracle10g hard nproc 16384
oracle10g soft nofile 1024
oracle10g hard nofile 65536
d. Setting correct date and time
During the installation of Oracle Clusterware, the Database, and the Companion CD, the Oracle Universal Installer (OUI) first installs the software to the local node running the installer (i.e. linux1). The software is then copied remotely to all of the remaining nodes in the cluster (i.e. linux2). During the remote copy process, the OUI will execute the UNIX "tar" command on each of the remote nodes to extract the files that were archived and copied over. If the date and time on the node performing the install is greater than that of the node it is copying to, the OUI will throw an error from the "tar" command indicating it is attempting to extract files stamped with a time in the future.
Ensure that the date and time of the all nodes are same (unless you are using Network Time). To set the date and time now, you can execute the following commands:
rac1# date -s "9/13/2007 23:00:00"rac2# date -s "9/13/2007 23:00:20" --> node 2 is greater than node 1 for safety
e. Configuring “hangcheck-timer”
Starting with Oracle9i Release 2 (9.2.0.2), the watchdog daemon has been deprecated by a Linux kernel module named hangcheck-timer which addresses availability and reliability problems much better. The hang-check timer is loaded into the Linux kernel and checks if the system hangs. It will set a timer and check the timer after a certain amount of time. There is a configurable threshold to hang-check that, if exceeded will reboot the machine.
The hangcheck-timer was normally shipped only by Oracle, however, this module is now included with Red Hat Linux AS starting with kernel versions 2.4.9-e.12 and higher
[root@rac1 ~]# find /lib/modules -name "hangcheck-timer.ko" --> check the module presence
/lib/modules/2.6.9-22.EL/kernel/drivers/char/hangcheck-timer.ko
hangcheck-tick: This parameter defines the period of time between checks of system health. The default value is 60 seconds; Oracle recommends setting it to 30 seconds.
hangcheck-margin: This parameter defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node. It defines the margin of error in seconds. The default value is 180 seconds; Oracle recommends setting it to 180 seconds.
Set the hangcheck-timer settings in /etc/modprobe.conf (all nodes)
[root@rac1 ~]# vi /etc/modprobe.conf
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
Add hangcheck-timer module in /etc/rc.local to probe it at every startup:
[root@rac1 ~]# vi /etc/rc.local
/sbin/modprobe hangcheck-timer
To test the hangcheck-timer module manually (before reboot):
[root@rac1 ~]# modprobe hangcheck-timer
[root@rac1 ~]# grep Hangcheck /var/log/messages tail -2
May 29 11:40:35 rac1 kernel: Hangcheck: starting hangcheck timer 0.5.0 (tick is 30 seconds, margin is 180 seconds).
f. Configure racnodes for remote access:
Before you can install and use Oracle Real Application clusters, you must configure either secure shell (SSH) or remote shell (RSH) for the "oracle" UNIX user account on all cluster nodes. The goal here is to setup user equivalence for the "oracle" UNIX user account. User equivalence enables the "oracle" UNIX user account to access all other nodes in the cluster (running commands and copying files) without the need for a password.
Using RSH:
# vi /etc/hosts.equiv
+rac1
+rac2
+racp1
+racp2
# chkconfig rsh on
# chkconfig rlogin on
# which rsh
/usr/kerberos/bin/rsh
# mv /usr/kerberos/bin/rsh /usr/kerberos/bin/rsh.bak
# mv /usr/kerberos/bin/rlogin /usr/kerberos/bin/rlogin.bak
# mv /usr/kerberos/bin/rcp /usr/kerberos/bin/rcp.bak
# which rsh
/usr/bin/rsh
$ rcp a.txt rac2:
Using SSH:
Perform all the above steps (1 to 6) in the all the nodes.
1. su – oracle10g
2. mkdir .ssh
3. chmod -R 700 .ssh
4. cd .ssh
5. ssh-keygen -t rsa
6. ssh-keygen -t dsa
And perform the following steps from the first node only.
7. ssh rac1 cat /home/oracle10g/.ssh/id_rsa.pub >> authorized_keys
8. ssh rac1 cat /home/oracle10g/.ssh/id_dsa.pub >> authorized_keys
9. ssh rac2 cat /home/oracle10g/.ssh/id_rsa.pub >> authorized_keys
10. ssh rac2 cat /home/oracle10g/.ssh/id_dsa.pub >> authorized_keys
11. scp authorized_keys rac2:/home/oracle10g/.ssh/
12. scp known_hosts rac2:/home/oracle10g/.ssh/
Note: The steps 12 should not prompt for password
------------------------------------------------------------------------------------------------------------------
5. CONFIGURING OCFS2 (Oracle Clustered File System)
------------------------------------------------------------------------------------------------------------------
a. Check necessary packages for Oracle software
The following packages must be installed for Oracle 10g:
make-3.80-5
glibc-2.3.4-2.9
glibc-devel-2.3.4-2.9
glibc-headers-2.3.4-2.9
glibc-kernheaders-2.4-9.1.87
cpp-3.4.3-22.1
compat-db-4.1.25-9
compat-gcc-32-3.2.3-47.3
compat-gcc-32-c++-3.2.3-47.3
compat-libstdc++-33-3.2.3-47.3
compat-libstdc++-296-2.96-132.7.2
openmotif-2.2.3-9.RHEL4.1
setarch-1.6-1
Use “rpm” command to check the availability of package:
[root@rac1 ~]# rpm -qa grep make-3.80-5
make-3.80-5
b. Install and Configure OCFS
OCFS Release 1 was released in 2002 to enable Oracle RAC users to run the clustered database without having to deal with RAW devices. The filesystem was designed to store database related files, such as data files, control files, redo logs, archive logs, etc.
Install the OCFS rpms (all nodes):
[root@rac1 OCFS_AS4U2]# ls
ocfs2-2.6.9-42.ELsmp-1.2.7-1.el4.i686.rpm ocfs2-tools-1.2.7-1.el4.i386.rpm
ocfs2console-1.2.7-1.el4.i386.rpm
[root@rac1 OCFS_AS4U4]# rpm -ivh *.rpm
Preparing... ########################################### [100%]
1:ocfs2-tools ########################################### [ 33%]
2:ocfs2-2.6.9-42.ELsmp ########################################### [ 67%]
3:ocfs2console ########################################### [100%]
Configure Cluster Nodes to OCFS (on all nodes):
[root@rac1 ~]# ocfs2console &
Cluster -> Configure Nodes -> “Node Configuration” window -> Add -> Enter the hostname and ip address of all nodes (keeping port number unchanged) -> “Apply” -> Close -> File -> Quit
[root@rac1 ~]# cat /etc/ocfs2/cluster.conf --> to verify the configuration
Understanding O2CB Service:
Before we can do anything with OCFS2 like formatting or mounting the file system, we need to first have OCFS2's cluster stack, O2CB, running (which it will be as a result of the configuration process performed above). The stack includes the following services:
● NM: Node Manager that keep track of all the nodes in the cluster.conf
● HB: Heart beat service that issues up/down notifications when nodes join or leave the cluster
● TCP: Handles communication between the nodes
● DLM: Distributed lock manager that keeps track of all locks, its owners and status
● CONFIGFS: User space driven configuration file system mounted at /config
● DLMFS: User space interface to the kernel space DLM
All of the above cluster services have been packaged in the o2cb system service (/etc/init.d/o2cb)
You can use the following commands to manage the o2cb services:
/etc/init.d/o2cb status --> check status
/etc/init.d/o2cb load --> Load all ocfs modules
/etc/init.d/o2cb online ocfs2 --> Online the cluster we created: ocfs
/etc/init.d/o2cb offline ocfs2 --> Offline the cluster we created: ocfs
/etc/init.d/o2cb unload --> Unload all ocfs modules
Configure O2CB to Start on Boot:
Before attempting to configure the on-boot properties:
REMOVE the following lines in /etc/init.d/o2cb (all nodes)
### BEGIN INIT INFO
# Provides: o2cb
# Required-Start:
# Should-Start:
# Required-Stop:
# Default-Start: 2 3 5
# Default-Stop:
# Description: Load O2CB cluster services at system boot.
### END INIT INFO
Reconfigure to implement the change (all nodes):
# chkconfig --del o2cb
# chkconfig --add o2cb
# chkconfig --list o2cb
o2cb 0:off 1:off 2:on 3:on 4:on 5:on 6:off
Unload modules from all nodes (all nodes):
# /etc/init.d/o2cb offline ocfs2
# /etc/init.d/o2cb unload
Start configuration (all nodes):
#/etc/init.d/o2cb configure
Load O2CB driver on boot (y/n) [n]: y
Cluster to start on boot (Enter "none" to clear) [ocfs2]:
Specify heartbeat dead threshold (>=7) [7]: 600
Format the drive with OCFS2 File system (from Only ONE node):
Format from GUI utility ocfs2console, use the menu [Tasks] –> Select appropriate settings -> [Format].
OR Command line [mkfs]:
# mkfs.ocfs2 -b 4K -C 32K -N 4 -L oradatafiles /dev/sdb1
where b -> Block size
C -> Cluster
N -> Nodes
L -> Label
Mount the formated slice (on all nodes) to the directory '/home/cluster' :
# mount -t ocfs2 -o datavolume,nointr /dev/sdb1 /home/cluster
Execute “mount” command to verify the mounted file system (all nodes):
[root@rac1 ~]# mount grep /home/cluster
/dev/sdb1 on /home/cluster type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
Add entry in /etc/fstab for auto mount at startup (all nodes):
/dev/sdb1 /home/cluster ocfs2 _netdev,datavolume,nointr 0 0
Check permissions on the new file system (all nodes):
# ls -ld /home/cluster/
drwxr-xr-x 3 root root 4096 May 25 21:29 /home/cluster/
Change it to Oracle user's access (from both nodes) - all nodes:
# chown oracle10g:oinstall /home/cluster; chmod 775 /home/cluster/
# ls -ld /home/cluster/
drwxrwxr-x 3 oracle10g oinstall 4096 May 25 21:29 /home/cluster/
REBOOT ALL THE NODES
After reboot verify whether ocfs partition is mounted(all nodes):
[root@rac1 ~]# mount grep /home/cluster
/dev/sdb1 on /home/cluster type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
------------------------------------------------------------------------------------------------------------------
6. INSTALL AND CONFIGURE ORACLE CLUSTERWARE
------------------------------------------------------------------------------------------------------------------
Logout from root user (GUI) and Login to Oracle user
a. Download and extract the Oracle 10g Release 2 Clusterware (only ONE node)
[oracle10g@rac1 10gRAC]$ unzip 10201_clusterware_linux32.zip
[oracle10g@rac1 10gRAC]$ ls
10201_clusterware_linux32.zip
clusterware
[oracle10g@rac1 10gRAC]$ cd clusterware/
[oracle10g@rac1 clusterware]$ ls
cluvfy doc install response rpm runInstaller stage upgrade welcome.html
a. Use cluvfy utility to verify the prerequisites before installation
# cd clusterware
# ls
cluvfy doc install response rpm runInstaller stage upgrade welcome.html
# cd rpm
# rpm -ivh cluvqdisk-1.0.1-1.rpm --> apply patch to use cluvfy utility in all the nodes
Verify the node connectivity (only ONE node):
[oracle10g@rac1 cluvfy]$
[oracle10g@rac1 cluvfy]$ ./runcluvfy.sh comp nodecon -n rac1,rac2 -verbose
Verify the prerequisites for CRS installation (only ONE node):
[oracle10g@rac1 cluvfy]$ ./runcluvfy.sh stage -pre crsinst -n rac1,rac2 -verbose
If all the above verifications complete successfully, then you can proceed with the CRS installation.
c. Invoke the Oracle Universal Installer (only ONE node)
[oracle10g@rac1 clusterware]$ ./runInstaller
1. Click Next
2. Choose path for OraInventory “/home/oracle10g/oraInventory”, Choose Oracle group as “oinstall”
3. Home Name: OraCrs10g_home Path: /home/oracle10g/crs
4. Verify requirements and click “next”
5. Specify cluster configuration details:
Cluster Name: crs Cluster nodes: rac1 racp1 racvip1
rac2 racp2 racvip2
6. Verify the network interface usage:
eth0 192.9.200.0 Public
eth1 10.0.0.0 Private
7. Specify Oracle Cluster Registry Location:
Location: /home/cluster/OCR1
Mirror Location: /home/cluster/OCR2
8. Specify Voting Disk Location:
Location: /home/cluster/Vote1
Mirror1: /home/cluster/Vote2
Mirror2: /home/cluster/Vote3
9. Click Install to start installation
10. Execute Configuration Scripts:
Execute “orainstRoot.sh” - on all nodes as root user only
[root@rac1 ~]# /home/oracle10g/oraInventory/orainstRoot.sh
Changing permissions of /home/oracle10g/oraInventory to 770.
Changing groupname of /home/oracle10g/oraInventory to dba.
The execution of the script is complete
[root@rac1 ~]# /home/oracle10g/crs/root.sh
While executing root.sh in any of the remote node, if you get a message “eth0 is not public” or any
similar error, you need to execute the VIPCA (Virtual IP Configuration Assistant) manually.
Running vipca manually:
[root@rac2 ~]# sh /home/oracle10g/crs/bin/vipca
Enter the proper IP address of your VIP and its alias names, then click Finish to complete the configuration.
You can verify the pinging of Virtual IP address now:
[oracle10g@rac2 ~]$ ping racvip1
[oracle10g@rac1 ~]$ ping racvip2
Return to the “ Execute Configuration Scripts” Screen and Click “OK”
11. Once the configurations are run successfully, click “Exit” to exit the installation
d. Post install verification (all nodes)
List the cluster nodes:
[oracle10g@rac1 ~]$ /home/oracle10g/crs/bin/olsnodes -n
rac1 1
rac2 2
Check oracle cluster auto-startup scripts:
[oracle10g@rac1 ~]$ ls -l /etc/init.d/init.*
-r-xr-xr-x 1 root root 1951 May 29 21:30 /etc/init.d/init.crs
-r-xr-xr-x 1 root root 4716 May 29 21:30 /etc/init.d/init.crsd
-r-xr-xr-x 1 root root 35396 May 29 21:30 /etc/init.d/init.cssd
-r-xr-xr-x 1 root root 3192 May 29 21:30 /etc/init.d/init.evmd
Check cluster ready services:
[oracle10g@rac1 ~]$ ps -ef grep crs
Check cluster synchronization services:
[oracle10g@rac1 ~]$ ps -ef grep css
Check the pinging of Virtual IP:
[oracle10g@rac1 ~]$ ping racvip2
[oracle10g@rac2 ~]$ ping racvip1
------------------------------------------------------------------------------------------------------------------
7. INSTALL ORACLE DATABASE SOFTWARE
------------------------------------------------------------------------------------------------------------------
Verify the prerequisites for RDBMS installation (only ONE node):
[oracle10g@rac1 cluvfy]$ cd clusterware/cluvfy
[oracle10g@rac1 cluvfy]$ ./runcluvfy.sh stage -pre dbinst -n rac1,rac2 -verbose
If all the above verifications complete successfully, then you can proceed with the CRS installation.
a. Download and extract the Oracle 10g Release 2 Database Software (one NODE only)
[oracle10g@rac1 10gRAC]$ unzip Ora10gSetup.zip
[oracle10g@rac1 10gRAC]$ cd database/
[oracle10g@rac1 database]$ ls
doc install response runInstaller stage welcome.html
b. Invoke the Oracle Universal Installer (one NODE only)
[oracle10g@rac1 database]$ ./runInstaller
1. You can verify the cluster installation by clicking “installed products”. Click “next”
2. Choose “Enterprise Edition”
3. Choose Home details:
Name: OraDb10g_home1
Path: /home/oracle10g/oracle/product/10.2.0/db_1
4. Click “Select all” for installing in all clustered nodes
5. Verify the requirements and click “next”
6. Choose “Install database Software only”
7. Click “install” to start installation
8. Execute Configuration Scripts
Execute “root.sh” in all nodes (one at a time) as root user only
[root@rac1 ~]# /home/oracle10g/oracle/product/10.2.0/db_1/root.sh
Once the scripts are run successfully, return to “Execute Configuration Scripts” window and click “ok”
9. Click “Exit” to exit the installation
c. Set the Oracle Environment
Edit the .bash_profile of oracle user (all nodes):
export ORACLE_BASE=/home/oracle10g
export ORACLE_HOME=$ORACLE_BASE/oracle/product/10.2.0/db_1
export ORA_CRS_HOME=$ORACLE_BASE/crs
export ORACLE_SID=orcl1 --> change sid in other nodes
export PATH=$ORACLE_HOME/bin:$ORA_CRS_HOME/bin:$PATH
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$ORA_CRS_HOME/lib
Execute the .bash_profile (all nodes):
[oracle10g@rac1 ~]$ . .bash_profile
Verify the environment:
[oracle10g@rac1 ~]$ echo $ORACLE_HOME
/home/oracle10g/oracle/product/10.2.0/db_1
------------------------------------------------------------------------------------------------------------------
8. CONFIGURING DISKS FOR ASM
------------------------------------------------------------------------------------------------------------------
a. Configure disks for ASM with Standard Linux I/O
Edit /etc/sysconfig/rawdevices to map the devices with Linux raw files (all nodes)
[root@rac1 ~]# vi /etc/sysconfig/rawdevices
/dev/raw/raw2 /dev/sdb2 --> sda1 is not mapped as it used for OCFS2
/dev/raw/raw3 /dev/sdb3
/dev/raw/raw5 /dev/sdb5 --> sda4 is not mapped as it is extended partition
Restart the rawdevices service (all nodes):
[root@rac1 ~]# service rawdevices restart
#raw -qa --->>to check the raw devices mapped with shared device
Change permissions to raw devices, so that the Oracle user has read and write access (all nodes):
[root@rac1 ~]# chown oracle10g:dba /dev/raw/raw2; chmod 600 /dev/raw/raw2
[root@rac1 ~]# chown oracle10g:dba /dev/raw/raw3; chmod 600 /dev/raw/raw3
[root@rac1 ~]# chown oracle10g:dba /dev/raw/raw4; chmod 600 /dev/raw/raw4
[root@rac1 ~]# chown oracle10g:dba /dev/raw/raw5; chmod 600 /dev/raw/raw5
Edit /etc/rc.local to add permission details to be assigned at every startup (all nodes):
[root@rac1 ~]# vi /etc/rc.local
chown oracle10g:dba /dev/raw/raw2; chmod 600 /dev/raw/raw2
chown oracle10g:dba /dev/raw/raw3; chmod 600 /dev/raw/raw3
chown oracle10g:dba /dev/raw/raw4; chmod 600 /dev/raw/raw4
chown oracle10g:dba /dev/raw/raw5; chmod 600 /dev/raw/raw5
------------------------------------------------------------------------------------------------------------------
9. CREATE ORACLE DATABASE
------------------------------------------------------------------------------------------------------------------
a. Configure Listeners on all nodes (as it would be required for creating database by DBCA)
Use “netca” and configure listeners and start the listener (from only ONE node)
[oracle10g@rac1 ~]$ netca
1. Choose “RAC” option
2. Choose “Listener configuration”
3. Choose “Add”
4. Add the name, protocol, and port details
5. Wait for listener to start in all nodes If listener does not start, you can manually start the listeners by using LSNRCTL command line utility
6. Click finish to exit
b. Configure ASM instance (using DBCA)
Invoke Database Configuration Assistant (DBCA) - from only ONE node
[oracle10g@rac1 ~]$ dbca
1. Choose “RAC” Options
2. Choose “Configure Automatic Storage Management”
3. Click “Select All” to select all the nodes to be configured
4. Specify the password for SYS user of ASM instance. Choose “Spfile” for creating parameter file. Specify the location of OCFS2 file system: /home/cluster/spfile+ASM.ora
5. Click “Ok” to create ASM instance
6. Initially, you'll not have any diskgroups created. Click “create new” to created diskgroups.
7. Give the diskgroup name, select disk paths required, specify the fail group name, and click “OK”
8. Please wait until the disk group is created. 9. Now, you can see the list the diskgroups created.
10. Similarly, you can create many diskgroups with the existing disks 11. Finally, click “finish” to exit ASM configuration
You can verify the asm instance (all nodes):
[oracle10g@rac1 admin]$ ps -ef grep asm
You can login to asm instance:
[oracle10g@rac1 admin]$ export ORACLE_SID=+ASM1
[oracle10g@rac1 admin]$ sqlplus
SQL*Plus: Release 10.2.0.1.0 - Production on Wed May 30 09:47:57 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Enter user-name: /as sysdba
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
SQL> select name from v$asm_diskgroup;
SQL> select group_number, disk_number, mount_status, header_status, state, name, failgroup from v$asm_disk;
SQL> select group_number, name, state, free_mb, offline_disks from v$asm_diskgroup;
c. Create database using DBCA
Invoke Database Configuration Assistant (DBCA) - from only ONE node
[oracle10g@rac1 ~]$ dbca
1. Choose “RAC” Options 2. Choose “Create database” 3. Click “Select All” to select all the nodes 4. Choose the type of database “General Purpose” 5. Specify the Global Database Name as “orcl” 6. Choose “Configure database with enterprise manager” 7. Specify the passwords for user accounts 8. Choose “Automatic storage management” 9. Select the Disk groups where the database has to be created 10. Choose “Use Oracle Managed Files” and edit the diskgroup name if you want 11. Choose whether to use Flash Recovery Area and Archiving to your database 12. Select whether to create sample schemas 13. Review the services 14. Choose Automatic or custom for memory management. Ensure that you have enough space for shared pool. 15. Review the files 16. Click “Finish” 17. Review the Initialization variables and click “OK” to start the database creation 18. Please wait until the database is created successfully 19. A summary is shown at the end for your information. Click “exit” (you can note the SID, SPFile path and OEM address)
20. The database gets restarted after clicking “exit” button You can visit OEM Home page by using the address (Like: http://rac1:1158/em)
------------------------------------------------------------------------------------------------------------------
10. COMMAND LINE UTILITIES
------------------------------------------------------------------------------------------------------------------
Using command line utilities:
Manual configuration of OEM dbconsole:
1. Ensure that you have listeners and password files created for your database and asm instances in all instances
2. Configure the repository for Enterprise Manager
$ emca -repos create
3. Configure the EM dbconsole
$ emca -config dbcontrol db
4. For any help in the syntax, you can use the following command
$ emca help=y
5. Start EM DBConsole during next startup:
$ emctl start dbconsole
Manual configuration of database services with srvctl:
Add ASM details:
$ srvctl add asm -n rac1 -i +ASM1 -o $ORACLE_HOME$ srvctl add asm -n rac2 -i +ASM2 -o $ORACLE_HOME
$ srvctl enable asm -n rac1$ srvctl enable adm -n rac2
Add database details:
$ srvctl add database -d orcl -o /home/oracle9i/OraHome1$ srvctl add instance -d orcl -i orcl1 -n rac1 $ srvctl add instance -d orcl -i orcl2 -n rac2
Check the configuration:
$ srvctl config database -d orcl
Start or Stop database:
$ srvctl { start stop } database -d orcl [ -o normal ]
Start or Stop instance:
$ srvctl { start stop } instance -d orcl -i orcl1
Check configuration in the OCR:
$ crs_stat -t
Follow the steps (manual) to shutdown your servers:
$ emctl stop dbconsole --> to stop the Database Console
$ srvctl stop database -d orcl --> to stop the database in all the instances
$ srvctl status database -d orcl --> to check the status
$ srvctl stop asm -n rac1 --> to stop the ASM in each node
$ srvctl stop asm -n rac2 --> to stop the ASM in each node
$ srvctl stop nodeapps -n rac1 --> to stop other utilities in each node # crsctl stop crs --> to stop cluster ready services
Checklist at next startup:
– Switch on the shared disks
– Switch on the Servers
– Check for CSS and CRS Processes (all nodes) [oracle10g@rac1 ~]$ crsctl check crs
– Check the Pinging between nodes
[oracle10g@rac2 ~]$ ping rac1 [oracle10g@rac2 ~]$ ping racp1 [oracle10g@rac2 ~]$ ping racvp1
– Check whether OCFS is mounted [oracle10g@rac2 ~]$ mount grep /home/cluster
– Start the Node applications in all nodes
[oracle10g@rac1 ~]$ srvctl status nodeapps -n rac1
VIP is running on node: rac1
GSD is not running on node: rac1
Listener is not running on node: rac1
ONS daemon is running on node: rac1
[oracle10g@rac1 ~]$ srvctl start nodeapps -n rac1
[oracle10g@rac1 ~]$ srvctl start nodeapps -n rac2
– Start the ASM instance in all nodes
[oracle10g@rac1 ~]$ srvctl start asm -n rac1
[oracle10g@rac1 ~]$ srvctl start asm -n rac2
– Start the Database instance from one node
[oracle10g@rac1 ~]$ srvctl start database -d orcl
– Start the Enterprise Manager DB Console on all nodes
[oracle10g@rac1 ~]$ emctl start dbconsole
[oracle10g@rac2 ~]$ emctl start dbconsole
Configure Network interfaces:
$ oifcfg getif -node rac1
$ oifcfg getif -global
$ oifcfg setif
Note on Parameter file:
Parameter Types:
Identical across instances
db_name
compatible
cluster_database
control_file
db_block_size
Unique across instances
instance_name
instance_number
rollback_segments
thread
undo_tablespace
Multi-Valued Parameters
fast_start_mttr_target
instance_groups (for parallel query operations)
Spfile Features:
Oracle recommends to use spfile
Easier to use and manage
Single, central location
Parameters are applicable to all instances
It permits dynamic changes
Persistent
Can specify common values and specific values
*.open_cursors=300
racins1.open_cursors=200
Tuning considerations:
Statistics / events to look for:
SQL> select * from gv$sysstat where name like '%gcs %';
This will give you a result set with specific attention to GCS messages sent across the nodes. If this value is inconsistent across nodes or if huge differences are apparent then it might be time to investigate.
===================================================================
The following query can also be run to monitor the average cr block receive time
since the last startup (should be less than 15):
set numwidth 20
column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9
select b1.inst_id, b2.value "GCS CR BLOCKS RECEIVED",
b1.value "GCS CR BLOCK RECEIVE TIME",
((b1.value / b2.value) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"
from gv$sysstat b1, gv$sysstat b2
where b1.name = 'global cache cr block receive time' and
b2.name = 'global cache cr blocks received' and b1.inst_id = b2.inst_id ;
===================================================================
When a consistent read buffer cannot be found in the local cache, an attempt is
made to find a usable version in another instance. There are 3 possible outcomes,
depending on whether any instance in the cluster has the requested data block
cached or not:
a) A cr block is received (i.e. another instance found or managed to produce the
wanted version). The "global cache cr blocks received" statistic is incremented.
b) No other instance has the block cached and therefor the requesting instance
needs to read from disk, but a shared lock will be granted to the requestor
The "global cache gets" statistic is incremented
c) RAC Only --> A current block is received (the current block is good enough for
the query ). The " global cache current blocks received" statistic is
incremented.
===================================================================
Querying the gv$ges_traffic_controller or
gv$dlm_traffic_controller views, you may find that the TCKT_AVAIL shows '0'. To
find out the available network buffer space we introduce the concepts of tickets.
The maximum number of tickets available is a function of the network send buffer
size. In the case of lmd and lmon, they always buffer their messages in case of
ticket unavailability. A node relies on messages to come back from the remote
node to release tickets for reuse.
===================================================================
1. Tuning UDP
Linux (edit files)
/proc/sys/net/core/rmem_default
/proc/sys/net/core/rmem_max
/proc/sys/net/core/wmem_default
/proc/sys/net/core/wmem_max
2. Use netstat command to check the if the interconnect is slow, busy, or faulty,
you can look for dropped packets,retransmits, or cyclic redundancy check errors (CRC)
3. Use of Backup interconnect by setting parameter:
CLUSTER_INTERCONNECTS
4. Choose Locally managed tablespaces with Automatic Segment Space Management
Oracle 10g Step By Step Data Guard Configuration
INTRODUCTION:
Oracle dataguard ensures high availability data protection and disaster
recovery for enterprise data. The dataguard configuration will always have 2
databases running transactionally consistent. If the production database becomes
unavailable because of planned or unplanned outage. Dataguard can be switched to
any standby database in the configuration to production role, minimizing down
time.
Standby databases are created by the backup of the primary database.
Once configured dataguard will maintain the standby database automatically by
transmitting redo of the primary database and applying the transmitted data to the
standby database(Redo apply for physical standby, SQL apply for logical standby).
TYPES OF STANDBY DATABASES:
Standby databases can be of two types.
1. Physical standby database.
Provides a physically identical copy of the primary database, with on disk
database structures that are identical to the primary database on a block-for-block
basis. The database schema, including indexes, are the same.
2. Logical standby database.
Contains the same logical information as the production database, although the
physical organization and structure of the data can be different.
BENEFITS OF PHYSICAL STANDBY DATABASE:
1.Redo apply:
Standby database is made consistent by applying redo data from archivelog files.
Recovery applies the changes from redo block to the data block.
2.Read only mode:
Physical standby database can be opened in read only mode for querying, and it
is available for taking the backups. Thus we can minimize the load on the
production database. Redo data will not be applied if the standby database is
opened in read only mode, but redo data will be still put on the standby side.
3. Read write mode:
Physical standby database can be opened in read write mode for temporary
purpose (using flashback feature).
4.Disaster recovery and high availability
It gives best disaster recovery and high availability solution by easily managed
switch over and failover capabilities between the primary and standby databases.
5. Data protection
Using a physical standby database, Data Guard can ensure no data loss, even in the
face of unforeseen disasters. A physical standby database supports all datatypes,
and all DDL and DML operations that the primary database can support.
BENEFITS OF LOGICAL STANDBY DATABASE:
1.Efficient use of standby hardware resources
A logical standby database can be used for other business purposes in addition to
disaster recovery requirements. It can host additional database schemas beyond
the ones that are protected in a Data Guard configuration, and users can
perform normal DDL or DML operations on those schemas any time.
2.Reduction in primary database workload
A logical standby database can remain open at the same time its tables are
updated from the primary database, and those tables are simultaneously available
for read access. This makes a logical standby database an excellent choice to do
queries, summations, and reporting activities, thereby off-loading the primary
database from those tasks and saving valuable CPU and I/O cycles.
DATA GUARD SERVICES:
1.Redo Transport Services
Control the automated transfer of redo data from the production database to
one
or more archival destinations.
2.Log Apply Services
Apply redo data on the standby database to maintain transactional
synchronization with the primary database. Redo data can be applied either from
archived redo log files, or, if real-time apply is enabled, directly from the standby
redo log files as they are being filled, without requiring the redo data to be
archived first at the standby database.
3. Role Transitions
Change the role of a database from a standby database to a primary database, or
vice versa using either a switchover or a failover operation.
CREATING PHYSICAL STANDBY DATABASE:
1. Enable forced logging for primary database.
SQL>alter database force logging;
2. Configure standby redologfiles
Standby redolog files are required for Maximum availability and maximum
protection modes. Ensure logfile sizes are identical on both primary and standby
side. Create groups adding logfiles to the groups.
SQL> alter database add standby logfile '/home/oracle10g/orcl/redo3.log' size 50m;
SQL> alter database add standby logfile '/home/oracle10g/orcl/redo4.log' size 50m;
SQL> alter database add standby logfile '/home/oracle10g/orcl/redo5.log' size 50m;
SQL> alter database add standby logfile '/home/oracle10g/orcl/redo6.log' size 50m;
3. primary database parameters
*.background_dump_dest='/home/oracle10g/orcl/bdump'
*.compatible='10.2.0.1.0'
*.control_files='/home/oracle10g/orcl/control/control01.ctl'
*.db_block_size=8192
log_archive_config='DG_CONFIG=(orcl,stand)'
log_archive_dest_1='LOCATION=/home/oracle10g/orcl/archloc/ VALID_FOR
=(ALL_LOGFILES,ALL_ROLES) db_unique_name=orcl'
log_archive_dest_2='SERVICE=con_stand
VALID_FOR=(ONLINE_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=stand'
log_archive_dest_state_1='enable'
log_archive_dest_state_2='enable'
db_unique_name='orcl'
FAL_CLIENT='orcl' #Connect descriptor name which connects to primary
FAL_SERVER='stand' #Connect descriptor name which connects to standby
standby_file_management='auto'
remote_login_passwordfile='EXCLUSIVE'
*.db_domain=''
*.db_file_multiblock_read_count=16
*.db_name='orcl'
*.job_queue_processes=10
*.open_cursors=300
*.pga_aggregate_target=94371840
*.processes=150
*.sga_target=283115520
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='/home/oracle10g/orcl/udump'
log_file_name_convert='/home/oracle10g/stand/','/home/oracle10g/orcl/'
db_file_name_convert='/home/oracle10g/stand/','/home/oracle10g/orcl/'
4. standby database parameters
*.background_dump_dest='/home/oracle10g/stand/bdump'
*.compatible='10.2.0.1.0'
*.control_files='/home/oracle10g/stand/control/constd.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_file_multiblock_read_count=16
*.db_file_name_convert='/home/oracle10g/orcl','/home/oracle10g/stand'
*.db_name='orcl'
*.db_unique_name='stand'
*.fal_client='stand' #Connect descriptor name which connects to standby
*.fal_server='orcl' #Connect descriptor name which connects to primary
*.job_queue_processes=10
*.log_archive_config='DG_CONFIG=(stand,orcl)'
*.log_archive_dest_1='LOCATION=/home/oracle10g/stand/archloc/ VALID_FOR=
(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=stand'
*.log_archive_dest_2='SERVICE=con_orcl LGWR ASYNC VALID_FOR=
(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=orcl'
*.log_archive_dest_state_1='enable'
*.log_archive_dest_state_2='enable'
*.log_archive_max_processes=2
*.log_file_name_convert='/home/oracle10g/orcl','/home/oracle10g/stand'
*.open_cursors=300
*.pga_aggregate_target=16777216
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=167772160
*.standby_archive_dest='/home/oracle10g/stand/arch/'
*.standby_file_management='auto'
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='/home/oracle10g/stand/udump'
SQL> select instance_name,status from v$instance;
INSTANCE_NAME STATUS
---------------- ------------
orcl OPEN
5. Backup the primary database completely:
NOTE: Backup can be taken either by shutting down the database or by RMAN.
6. Start the primary database in mount or open phase to take control file backup.
SQL> alter database create standby controlfile as '/home/oracle10g/stdctl.ctl';
7. Copy standby controlfile, datafiles and redologfile to standby side.
Create appropriate directories where datafile and redolog file has to be kept in
the standby side. Update the parameters DB_FILE_NAME_CONVERT,
LOG_FILE_NAME_CONVERT in the parameter files of both primary and standby
databases as shown above.
8. Shutdown primary database
9. Configure listener on both primary and standby side and suitable connect
descriptors which can connect to each other.
10.Create password file for both primary standby database.
Note: Both standby and primary database should have same password.
11. Start the standby database in nomount phase and mount it.
SQL> alter database mount standby database;
12. Start primary database.
SQL> select error from v$archive_dest;
NOTE: Above query Should not give any errors for any destination which
transmits redo data from the primary database. If any error occurs check the
listener status on the standby side and corresponding connect descriptor on the
primary side.
13.Verifying the standby database.
SQL> select SEQUENCE#,ARCHIVED,APPLIED,FIRST_TIME,NEXT_TIME from
v$archived_log;
SEQUENCE# ARC APP FIRST_TIM NEXT_TIME
-------------------- ------- ------ ---------------- -----------------
19 YES YES 17-FEB-08 18-FEB-08
20 YES YES 18-FEB-08 18-FEB-08
21 YES YES 18-FEB-08 18-FEB-08
22 YES YES 18-FEB-08 18-FEB-08
23 YES YES 18-FEB-08 18-FEB-08
SQL> alter system switch logfile; (Primary database)
SQL> select SEQUENCE#,ARCHIVED,APPLIED,FIRST_TIME,NEXT_TIME from
v$archived_log; (Standby database)
SEQUENCE# ARC APP FIRST_TIM NEXT_TIME
---------- --- --- --------- ---------
19 YES YES 17-FEB-08 18-FEB-08
20 YES YES 18-FEB-08 18-FEB-08
21 YES YES 18-FEB-08 18-FEB-08
22 YES YES 18-FEB-08 18-FEB-08
23 YES YES 18-FEB-08 18-FEB-08
24 YES YES 18-FEB-08 18-FEB-08 (New Archive logfile applied)
Note: The Query has to be checked for the subsequent log switches that occur on
the primary side. Primary database will be one log switch ahead of the standby
database. Also ensure that MRP process is running on standby side by giving
ps -ef grep ora_ command.
DATAGUARD PROTECTION MODES:
Following table shows the archive parameter attributes to be specified for keeping
the database in different protection modes properly.
Maximum
Protection
Maximum
Availability
Maximum
Performance
Redo archival
process
LGWR LGWR LGWR or ARCH
Network
transmission
mode
SYNC SYNC SYNC or ASYNC when using LGWR process.
SYNC if using ARCH process
Disk write
option
AFFIRM AFFIRM AFFIRM or NOAFFIRM
Standby redo
log required?
Yes Yes No, but it is recommended
UPGRADING FROM PERFORMENCE TO AVAILIBILITY:
1. Verify the parameters of the primary database.
SQL> show parameter log_archive_dest_2
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_2 string SERVICE=con_stand VALID_FOR=(O
NLINE_LOGFILES,ALL_ROLES) DB_U
NIQUE_NAME=stand
2. If not set properly, set the archive parameters with the proper log transport
services as required for the protection modes.
SQL> alter system set log_archive_dest_2='SERVICE=con_stand LGWR SYNC
AFFIRM VALID_FOR=(ONLINE_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=stand';
3. Shutdown and startup the database in mount phase. And issue the following
command for keeping database in maximum availability mode.
SQL> ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY;
4. Open the database
SQL> ALTER DATABASE OPEN;
5.Verify
SQL> select protection_mode,protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
--------------------------------- --------------------------------------
MAXIMUM AVAILABILITY RESYNCHRONIZATION
DOWNGRADING FROM AVAILABILITY TO PERFORMANCE:
1. Set the archive parameter suitably.
SQL> alter system set log_archive_dest_2='SERVICE=con_stand ARCH SYNC
AFFIRM VALID_FOR=(ONLINE_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=stand';
2. Issue the following command to put the database back in Performance Mode.
SQL> ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE
PERFORMANCE;
3. Verify
SQL> select protection_mode,protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
--------------------------------- ---------------------------------
MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE
NOTE: For downgrading form Availability to performance or protection to
Availability there is no need to restart the instance.
VERIFY THE LOGSHIPPING:
NOTE: Following Queries provides you the information regarding the STATUS for
the Online logfiles, Maximum LSN, and the Archive log file with the latest LSN
which is Transmitted.
SQL> select thread#,sequence#,archived,status from v$log;
THREAD# SEQUENCE# ARC STATUS
---------- ---------- --- ----------------
1 35 NO CURRENT
1 33 YES INACTIVE
1 34 YES INACTIVE
SQL> select max(sequence#),thread# from v$archived_log group by thread#;
MAX(SEQUENCE#) THREAD#
------------------------------ ----------
34 1
SQL> select destination,status,archived_thread#,archived_seq# from
v$archive_dest_status where status <> 'DEFERED' and status <> 'INACTIVE';
DESTINATION STATUS ARCHIVED_THREAD# ARCHIVED_SEQ#
--------------------------------------------- ------------ --------------------------------- --------------------------
/home/oracle10g/orcl/archloc VALID 1 34
con_stand VALID 1 34
DATABASE ROLE TRANSITIONS:
1. Check the database's current role in the configuration.
SQL> select DATABASE_ROLE from v$database;
DATABASE_ROLE
----------------
PRIMARY
2. To choose the suitable standby database to which we can failover or switchover
with the least possible apply time. We have following query on the standby
database.
SQL> select NAME,VALUE,UNIT from v$dataguard_stats;
NAME VALUE UNIT
------------------ ---------------- ------------------------------
apply finish time day(2) to second(1) interval
apply lag +00 02:19:59 day(2) to second(0) interval
estimated startup 10 second
time
standby has been o N
pen
transport lag +00 02:17:32 day(2) to second(0) interval
3. Verify current switchover status on primary database.
SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
--------------------
TO STANDBY
4. Issue switchover command on primary database first.
SQL> alter database commit to switchover to physical standby;
5. Shutdown and Start the standby database in mount phase.
6. Verify the proper switchover of Primary to Standby by following query.
SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
--------------------
TO PRIMARY
7. Further carry out switchover operation on standby side.
SQL> alter database commit to switchover to primary;
8. Verify the proper switchover of Standby to Primary.
SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
--------------------
TO STANDBY
9. Shutdown and startup the new primary database normally.
NOTE: OR alter database open if the standby database is not open in read only
mode even for the first time.
10. Start Managed recovery process on new standby side.
SQL> alter database recover managed standby database disconnect;
11. Verify the MRP process.
SQL> SELECT PROCESS,STATUS FROM V$MANAGED_STANDBY;
PROCESS STATUS
--------- ------------
ARCH CLOSING
ARCH CLOSING
MRP0 WAIT_FOR_LOG
RFS IDLE
FAILOVER:
1. Issue the following query to check for any archive gaps. If exists, Archive gaps
has to be resolved first.
NOTE: Resolving Archive gaps is demonstrated in further steps.
SQL> select thread#,low_sequence#,high_sequence# from v$archive_gap;
2. To check the archive log file with the highest LSN issue following query Until
which archive gap has to be resolved.
SQL> select unique thread# as thread, max(sequence#) over (partition by
thread#) as last from v$archived_log;
THREAD LAST
---------------- ----------
1 45
NOTE: If archive gap exists copy corresponding archive log file to the standby
database and register it to the standby side. Repeate the step until the gap is
resolved.
SQL> alter database register standby logfile 'path and name';
3. To Failover issue the following commands on standby.
SQL> alter database recover managed standby database finish force;
SQL> alter database commit to switchover to primary;
NOTE: Shutdown and startup if the database was opened in read only any time
after configuration.
HOW TO OPEN STANDBY DATABASE IN READ ONLY MODE?
1.When it is shutdown just STARTUP will open standby database in read only mode.
2.If the standby database is mounted then:
a. SQL> alter database recover managed standby database cancel;
SQL> alter database open;
SQL> select OPEN_MODE from v$database;
OPEN_MODE
--------------------
READ ONLY
NOTE: Further... If we start MRP process while the standby database is opened, it
will be automatically put into Mount phase as shown below.
SQL> alter database recover managed standby database disconnect;
SQL> select status from v$instance;
STATUS
------------
MOUNTED
SQL> SELECT PROCESS,STATUS FROM V$MANAGED_STANDBY;
PROCESS STATUS
-------------- ------------
ARCH CLOSING
ARCH CLOSING
MRP0 WAIT_FOR_LOG
RFS IDLE
SQL> alter database recover managed standby database cancel;
SQL> alter database open;
SQL> SELECT PROCESS,STATUS FROM V$MANAGED_STANDBY;
PROCESS STATUS
-------------- ------------
ARCH CLOSING
ARCH CLOSING
RFS IDLE
NOTE: The above query shows MRP process stopped as it has been cancelled.
MANAGEMENT:
Tablespace Management.
If STANDBY_FILE_MANAGEMENT is set to auto.
SQL> create tablespace tbs1
datafile '/home/oracle10g/orcl/tbs1.dbf' size 20m;
SQL> alter system archive log current;
In standby side the datafile will be automatically created.
SQL> select name from v$datafile;
NAME
--------------------------------------------------------------------------------
/home/oracle10g/stand/system01.dbf
/home/oracle10g/stand/undotbs01.dbf
/home/oracle10g/stand/sysaux01.dbf
/home/oracle10g/stand/users01.dbf
/home/oracle10g/stand/example01.dbf
/home/oracle10g/stand/tbs1.dbf
If the Tablespace is dropped
SQL> drop tablespace tbs1;
SQL> alter system archive log current;
In standby side, the datafile will be automatically dropped.
SQL> select name from v$datafile;
NAME
----------------------------------------------------------------
/home/oracle10g/stand/system01.dbf
/home/oracle10g/stand/undotbs01.dbf
/home/oracle10g/stand/sysaux01.dbf
/home/oracle10g/stand/users01.dbf
/home/oracle10g/stand/example01.dbf
If STANDBY_FILE_MANAGEMENT is not set to auto.
Renaming datafile:
If a datafile is renamed in the primary side, The same datafile has to be rename
manually in the standby side.
A. SQL> alter tablespace example offline;
B. $ mv example01.dbf example.dbf
C. alter database rename file '/home/oracle10g/orcl/example01.dbf' to
'/home/oracle10g/orcl/example.dbf';
SQL> alter system archive log current;
On standby side:
D. SQL> alter system set standby_file_management=manual;
E. SQL> alter database rename file '/home/oracle10g/stand/example01.dbf'
to '/home/oracle10g/stand/example.dbf';
F. SQL> alter database recover managed standby database disconnect;
G. SQL> alter system set standby_file_management=auto;
H. SQL> select name from v$datafile;
NAME
----------------------------------------------------------
/home/oracle10g/stand/system01.dbf
/home/oracle10g/stand/undotbs01.dbf
/home/oracle10g/stand/sysaux01.dbf
/home/oracle10g/stand/users01.dbf
/home/oracle10g/stand/example.dbf
MONITORING PROCESS:
Following queries are helpful in getting the information regarding the processes
and the Archvelog files.
SQL> select process,client_process,sequence#,status from v$managed_standby;
PROCESS CLIENT_P SEQUENCE# STATUS
--------- ---------------- ------------------- ------------
ARCH ARCH 0 CONNECTED
ARCH ARCH 0 CONNECTED
MRP0 N/A 63 WAIT_FOR_LOG
RFS N/A 0 IDLE
SQL> select archived_thread#,archived_seq#,applied_thread#,applied_seq# from
v$archive_dest_status;
ARCHIVED_THREAD# ARCHIVED_SEQ# APPLIED_THREAD# APPLIED_SEQ#
---------------- ------------- --------------- ------------
1 64 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
1 63 1 62
SQL> select name,creator,sequence#,applied,completion_time from
v$archived_log;
NAME CREATOR SEQUENCE# APP COMPLETION
---------- --------------- ------------------- ---------------------------
/home/oracle10g ARCH 65 YES 28-MAR-08
/stand/archloc/
1_65_646949374.dbf
SQL> SELECT FIRST_TIME,FIRST_CHANGE#,NEXT_CHANGE#,SEQUENCE#
FROM V$LOG_HISTORY;
17-FEB-08 446075 474198 1
17-FEB-08 474198 476839 2
17-FEB-08 476839 476841 3
------------ --------- ---------- -
------------ --------- --------- -
------------ --------- ---------- -
27-MAR-08 554446 561139 63
28-MAR-08 561139 561141 64
28-MAR-08 561141 561853 65
SQL> select thread#,max(sequence#) as "LAST_APPLIED" from v$log_history
group by thread#;
THREAD# LAST_APPLIED
---------- ------------
1 65
OPENING STANDBY DATABASE IN READ AND WRITE MODE:
NOTE: Set DB_RECOVERY_FILE_DEST_SIZE and DB_RECOVERY_FILE_DEST to
some directory name in physical size.
NOTE: Here we use Activate and Convert command.
1. Stop the MRP Process on standby.
SQL> alter database recover managed standby database cancel;
2. Create a Guarantee Restore point on standby database until which you can
flashback.
SQL> create restore point rest1 guarantee flashback database;
3. Switch the logfile on Primary side to archive the latest changes to standby side.
SQL> alter system switch logfile;
4.Defer the Standby destination of the primary database.
SQL> alter system set log_archive_dest_state_2=defer;
5. Activate the standby database.
SQL> alter database activate standby database;
6. Startup in mount phase and put the standby database in Max performance.
SQL> alter database set standby database to maximize performance;
7. Open the standby database in Read Write mode.
SQL> alter database open;
8. Verify the open mode.
SQL> select OPEN_MODE from v$database;
OPEN_MODE
--------------------
READ WRITE
NOTE: Standby database can be opened in this mode only for test purpose. After
competing the test operations it has to be again flashed back to the restore point.
9. To Flashback the standby database to before the point where it was opened in
Read Write mode, Shutdown and start the standby database in mount phase.
SQL> flashback database to restore point rest1;
10.Convert the standby database back in to physical standby.
SQL> alter database convert to physical standby;
11. Startup the standby database in mount phase and start the MRP process.
SQL> alter database recover managed standby database disconnect;
12. Enable the status of the Archive destination in the primary database.
SQL> alter system set log_archive_dest_state_2=enable;
RESOLVING ARCHIVE GAP MANUALLY:
1. Following SQL statement gives the latest LSN and the LSN of the archive log file
which is archived to the standby.
SQL> select max(r.sequence#) last_seq_recd, max(l.sequence#) last_seq_sent from
v$archived_log r, v$log l where r.dest_id=2 and l.archived='YES';
LAST_SEQ_RECD LAST_SEQ_SENT
------------- -------------
77 78
2.We can also know the name of the archive log files between those LSNs on the
primary side.
SQL> select name from v$archived_log where thread#=1 and dest_id=1 and
sequence# between 77 and 80;
NAME
--------------------------------------------------------------------------------
/home/oracle10g/orcl/archloc/1_77_646949374.dbf
/home/oracle10g/orcl/archloc/1_78_646949374.dbf
/home/oracle10g/orcl/archloc/1_79_646949374.dbf
SQL> select name from v$archived_log where thread#=1 and dest_id=1 and
sequence# > 76 and sequence# <> show parameter standby_archive_dest
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
standby_archive_dest string /home/oracle10g/stand/arch
4. Copy the archive log files listed in the above query manually to the standby
archive destination and issue the following command to start automatic recovery.
SQL> alter database recover automatic standby database;
alter database recover automatic standby database
*
ERROR at line 1:
ORA-00279: change 580282 generated at 04/01/2008 14:17:02 needed for thread 1
ORA-00289: suggestion :
/home/oracle10g/stand/archloc/1_80_646949374.dbf
ORA-00280: change 580282 for thread 1 is in sequence #80
ORA-00278: log file '/home/oracle10g/stand/archloc/1_80_646949374.dbf' no
longer needed for this recovery
ORA-00308: cannot open archived log
'/home/oracle10g/stand/archloc/1_80_646949374.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
NOTE: The automatic recovery will always searches for the archive log file with the
latest LSN, Which may not be transported to the standby side. Recovery will be
automatically stopped with the above error.
5. Start the MRP process in the standby side.
SQL> alter database recover managed standby database disconnect;
FLASHBACK AFTER ROLE TRANSITION (FAILOVER):
1. Verify Flashback mode for the standby database, it should be ON.
SQL> select flashback_on from v$database;
FLASHBACK_ON
------------------
YES
2. Check for the archive gap. If it exists resolve it as in the above mentioned steps.
SQL> select thread#,low_sequence#,high_sequence# from v$archive_gap;
no rows selected
3. Failover to standby by issuing following command.
SQL> alter database recover managed standby database finish force;
SQL> alter database commit to switchover to primary;
SQL> SELECT DATABASE_ROLE FROM V$DATABASE;
DATABASE_ROLE
---------------------------
PRIMARY
4. Find out the SCN at which the standby database has become primary.
SQL> SELECT TO_CHAR(STANDBY_BECAME_PRIMARY_SCN) FROM
V$DATABASE;
TO_CHAR(STANDBY_BECAME_PRIMARY_SCN)
----------------------------------------
601721
5. Shutdown and Startup the OLD PRIMARY DATABASE in mount phase.
6. Flashback the OLD PRIMARY SIZE to the above SCN.
SQL> flashback database to scn 601721;
7. Convert the OLD PRIMARY DATABASE to physical standby.
SQL> alter database convert to physical standby;
8. Shutdown and Start the NEW STANDBY DATABASE in mount phase.
9. ON PRIMARY DATABASE check for the archive destination errors and archive
dest status.
SQL> select dest_id,dest_name,status,protection_mode,destination,error, sql from
v$archive_dest_status;
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2=ENABLE;
SQL> select DEST_ID,ERROR,LOG_SEQUENCE from v$archive_dest;
NOTE: Above query shows the the error that may occur while archiving. There
should be no errors for proper Archiving.
10. ON NEW STANDBY database start MRP process.
SQL> alter database recover managed standby database disconnect;
NOTE: properly verify log archive parameters in new standby database i.e
ALL_LOGFILES,ALL_ROLES' values are set for VALID_FOR value and properly set
DB_FILE_NAME_CONVERT, LOG_FILE_NAME_CONVERT.
CREATING LOGICAL STANDBY DATABASE.
NOTE: In 10G first Physical standby has to be created and later it can be converted
to Logical standby.
1. Stop the MRP Process on standby database.
SQL> alter database recover managed standby database cancel;
2. On primary database Build the logminer directory by executing the following
package on primary database.
SQL> execute dbms_logstdby.build;
3. Change the database name on Standby database by issuing the following
command.
SQL> alter database recover to logical standby LOGICAL;
alter database recover to logical standby logical
ERROR at line 1:
ORA-16254: change db_name to STAND in the client-side parameter file (pfile)
NOTE: You will encounter above ERROR if the standby database is using pfile. It
also prompts you to change the DB_NAME Parameter in the parameter file to new
name which you wish to give. If the standby database is using SPFILE then,
DB_NAME parameter will be automatically updated to the new name. Here in the
above example new database name is LOGICAL.
4. Shutdown the standby database edit DB_NAME parameter to new name if the
standby database is using pfile.
5. create a password file and set the following parameters on logical standby.
log_archive_dest_1='LOCATION=/home/oracle10g/logical/archloc/ VALID_FOR=
(ONLINE_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=logical'
log_archive_dest_2='SERVICE=con_orcl LGWR ASYNC VALID_FOR=
(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=orcl'
*.log_archive_dest_3='LOCATION=/home/oracle10g/logical/arch VALID_FOR=
(STANDBY_LOGFILES,STANDBY_ROLE)
DB_UNIQUE_NAME=logical'
6. Start the instance in mount phase.
7. Open the database with resetlogs.
SQL> alter database open resetlogs;
8. Start the LSP process.
SQL> alter database start logical standby apply immediate;
Oracle dataguard ensures high availability data protection and disaster
recovery for enterprise data. The dataguard configuration will always have 2
databases running transactionally consistent. If the production database becomes
unavailable because of planned or unplanned outage. Dataguard can be switched to
any standby database in the configuration to production role, minimizing down
time.
Standby databases are created by the backup of the primary database.
Once configured dataguard will maintain the standby database automatically by
transmitting redo of the primary database and applying the transmitted data to the
standby database(Redo apply for physical standby, SQL apply for logical standby).
TYPES OF STANDBY DATABASES:
Standby databases can be of two types.
1. Physical standby database.
Provides a physically identical copy of the primary database, with on disk
database structures that are identical to the primary database on a block-for-block
basis. The database schema, including indexes, are the same.
2. Logical standby database.
Contains the same logical information as the production database, although the
physical organization and structure of the data can be different.
BENEFITS OF PHYSICAL STANDBY DATABASE:
1.Redo apply:
Standby database is made consistent by applying redo data from archivelog files.
Recovery applies the changes from redo block to the data block.
2.Read only mode:
Physical standby database can be opened in read only mode for querying, and it
is available for taking the backups. Thus we can minimize the load on the
production database. Redo data will not be applied if the standby database is
opened in read only mode, but redo data will be still put on the standby side.
3. Read write mode:
Physical standby database can be opened in read write mode for temporary
purpose (using flashback feature).
4.Disaster recovery and high availability
It gives best disaster recovery and high availability solution by easily managed
switch over and failover capabilities between the primary and standby databases.
5. Data protection
Using a physical standby database, Data Guard can ensure no data loss, even in the
face of unforeseen disasters. A physical standby database supports all datatypes,
and all DDL and DML operations that the primary database can support.
BENEFITS OF LOGICAL STANDBY DATABASE:
1.Efficient use of standby hardware resources
A logical standby database can be used for other business purposes in addition to
disaster recovery requirements. It can host additional database schemas beyond
the ones that are protected in a Data Guard configuration, and users can
perform normal DDL or DML operations on those schemas any time.
2.Reduction in primary database workload
A logical standby database can remain open at the same time its tables are
updated from the primary database, and those tables are simultaneously available
for read access. This makes a logical standby database an excellent choice to do
queries, summations, and reporting activities, thereby off-loading the primary
database from those tasks and saving valuable CPU and I/O cycles.
DATA GUARD SERVICES:
1.Redo Transport Services
Control the automated transfer of redo data from the production database to
one
or more archival destinations.
2.Log Apply Services
Apply redo data on the standby database to maintain transactional
synchronization with the primary database. Redo data can be applied either from
archived redo log files, or, if real-time apply is enabled, directly from the standby
redo log files as they are being filled, without requiring the redo data to be
archived first at the standby database.
3. Role Transitions
Change the role of a database from a standby database to a primary database, or
vice versa using either a switchover or a failover operation.
CREATING PHYSICAL STANDBY DATABASE:
1. Enable forced logging for primary database.
SQL>alter database force logging;
2. Configure standby redologfiles
Standby redolog files are required for Maximum availability and maximum
protection modes. Ensure logfile sizes are identical on both primary and standby
side. Create groups adding logfiles to the groups.
SQL> alter database add standby logfile '/home/oracle10g/orcl/redo3.log' size 50m;
SQL> alter database add standby logfile '/home/oracle10g/orcl/redo4.log' size 50m;
SQL> alter database add standby logfile '/home/oracle10g/orcl/redo5.log' size 50m;
SQL> alter database add standby logfile '/home/oracle10g/orcl/redo6.log' size 50m;
3. primary database parameters
*.background_dump_dest='/home/oracle10g/orcl/bdump'
*.compatible='10.2.0.1.0'
*.control_files='/home/oracle10g/orcl/control/control01.ctl'
*.db_block_size=8192
log_archive_config='DG_CONFIG=(orcl,stand)'
log_archive_dest_1='LOCATION=/home/oracle10g/orcl/archloc/ VALID_FOR
=(ALL_LOGFILES,ALL_ROLES) db_unique_name=orcl'
log_archive_dest_2='SERVICE=con_stand
VALID_FOR=(ONLINE_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=stand'
log_archive_dest_state_1='enable'
log_archive_dest_state_2='enable'
db_unique_name='orcl'
FAL_CLIENT='orcl' #Connect descriptor name which connects to primary
FAL_SERVER='stand' #Connect descriptor name which connects to standby
standby_file_management='auto'
remote_login_passwordfile='EXCLUSIVE'
*.db_domain=''
*.db_file_multiblock_read_count=16
*.db_name='orcl'
*.job_queue_processes=10
*.open_cursors=300
*.pga_aggregate_target=94371840
*.processes=150
*.sga_target=283115520
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='/home/oracle10g/orcl/udump'
log_file_name_convert='/home/oracle10g/stand/','/home/oracle10g/orcl/'
db_file_name_convert='/home/oracle10g/stand/','/home/oracle10g/orcl/'
4. standby database parameters
*.background_dump_dest='/home/oracle10g/stand/bdump'
*.compatible='10.2.0.1.0'
*.control_files='/home/oracle10g/stand/control/constd.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_file_multiblock_read_count=16
*.db_file_name_convert='/home/oracle10g/orcl','/home/oracle10g/stand'
*.db_name='orcl'
*.db_unique_name='stand'
*.fal_client='stand' #Connect descriptor name which connects to standby
*.fal_server='orcl' #Connect descriptor name which connects to primary
*.job_queue_processes=10
*.log_archive_config='DG_CONFIG=(stand,orcl)'
*.log_archive_dest_1='LOCATION=/home/oracle10g/stand/archloc/ VALID_FOR=
(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=stand'
*.log_archive_dest_2='SERVICE=con_orcl LGWR ASYNC VALID_FOR=
(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=orcl'
*.log_archive_dest_state_1='enable'
*.log_archive_dest_state_2='enable'
*.log_archive_max_processes=2
*.log_file_name_convert='/home/oracle10g/orcl','/home/oracle10g/stand'
*.open_cursors=300
*.pga_aggregate_target=16777216
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=167772160
*.standby_archive_dest='/home/oracle10g/stand/arch/'
*.standby_file_management='auto'
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='/home/oracle10g/stand/udump'
SQL> select instance_name,status from v$instance;
INSTANCE_NAME STATUS
---------------- ------------
orcl OPEN
5. Backup the primary database completely:
NOTE: Backup can be taken either by shutting down the database or by RMAN.
6. Start the primary database in mount or open phase to take control file backup.
SQL> alter database create standby controlfile as '/home/oracle10g/stdctl.ctl';
7. Copy standby controlfile, datafiles and redologfile to standby side.
Create appropriate directories where datafile and redolog file has to be kept in
the standby side. Update the parameters DB_FILE_NAME_CONVERT,
LOG_FILE_NAME_CONVERT in the parameter files of both primary and standby
databases as shown above.
8. Shutdown primary database
9. Configure listener on both primary and standby side and suitable connect
descriptors which can connect to each other.
10.Create password file for both primary standby database.
Note: Both standby and primary database should have same password.
11. Start the standby database in nomount phase and mount it.
SQL> alter database mount standby database;
12. Start primary database.
SQL> select error from v$archive_dest;
NOTE: Above query Should not give any errors for any destination which
transmits redo data from the primary database. If any error occurs check the
listener status on the standby side and corresponding connect descriptor on the
primary side.
13.Verifying the standby database.
SQL> select SEQUENCE#,ARCHIVED,APPLIED,FIRST_TIME,NEXT_TIME from
v$archived_log;
SEQUENCE# ARC APP FIRST_TIM NEXT_TIME
-------------------- ------- ------ ---------------- -----------------
19 YES YES 17-FEB-08 18-FEB-08
20 YES YES 18-FEB-08 18-FEB-08
21 YES YES 18-FEB-08 18-FEB-08
22 YES YES 18-FEB-08 18-FEB-08
23 YES YES 18-FEB-08 18-FEB-08
SQL> alter system switch logfile; (Primary database)
SQL> select SEQUENCE#,ARCHIVED,APPLIED,FIRST_TIME,NEXT_TIME from
v$archived_log; (Standby database)
SEQUENCE# ARC APP FIRST_TIM NEXT_TIME
---------- --- --- --------- ---------
19 YES YES 17-FEB-08 18-FEB-08
20 YES YES 18-FEB-08 18-FEB-08
21 YES YES 18-FEB-08 18-FEB-08
22 YES YES 18-FEB-08 18-FEB-08
23 YES YES 18-FEB-08 18-FEB-08
24 YES YES 18-FEB-08 18-FEB-08 (New Archive logfile applied)
Note: The Query has to be checked for the subsequent log switches that occur on
the primary side. Primary database will be one log switch ahead of the standby
database. Also ensure that MRP process is running on standby side by giving
ps -ef grep ora_ command.
DATAGUARD PROTECTION MODES:
Following table shows the archive parameter attributes to be specified for keeping
the database in different protection modes properly.
Maximum
Protection
Maximum
Availability
Maximum
Performance
Redo archival
process
LGWR LGWR LGWR or ARCH
Network
transmission
mode
SYNC SYNC SYNC or ASYNC when using LGWR process.
SYNC if using ARCH process
Disk write
option
AFFIRM AFFIRM AFFIRM or NOAFFIRM
Standby redo
log required?
Yes Yes No, but it is recommended
UPGRADING FROM PERFORMENCE TO AVAILIBILITY:
1. Verify the parameters of the primary database.
SQL> show parameter log_archive_dest_2
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_2 string SERVICE=con_stand VALID_FOR=(O
NLINE_LOGFILES,ALL_ROLES) DB_U
NIQUE_NAME=stand
2. If not set properly, set the archive parameters with the proper log transport
services as required for the protection modes.
SQL> alter system set log_archive_dest_2='SERVICE=con_stand LGWR SYNC
AFFIRM VALID_FOR=(ONLINE_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=stand';
3. Shutdown and startup the database in mount phase. And issue the following
command for keeping database in maximum availability mode.
SQL> ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE AVAILABILITY;
4. Open the database
SQL> ALTER DATABASE OPEN;
5.Verify
SQL> select protection_mode,protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
--------------------------------- --------------------------------------
MAXIMUM AVAILABILITY RESYNCHRONIZATION
DOWNGRADING FROM AVAILABILITY TO PERFORMANCE:
1. Set the archive parameter suitably.
SQL> alter system set log_archive_dest_2='SERVICE=con_stand ARCH SYNC
AFFIRM VALID_FOR=(ONLINE_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=stand';
2. Issue the following command to put the database back in Performance Mode.
SQL> ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE
PERFORMANCE;
3. Verify
SQL> select protection_mode,protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
--------------------------------- ---------------------------------
MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE
NOTE: For downgrading form Availability to performance or protection to
Availability there is no need to restart the instance.
VERIFY THE LOGSHIPPING:
NOTE: Following Queries provides you the information regarding the STATUS for
the Online logfiles, Maximum LSN, and the Archive log file with the latest LSN
which is Transmitted.
SQL> select thread#,sequence#,archived,status from v$log;
THREAD# SEQUENCE# ARC STATUS
---------- ---------- --- ----------------
1 35 NO CURRENT
1 33 YES INACTIVE
1 34 YES INACTIVE
SQL> select max(sequence#),thread# from v$archived_log group by thread#;
MAX(SEQUENCE#) THREAD#
------------------------------ ----------
34 1
SQL> select destination,status,archived_thread#,archived_seq# from
v$archive_dest_status where status <> 'DEFERED' and status <> 'INACTIVE';
DESTINATION STATUS ARCHIVED_THREAD# ARCHIVED_SEQ#
--------------------------------------------- ------------ --------------------------------- --------------------------
/home/oracle10g/orcl/archloc VALID 1 34
con_stand VALID 1 34
DATABASE ROLE TRANSITIONS:
1. Check the database's current role in the configuration.
SQL> select DATABASE_ROLE from v$database;
DATABASE_ROLE
----------------
PRIMARY
2. To choose the suitable standby database to which we can failover or switchover
with the least possible apply time. We have following query on the standby
database.
SQL> select NAME,VALUE,UNIT from v$dataguard_stats;
NAME VALUE UNIT
------------------ ---------------- ------------------------------
apply finish time day(2) to second(1) interval
apply lag +00 02:19:59 day(2) to second(0) interval
estimated startup 10 second
time
standby has been o N
pen
transport lag +00 02:17:32 day(2) to second(0) interval
3. Verify current switchover status on primary database.
SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
--------------------
TO STANDBY
4. Issue switchover command on primary database first.
SQL> alter database commit to switchover to physical standby;
5. Shutdown and Start the standby database in mount phase.
6. Verify the proper switchover of Primary to Standby by following query.
SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
--------------------
TO PRIMARY
7. Further carry out switchover operation on standby side.
SQL> alter database commit to switchover to primary;
8. Verify the proper switchover of Standby to Primary.
SQL> select switchover_status from v$database;
SWITCHOVER_STATUS
--------------------
TO STANDBY
9. Shutdown and startup the new primary database normally.
NOTE: OR alter database open if the standby database is not open in read only
mode even for the first time.
10. Start Managed recovery process on new standby side.
SQL> alter database recover managed standby database disconnect;
11. Verify the MRP process.
SQL> SELECT PROCESS,STATUS FROM V$MANAGED_STANDBY;
PROCESS STATUS
--------- ------------
ARCH CLOSING
ARCH CLOSING
MRP0 WAIT_FOR_LOG
RFS IDLE
FAILOVER:
1. Issue the following query to check for any archive gaps. If exists, Archive gaps
has to be resolved first.
NOTE: Resolving Archive gaps is demonstrated in further steps.
SQL> select thread#,low_sequence#,high_sequence# from v$archive_gap;
2. To check the archive log file with the highest LSN issue following query Until
which archive gap has to be resolved.
SQL> select unique thread# as thread, max(sequence#) over (partition by
thread#) as last from v$archived_log;
THREAD LAST
---------------- ----------
1 45
NOTE: If archive gap exists copy corresponding archive log file to the standby
database and register it to the standby side. Repeate the step until the gap is
resolved.
SQL> alter database register standby logfile 'path and name';
3. To Failover issue the following commands on standby.
SQL> alter database recover managed standby database finish force;
SQL> alter database commit to switchover to primary;
NOTE: Shutdown and startup if the database was opened in read only any time
after configuration.
HOW TO OPEN STANDBY DATABASE IN READ ONLY MODE?
1.When it is shutdown just STARTUP will open standby database in read only mode.
2.If the standby database is mounted then:
a. SQL> alter database recover managed standby database cancel;
SQL> alter database open;
SQL> select OPEN_MODE from v$database;
OPEN_MODE
--------------------
READ ONLY
NOTE: Further... If we start MRP process while the standby database is opened, it
will be automatically put into Mount phase as shown below.
SQL> alter database recover managed standby database disconnect;
SQL> select status from v$instance;
STATUS
------------
MOUNTED
SQL> SELECT PROCESS,STATUS FROM V$MANAGED_STANDBY;
PROCESS STATUS
-------------- ------------
ARCH CLOSING
ARCH CLOSING
MRP0 WAIT_FOR_LOG
RFS IDLE
SQL> alter database recover managed standby database cancel;
SQL> alter database open;
SQL> SELECT PROCESS,STATUS FROM V$MANAGED_STANDBY;
PROCESS STATUS
-------------- ------------
ARCH CLOSING
ARCH CLOSING
RFS IDLE
NOTE: The above query shows MRP process stopped as it has been cancelled.
MANAGEMENT:
Tablespace Management.
If STANDBY_FILE_MANAGEMENT is set to auto.
SQL> create tablespace tbs1
datafile '/home/oracle10g/orcl/tbs1.dbf' size 20m;
SQL> alter system archive log current;
In standby side the datafile will be automatically created.
SQL> select name from v$datafile;
NAME
--------------------------------------------------------------------------------
/home/oracle10g/stand/system01.dbf
/home/oracle10g/stand/undotbs01.dbf
/home/oracle10g/stand/sysaux01.dbf
/home/oracle10g/stand/users01.dbf
/home/oracle10g/stand/example01.dbf
/home/oracle10g/stand/tbs1.dbf
If the Tablespace is dropped
SQL> drop tablespace tbs1;
SQL> alter system archive log current;
In standby side, the datafile will be automatically dropped.
SQL> select name from v$datafile;
NAME
----------------------------------------------------------------
/home/oracle10g/stand/system01.dbf
/home/oracle10g/stand/undotbs01.dbf
/home/oracle10g/stand/sysaux01.dbf
/home/oracle10g/stand/users01.dbf
/home/oracle10g/stand/example01.dbf
If STANDBY_FILE_MANAGEMENT is not set to auto.
Renaming datafile:
If a datafile is renamed in the primary side, The same datafile has to be rename
manually in the standby side.
A. SQL> alter tablespace example offline;
B. $ mv example01.dbf example.dbf
C. alter database rename file '/home/oracle10g/orcl/example01.dbf' to
'/home/oracle10g/orcl/example.dbf';
SQL> alter system archive log current;
On standby side:
D. SQL> alter system set standby_file_management=manual;
E. SQL> alter database rename file '/home/oracle10g/stand/example01.dbf'
to '/home/oracle10g/stand/example.dbf';
F. SQL> alter database recover managed standby database disconnect;
G. SQL> alter system set standby_file_management=auto;
H. SQL> select name from v$datafile;
NAME
----------------------------------------------------------
/home/oracle10g/stand/system01.dbf
/home/oracle10g/stand/undotbs01.dbf
/home/oracle10g/stand/sysaux01.dbf
/home/oracle10g/stand/users01.dbf
/home/oracle10g/stand/example.dbf
MONITORING PROCESS:
Following queries are helpful in getting the information regarding the processes
and the Archvelog files.
SQL> select process,client_process,sequence#,status from v$managed_standby;
PROCESS CLIENT_P SEQUENCE# STATUS
--------- ---------------- ------------------- ------------
ARCH ARCH 0 CONNECTED
ARCH ARCH 0 CONNECTED
MRP0 N/A 63 WAIT_FOR_LOG
RFS N/A 0 IDLE
SQL> select archived_thread#,archived_seq#,applied_thread#,applied_seq# from
v$archive_dest_status;
ARCHIVED_THREAD# ARCHIVED_SEQ# APPLIED_THREAD# APPLIED_SEQ#
---------------- ------------- --------------- ------------
1 64 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
1 63 1 62
SQL> select name,creator,sequence#,applied,completion_time from
v$archived_log;
NAME CREATOR SEQUENCE# APP COMPLETION
---------- --------------- ------------------- ---------------------------
/home/oracle10g ARCH 65 YES 28-MAR-08
/stand/archloc/
1_65_646949374.dbf
SQL> SELECT FIRST_TIME,FIRST_CHANGE#,NEXT_CHANGE#,SEQUENCE#
FROM V$LOG_HISTORY;
17-FEB-08 446075 474198 1
17-FEB-08 474198 476839 2
17-FEB-08 476839 476841 3
------------ --------- ---------- -
------------ --------- --------- -
------------ --------- ---------- -
27-MAR-08 554446 561139 63
28-MAR-08 561139 561141 64
28-MAR-08 561141 561853 65
SQL> select thread#,max(sequence#) as "LAST_APPLIED" from v$log_history
group by thread#;
THREAD# LAST_APPLIED
---------- ------------
1 65
OPENING STANDBY DATABASE IN READ AND WRITE MODE:
NOTE: Set DB_RECOVERY_FILE_DEST_SIZE and DB_RECOVERY_FILE_DEST to
some directory name in physical size.
NOTE: Here we use Activate and Convert command.
1. Stop the MRP Process on standby.
SQL> alter database recover managed standby database cancel;
2. Create a Guarantee Restore point on standby database until which you can
flashback.
SQL> create restore point rest1 guarantee flashback database;
3. Switch the logfile on Primary side to archive the latest changes to standby side.
SQL> alter system switch logfile;
4.Defer the Standby destination of the primary database.
SQL> alter system set log_archive_dest_state_2=defer;
5. Activate the standby database.
SQL> alter database activate standby database;
6. Startup in mount phase and put the standby database in Max performance.
SQL> alter database set standby database to maximize performance;
7. Open the standby database in Read Write mode.
SQL> alter database open;
8. Verify the open mode.
SQL> select OPEN_MODE from v$database;
OPEN_MODE
--------------------
READ WRITE
NOTE: Standby database can be opened in this mode only for test purpose. After
competing the test operations it has to be again flashed back to the restore point.
9. To Flashback the standby database to before the point where it was opened in
Read Write mode, Shutdown and start the standby database in mount phase.
SQL> flashback database to restore point rest1;
10.Convert the standby database back in to physical standby.
SQL> alter database convert to physical standby;
11. Startup the standby database in mount phase and start the MRP process.
SQL> alter database recover managed standby database disconnect;
12. Enable the status of the Archive destination in the primary database.
SQL> alter system set log_archive_dest_state_2=enable;
RESOLVING ARCHIVE GAP MANUALLY:
1. Following SQL statement gives the latest LSN and the LSN of the archive log file
which is archived to the standby.
SQL> select max(r.sequence#) last_seq_recd, max(l.sequence#) last_seq_sent from
v$archived_log r, v$log l where r.dest_id=2 and l.archived='YES';
LAST_SEQ_RECD LAST_SEQ_SENT
------------- -------------
77 78
2.We can also know the name of the archive log files between those LSNs on the
primary side.
SQL> select name from v$archived_log where thread#=1 and dest_id=1 and
sequence# between 77 and 80;
NAME
--------------------------------------------------------------------------------
/home/oracle10g/orcl/archloc/1_77_646949374.dbf
/home/oracle10g/orcl/archloc/1_78_646949374.dbf
/home/oracle10g/orcl/archloc/1_79_646949374.dbf
SQL> select name from v$archived_log where thread#=1 and dest_id=1 and
sequence# > 76 and sequence# <> show parameter standby_archive_dest
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
standby_archive_dest string /home/oracle10g/stand/arch
4. Copy the archive log files listed in the above query manually to the standby
archive destination and issue the following command to start automatic recovery.
SQL> alter database recover automatic standby database;
alter database recover automatic standby database
*
ERROR at line 1:
ORA-00279: change 580282 generated at 04/01/2008 14:17:02 needed for thread 1
ORA-00289: suggestion :
/home/oracle10g/stand/archloc/1_80_646949374.dbf
ORA-00280: change 580282 for thread 1 is in sequence #80
ORA-00278: log file '/home/oracle10g/stand/archloc/1_80_646949374.dbf' no
longer needed for this recovery
ORA-00308: cannot open archived log
'/home/oracle10g/stand/archloc/1_80_646949374.dbf'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
NOTE: The automatic recovery will always searches for the archive log file with the
latest LSN, Which may not be transported to the standby side. Recovery will be
automatically stopped with the above error.
5. Start the MRP process in the standby side.
SQL> alter database recover managed standby database disconnect;
FLASHBACK AFTER ROLE TRANSITION (FAILOVER):
1. Verify Flashback mode for the standby database, it should be ON.
SQL> select flashback_on from v$database;
FLASHBACK_ON
------------------
YES
2. Check for the archive gap. If it exists resolve it as in the above mentioned steps.
SQL> select thread#,low_sequence#,high_sequence# from v$archive_gap;
no rows selected
3. Failover to standby by issuing following command.
SQL> alter database recover managed standby database finish force;
SQL> alter database commit to switchover to primary;
SQL> SELECT DATABASE_ROLE FROM V$DATABASE;
DATABASE_ROLE
---------------------------
PRIMARY
4. Find out the SCN at which the standby database has become primary.
SQL> SELECT TO_CHAR(STANDBY_BECAME_PRIMARY_SCN) FROM
V$DATABASE;
TO_CHAR(STANDBY_BECAME_PRIMARY_SCN)
----------------------------------------
601721
5. Shutdown and Startup the OLD PRIMARY DATABASE in mount phase.
6. Flashback the OLD PRIMARY SIZE to the above SCN.
SQL> flashback database to scn 601721;
7. Convert the OLD PRIMARY DATABASE to physical standby.
SQL> alter database convert to physical standby;
8. Shutdown and Start the NEW STANDBY DATABASE in mount phase.
9. ON PRIMARY DATABASE check for the archive destination errors and archive
dest status.
SQL> select dest_id,dest_name,status,protection_mode,destination,error, sql from
v$archive_dest_status;
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2=ENABLE;
SQL> select DEST_ID,ERROR,LOG_SEQUENCE from v$archive_dest;
NOTE: Above query shows the the error that may occur while archiving. There
should be no errors for proper Archiving.
10. ON NEW STANDBY database start MRP process.
SQL> alter database recover managed standby database disconnect;
NOTE: properly verify log archive parameters in new standby database i.e
ALL_LOGFILES,ALL_ROLES' values are set for VALID_FOR value and properly set
DB_FILE_NAME_CONVERT, LOG_FILE_NAME_CONVERT.
CREATING LOGICAL STANDBY DATABASE.
NOTE: In 10G first Physical standby has to be created and later it can be converted
to Logical standby.
1. Stop the MRP Process on standby database.
SQL> alter database recover managed standby database cancel;
2. On primary database Build the logminer directory by executing the following
package on primary database.
SQL> execute dbms_logstdby.build;
3. Change the database name on Standby database by issuing the following
command.
SQL> alter database recover to logical standby LOGICAL;
alter database recover to logical standby logical
ERROR at line 1:
ORA-16254: change db_name to STAND in the client-side parameter file (pfile)
NOTE: You will encounter above ERROR if the standby database is using pfile. It
also prompts you to change the DB_NAME Parameter in the parameter file to new
name which you wish to give. If the standby database is using SPFILE then,
DB_NAME parameter will be automatically updated to the new name. Here in the
above example new database name is LOGICAL.
4. Shutdown the standby database edit DB_NAME parameter to new name if the
standby database is using pfile.
5. create a password file and set the following parameters on logical standby.
log_archive_dest_1='LOCATION=/home/oracle10g/logical/archloc/ VALID_FOR=
(ONLINE_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=logical'
log_archive_dest_2='SERVICE=con_orcl LGWR ASYNC VALID_FOR=
(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=orcl'
*.log_archive_dest_3='LOCATION=/home/oracle10g/logical/arch VALID_FOR=
(STANDBY_LOGFILES,STANDBY_ROLE)
DB_UNIQUE_NAME=logical'
6. Start the instance in mount phase.
7. Open the database with resetlogs.
SQL> alter database open resetlogs;
8. Start the LSP process.
SQL> alter database start logical standby apply immediate;
Subscribe to:
Posts (Atom)