Grid Management DB filling up ASM disk space

Recently I discovered on Oracle Grid Infrastructure 12cR2 that the ASM disk group hosting the Management DB (-MGMTDB) was filling up the disk space very quickly.

This is due to a bug on the oclumon data purge procedure.

To fix the problem, two possibilities are available:

  1. Recreate the Management DB
  2. Manually truncate the tables not purged and shrinking the Tablespace Size

 

Below are described the two options.

 

Option 1 – Recreate the Management DB

As root user on each cluster node:

# /u01/app/12.2.0.1/grid/bin/crsctl stop res ora.crf -init
# /u01/app/12.2.0.1/grid/bin/crsctl modify res ora.crf -attr ENABLED=0 -init

 

As Grid from the local node hosting the Management Database Instance run the commands:

$ /u01/app/12.2.0.1/grid/bin/srvctl status mgmtdb
$ /u01/app/12.2.0.1/grid/bin/dbca -silent -deleteDatabase -sourceDB -MGMTDB
Connecting to database
4% complete
9% complete
14% complete
19% complete
23% complete
28% complete
47% complete
Updating network configuration files
48% complete
52% complete
Deleting instance and datafiles
76% complete
100% complete

 

How to recreate the MGMTDB:

$ /u01/app/12.2.0.1/grid/bin/dbca -silent -createDatabase -createAsContainerDatabase true -templateName MGMTSeed_Database.dbc
-sid -MGMTDB
-gdbName _mgmtdb
-storageType ASM
-diskGroupName GIMR
-datafileJarLocation <GI HOME>/assistants/dbca/templates
-characterset AL32UTF8
-autoGeneratePassword           
-skipUserTemplateCheck

 

Create the pluggable GIMR database

$ /u01/app/12.2.0.1/grid/bin/mgmtca -local

 


 

Option 2 – Manually truncate the tables

 

As root user stop and disable ora.crf resource on each cluster node:

# /u01/app/12.2.0.1/grid/bin/crsctl stop res ora.crf -init
# /u01/app/12.2.0.1/grid/bin/crsctl modify res ora.crf -attr ENABLED=0 -init

 

Connect to MGMTDB and identify the segments to truncate:

export ORACLE_SID=-MGMTDB
$ORACLE_HOME/bin/sqlplus / as sysdba
SQL> select pdb_name from dba_pdbs where pdb_name!='PDB$SEED';

SQL> alter session set container=GIMR_DSCREP_10;

Session altered.

SQL> col obj format a50
SQL> select owner||'.'||SEGMENT_NAME obj, BYTES from dba_segments where owner='CHM' order by 2 asc;

 

Likely those two tables are much bigger than the rest :

  • CHM.CHMOS_PROCESS_INT_TBL
  • CHM.CHMOS_DEVICE_INT_TBL

Truncate the tables:

SQL> truncate table CHM.CHMOS_PROCESS_INT_TBL;
SQL> truncate table CHM.CHMOS_DEVICE_INT_TBL;

 

Then if needed shrink the tablespace and job done!

 


 

Advertisements

RHEL 7.4 fails to mount ACFS File System due to KMOD package

After a fresh OS installation or an upgrade to RHEL 7.4, any attempt to install ACFS drivers will fail with the following message: “ACFS-9459 ADVM/ACFS is not supported on this OS version”

The error persists even if the Oracle Grid Infrastructure software includes the  Patch 26247490: 12.2 ACFS MODULE ERRORS & CRASH DURING MODULE LOAD & UNLOAD WITH OL7U4 RHCK.

 

This problem has been identified by Oracle with  BUG 26320387 – 7.4 kmod weak-modules not checking kABI compatibility correctly

And by Red Hat  Bugzilla bug:  1477073 – 7.4 kmod weak-modules –dry-run changed output format missing ‘is compatible’ messages.

root@oel7node06:/u01/app/12.2.0.1/grid/crs/install# /u01/app/12.2.0.1/grid/bin/acfsroot install
ACFS-9459: ADVM/ACFS is not supported on this OS version: '3.10.0-514.6.1.el7.x86_64'

root@oel7node06:~# /sbin/lsmod | grep oracle
oracleadvm 776830 7
oracleoks 654476 1 oracleadvm
oracleafd 205543 1

 

The current Workaround consists in downgrade the version of the kmod  RPM to  kmod-20-9.el7.x86_64.

root@oel7node06:~# yum downgrade kmod-20-9.el7

 

After the package downgrade the ACFS drivers are correcly loaded:

root@oel7node06:~# /sbin/lsmod | grep oracle
oracleacfs 4597925 2
oracleadvm 776830 8
oracleoks 654476 2 oracleacfs,oracleadvm
oracleafd 205543 1

 


 

 

 

Adding flexibility to Oracle GI Implementing Multiple SCANs

Nowadays the business requirements force the IT to implement the more and more sophisticated and consolidated environments without compromising availability, performance and flexibility of each application running on it.

In this post, I explain how to improve the Grid Infrastructure Network flexibility, implementing multiple SCANs and how to associate one or multiple networks to the Oracle databases.

To better understand the reasons for such type of implementation, below are listed few common use cases:

  • Applications are deployed on different/dedicated subnets.
  • Network isolation due to security requirement.
  • Different database protocols are in use (TCP, TCPS, etc.).

 

 

Single Client Access Name (SCAN)

By default on each Oracle Grid Infrastructure cluster, indipendently from the number of nodes, one SCAN with 3 SCAN VIPs is created.

Below is depicted the default Oracle Clusterware network/SCAN configuration.

 

Single_Scan_Listener

 

Multiple Single Client Access Name (SCAN) implementation

Before implemeting additional SCANs, the OS provisioning of new network interfaces or new VLAN Tagging has to be completed.

The current example uses the second option (VLAN Tagging), and the bond0 interface is an Active/Active setup of two 10gbe cards, to which a VLAN tag has been added.

Below is represented the customized Oracle Clusterware network/SCAN configuration, having added a second SCAN.

 

Multi_Scan_Listeners

 

Step-by-step implementation

After completing the OS network setup, as grid owner add the new interface to the Grid Infrastructure:

grid@host01a:~# oifcfg setif -global bond0.764/10.15.69.0:public

grid@host01a:~# oifcfg getif
eno49 192.168.7.32 global cluster_interconnect,asm
eno50 192.168.9.48 global cluster_interconnect,asm
bond0 10.11.8.0 global public
bond0.764 10.15.69.0 global public
grid@host01a:~#

 

Then as root create the network number 2 and disply the configuration:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add network -netnum 2 -subnet 10.15.69.0/255.255.255.0/bond0.764 -nettype STATIC

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl config network -netnum 2
Network 2 exists
Subnet IPv4: 10.15.69.0/255.255.255.0/, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:

 

As root user add the node VIPs:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host01a -netnum 2 -address host01b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host02a -netnum 2 -address host02b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host03a -netnum 2 -address host03b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host04a -netnum 2 -address host04b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host05a -netnum 2 -address host05b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host06a -netnum 2 -address host06b-vip.emilianofusaglia.net/255.255.255.0

 

As grid user  create a new listener based on the network number 2:

grid@host01a:~# srvctl add listener -listener LISTENER2 -netnum 2 -endpoints "TCP:1532"

 

As root user add the new SCAN to the network number 2:

 root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add scan -scanname scan-02.emilianofusaglia.net -netnum 2

 

As root user start the new node VIPs:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host01b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host02b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host03b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host04b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host05b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host06b-vip.emilianofusaglia.net

 

As grid user start the new node Listeners:

grid@host01a:~# srvctl start listener -listener LISTENER2
grid@host01a:~# srvctl status listener -listener LISTENER2
Listener LISTENER2 is enabled
Listener LISTENER2 is running on node(s): host01a,host02a,host03a,host04a,host05a,host06a

 

As root user start the new SCAN and as grid user check the configuration:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start scan -netnum 2

grid@host01a:~# srvctl config scan -netnum 2
SCAN name: scan-02.emilianofusaglia.net, Network: 2
Subnet IPv4: 10.15.69.0/255.255.255.0/, static
Subnet IPv6:
SCAN 1 IPv4 VIP: 10.15.69.44
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:
SCAN 2 IPv4 VIP: 10.15.69.45
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:
SCAN 3 IPv4 VIP: 10.15.69.43
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:

grid@host01a:~# srvctl status scan -netnum 2
SCAN VIP scan1_net2 is enabled
SCAN VIP scan1_net2 is running on node host02a
SCAN VIP scan2_net2 is enabled
SCAN VIP scan2_net2 is running on node host01a
SCAN VIP scan3_net2 is enabled
SCAN VIP scan3_net2 is running on node host03a

 

As grid user add the SCAN Listener and check the configuration:

grid@host01a:~# srvctl add scan_listener -netnum 2 -listener LISTENER2 -endpoints TCP:1532

grid@host01a:~# srvctl config scan_listener -netnum 2
SCAN Listener LISTENER2_SCAN1_NET2 exists. Port: TCP:1532
Registration invited nodes:
Registration invited subnets:
SCAN Listener is enabled.
SCAN Listener is individually enabled on nodes:
SCAN Listener is individually disabled on nodes:
SCAN Listener LISTENER2_SCAN2_NET2 exists. Port: TCP:1532
Registration invited nodes:
Registration invited subnets:
SCAN Listener is enabled.
SCAN Listener is individually enabled on nodes:
SCAN Listener is individually disabled on nodes:
SCAN Listener LISTENER2_SCAN3_NET2 exists. Port: TCP:1532
Registration invited nodes:
Registration invited subnets:
SCAN Listener is enabled.
SCAN Listener is individually enabled on nodes:
SCAN Listener is individually disabled on nodes:

 

As grid user start the SCAN Listener2 and check the status:

grid@host01a:~# srvctl start scan_listener -netnum 2

grid@host01a:~# srvctl status scan_listener -netnum 2
SCAN Listener LISTENER2_SCAN1_NET2 is enabled
SCAN listener LISTENER2_SCAN1_NET2 is running on node host02a
SCAN Listener LISTENER2_SCAN2_NET2 is enabled
SCAN listener LISTENER2_SCAN2_NET2 is running on node host01a
SCAN Listener LISTENER2_SCAN3_NET2 is enabled
SCAN listener LISTENER2_SCAN3_NET2 is running on node host03a

 

Defining the multi SCANs configuration per database

Once the above configuration is completed, it remains to define which SCAN/s should be used by each database.

When multiple SCANs exists, by default the CRS populate the LISTENER_NETWORKS parameter to register the database against all SCANs and LISTENERs.

To overwrite this default behavior, allowing for example the authentication of a specific database only against the SCAN scan-02.emilianofusaglia.net, the database parameter LISTENER_NETWORKS should be manually configured.
The parameter LISTENER_NETWORKS can be dynamically set but the new value is enforced during the next instance restart.

 


 

ASM Filter Driver (ASMFD)

 

ASM Filter Driver is a Linux kernel module introduced in 12c R1. It resides in the I/O path of the Oracle ASM disks providing the following features:

  • Rejecting all non-Oracle I/O write requests to ASM Disks.
  • Device name persistency.
  • Node level fencing without reboot.

 

In 12c R2 ASMFD can be enabled from the GUI interface of the Grid Infrastructure installation, as shown on this post GI 12c R2 Installation at the step #8 “Create ASM Disk Group”.

Once ASM Filter Driver is in use, similarly to ASMLib the disks are managed using the ASMFD Label Name.

 

Here few examples about the implementation of ASM Filter Driver.

--How to create an ASMFD label in SQL*Plus
SQL> Alter system label set 'DATA1' to '/dev/mapper/mpathak';

System altered.


--How to create an ASM Disk Group with ASMFD
CREATE DISKGROUP DATA_DG EXTERNAL REDUNDANCY DISK 'AFD:DATA1' SIZE 30720M
ATTRIBUTE 'SECTOR_SIZE'='512','LOGICAL_SECTOR_SIZE'='512','compatible.asm'='12.2.0.1',
'compatible.rdbms'='12.2.0.1','compatible.advm'='12.2.0.1','au_size'='4M';

Diskgroup created.

 

ASM Filter Driver can also be managed from the ASM command line utility ASMCMD

--Check ASMFD status
ASMCMD> afd_state
ASMCMD-9526: The AFD state is 'LOADED' and filtering is 'ENABLED' on host 'oel7node06.localdomain'


--List ASM Disks where ASMFD is enabled
ASMCMD> afd_lsdsk
--------------------------------------------------------------------------------
Label                    Filtering                Path
================================================================================
DATA1                      ENABLED                /dev/mapper/mpathak
DATA2                      ENABLED                /dev/mapper/mpathan
DATA3                      ENABLED                /dev/mapper/mpathw
DATA4                      ENABLED                /dev/mapper/mpathac
GIMR1                      ENABLED                /dev/mapper/mpatham
GIMR2                      ENABLED                /dev/mapper/mpathaj
GIMR3                      ENABLED                /dev/mapper/mpathal
GIMR4                      ENABLED                /dev/mapper/mpathaf
GIMR5                      ENABLED                /dev/mapper/mpathai
RECO3                      ENABLED                /dev/mapper/mpathy
RECO1                      ENABLED                /dev/mapper/mpathab
RECO2                      ENABLED                /dev/mapper/mpathx
ASMCMD>


--How to remove an ASMFD label in ASMCMD
ASMCMD> afd_unlabel DATA4

 

 


 

Installing Oracle Grid Infrastructure 12c R2

It has been an exciting week, Oracle 12c R2 came out and suddenly was time to refresh the RAC test environments. My friend Jacques opted for an upgrade from 12.1.0.2 to 12.2.0.1 (here the link to his blog post),  I started with a fresh installation, because I also upgraded the Operating System to OEL  7.3.

Compared to 12c R1 there are new options on the installation process, but general speaking the wizard is quite similar.

The first breakthrough is about the installation simplified with an image based, no more runIstaller.sh to invoke but …

Unpack the .Zip file directly inside the Grid Infrastructure Home of the first cluster node as described below:

[grid@oel7node06 ~]$ mkdir -p /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ chown grid:oinstall /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ cd /u01/app/12.2.0.1/grid 
[grid@oel7node06 grid]$ unzip -q download_location/grid_home_image.zip

# From an X session invoke the Grid Infrastructure wizard: 
[grid@oel7node06 grid]$ ./gridSetup.sh

 

01

 

 

The second screenshot list the new Cluster typoligies available on 12c R2:

  • Oracle Standalone Cluster
  • Oracle Cluster Domain
    • Oracle Domain Services Cluster
    • Oracle Member Clusters
      • Oracle Member Cluster for Oracle Database
      • Oracle Member Cluster for Applications

 

In my case I’m installing an Oracle Standalone Cluster

02

 

 

03

04

 

05

 

06

 

07

 

08

 

09

 

10

 

11

 

12

 

13

 

14

 

15

 

16

 

17

 

18

19

 

20

 

21

 

22

 

And now time for testing.

 

 

Linux for DBA: How disable the ssh banner for a given user

Ready to install a new Oracle RAC cluster, but the ssh banner (in /etc/issue.net protected by root privileges) is compromising the non-interactive ssh commands issued by grid & oracle?

Here the trick to disable it:

--Add this empty file to the grid and oracle UNIX home
touch ~/.hushlogin 

--or
mkdir -p .ssh
chmod 700 .ssh 
echo "LogLevel quiet" > ~/.ssh/config

How to restore OCR and Voting disk

################################################################
# How to restore OCR and Voting disk  on Oracle 11g R2.
################################################################

--Location and status of OCR before starting the test:
 root@host1:/u01/GRID/11.2/cdata # /u01/GRID/11.2/bin/ocrcheck
 Status of Oracle Cluster Registry is as follows :
 Version                  :          3
 Total space (kbytes)     :     262120
 Used space (kbytes)      :       2744
 Available space (kbytes) :     259376
 ID                       :  401168391
 Device/File Name         : +OCRVOTING
 Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
--Check the existency of BACKUPS:
 root@host1:/root # /u01/GRID/11.2/bin/ocrconfig -showbackup
host1     2010/01/21 14:17:54     /u01/GRID/11.2/cdata/cluster01/backup00.ocr
host1     2010/01/21 05:58:31     /u01/GRID/11.2/cdata/cluster01/backup01.ocr
host1     2010/01/21 01:58:30     /u01/GRID/11.2/cdata/cluster01/backup02.ocr
host1     2010/01/20 05:58:21     /u01/GRID/11.2/cdata/cluster01/day.ocr
host1     2010/01/14 23:12:07     /u01/GRID/11.2/cdata/cluster01/week.ocr
 PROT-25: Manual backups for the Oracle Cluster Registry are not available
--Identify all the disks belong the Disk group +OCRVOTING:
NAME                                       PATH
 ------------------------------ ------------------------------------------------------------
 OCRVOTING_0000                 /dev/oracle/asm.25.lun
 OCRVOTING_0001                 /dev/oracle/asm.26.lun
 OCRVOTING_0002                 /dev/oracle/asm.27.lun
 OCRVOTING_0003                 /dev/oracle/asm.28.lun
 OCRVOTING_0004                 /dev/oracle/asm.29.lun
5 rows selected.
--Corrupt tht disks belong the Disk group +OCRVOTING:
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.25.lun bs=1024 count=1000
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.26.lun bs=1024 count=1000
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.27.lun bs=1024 count=1000
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.28.lun bs=1024 count=1000
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.29.lun bs=1024 count=1000
--OCR Check after Corruption:
 root@host1:/tmp # /u01/GRID/11.2/bin/ocrcheck
 Status of Oracle Cluster Registry is as follows :
 Version                  :          3
 Total space (kbytes)     :     262120
 Used space (kbytes)      :       2712
 Available space (kbytes) :     259408
 ID                       :  701409037
 Device/File Name         : +OCRVOTING
 Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
--Stop and Start of database instance after corruption
 oracle@host1:/u01/oracle/data $ srvctl stop instance -d DB -i DB1
 oracle@host1:/u01/oracle/data $ srvctl start instance -d DB -i DB1
--Stop and Start entire Cluster:
-host1:
 root@host1:/tmp # /u01/GRID/11.2/bin/crsctl stop crs
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host1'
 CRS-2673: Attempting to stop 'ora.crsd' on 'host1'
 CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'host1'
 CRS-2673: Attempting to stop 'ora.OCRVOTING.dg' on 'host1'
 CRS-2673: Attempting to stop 'ora.db.db' on 'host1'
 CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'host1'
 CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.host1.vip' on 'host1'
 CRS-2677: Stop of 'ora.host1.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.OCRVOTING.dg' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.scan2.vip' on 'host1'
 CRS-2673: Attempting to stop 'ora.scan3.vip' on 'host1'
 CRS-2673: Attempting to stop 'ora.host2.vip' on 'host1'
 CRS-2677: Stop of 'ora.scan2.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.scan3.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.host2.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.db.db' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.DATA1.dg' on 'host1'
 CRS-2673: Attempting to stop 'ora.FRA1.dg' on 'host1'
 CRS-2677: Stop of 'ora.DATA1.dg' on 'host1' succeeded
 CRS-2677: Stop of 'ora.FRA1.dg' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.asm' on 'host1'
 CRS-2677: Stop of 'ora.asm' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.ons' on 'host1'
 CRS-2673: Attempting to stop 'ora.eons' on 'host1'
 CRS-2677: Stop of 'ora.ons' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.net1.network' on 'host1'
 CRS-2677: Stop of 'ora.net1.network' on 'host1' succeeded
 CRS-2677: Stop of 'ora.eons' on 'host1' succeeded
 CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'host1' has completed
 CRS-2677: Stop of 'ora.crsd' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.mdnsd' on 'host1'
 CRS-2673: Attempting to stop 'ora.gpnpd' on 'host1'
 CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'host1'
 CRS-2673: Attempting to stop 'ora.ctssd' on 'host1'
 CRS-2673: Attempting to stop 'ora.evmd' on 'host1'
 CRS-2673: Attempting to stop 'ora.asm' on 'host1'
 CRS-2677: Stop of 'ora.cssdmonitor' on 'host1' succeeded
 CRS-2677: Stop of 'ora.mdnsd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.gpnpd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.evmd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.ctssd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.asm' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.cssd' on 'host1'
 CRS-2677: Stop of 'ora.cssd' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.diskmon' on 'host1'
 CRS-2673: Attempting to stop 'ora.gipcd' on 'host1'
 CRS-2677: Stop of 'ora.gipcd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.diskmon' on 'host1' succeeded
 CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'host1' has completed
 CRS-4133: Oracle High Availability Services has been stopped.
--host2:
 root@host2:/root # /u01/GRID/11.2/bin/crsctl stop crs
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host2'
 CRS-2673: Attempting to stop 'ora.crsd' on 'host2'
 CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'host2'
 CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'host2'
 CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'host2'
 CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'host2'
 CRS-2673: Attempting to stop 'ora.OCRVOTING.dg' on 'host2'
 CRS-2673: Attempting to stop 'ora.db.db' on 'host2'
 CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'host2'
 CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.scan2.vip' on 'host2'
 CRS-2677: Stop of 'ora.scan2.vip' on 'host2' succeeded
 CRS-2672: Attempting to start 'ora.scan2.vip' on 'host1'
 CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.scan3.vip' on 'host2'
 CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.host2.vip' on 'host2'
 CRS-2677: Stop of 'ora.scan3.vip' on 'host2' succeeded
 CRS-2672: Attempting to start 'ora.scan3.vip' on 'host1'
 CRS-2677: Stop of 'ora.host2.vip' on 'host2' succeeded
 CRS-2672: Attempting to start 'ora.host2.vip' on 'host1'
 CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.scan1.vip' on 'host2'
 CRS-2677: Stop of 'ora.scan1.vip' on 'host2' succeeded
 CRS-2676: Start of 'ora.scan2.vip' on 'host1' succeeded
 CRS-2676: Start of 'ora.scan3.vip' on 'host1' succeeded
 CRS-2676: Start of 'ora.host2.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.OCRVOTING.dg' on 'host2' succeeded
 CRS-2677: Stop of 'ora.db.db' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.DATA1.dg' on 'host2'
 CRS-2673: Attempting to stop 'ora.FRA1.dg' on 'host2'
 CRS-2677: Stop of 'ora.DATA1.dg' on 'host2' succeeded
 CRS-2677: Stop of 'ora.FRA1.dg' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.asm' on 'host2'
 CRS-2677: Stop of 'ora.asm' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.ons' on 'host2'
 CRS-2673: Attempting to stop 'ora.eons' on 'host2'
 CRS-2677: Stop of 'ora.ons' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.net1.network' on 'host2'
 CRS-2677: Stop of 'ora.net1.network' on 'host2' succeeded
 CRS-2677: Stop of 'ora.eons' on 'host2' succeeded
 CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'host2' has completed
 CRS-2677: Stop of 'ora.crsd' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.gpnpd' on 'host2'
 CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'host2'
 CRS-2673: Attempting to stop 'ora.ctssd' on 'host2'
 CRS-2673: Attempting to stop 'ora.evmd' on 'host2'
 CRS-2673: Attempting to stop 'ora.asm' on 'host2'
 CRS-2673: Attempting to stop 'ora.mdnsd' on 'host2'
 CRS-2677: Stop of 'ora.cssdmonitor' on 'host2' succeeded
 CRS-2677: Stop of 'ora.gpnpd' on 'host2' succeeded
 CRS-2677: Stop of 'ora.evmd' on 'host2' succeeded
 CRS-2677: Stop of 'ora.mdnsd' on 'host2' succeeded
 CRS-2677: Stop of 'ora.asm' on 'host2' succeeded
 CRS-2677: Stop of 'ora.ctssd' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.cssd' on 'host2'
 CRS-2677: Stop of 'ora.cssd' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.diskmon' on 'host2'
 CRS-2673: Attempting to stop 'ora.gipcd' on 'host2'
 CRS-2677: Stop of 'ora.gipcd' on 'host2' succeeded
 CRS-2677: Stop of 'ora.diskmon' on 'host2' succeeded
 CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'host2' has completed
 CRS-4133: Oracle High Availability Services has been stopped.
--host1
 root@host1:/root # /u01/GRID/11.2/bin/crsctl start crs
 CRS-4123: Oracle High Availability Services has been started.
--host2
 root@host2:/u01/GRID/11.2/cdata/cluster01 # /u01/GRID/11.2/bin/crsctl start crs
 CRS-4123: Oracle High Availability Services has been started.
--CRS Alert log: (Start failed because the Diskgroup is not available)
 2010-01-21 16:29:07.785
 [cssd(10123)]CRS-1705:Found 0 configured voting files but 1 voting files are required, terminating to ensure data integrity; details at (:CSSNM00065:) in /u01/GRID/11.2/log/host1/cssd/ocssd.log
 2010-01-21 16:29:07.785
 [cssd(10123)]CRS-1603:CSSD on node host1 shutdown by user.
 2010-01-21 16:29:07.918
 [ohasd(9931)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'host1'.
 2010-01-21 16:30:05.489
 [/u01/GRID/11.2/bin/orarootagent.bin(10113)]CRS-5818:Aborted command 'start for resource: ora.diskmon 1 1' for resource 'ora.diskmon'. Details at (:CRSAGF00113:) in /u01/GRID/11.2/log/host1/agent/ohasd/orarootagent_root/orarootagent_root.log.
 2010-01-21 16:30:09.504
 [ohasd(9931)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.diskmon'. Details at (:CRSPE00111:) in /u01/GRID/11.2/log/host1/ohasd/ohasd.log.
 2010-01-21 16:30:20.687
 [cssd(10622)]CRS-1713:CSSD daemon is started in clustered mode
 2010-01-21 16:30:21.801
 [cssd(10622)]CRS-1705:Found 0 configured voting files but 1 voting files are required, terminating to ensure data integrity; details at (:CSSNM00065:) in /u01/GRID/11.2/log/host1/cssd/ocssd.log
 2010-01-21 16:30:21.801
 [cssd(10622)]CRS-1603:CSSD on node host1 shutdown by user.
--host1 STOP CRS because due to Voting Disk unavailability is not running properly:
 root@host1:/tmp # /u01/GRID/11.2/bin/crsctl stop crs
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host1'
 CRS-2673: Attempting to stop 'ora.crsd' on 'host1'
 CRS-4548: Unable to connect to CRSD
 CRS-2675: Stop of 'ora.crsd' on 'host1' failed
 CRS-2679: Attempting to clean 'ora.crsd' on 'host1'
 CRS-4548: Unable to connect to CRSD
 CRS-2678: 'ora.crsd' on 'host1' has experienced an unrecoverable failure
 CRS-0267: Human intervention required to resume its availability.
 CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'host1' has failed
 CRS-4687: Shutdown command has completed with error(s).
 CRS-4000: Command Stop failed, or completed with errors.
--Because all the processes are not STOPPING, disable the cluster AUTO Start and reboot
 --the server for cleaning all the pending processes.
root@host1:/tmp # /u01/GRID/11.2/bin/crsctl disable crs
 CRS-4621: Oracle High Availability Services autostart is disabled.
root@host1:/tmp # reboot
--Start the Cluster in EXLUSIVE Mode in order to recreate ASM Diskgroup:
 root@host1:/root # /u01/GRID/11.2/bin/crsctl start crs -excl
 CRS-4123: Oracle High Availability Services has been started.
 CRS-2672: Attempting to start 'ora.gipcd' on 'host1'
 CRS-2672: Attempting to start 'ora.mdnsd' on 'host1'
 CRS-2676: Start of 'ora.gipcd' on 'host1' succeeded
 CRS-2676: Start of 'ora.mdnsd' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.gpnpd' on 'host1'
 CRS-2676: Start of 'ora.gpnpd' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.cssdmonitor' on 'host1'
 CRS-2676: Start of 'ora.cssdmonitor' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.cssd' on 'host1'
 CRS-2679: Attempting to clean 'ora.diskmon' on 'host1'
 CRS-2681: Clean of 'ora.diskmon' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.diskmon' on 'host1'
 CRS-2676: Start of 'ora.diskmon' on 'host1' succeeded
 CRS-2676: Start of 'ora.cssd' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.ctssd' on 'host1'
 CRS-2676: Start of 'ora.ctssd' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.asm' on 'host1'
 CRS-2676: Start of 'ora.asm' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.crsd' on 'host1'
 CRS-2676: Start of 'ora.crsd' on 'host1' succeeded
--Stop ASM and restart it using a pfile example:
 *.asm_diskgroups='DATA1','FRA1'
 *.asm_diskstring='/dev/oracle/asm*'
 *.diagnostic_dest='/u01/oracle'
 +ASM1.instance_number=1
 +ASM2.instance_number=2
 *.instance_type='asm'
 *.large_pool_size=12M
 *.processes=500
 *.sga_max_size=1G
 *.sga_target=1G
 *.shared_pool_size=300M
--Recreate ASM Diskgroup
 --This command FAILS because asmca is not able to update the OCR:
 asmca -silent -createDiskGroup -diskGroupName OCRVOTING  -disk '/dev/oracle/asm.25.lun' -disk '/dev/oracle/asm.26.lun'  -disk '/dev/oracle/asm.27.lun'  -disk '/dev/oracle/asm.28.lun'  -disk '/dev/oracle/asm.29.lun'  -redundancy HIGH -compatible.asm '11.2.0.0.0'  -compatible.rdbms '11.2.0.0.0' -compatible.advm '11.2.0.0.0'
--Create the Diskgroup Using SQLPLUS Create Diskgroup and save the ASM spfile inside:
 create Diskgroup OCRVOTING high redundancy disk '/dev/oracle/asm.25.lun',
 '/dev/oracle/asm.26.lun', '/dev/oracle/asm.27.lun',
 '/dev/oracle/asm.28.lun', '/dev/oracle/asm.29.lun'
 ATTRIBUTE  'compatible.asm'='11.2.0.0.0', 'compatible.rdbms'='11.2.0.0.0';
create spfile='+OCRVOTING' from pfile='/tmp/asm_pfile.ora';
File created.
SQL> shut immediate
 ASM diskgroups dismounted
 ASM instance shutdown
 SQL> startup
 ASM instance started
Total System Global Area 1069252608 bytes
 Fixed Size                  2154936 bytes
 Variable Size            1041931848 bytes
 ASM Cache                  25165824 bytes
 ASM diskgroups mounted
-- Restore OCR from backup:
 root@host1:/root # /u01/GRID/11.2/bin/ocrconfig -restore /u01/GRID/11.2/cdata/cluster01/backup00.ocr
--Check the OCR status after restore:
 root@host1:/root # /u01/GRID/11.2/bin/ocrcheck
 Status of Oracle Cluster Registry is as follows :
 Version                  :          3
 Total space (kbytes)     :     262120
 Used space (kbytes)      :       2712
 Available space (kbytes) :     259408
 ID                       :  701409037
 Device/File Name         : +OCRVOTING
 Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
--Restore the Voting Disk:
 root@host1:/root # /u01/GRID/11.2/bin/crsctl replace votedisk +OCRVOTING
 Successful addition of voting disk 7s16f9fbf4b64f74bfy0ee8826f15eb4.
 Successful addition of voting disk 9k6af49d3cd54fc5bf28a2fc3899c8c6.
 Successful addition of voting disk 876eb99563924ff6bfc1defe6865deeb.
 Successful addition of voting disk 12230b5ef41f4fc2bf2cae957f765fb0.
 Successful addition of voting disk 47812b7f6p034f33bf13490e6e136b8b.
 Successfully replaced voting disk group with +OCRVOTING.
 CRS-4266: Voting file(s) successfully replaced
--Re-enable CRS auto starup
 root@host1:/root # /u01/GRID/11.2/bin/crsctl enable crs
 CRS-4622: Oracle High Availability Services autostart is enabled.
--Stop CRS on host1
 root@host1:/root # /u01/GRID/11.2/bin/crsctl stop crs
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host1'
 CRS-2673: Attempting to stop 'ora.crsd' on 'host1'
 CRS-2677: Stop of 'ora.crsd' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.gpnpd' on 'host1'
 CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'host1'
 CRS-2673: Attempting to stop 'ora.ctssd' on 'host1'
 CRS-2673: Attempting to stop 'ora.asm' on 'host1'
 CRS-2673: Attempting to stop 'ora.mdnsd' on 'host1'
 CRS-2677: Stop of 'ora.cssdmonitor' on 'host1' succeeded
 CRS-2677: Stop of 'ora.gpnpd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.mdnsd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.ctssd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.asm' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.cssd' on 'host1'
 CRS-2677: Stop of 'ora.cssd' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.diskmon' on 'host1'
 CRS-2673: Attempting to stop 'ora.gipcd' on 'host1'
 CRS-2677: Stop of 'ora.gipcd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.diskmon' on 'host1' succeeded
 CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'host1' has completed
 CRS-4133: Oracle High Availability Services has been stopped.
--Start CRS on host1
 root@host1:/root # /u01/GRID/11.2/bin/crsctl start crs
 CRS-4123: Oracle High Availability Services has been started.
--Start CRS on host2
 root@host2:/root # /u01/GRID/11.2/bin/crsctl start crs
 CRS-4123: Oracle High Availability Services has been started.
--Check if all the Resources are running:
 root@host1:/root # /u01/GRID/11.2/bin/crsctl stat res -t
 --------------------------------------------------------------------------------
 NAME           TARGET  STATE        SERVER                   STATE_DETAILS
 --------------------------------------------------------------------------------
 Local Resources
 --------------------------------------------------------------------------------
 ora.DATA1.dg
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.FRA1.dg
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.LISTENER.lsnr
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.OCRVOTING.dg
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.asm
 ONLINE  ONLINE       host1                 Started
 ONLINE  ONLINE       host2                 Started
 ora.eons
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.gsd
 OFFLINE OFFLINE      host1
 OFFLINE OFFLINE      host2
 ora.net1.network
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.ons
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 --------------------------------------------------------------------------------
 Cluster Resources
 --------------------------------------------------------------------------------
 ora.LISTENER_SCAN1.lsnr
 1        ONLINE  ONLINE       host1
 ora.LISTENER_SCAN2.lsnr
 1        ONLINE  ONLINE       host2
 ora.LISTENER_SCAN3.lsnr
 1        ONLINE  ONLINE       host2
 ora.db.db
 1        ONLINE  ONLINE       host1                 Open
 2        ONLINE  ONLINE       host2                 Open
 ora.oc4j
 1        OFFLINE OFFLINE
 ora.scan1.vip
 1        ONLINE  ONLINE       host1
 ora.scan2.vip
 1        ONLINE  ONLINE       host2
 ora.scan3.vip
 1        ONLINE  ONLINE       host2
 ora.host1.vip
 1        ONLINE  ONLINE       host1
 ora.host2.vip
 1        ONLINE  ONLINE       host2