Exadata Deployment with Elastic Configuration

Recently, for one of my customers, I had the chance to install a couples of Exadata X7-2 using the new Elastic Configuration. The major benefits of using Elastic Configuration consists in the possibility to acquire the Exadata Machine with almost any possible combination of Database Nodes and Storage Cells.

In the past we were used to standard Oracle pre-defined Exadata Machine configurations: Eighth Rack, Quarter Rack, Half Rack and Full Rack, which is still possible, but not flexible enough.

The pictures below highlight the differences between the two configurations:

Edadats_Classiv_vs_Elastic

source: Oracle Data Sheet Exadata Database Machine X7-2

 

Deployment Exadata Elactic Configuration

The elastic configuration process automates the initial IP address allocations to databasenodes and storage cells, regardless the ordered configuration.  The Exadata Machine is connected to the InfiniBand switches using a standard cabling methodology which allows to determinate the node’s location in the rack. This information is therefore used when the nodes are powered up for the first time in order to assign the initial default IPs.

[root@exatest-iba0 ~]# ibhosts
Ca : 0x579b0123796ba0 ports 2 "node10 elasticNode 192.168.10.17,192.168.10.18 ETH0"
Ca : 0x579b01237966e0 ports 2 "node8 elasticNode 192.168.10.15,192.168.10.16 ETH0"
Ca : 0x579b0123844ab0 ports 2 "node6 elasticNode 192.168.10.11,192.168.10.12 ETH0"
Ca : 0x579b0123845e50 ports 2 "node5 elasticNode 192.168.10.7,192.168.10.8 ETH0"
Ca : 0x579b0123845fe0 ports 2 "node4 elasticNode 192.168.10.40,172.16.2.40 ETH0"
Ca : 0x579b0123845ea0 ports 2 "node3 elasticNode 192.168.10.9,192.168.10.10 ETH0"
Ca : 0x579b0123812b90 ports 2 "node2 elasticNode 192.168.10.1,192.168.10.2 ETH0"
Ca : 0x579b0123812970 ports 2 "node1 elasticNode 192.168.10.3,192.168.10.4 ETH0"
[root@exatest-iba0 ~]#

 

 

Because the Virtualization option was required,  it has to be activated at this stage:

[root@node8 ~]# /opt/oracle.SupportTools/switch_to_ovm.sh
2019-03-07 01:05:22 -0800 [INFO] Switch to DOM0 system partition /dev/VGExaDb/LVDbSys3 (/dev/mapper/VGExaDb-LVDbSys3)
2019-03-07 01:05:22 -0800 [INFO] Active system device: /dev/mapper/VGExaDb-LVDbSys1
2019-03-07 01:05:22 -0800 [INFO] Active system device in boot area: /dev/mapper/VGExaDb-LVDbSys1
2019-03-07 01:05:23 -0800 [INFO] Set active system device to /dev/VGExaDb/LVDbSys3 in /boot/I_am_hd_boot
2019-03-07 01:05:23 -0800 [INFO] Creating /.elasticConfig on DOM0 boot partition /boot
2019-03-07 01:05:34 -0800 [INFO] Reboot has been initiated to switch to the DOM0 system partition
Connection to 192.168.1.8 closed by remote host.
Connection to 192.168.1.8 closed.
✘

 

After the switch to OVM command it is time to reclaim the space initially used by the Linux bare metal Logical Volumes:

[root@node8 ~]# /opt/oracle.SupportTools/reclaimdisks.sh -free -reclaim
Model is ORACLE SERVER X7-2
Number of LSI controllers: 1
Physical disks found: 4 (252:0 252:1 252:2 252:3)
Logical drives found: 1
Linux logical drive: 0
RAID Level for the Linux logical drive: 5
Physical disks in the Linux logical drive: 4 (252:0 252:1 252:2 252:3)
Dedicated Hot Spares for the Linux logical drive: 0
Global Hot Spares: 0
[INFO ] Check for DOM0 with inactive Linux system disk
[INFO ] Valid DOM0 with inactive Linux system disk is detected
[INFO ] Number of partitions on the system device /dev/sda: 3
[INFO ] Higher partition number on the system device /dev/sda: 3
[INFO ] Last sector on the system device /dev/sda: 3509760000
[INFO ] End sector of the last partition on the system device /dev/sda: 3509759966
[INFO ] Remove inactive system logical volume /dev/VGExaDb/LVDbSys1
[INFO ] Remove logical volume /dev/VGExaDb/LVDbOra1
[INFO ] Extend logical volume /dev/VGExaDb/LVDbExaVMImages
[INFO ] Resize ocfs2 on logical volume /dev/VGExaDb/LVDbExaVMImages
[INFO ] XEN boot version and rpm versions are in sync
[INFO ] XEN EFI files will not be updated
[INFO ] Force setup grub
[root@node8 ~]#

 

Check the success of the reclaim disks procedure:

[root@node8 ~]# /opt/oracle.SupportTools/reclaimdisks.sh -check
Model is ORACLE SERVER X7-2
Number of LSI controllers: 1
Physical disks found: 4 (252:0 252:1 252:2 252:3)
Logical drives found: 1
Linux logical drive: 0
RAID Level for the Linux logical drive: 5
Physical disks in the Linux logical drive: 4 (252:0 252:1 252:2 252:3)
Dedicated Hot Spares for the Linux logical drive: 0
Global Hot Spares: 0
Valid. Disks configuration: RAID5 from 4 disks with no global and dedicated hot spare disks.
Valid. Booted: DOM0. Layout: DOM0.
[root@node8 ~]#

 

Upload the Oracle Exadata Database Machine Deployment Assistant configuration files to the database server, together with all software images, and run the One command procedure.

List of all Steps

[root@exatestdbadm01 linux-x64]# ./install.sh -cf TVD-exatest.xml -l
Initializing

1. Validate Configuration File
2. Update Nodes for Eighth Rack
3. Create Virtual Machine
4. Create Users
5. Setup Cell Connectivity
6. Calibrate Cells
7. Create Cell Disks
8. Create Grid Disks
9. Install Cluster Software
10. Initialize Cluster Software
11. Install Database Software
12. Relink Database with RDS
13. Create ASM Diskgroups
14. Create Databases
15. Apply Security Fixes
16. Install Exachk
17. Create Installation Summary
18. Resecure Machine
[root@exatestdbadm01 linux-x64]#

 

Run Step One to validate the setup

This example includes the creation of three different Clusters.

[root@exatestdbadm01 linux-x64]# ./install.sh -cf TVD-exatest.xml -s 1
Initializing
Executing Validate Configuration File
Validating cluster: Cluster-EFU
Locating machines...
Verifying operating systems...
Validating cluster networks...
Validating network connectivity...
Validating private ips on virtual cluster
Validating NTP setup...
Validating physical disks on storage cells...
Validating users...
Validating cluster: Cluster-PR1
Locating machines...
Verifying operating systems...
Validating cluster networks...
Validating network connectivity...
Validating private ips on virtual cluster
Validating NTP setup...
Validating physical disks on storage cells...
Validating users...
Validating cluster: Cluster-VAL
Locating machines...
Verifying operating systems...
Validating cluster networks...
Validating network connectivity...
Validating private ips on virtual cluster
Validating NTP setup...
Validating physical disks on storage cells...
Validating users...
Validating platinum...
Validating switches...
Checking disk reclaim status...
Checking Disk Tests Status....
Completed validation...

SUCCESS: Ip address: 10.x8.xx.40 is configured correctly
SUCCESS: Ip address: 10.x9.xx.55 is configured correctly
SUCCESS: Ip address: 10.x8.xx.41 is configured correctly
SUCCESS: Ip address: 10.x9.xx.56 is configured correctly
SUCCESS: Ip address: 10.x8.xx.45 is configured correctly
SUCCESS: Ip address: 10.x8.xx.46 is configured correctly
SUCCESS: Ip address: 10.x8.xx.44 is configured correctly
SUCCESS: Ip address: 10.x8.xx.43 is configured correctly
SUCCESS: Ip address: 10.x8.xx.42 is configured correctly
SUCCESS: 10.x8.xx.40 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.55 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.41 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.56 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.45 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.46 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.44 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.43 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.42 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.40 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.55 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.41 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.56 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.45 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.46 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.44 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.43 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.42 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.40 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.55 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.41 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.56 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.45 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.46 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.44 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.43 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.42 configured correctly on exatestceladm03.my.domain.com
SUCCESS: Ip address: 10.x8.xx.47 is configured correctly
SUCCESS: Ip address: 10.x9.xx.57 is configured correctly
SUCCESS: Ip address: 10.x8.xx.48 is configured correctly
SUCCESS: Ip address: 10.x9.xx.58 is configured correctly
SUCCESS: Ip address: 10.x8.xx.52 is configured correctly
SUCCESS: Ip address: 10.x8.xx.51 is configured correctly
SUCCESS: Ip address: 10.x8.xx.53 is configured correctly
SUCCESS: Ip address: 10.x8.xx.50 is configured correctly
SUCCESS: Ip address: 10.x8.xx.49 is configured correctly
SUCCESS: 10.x8.xx.47 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.57 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.48 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.58 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.52 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.51 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.53 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.50 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.49 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.47 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.57 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.48 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.58 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.52 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.51 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.53 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.50 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.49 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.47 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.57 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.48 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.58 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.52 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.51 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.53 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.50 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.49 configured correctly on exatestceladm03.my.domain.com
SUCCESS: Ip address: 10.x8.xx.54 is configured correctly
SUCCESS: Ip address: 10.x9.xx.59 is configured correctly
SUCCESS: Ip address: 10.x8.xx.55 is configured correctly
SUCCESS: Ip address: 10.x9.xx.60 is configured correctly
SUCCESS: Ip address: 10.x8.xx.58 is configured correctly
SUCCESS: Ip address: 10.x8.xx.60 is configured correctly
SUCCESS: Ip address: 10.x8.xx.59 is configured correctly
SUCCESS: Ip address: 10.x8.xx.57 is configured correctly
SUCCESS: Ip address: 10.x8.xx.56 is configured correctly
SUCCESS: 10.x8.xx.54 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.59 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.55 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.60 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.58 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.60 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.59 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.57 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.56 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.54 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.59 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.55 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.60 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.58 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.60 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.59 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.57 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.56 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.54 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.59 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.55 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.60 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.58 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.60 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.59 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.57 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.56 configured correctly on exatestceladm03.my.domain.com
SUCCESS: Validated NTP server 10.x3.xx.xx0
SUCCESS: Validated NTP server 10.x3.xx.xx1
SUCCESS: Required file /EXAVMIMAGES/onecommand/linux-x64/WorkDir/p28514222_122118_Linux-x86-64.zip exists...
SUCCESS: Required file /EXAVMIMAGES/onecommand/linux-x64/WorkDir/p28762988_12201181016GIOCT2018RU_Linux-x86-64.zip exists...
SUCCESS: Required file /EXAVMIMAGES/onecommand/linux-x64/WorkDir/p28762989_12201181016DBOCT2018RU_Linux-x86-64.zip exists...
SUCCESS: Required file config/exachk.zip exists...
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm03.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm02.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm01.my.domain.com, machine type: storage
SUCCESS: Expected machine exatestdbadm01.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Expected machine exatestdbadm02.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: NTP servers on machine exatestceladm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm03.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm02.my.domain.com verified successfully
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm02.my.domain.com
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm01.my.domain.com
SUCCESS: Expected machine exatestdbadm02.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm01.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm02.my.domain.com, machine type: storage
SUCCESS: Expected machine exatestdbadm01.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm03.my.domain.com, machine type: storage
SUCCESS: NTP servers on machine exatestceladm03.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm01.my.domain.com verified successfully
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm02.my.domain.com
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm01.my.domain.com
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm03.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm02.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm01.my.domain.com, machine type: storage
SUCCESS: Expected machine exatestdbadm02.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Expected machine exatestdbadm01.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: NTP servers on machine exatestceladm03.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm02.my.domain.com verified successfully
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm02.my.domain.com
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm01.my.domain.com
SUCCESS: Switch IP 10.x9.xx.51 resolves successfully to host exatest-iba0.my.domain.com on node exatestceladm03.my.domain.com
SUCCESS:
SUCCESS: Switch IP 10.x9.xx.51 resolves successfully to host exatest-iba0.my.domain.com on node exatestceladm02.my.domain.com
SUCCESS: Switch IP 10.x9.xx.52 resolves successfully to host exatest-ibb0.my.domain.com on node exatestceladm03.my.domain.com
SUCCESS:
SUCCESS:
SUCCESS:
SUCCESS: Switch IP 10.x9.xx.52 resolves successfully to host exatest-ibb0.my.domain.com on node exatestceladm02.my.domain.com
SUCCESS:
SUCCESS: Switch IP 10.x9.xx.51 resolves successfully to host exatest-iba0.my.domain.com on node exatestceladm01.my.domain.com
SUCCESS: Switch IP 10.x9.xx.52 resolves successfully to host exatest-ibb0.my.domain.com on node exatestceladm01.my.domain.com
SUCCESS:
SUCCESS: X7 compute node exatestdbadm01.my.domain.com has updated Broadcom firmware
SUCCESS: X7 compute node exatestdbadm02.my.domain.com has updated Broadcom firmware
SUCCESS: Disk Tests are not running/active on any of the Storage Servers.
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm01
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm02
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm01
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm02
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm01
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm02
SUCCESS: Disk size 10000GB on cell exatestceladm01.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm02.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm03.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm04.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm05.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm06.my.domain.com matches the value specified in the OEDA configuration file
Successfully completed execution of step Validate Configuration File [elapsed Time [Elapsed = 250301 mS [4.0 minutes] Thu Mar 07 12:35:31 CET 2019]]
[root@exatestdbadm01 linux-x64]#

 

 

Execution of all remaining steps

Than, because we felt confident, we decide to invoke all remaining steps together:

root@exatestdbadm01 linux-x64]# ./install.sh -cf TVD-exatest.xml -r 1-18
...
..

 

The final result is the Exadata Machine installed with six Oracle VMs, and three Grid Infrastructure clusters each one running a test RAC database.

 

 

Grid Management DB filling up ASM disk space

Recently I discovered on Oracle Grid Infrastructure 12cR2 that the ASM disk group hosting the Management DB (-MGMTDB) was filling up the disk space very quickly.

This is due to a bug on the oclumon data purge procedure.

To fix the problem, two possibilities are available:

  1. Recreate the Management DB
  2. Manually truncate the tables not purged and shrinking the Tablespace Size

 

Below are described the two options.

 

Option 1 – Recreate the Management DB

As root user on each cluster node:

# /u01/app/12.2.0.1/grid/bin/crsctl stop res ora.crf -init
# /u01/app/12.2.0.1/grid/bin/crsctl modify res ora.crf -attr ENABLED=0 -init

 

As Grid from the local node hosting the Management Database Instance run the commands:

$ /u01/app/12.2.0.1/grid/bin/srvctl status mgmtdb
$ /u01/app/12.2.0.1/grid/bin/dbca -silent -deleteDatabase -sourceDB -MGMTDB
Connecting to database
4% complete
9% complete
14% complete
19% complete
23% complete
28% complete
47% complete
Updating network configuration files
48% complete
52% complete
Deleting instance and datafiles
76% complete
100% complete

 

How to recreate the MGMTDB:

$ /u01/app/12.2.0.1/grid/bin/dbca -silent -createDatabase -createAsContainerDatabase true -templateName MGMTSeed_Database.dbc
-sid -MGMTDB
-gdbName _mgmtdb
-storageType ASM
-diskGroupName GIMR
-datafileJarLocation <GI HOME>/assistants/dbca/templates
-characterset AL32UTF8
-autoGeneratePassword           
-skipUserTemplateCheck

 

Create the pluggable GIMR database

$ /u01/app/12.2.0.1/grid/bin/mgmtca -local

 


 

Option 2 – Manually truncate the tables

 

As root user stop and disable ora.crf resource on each cluster node:

# /u01/app/12.2.0.1/grid/bin/crsctl stop res ora.crf -init
# /u01/app/12.2.0.1/grid/bin/crsctl modify res ora.crf -attr ENABLED=0 -init

 

Connect to MGMTDB and identify the segments to truncate:

export ORACLE_SID=-MGMTDB
$ORACLE_HOME/bin/sqlplus / as sysdba
SQL> select pdb_name from dba_pdbs where pdb_name!='PDB$SEED';

SQL> alter session set container=GIMR_DSCREP_10;

Session altered.

SQL> col obj format a50
SQL> select owner||'.'||SEGMENT_NAME obj, BYTES from dba_segments where owner='CHM' order by 2 asc;

 

Likely those two tables are much bigger than the rest :

  • CHM.CHMOS_PROCESS_INT_TBL
  • CHM.CHMOS_DEVICE_INT_TBL

Truncate the tables:

SQL> truncate table CHM.CHMOS_PROCESS_INT_TBL;
SQL> truncate table CHM.CHMOS_DEVICE_INT_TBL;

 

Then if needed shrink the tablespace and job done!

 


 

Exadata Storage Snapshots

This post describes how to implement Oracle Database Snapshot Technology on Exadata Machine.

Because Exadata Storage Cell Smart Features, Storage Indexes, IORM and Network Resource Manager work at level of ASM Volume Manager only, (and they don’t work on top of ACFS Cluster File System), the implementation of the snapshot technology is different compared to any other non-Exadata environment.

At this purpuse Oracle has developed a new type of ASM Disk Group called SPARSE Disk Group. It uses ASM SPARSE Grid Disk based on Thin Provisioning to save the database snapshot copies and the associated metadata, and it supports non-CDB and PDB snapshot copy.

The implementation requires the following minimal software versions :

  • Exadata Storage Software version 12.1.2.1.0.
  • Oracle Database version 12.1.0.2 with bundle patch 5.
One major restriction applies to Exadata Storage Sanpshot compared to ACFS;
the source database must be a shared copy open on read only and called Test Master. The Test Master Database can not be modified or deleted as long the latest child snapshot is in use.
This restriction exists because Exadata Snapshot technology uses “allocate on first write”, and not “copy on write” (like for ACFS), and the snapshot is per-database-datafile.
When a child snapshot issue a write, the write goes to a private copy of that block inside the snapshot, preserving the original block value which can be accessed by other child snapshots of the same Test Master.

How to Implement Exadata Storage Snapshots in a PDB Environment

Check the celldisks for available free space to allocate to a new SPARSE Disk Group

[root@strgceladm01 ~]# cellcli -e list celldisk attributes name,freespace
 CD_00_strgceladm01 853.34375G
 CD_01_strgceladm01 853.34375G
 CD_02_strgceladm01 853.34375G
 CD_03_strgceladm01 853.34375G
 CD_04_strgceladm01 853.34375G
 CD_05_strgceladm01 853.34375G
 CD_06_strgceladm01 853.34375G
 CD_07_strgceladm01 853.34375G
 CD_08_strgceladm01 853.34375G
 CD_09_strgceladm01 853.34375G
 CD_10_strgceladm01 853.34375G
 CD_11_strgceladm01 853.34375G
 FD_00_strgceladm01 0
 FD_01_strgceladm01 0
 FD_02_strgceladm01 0
 FD_03_strgceladm01 0
[root@strgceladm01 ~]#


[root@strgceladm02 ~]# cellcli -e list celldisk attributes name,freespace
 CD_00_strgceladm02 853.34375G
 CD_01_strgceladm02 853.34375G
 CD_02_strgceladm02 853.34375G
 CD_03_strgceladm02 853.34375G
 CD_04_strgceladm02 853.34375G
 CD_05_strgceladm02 853.34375G
 CD_06_strgceladm02 853.34375G
 CD_07_strgceladm02 853.34375G
 CD_08_strgceladm02 853.34375G
 CD_09_strgceladm02 853.34375G
 CD_10_strgceladm02 853.34375G
 CD_11_strgceladm02 853.34375G
 FD_00_strgceladm02 0
 FD_01_strgceladm02 0
 FD_02_strgceladm02 0
 FD_03_strgceladm02 0
[root@strgceladm02 ~]#


[root@strgceladm03 ~]# cellcli -e list celldisk attributes name,freespace
 CD_00_strgceladm03 853.34375G
 CD_01_strgceladm03 853.34375G
 CD_02_strgceladm03 853.34375G
 CD_03_strgceladm03 853.34375G
 CD_04_strgceladm03 853.34375G
 CD_05_strgceladm03 853.34375G
 CD_06_strgceladm03 853.34375G
 CD_07_strgceladm03 853.34375G
 CD_08_strgceladm03 853.34375G
 CD_09_strgceladm03 853.34375G
 CD_10_strgceladm03 853.34375G
 CD_11_strgceladm03 853.34375G
 FD_00_strgceladm03 0
 FD_01_strgceladm03 0
 FD_02_strgceladm03 0
 FD_03_strgceladm03 0
[root@strgceladm03 ~]#

For each Storage Cell Create a SPARSE Grid Disks as described below

[root@strgceladm01 ~]# cellcli -e CREATE GRIDDISK ALL PREFIX=SPARSE, sparse=true, SIZE=853.34375G
Cell disks were skipped because they had no freespace for grid disks: FD_00_strgceladm01, FD_01_strgceladm01, FD_02_strgceladm01, FD_03_strgceladm01.
GridDisk SPARSE_CD_00_strgceladm01 successfully created
GridDisk SPARSE_CD_01_strgceladm01 successfully created
GridDisk SPARSE_CD_02_strgceladm01 successfully created
GridDisk SPARSE_CD_03_strgceladm01 successfully created
GridDisk SPARSE_CD_04_strgceladm01 successfully created
GridDisk SPARSE_CD_05_strgceladm01 successfully created
GridDisk SPARSE_CD_06_strgceladm01 successfully created
GridDisk SPARSE_CD_07_strgceladm01 successfully created
GridDisk SPARSE_CD_08_strgceladm01 successfully created
GridDisk SPARSE_CD_09_strgceladm01 successfully created
GridDisk SPARSE_CD_10_strgceladm01 successfully created
GridDisk SPARSE_CD_11_strgceladm01 successfully created
[root@strgceladm01 ~]#

For each Storage Cell List all Grid Disks

[root@strgceladm01 ~]# cellcli -e list griddisk attributes name,size
 DATAC1_CD_00_strgceladm01 6.294586181640625T
 DATAC1_CD_01_strgceladm01 6.294586181640625T
 DATAC1_CD_02_strgceladm01 6.294586181640625T
 DATAC1_CD_03_strgceladm01 6.294586181640625T
 DATAC1_CD_04_strgceladm01 6.294586181640625T
 DATAC1_CD_05_strgceladm01 6.294586181640625T
 DATAC1_CD_06_strgceladm01 6.294586181640625T
 DATAC1_CD_07_strgceladm01 6.294586181640625T
 DATAC1_CD_08_strgceladm01 6.294586181640625T
 DATAC1_CD_09_strgceladm01 6.294586181640625T
 DATAC1_CD_10_strgceladm01 6.294586181640625T
 DATAC1_CD_11_strgceladm01 6.294586181640625T
 FGRID_FD_00_strgceladm01 2.0717315673828125T
 FGRID_FD_01_strgceladm01 2.0717315673828125T
 FGRID_FD_02_strgceladm01 2.0717315673828125T
 FGRID_FD_03_strgceladm01 2.0717315673828125T
 RECOC1_CD_00_strgceladm01 1.78143310546875T
 RECOC1_CD_01_strgceladm01 1.78143310546875T
 RECOC1_CD_02_strgceladm01 1.78143310546875T
 RECOC1_CD_03_strgceladm01 1.78143310546875T
 RECOC1_CD_04_strgceladm01 1.78143310546875T
 RECOC1_CD_05_strgceladm01 1.78143310546875T
 RECOC1_CD_06_strgceladm01 1.78143310546875T
 RECOC1_CD_07_strgceladm01 1.78143310546875T
 RECOC1_CD_08_strgceladm01 1.78143310546875T
 RECOC1_CD_09_strgceladm01 1.78143310546875T
 RECOC1_CD_10_strgceladm01 1.78143310546875T
 RECOC1_CD_11_strgceladm01 1.78143310546875T
 SPARSE_CD_00_strgceladm01 853.34375G
 SPARSE_CD_01_strgceladm01 853.34375G
 SPARSE_CD_02_strgceladm01 853.34375G
 SPARSE_CD_03_strgceladm01 853.34375G
 SPARSE_CD_04_strgceladm01 853.34375G
 SPARSE_CD_05_strgceladm01 853.34375G
 SPARSE_CD_06_strgceladm01 853.34375G
 SPARSE_CD_07_strgceladm01 853.34375G
 SPARSE_CD_08_strgceladm01 853.34375G
 SPARSE_CD_09_strgceladm01 853.34375G
 SPARSE_CD_10_strgceladm01 853.34375G
 SPARSE_CD_11_strgceladm01 853.34375G
[root@strgceladm01 ~]#

From ASM Instance Create a SPARSE Disk Group

SQL> CREATE DISKGROUP SPARSEC1 EXTERNAL REDUNDANCY DISK 'o/*/SPARSE_CD_*'
ATTRIBUTE
'compatible.asm' = '12.2.0.1',
'compatible.rdbms' = '12.2.0.1',
'cell.smart_scan_capable'='TRUE',
'cell.sparse_dg' = 'allsparse',
'AU_SIZE' = '4M';

Diskgroup created.

Set the following ASM attributes on the Disk Group hosting the Test Master Database

ALTER DISKGROUP DATAC1 SET ATTRIBUTE 'access_control.enabled' = 'true';

Grant access to the OS RDBMS user used to access to the Disk Group

ALTER DISKGROUP DATAC1 ADD USER 'oracle';

From an ASM Instance Set ownership permissions for every file that belongs solely to the PDB being snapped cloned as per example below

alter diskgroup DATAC1 set ownership owner='oracle' for file '+DATAC1/CDBT/<xxxxxxxxxxxxxxxxxxx>/DATAFILE/system.xxx.xxxxxxx';
alter diskgroup DATAC1 set ownership owner='oracle' for file '+DATAC1/CDBT/<xxxxxxxxxxxxxxxxxxx>/DATAFILE/sysaux.xxx.xxxxxxx';
alter diskgroup DATAC1 set ownership owner='oracle' for file '+DATAC1/CDBT/<xxxxxxxxxxxxxxxxxxx>/DATAFILE/users.xxx.xxxxxxx';
...
..

Restart the Master Test PDB in Read Only

alter pluggable database PDBTESTMASTER close immediate instances=all;
alter pluggable database PDBTESTMASTER open read only;

Create the first PDB Snapshot Copy on Exadata SPARSE Disk Group

Create pluggable database PDBDEV01 from PDBTESTMASTER tempfile reuse create_file_dest='+SPARSEC1' snapshot copy;

Feedback of the Exadata Storage Snapshots

The ability to create storage efficient database copies in a few seconds, independently from the size of the Test Master is very useful for today IT departments; but such extreme velocity and flexibility is not entirely free. In fact performance tests on a I/O bound workload have highlighted important performance degradation. This reminds us that as defined by Oracle Corporation, the Snapshot Technology, included on Exadata Machine remains a non-production option.

Feedback of Modern Consolidated Database Environment

 

Since the launch of Oracle 12c R1 Beta Program (August 2012) at Trivadis, we have been intensively testing, engineering and implementing Multitenant architectures for our customers.

Today, we can provide our feedbacks and those of our customers!

The overall feedback related to Oracle Multitenant is very positive, customers have been able to increase flexibility and automation, improving the efficiency of the software development life cycles.

Even the Single-tenant configuration (free of charge) brings few advantages compared to the non-CDB architecture. Therefore, from a technology point of view I recommend adopting the Container Database (CDB) architecture for all Oracle databases.

 

Examples of Multitenant architectures implemented

Having defined Oracle Multitenant a technological revolution on the space of relational databases, when combined with others 12c features it becomes a game changer for flexibility, automation and velocity.

Here are listed few examples of successful architectures implemented with our customers, using Oracle Container Database (CDB):

 

  • Database consolidation without performance and stability compromise here.

 

  • Multitenant and DevOps here.

 

  • Operating Database Disaster Recovery in Multitenant environment here.

 

 


 

RHEL 7.4 fails to mount ACFS File System due to KMOD package

After a fresh OS installation or an upgrade to RHEL 7.4, any attempt to install ACFS drivers will fail with the following message: “ACFS-9459 ADVM/ACFS is not supported on this OS version”

The error persists even if the Oracle Grid Infrastructure software includes the  Patch 26247490: 12.2 ACFS MODULE ERRORS & CRASH DURING MODULE LOAD & UNLOAD WITH OL7U4 RHCK.

 

This problem has been identified by Oracle with  BUG 26320387 – 7.4 kmod weak-modules not checking kABI compatibility correctly

And by Red Hat  Bugzilla bug:  1477073 – 7.4 kmod weak-modules –dry-run changed output format missing ‘is compatible’ messages.

root@oel7node06:/u01/app/12.2.0.1/grid/crs/install# /u01/app/12.2.0.1/grid/bin/acfsroot install
ACFS-9459: ADVM/ACFS is not supported on this OS version: '3.10.0-514.6.1.el7.x86_64'

root@oel7node06:~# /sbin/lsmod | grep oracle
oracleadvm 776830 7
oracleoks 654476 1 oracleadvm
oracleafd 205543 1

 

The current Workaround consists in downgrade the version of the kmod  RPM to  kmod-20-9.el7.x86_64.

root@oel7node06:~# yum downgrade kmod-20-9.el7

 

After the package downgrade the ACFS drivers are correcly loaded:

root@oel7node06:~# /sbin/lsmod | grep oracle
oracleacfs 4597925 2
oracleadvm 776830 8
oracleoks 654476 2 oracleacfs,oracleadvm
oracleafd 205543 1

 


 

 

 

ASM Filter Driver (ASMFD)

 

ASM Filter Driver is a Linux kernel module introduced in 12c R1. It resides in the I/O path of the Oracle ASM disks providing the following features:

  • Rejecting all non-Oracle I/O write requests to ASM Disks.
  • Device name persistency.
  • Node level fencing without reboot.

 

In 12c R2 ASMFD can be enabled from the GUI interface of the Grid Infrastructure installation, as shown on this post GI 12c R2 Installation at the step #8 “Create ASM Disk Group”.

Once ASM Filter Driver is in use, similarly to ASMLib the disks are managed using the ASMFD Label Name.

 

Here few examples about the implementation of ASM Filter Driver.

--How to create an ASMFD label in SQL*Plus
SQL> Alter system label set 'DATA1' to '/dev/mapper/mpathak';

System altered.


--How to create an ASM Disk Group with ASMFD
CREATE DISKGROUP DATA_DG EXTERNAL REDUNDANCY DISK 'AFD:DATA1' SIZE 30720M
ATTRIBUTE 'SECTOR_SIZE'='512','LOGICAL_SECTOR_SIZE'='512','compatible.asm'='12.2.0.1',
'compatible.rdbms'='12.2.0.1','compatible.advm'='12.2.0.1','au_size'='4M';

Diskgroup created.

 

ASM Filter Driver can also be managed from the ASM command line utility ASMCMD

--Check ASMFD status
ASMCMD> afd_state
ASMCMD-9526: The AFD state is 'LOADED' and filtering is 'ENABLED' on host 'oel7node06.localdomain'


--List ASM Disks where ASMFD is enabled
ASMCMD> afd_lsdsk
--------------------------------------------------------------------------------
Label                    Filtering                Path
================================================================================
DATA1                      ENABLED                /dev/mapper/mpathak
DATA2                      ENABLED                /dev/mapper/mpathan
DATA3                      ENABLED                /dev/mapper/mpathw
DATA4                      ENABLED                /dev/mapper/mpathac
GIMR1                      ENABLED                /dev/mapper/mpatham
GIMR2                      ENABLED                /dev/mapper/mpathaj
GIMR3                      ENABLED                /dev/mapper/mpathal
GIMR4                      ENABLED                /dev/mapper/mpathaf
GIMR5                      ENABLED                /dev/mapper/mpathai
RECO3                      ENABLED                /dev/mapper/mpathy
RECO1                      ENABLED                /dev/mapper/mpathab
RECO2                      ENABLED                /dev/mapper/mpathx
ASMCMD>


--How to remove an ASMFD label in ASMCMD
ASMCMD> afd_unlabel DATA4

 

 


 

Installing Oracle Grid Infrastructure 12c R2

It has been an exciting week, Oracle 12c R2 came out and suddenly was time to refresh the RAC test environments. My friend Jacques opted for an upgrade from 12.1.0.2 to 12.2.0.1 (here the link to his blog post),  I started with a fresh installation, because I also upgraded the Operating System to OEL  7.3.

Compared to 12c R1 there are new options on the installation process, but general speaking the wizard is quite similar.

The first breakthrough is about the installation simplified with an image based, no more runIstaller.sh to invoke but …

Unpack the .Zip file directly inside the Grid Infrastructure Home of the first cluster node as described below:

[grid@oel7node06 ~]$ mkdir -p /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ chown grid:oinstall /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ cd /u01/app/12.2.0.1/grid 
[grid@oel7node06 grid]$ unzip -q download_location/grid_home_image.zip

# From an X session invoke the Grid Infrastructure wizard: 
[grid@oel7node06 grid]$ ./gridSetup.sh

 

01

 

 

The second screenshot list the new Cluster typoligies available on 12c R2:

  • Oracle Standalone Cluster
  • Oracle Cluster Domain
    • Oracle Domain Services Cluster
    • Oracle Member Clusters
      • Oracle Member Cluster for Oracle Database
      • Oracle Member Cluster for Applications

 

In my case I’m installing an Oracle Standalone Cluster

02

 

 

03

04

 

05

 

06

 

07

 

08

 

09

 

10

 

11

 

12

 

13

 

14

 

15

 

16

 

17

 

18

19

 

20

 

21

 

22

 

And now time for testing.

 

 

New Oracle version (12.2.0.1) old BUG!

 

In June 2016 I posted the following BUG: Bug on Oracle 12c Multitenant & PDB Clone as Snapshot Copypromising to post an update once the version 12cR2 is available, because in the service request, originally opened with the version 12.1.0.2 Oracle stated that the bug would be fixed in 12cR2.

I was so impatient, that just few hours after the general availability of the Oracle Database 12c Release 2  I created a new cluster and tested the resolution.

 

For the record, it states that the resolution of this bug is important for one of my clients, where we have implemented the snapshot PDB on the application development lifecycle.

 

So let’s see if the bug has been fixed!

SQL*Plus: Release 12.2.0.1.0 Production on Wed Mar 1 21:06:54 2017

Copyright (c) 1982, 2016, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production


SQL> CREATE PLUGGABLE DATABASE PDBACFS1_SNAP1 from PDBACFS1 SNAPSHOT COPY;

Pluggable database created.

SQL> ALTER PLUGGABLE DATABASE PDBACFS1_SNAP1 OPEN instances=all;

Pluggable database altered.

SQL> select CON_ID, NAME, OPEN_MODE, SNAPSHOT_PARENT_CON_ID from v$pdbs where NAME in ('PDBACFS1','PDBACFS1_SNAP1');

 CON_ID     NAME               OPEN_MODE  SNAPSHOT_PARENT_CON_ID
---------- ------------------- ---------- ----------------------
 5          PDBACFS1            READ WRITE
 6          PDBACFS1_SNAP1      READ WRITE               <-- This should be 5 but is NULL

2 rows selected.

 

To a certain point of view progress has been made, in version 12.1.0.2 the column SNAPSHOT_PARENT_CON_ID was always zero (0) now is null!

I’m sorry for my customer, I’ll keep testing hoping …

 

 

Oracle DB stored on ASM vs ACFS

Nowadays a new Oracle database environment with Grid Infrastructure has three main storage options:

  1. Third party clustered file system
  2. ASM Disk Groups
  3. ACFS File System

While the first option was not in scope, this blog compares the result of the tests between ASM and ACFS, highlighting when to use one or the other to store 12c NON-CDB or CDB Databases.

The tests conducted on different environments using Oracle version 12.1.0.2 July PSU have shown controversial results compared to what Oracle  is promoting for the Oracle Database Appliance (ODA) in the following paper: “Frequently Asked Questions Storing Database Files in ACFS on Oracle Database Appliance

 

Outcome of the tests

ASM remains the preferred option to achieve the best I/O performance, while ACFS introduces interesting features like DB snapshot to quickly and space efficiently provision new databases.

The performance gap between the two solutions is not negligible as reported below by the  AWR – TOP Timed Events sections of two PDBs, sharing the same infrastructure, executing the same workload but respectively using ASM and ACFS storage:

  • PDBASM: Pluggable Database stored on  ASM Disk Group
  • PDBACFS:Pluggable Database stored on ACFS File System

 

 

PDBASM AWR – TOP Timed Events and Other Stats

topevents_asm

fg_asm

 

 

PDBACFS AWR – TOP Timed Events and Other Stats

TopEvents_ACFS.png

fg_acfs

 

Due to the different characteristics and results when ASM or ACFS is in use, it is not possible to give a generic recommendation. But case by case the choise should be driven by business needs like maximum performance versus fast and efficient database clone.

 

 

 

 

Severe Oracle instability due to new RedHat 7.2 feature which releases IPC objects

I have recently installed a two node RAC version 12.1.0.2 on top of RedHat 7.2 and few hours after the initial setup I started experiencing ASM and database crashes.

Checking in the alert log I found the following errors:

Tue Oct 04 05:25:17 2016
Dumping diagnostic data in directory=[cdmp_20161004052517], requested by (instance=1, osid=84872 (MMAN)), summary=[abnormal instance termination].
Tue Oct 04 05:25:18 2016
Instance terminated by USER, pid = 84872
Tue Oct 04 05:25:18 2016
Errors in file /oams/base/diag/rdbms/txdop/txdop1/trace/txdop1_mman_84872.trc:
ORA-27300: OS system dependent operation:semctl failed with status: 22
ORA-27301: OS failure message: Invalid argument
ORA-27302: failure occurred at: sskgpwrm1
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1

 

The errors pointed to the OS and in particular to the possibility that semaphores in use by Oracle have been removed.

Because this is a fresh installation and I was the only person using the cluster, it was easy to exclude any third party activity. Then I double-checked the kernel parameters and all other system pre-requisites without finding any wrong configuration.

Finally, on MOS and I found the followinfg note ALERT: Setting RemoveIPC=yes on Redhat 7.2 Crashes ASM and Database Instances as Well as Any Application That Uses a Shared Memory Segment (SHM) or Semaphores (SEM) (Doc ID 2081410.1)”

Redhat 7.2, systemd-logind service introduced a new feature to remove all IPC objects when a user fully logs out.
The feature is controled by the option RemoveIPC in the /etc/systemd/logind.conf configuration file, see man logind.conf(5) for details.

The default value for RemoveIPC in RHEL7.2 is yes.

As a result, when the last oracle or grid user disconnects, the OS removes shared memory segments and semaphores for those users.
As Oracle ASM and Databases use shared memory segments for SGA, removing shared memory segments will crash the Oracle ASM and database instances.

 

New to Oracle Multitenant?

Multitenant is the biggest architectural change of Oracle 12c and the enabler of many new database options in the years to come. Therefore I have decided to write over the time, few blog posts with basic examples of what should be done and not in a multitenant database environment.

 

Rule #1   – What should not be done

If you are a CDB DBA, always pay attention to which container you are connected to and remember that application data should be stored on Application PDB only!

Unfortunately this golden rule is not-enforced by the RDBMS, but it is left in our responsibility as shown on the example below:

oracle@lxoel7n01:~/ [CDB_TEST] sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Wed Sep 21 18:28:23 2016

Copyright (c) 1982, 2014, Oracle. All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

CDB$ROOT SQL>
CDB$ROOT SQL> show user
USER is "SYS"
CDB$ROOT SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT

 

Once connected to the ROOT container let see if I can mistakenly create an application table:

CDB$ROOT SQL> CREATE TABLE EMP_1
(emp_id NUMBER,
emp_name VARCHAR2(25),
start_date DATE,
emp_status VARCHAR2(10) DEFAULT 'ACTIVE',
resume CLOB); 2 3 4 5 6

Table created.

CDB$ROOT SQL> desc emp_1
 Name                                Null?    Type
 ----------------------------------- -------- ----------------------------
 EMP_ID                                        NUMBER
 EMP_NAME                                      VARCHAR2(25)
 START_DATE                                    DATE
 EMP_STATUS                                    VARCHAR2(10)
 RESUME                                        CLOB


CDB$ROOT SQL> insert into emp_1 values (1, 'Emiliano', sysdate, 'active', ' ');

1 row created.

CDB$ROOT SQL> commit;

Commit complete.


CDB$ROOT SQL> select * from emp_1;

EMP_ID     EMP_NAME                  START_DAT EMP_STATUS RESUME
---------- ------------------------- --------- ---------- ----------------
 1          Emiliano                  21-SEP-16 active

CDB$ROOT SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT

The answer is “YES” and the consequences can be devastating…

 

Rule #2   – Overview of Local and Common Entities

Non-schema entities can be created as local or common.  Local entities exist only in one PDB similar to a non-CDB architecture, while Common entities exist in every current and future container.

List of possible Local / Common entities in a Multitenant database:

  • Users
  • Roles
  • Profiles
  • Audit Policies

All Local entities are created from the local PDB and all Common entities are created from the CDB$ROOT container.

Common-user-defined Users, Roles and Profiles require a standard database prefix, defined by the spfile parameter COMMON_USER_PREFIX:

SQL> show parameter common_user_prefix

NAME                              TYPE        VALUE
--------------------------------- ----------- -----------------
common_user_prefix                string      C##

 

Example of Common User creation:

SQL> CREATE USER C##CDB_DBA1 IDENTIFIED BY PWD CONTAINER=ALL;

User created.


SQL> SELECT con_id, username, user_id, common

  2  FROM cdb_users where username='C##CDB_DBA1'  ORDER BY con_id;

    CON_ID USERNAME                USER_ID COMMON
---------- -------------------- ---------- ------
         1 C##CDB_DBA1               102    YES
         2 C##CDB_DBA1               101    YES
         3 C##CDB_DBA1               107    YES
         4 C##CDB_DBA1               105    YES
         5 C##CDB_DBA1               109    YES
         ...

 

Example of Local user creation:

SQL> show con_name

CON_NAME
------------------------------
MYPDB

SQL> CREATE USER application IDENTIFIED BY pwd CONTAINER=CURRENT;

User created.

If we try to create a Local User from the CDB$ROOT container the following error occurs: ORA-65049: creation of local user or role is not allowed in CDB$ROOT

SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT

SQL> CREATE USER application IDENTIFIED BY pwd   CONTAINER=CURRENT;

CREATE USER application IDENTIFIED BY pwd   CONTAINER=CURRENT

                                      *

ERROR at line 1:
ORA-65049: creation of local user or role is not allowed in CDB$ROOT

 

 

Rule #3  – Application should connect through user-defined database services only

We have been avoiding to create user-defined database services for many years, sometimes even for RAC databases. But in Multitenet or Singletenant architecture the importance of user-defined database service is even greater. For each CDB and PDB Oracle is still automatically creating a default service, but as in the past the default services should never be exposed to the applications.

 

To create user-defined database service in stand-alone environment use the package DBMS_SERVICE while connected to the corresponding PDB:

BEGIN
 DBMS_SERVICE.CREATE_SERVICE(
     SERVICE_NAME     => 'mypdb_app.emilianofusaglia.net',
     NETWORK_NAME     => 'mypdb_app.emilianofusaglia.net',
     FAILOVER_METHOD  =>
     ...
      );
 DBMS_SERVICE.START_SERVICE('mypdp_app.emilianofusaglia.net ');
END;
/

The database services will not start automatically after opening a PDB!  Create a database trigger for this purpose.

 

To create user-defined database service in clustered environment use the srvctl utility from the corresponding RDBMS ORACLE_HOME:

oracle@oel7n01:~/ [EFU1] srvctl add service -db EFU \
> -pdb MYPDB -service mypdb_app.emilianofusaglia.net \
> -failovertype SELECT -failovermethod BASIC \
> -failoverdelay 2 -failoverretry 90

 

List all CDB database services order by Container ID:

SQL> SELECT con_id, name, pdb FROM v$services ORDER BY con_id;

    CON_ID NAME                                     PDB
---------- --------------------------------------- -----------------

         1 EFUXDB                                   CDB$ROOT   <-- CDB Default Service 
         1 SYS$BACKGROUND                           CDB$ROOT   <-- CDB Default Service 
         1 SYS$USERS                                CDB$ROOT   <-- CDB Default Service 
         1 EFU.emilianofusaglia.net                 CDB$ROOT   <-- CDB Default Service 
         1 EFU_ADMIN.emilianofusaglia.net           CDB$ROOT   <-- CDB User-defined Service  
         3 mypdb.emilianofusaglia.net               MYPDB      <-- PDB Default Service 
         3 mypdb_app.emilianofusaglia.net           MYPDB      <-- PDB User-defined Service  

7 rows selected.

 

EZCONNECT to a PDB using the user-defined service:

sqlplus <username>/<password>@<host_name>:<local-listener-port>/<service-name>
sqlplus application/pwd@oel7c-scan:1522/mypdb_app.emilianofusaglia.net

 

 

Rule #4  –  Backup/Recovery strategy in Multitenant

As database administrator one of the first responsibility to fulfil is the “Backup/Recovery” strategy. The migration to multitenant database, due to the high level of consolidation density requires to review existing procedures. Few infrastructure operations, like creating a Data Guard or executing a backup, have been shifted from per-database to per-container consolidating the number of tasks.

RMAN in 12c covers all CDB, PDB backup/restore combinations, even though the best practice suggests to run the daily backup at CDB level, and in case of restore needed, the granularity goes down to the single block of one PDB.  Below are reported few basic backup/restore operations in Multitenant environment.

 

Backup a full CDB:

RMAN> connect target /;
RMAN> backup database plus archivelog;

 

Backup a list of PDBs:

RMAN> connect target /;
RMAN> backup pluggable database mypdb, hrpdb plus archivelog;

 

Backup one PDB directly connecting to it:

RMAN> connect target sys/manager@mypdb.emilianofusaglia.net;
RMAN> backup incremental level 0 database;

 

Backup a PDB tablespace:

RMAN> connect target /;
RMAN> backup tablespace mypdb:system;

 

Generate RMAN report:

RMAN> report need backup pluggable database mypdb;

 

Complete PDB Restore

RMAN> connect target /;
RMAN> alter pluggable database mypdb close;
RMAN> restore pluggable database mypdb;
RMAN> recover pluggable database mypdb;
RMAN> alter pluggable database mypdb open;

 

 

Rule #5  –  Before moving to Multitenant

Oracle Multitenant has introduced many architectural changes that force the DBA to evolve how databases are administered. My last golden rule suggests to thoroughly study the multitenant/singletenant architecture before starting any implementation.

During my experiences implementing multitenant/singletenant architectures, I found great dependencies with the following database areas:

  • Provisioning/Decommissioning
  • Patching and Upgrade
  • Backup/recovery
  • Capacity Planning and Management
  • Tuning
  • Separation of duties between CDB and PDB

 

 

Bug on Oracle 12c Multitenant & PDB Clone as Snapshot Copy

While automating the refresh of the test databases on Oracle 12c Multitenant environment with ACFS and PDB snapshot copy, I encountered the following BUG:

The column SNAPSHOT_PARENT_CON_ID of the view V$PDBS shows 0 (zero) in case of PDBs created as Snapshot Copy.

This bug prevents to identify the parent-child relationship between a PDB and its own Snapshots Copies.

The test case below explains the problem:

SQL> CREATE PLUGGABLE DATABASE LARTE3SEFU from LARTE3 SNAPSHOT COPY; 
 
 Pluggable database created. 
 
 SQL> select CON_ID, NAME, OPEN_MODE, SNAPSHOT_PARENT_CON_ID from v$pdbs where NAME in ('LARTE3SEFU','LARTE3'); 
 
 CON_ID      NAME          OPEN_MODE  SNAPSHOT_PARENT_CON_ID 
 ---------- -------------- ---------- ---------------------- 
 5          LARTE3         READ ONLY  0 
 16         LARTE3SEFU     MOUNTED    0  <-- This should be 5
 
 2 rows selected. 

A Service Request to Oracle has been opened, I’ll update this post once I have the official answer.

Update from the Service Request: BUG Fixed on version 12.2

The “Great” ODA overwhelming the Exadata

Introduction

This article try to explain the technical reasons of the Oracle Database Appliance success, a well-known appliance with whom Oracle targets small and medium businesses, or specific departments of big companies looking for privacy and isolation from the rest of the IT. Nowadays this small and relatively cheap appliance (around 65’000$ price list) has evolved a lot, the storage has reached an important capacity 128TB raw expansible to 256TB, and the two X5-2 servers are the same used on the database node of the Exadata machine. Many customers, while defining the new database architecture evaluate the pros and cons of acquiring an ODA compared to the smallest Exadata configuration (one eight of a Rack). If the customer is not looking for a system with extreme performance and horizontal scalability beyond the two X5-2 servers, the Oracle Database Appliance is frequently the retained option.

Some of the ODA major features are:

  • High Availability: no single point of failure on all hardware and software components.
  • Performance: each server is equipped with 2×18-core Intel Xeon and 256GB of RAM extensible up to 768GB, cluster communication over InfiniBand. The shared storage offers a multi-tiers configuration with HDDs at 7.2K rpm and two type of SSDs for frequently accessed data and for database redo logs.
  • Flexibility & Scalability: running RAC, RAC One node and Single Instance databases.
  • Virtualized configuration: designed for offering Solution in-a-box, with high available virtual machines.
  • Optimized licensing model: pay-as-you-grow model activating a crescendo number of CPU-cores on demand, with the Bare Metal configuration; or capping the resources combining Oracle VM with the Hard Partitioning setup.
  • Time-to-market: no-matter if the ODA has to be installed bare metal or virtualized, this is a standardized and automated process generally completed in one or two day of work.
  • Price: the ODA is very competitive when comparing the cost to an equivalent commodity architecture; which in addition, must be engineered, integrated and maintained by the customer.

 

At the time of the writing of this article, the latest hardware model is ODA X5-2 and 12.1.2.6.0 is the software version. This HW and SW combination offers unique features, few of them not even available on the Exadata machine, like the possibility to host databases and applications in one single box, or the possibility to rapidly and space efficiently clone an 11gR2 and 12c database using ACFS Snapshot.

 

 

ODA HW & SW Architecture

Oracle Database Appliance is composed by two X5-2 servers and a shared storage shelf, which optionally can be doubled. Each Server disposes of: two 18-core Intel Xeon E5-2699 v3; 256GB RAM (optionally upgradable to 768GB) and two 600GB 10k rpm internal disks in RAID 1 for OS and software binaries.

This appliance is equipped with redundant networking connectivity up to 10Gb, redundant SAS HBAs and Storage I/O modules, redundant InfiniBand interconnect for cluster communication enabling 40 Gb/second server-to-server communication.

The software components are all part of Oracle “Red Stack” with Oracle Linux 6 UEK or OVM 3, Grid Infrastructure 12c, Oracle RDBMS 12c & 11gR2 and Oracle Appliance Manager.

 

 

ODA Front view

Components number 1 & 2 are the X5-2 Servers. Components 3 & 4 are the Storage and the optionally Storage extension.

ODA_Front

 

ODA Rear view

Highlight of the multiple redundant connections, including InfiniBand for Oracle Clusterware, ASM and RAC communications. No single point of HW or SW failure.

ODA_Back

 

 

Storage Organization

With 16x8TB SAS HDDs a total raw space of 128TB is available on each storage self (64TB in configuration ASM double-mirrored and 42.7TB with ASM triple-mirrored). To offer better I/O performance without exploding the price, Oracle has implemented the following SSD devices: 4x400GB ASM double-mirrored, for frequently accessed data, and 4x200GB ASM triple-mirrored, for database redo logs.

As shown on the picture aside, each rotating disk has two slices, the external, and more performant partition assigned to the +DATA ASM disk group, and the internal one allocated to +RECO ASM disk group.

 

ODA_Disk

This storage optimization allows the ODA to achieve competitive I/O performance. In a production-like environment, using the three type of disks, as per ODA Database template odb-24 (https://docs.oracle.com/cd/E22693_01/doc.12/e55580/sizing.htm), Trivadis has measured 12k I/O per second and a throughput of 2300 MB/s with an average latency of 10ms. As per Oracle documentation, the maximum number of I/O per second of the rotating disks, with a single storage shelf is 3300; but this value increases significantly relocating the hottest data files to +FLASH disk group created on SSD devices.

 

ACFS becomes the default database storage of ODA

Starting from the ODA software version 12.1.0.2, any fresh installation enforces ASM Cluster File System (ACFS) as lonely type of database storage support, restricting the supported database versions to 11.2.0.4 and greater. In case of ODA upgrade from previous release, all pre-existing databases are not automatically migrated to ACFS, but Oracle provides a tool called acfs_mig.pl for executing this mandatory step on all Non-CDB databases of version >= 11.2.0.4.

Oracle has decided to promote ACFS as default database storage on ODA environment for the following reasons:

  • ACFS provides almost equivalent performance than Oracle ASM disk groups.
  • Additional functionalities on industry standard POSIX file system.
  • Database snapshot copy of PDBs, and NON-CDB of version 11.2.0.4 or greater.
  • Advanced functionality for general-purpose files such as replication, tagging, encryption, security, and auditing.

Database created on ACFS follows the same Oracle Managed Files (OMF) standard used by ASM.

As in the past, the database provisioning requires the utilization of the command line interface oakcli and the selection of a database template, which defines several characteristics including the amount of space to allocate on each file system. Container and Non-Container databases can coexist on the same Oracle Database Appliance.

The ACFS file systems are created during the database provisioning process on top of the ASM disk groups +DATA, +RECO, +REDO, and optionally +FLASH. The file systems have two possible setups, depending on the database type Container or Non-Container.

  • Container database: for each CDB the ODA database-provisioning job creates dedicated ACFS file systems with the following characteristics:
Disk Characteristics ASM Disk group ACFS Mount Point
SAS Disk external partition +DATA /u02/app/oracle/oradata/datc<db_unique_name>
SAS Disk internal partition +RECO /u01/app/oracle/fast_recovery_area/rcoc<db_unique_name>
SSD Triple-mirrored +REDO /u01/app/oracle/oradata/rdoc<db_unique_name>
SSD Double-mirrored +FLASH (*) /u02/app/oracle/oradata/flashdata

 

  • Non-Container database: in case of Non-CDB the ODA database-provisioning job creates or resizes the following shared ACFS file systems:
Disk Characteristics ASM Disk group ACFS Mount Point
SAS Disk external partition +DATA /u02/app/oracle/oradata/datastore
SAS Disk internal partition +RECO /u01/app/oracle/fast_recovery_area/datastore
SSD Triple-mirrored +REDO /u01/app/oracle/oradata/datastore
SSD Double-mirrored +FLASH (*) /u02/app/oracle/oradata/flashdata

(*) Optionally used by the databases as Smart Flash Cache (extension of the SGA buffer cache), or allocated to store the hottest data files leveraging the I/O performance of the SSD disks.

 

Oracle Database Appliance Bare Metal

The bare metal configuration has been available since version one of the appliance, and nowadays it remains the default option proposed by Oracle, which pre-install the OS Linux on any new system. Very simple and intuitive to install thanks to the pre-built bundle software, which automates most of the steps. At the end of the installation, the architecture is very similar to any other two node RAC setup based on commodity hardware; but even from an operation point of view there are great advantages, because the Oracle Appliance Manager framework simplifies and accelerates the execution of almost any system and database administrator task.

Here below is depicted the ODA architecture when the bare metal configuration is in use:

ODA_Bare_Metal

 

Oracle Database Appliance Virtualized

When the ODA is deployed with the virtualization, both servers run Oracle VM Server, also called Dom0. Each Dom0 hosts in a local dedicated repository the ODA Base (or Dom Base), a privileged virtual machine where it is installed the Appliance Manager, Grid Infrastructure and RDBMS binaries. The ODA Base takes advantage of the Xen PCI Pass-through technology to provide direct access to the ODA shared disks presented and managed by ASM. This configuration reduces the VM flexibility; in fact, no VM migration is allowed for the two ODA Base, but it guarantees almost no I/O penalty in term of performance. With the Dom Base setup, the basic installation is completed and it is possible to start provisioning databases using Oracle Appliance Manager.

At the same time, the administrator can create new-shared repositories hosted on ACFS and NFS exported to the hypervisor for hosting the application virtual machines. Those application virtual machines are also identified with the name of Domain U.  The Domain U and the templates can be stored on a local or shared Oracle VM Server repository, but to enable the functionality to migrate between the two Oracle VM Servers a shared repository on the ACFS file system should be used.

Even when the virtualization is in use, Oracle Appliance Manager is the only framework for system and database administration tasks like repository creation, import of template, deployment of virtual machine, network configuration, database provisioning and so on, relieving the administrator from all complexity.

The implementation of the Solution-in-a-box guarantees the maximum Return on Investment of the ODA; in fact, while restricting the virtual CPUs to license on the Dom Base it allows relocating spare resources to the application virtual machines as showed on the picture below.

ODA_Virtualized

 

 

ODA compared to Exadata Machine and Commodity Hardware

As described on the previous sections, Oracle Database Appliance offers unique features such as pay-as-you-grow, solution-in-a-box and so on, which can heavily influence the decision for a new database architecture. The aim of the table below is to list the main architecture characteristics to evaluate while defining a new database infrastructure, comparing the result between Oracle Database Appliance, Exadata Machine and a Commodity Architecture based on Intel Linux engineered to run RAC databases.

Table_Architectures

As shown by the different scores of the three architectures, each solution comes with points of strength and weakness; about the Oracle Database Appliance, it is evident that due to its characteristics, the smallest Oracle Engineered System remains a great option for small, medium database environments.

 

Conclusion

I hope this article keep the initial promise to explain the technical reasons of the Oracle Database Appliance success, and it has highlighted the great work done by Oracle, engineering this solution on the edge of the technology keeping the price under control.

One last summary of what in my opinion are the major benefits offered by the ODA:

  • Time-to-market: Thanks to automated processes and pre-build software images, the deployment phase is extremely rapid.
  • Simplicity: The use of standard software components, combined to the appliance orchestrator Oracle Appliance Manager makes the ODA very simple to operate.
  • Standardization & Automation: The Appliance Manager encapsulates and automatizes all repeatable and error-prone tasks like provisioning, decommissioning, patching and so on.
  • Vendor certified platform: Oracle validates and certifies the compatibility among all HW & SW components.
  • Evolution: Over the time, the ODA benefits of specific bug fixing and software evolution (introduced by Oracle though the quarterly patch sets); keeping the system on the edge for longer time when compared to a commodity architecture.

EXADATA: How to enable Flash Cache WriteBack on a running system

In a recent tuning activity it was necessary to change the Exadata Smart Flash Cache from “WriteThrough” to “WriteBack“. Because the system was used in a 24/7 environment we had to implement the change in a Rolling Upgrade Fashion.

Here below are described the different steps.

 

From one DB node using dcli check the currest status of the storage cells:

[root@efudbadm02 ~]# dcli -g ~/cell_group -l root cellcli -e "list cell attributes flashcachemode"
efuceladm01: WriteThrough
efuceladm02: WriteThrough
efuceladm03: WriteThrough
efuceladm04: WriteThrough
efuceladm05: WriteThrough
efuceladm06: WriteThrough
efuceladm07: WriteThrough
efuceladm08: WriteThrough
efuceladm09: WriteThrough
efuceladm10: WriteThrough
efuceladm11: WriteThrough

From one DB node using dcli check that the properties asmdeactivationoutcome and asmmodestatus of all griddisks are respectively “Yes” and “ONLINE” before continuing with the change.

[root@efudbadm02 ~]# dcli -g cell_group -l root cellcli -e list griddisk attributes asmdeactivationoutcome, asmmodestatus
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
...
..
.

From one DB node using dcli check that all flashcache modules are in “normal” state and no flash disk is in degraded or critical state.

[root@efudbadm02 ~]# dcli -g cell_group -l root cellcli -e list flashcache detail
efuceladm01: name: efuceladm01_FLASHCACHE
efuceladm01: cellDisk: FD_00_efuceladm01,FD_07_efuceladm01,FD_06_efuceladm01,FD_03_efuceladm01,FD_05_efuceladm01,FD_01_efuceladm01,FD_02_efuceladm01,FD_04_efuceladm01
efuceladm01: creationTime: 2013-06-18T15:21:13+02:00
efuceladm01: degradedCelldisks:
efuceladm01: effectiveCacheSize: 744.125G
efuceladm01: id: 35b61001-438f-4d66-8ce9-40704f758d3f
efuceladm01: size: 744.125G
efuceladm01: status: normal
efuceladm02: name: efuceladm02_FLASHCACHE
efuceladm02: cellDisk: FD_06_efuceladm02,FD_05_efuceladm02,FD_00_efuceladm02,FD_02_efuceladm02,FD_01_efuceladm02,FD_07_efuceladm02,FD_03_efuceladm02,FD_04_efuceladm02
efuceladm02: creationTime: 2013-06-18T15:21:12+02:00
efuceladm02: degradedCelldisks:
efuceladm02: effectiveCacheSize: 744.125G
efuceladm02: id: 2f7eedd6-cda2-496e-98ec-417b94fb8ee7
efuceladm02: size: 744.125G
efuceladm02: status: normal
efuceladm03: name: efuceladm03_FLASHCACHE
efuceladm03: cellDisk: FD_00_efuceladm03,FD_04_efuceladm03,FD_01_efuceladm03,FD_02_efuceladm03,FD_03_efuceladm03,FD_06_efuceladm03,FD_05_efuceladm03,FD_07_efuceladm03
efuceladm03: creationTime: 2013-06-18T15:21:10+02:00
efuceladm03: degradedCelldisks:
efuceladm03: effectiveCacheSize: 744.125G
efuceladm03: id: c271cdb8-dc70-4009-ba97-dfc4c26b00ef
efuceladm03: size: 744.125G
efuceladm03: status: normal
...
..
.

Logon on the first Cell Storage and using CellCli interface perform the following procedure to enable the WriteBack Flash Cache in a rolling upgrade fashion.

 

Drop the existing flash cache

CellCLI> drop flashcache
Flash cache efuceladm01_FLASHCACHE successfully dropped

Inactivate the griddisk on the cell

CellCLI> alter griddisk all inactive
GridDisk DATA_CD_00_efuceladm01 successfully altered
GridDisk DATA_CD_01_efuceladm01 successfully altered
GridDisk DATA_CD_02_efuceladm01 successfully altered
GridDisk DATA_CD_03_efuceladm01 successfully altered
GridDisk DATA_CD_04_efuceladm01 successfully altered
GridDisk DATA_CD_05_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_02_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_03_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_04_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_05_efuceladm01 successfully altered
GridDisk RECO_CD_00_efuceladm01 successfully altered
GridDisk RECO_CD_01_efuceladm01 successfully altered
GridDisk RECO_CD_02_efuceladm01 successfully altered
GridDisk RECO_CD_03_efuceladm01 successfully altered
GridDisk RECO_CD_04_efuceladm01 successfully altered
GridDisk RECO_CD_05_efuceladm01 successfully altered

Shut down cellsrv service

CellCLI> alter cell shutdown services cellsrv

Stopping CELLSRV services...
The SHUTDOWN of CELLSRV services was successful.

Enable the Smart Flash Cache WriteBack

CellCLI> alter cell flashCacheMode=writeback
Cell efuceladm01 successfully altered

Restart the cellsrv service

CellCLI> alter cell startup services cellsrv

Starting CELLSRV services...
The STARTUP of CELLSRV services was successful.

Reactivate the griddisk on the cell

CellCLI> alter griddisk all active
GridDisk DATA_CD_00_efuceladm01 successfully altered
GridDisk DATA_CD_01_efuceladm01 successfully altered
GridDisk DATA_CD_02_efuceladm01 successfully altered
GridDisk DATA_CD_03_efuceladm01 successfully altered
GridDisk DATA_CD_04_efuceladm01 successfully altered
GridDisk DATA_CD_05_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_02_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_03_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_04_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_05_efuceladm01 successfully altered
GridDisk RECO_CD_00_efuceladm01 successfully altered
GridDisk RECO_CD_01_efuceladm01 successfully altered
GridDisk RECO_CD_02_efuceladm01 successfully altered
GridDisk RECO_CD_03_efuceladm01 successfully altered
GridDisk RECO_CD_04_efuceladm01 successfully altered
GridDisk RECO_CD_05_efuceladm01 successfully altered

Recreate the flash cache

CellCLI> create flashcache all
Flash cache efuceladm01_FLASHCACHE successfully created

 


Verify that the Smart Flash Cache WriteBackWriteBack option is enabled

[root@efuceladm01 ~]# cellcli -e list cell detail | grep flashCacheMode
 flashCacheMode: writeback

Before applying the change to the next Exadata Storage Server  wait that all griddisk are synronized and online.

[root@efuceladm01 ~]# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
 DATA_CD_00_efuceladm01 SYNCING Yes
 DATA_CD_01_efuceladm01 SYNCING Yes
 DATA_CD_02_efuceladm01 SYNCING Yes
 DATA_CD_03_efuceladm01 SYNCING Yes
 DATA_CD_04_efuceladm01 SYNCING Yes
 DATA_CD_05_efuceladm01 SYNCING Yes
 DBFS_DG_CD_02_efuceladm01 ONLINE Yes
 DBFS_DG_CD_03_efuceladm01 ONLINE Yes
 DBFS_DG_CD_04_efuceladm01 ONLINE Yes
 DBFS_DG_CD_05_efuceladm01 ONLINE Yes
 RECO_CD_00_efuceladm01 OFFLINE Yes
 RECO_CD_01_efuceladm01 OFFLINE Yes
 RECO_CD_02_efuceladm01 OFFLINE Yes
 RECO_CD_03_efuceladm01 OFFLINE Yes
 RECO_CD_04_efuceladm01 OFFLINE Yes
 RECO_CD_05_efuceladm01 OFFLINE Yes

Once the asmmodestatus is ONLINE on all griddisks it is safe to move to the next Storage Server.


 

At the end of the procedure all Storage Servers are configured with Smart Flash Cache WriteBach option:

[root@efudbadm02 ~]# dcli -g ~/cell_group -l root cellcli -e "list cell attributes flashcachemode"
efuceladm01: writeback
efuceladm02: writeback
efuceladm03: writeback
efuceladm04: writeback
efuceladm05: writeback
efuceladm06: writeback
efuceladm07: writeback
efuceladm08: writeback
efuceladm09: writeback
efuceladm10: writeback
efuceladm11: writeback


	

Troubleshooting not mounting ACFS File System

The ACSF /cloufs was created and registered on the CRS, following node reboot the file system was no longer mounting!

 

I logged on ASMCMD and checked the status of the ASM Volume:

[grid@rednodech07 ~]$ asmcmd
ASMCMD> volinfo -a
Diskgroup Name: FRA

 Volume Name: VOL_CLOUDFS
 Volume Device: ERROR
 State: DISABLED
 Size (MB): 20480
 Resize Unit (MB): 32
 Redundancy: MIRROR
 Stripe Columns: 4
 Stripe Width (K): 128
 Usage: ACFS
 Mountpath: /cloudfs

The output of the command above shows that the volume VOL_CLOUDFS is DISABLED. I tried manually  to restart it, but I got the following error:

ASMCMD> volenable -a
ORA-15032: not all alterations performed
ORA-15477: cannot communicate with the volume driver (DBD ERROR: OCIStmtExecute)
ASMCMD>

 

Then I checked if the ACFS kernel module where loaded into the Linux kernel:

  • oracleacfs (oracleacfs.ko): manages all ACFS filesystem operations.
  •  oracleavdm (oracleavdm.ko): AVDM module enabling direct interface with the filesystem
  • oracleoks (oracleoks.ko): provides memory management, lock and cluster synchronization
[root@rednodech07 ~]# /sbin/lsmod | grep oracle


Because the kernel modules were not loaded, I tried to manally load with the command

/bin/acfsload start

But it didn’t work, so I stopped the Grid Instrastruceure on the local node and I have reinstalled the ACFS drivers:

[root@rednodech07 asm]# /u01/GRID/11.2.0.4/bin/crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rednodech07'
CRS-2673: Attempting to stop 'ora.crsd' on 'rednodech07'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rednodech07'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.cvu' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.oc4j' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.GRID.dg' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.tvdtst.db' on 'rednodech07'
CRS-2677: Stop of 'ora.cvu' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.rednodech07.vip' on 'rednodech07'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'rednodech07'
CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.scan2.vip' on 'rednodech07'
CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.scan3.vip' on 'rednodech07'
CRS-2677: Stop of 'ora.tvdtst.db' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'rednodech07'
CRS-2677: Stop of 'ora.DATA.dg' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.rednodech07.vip' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.scan1.vip' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.scan2.vip' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.scan3.vip' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.GRID.dg' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rednodech07'
CRS-2677: Stop of 'ora.asm' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'rednodech07'
CRS-2677: Stop of 'ora.ons' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'rednodech07'
CRS-2677: Stop of 'ora.net1.network' on 'rednodech07' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rednodech07' has completed
CRS-2677: Stop of 'ora.crsd' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.evmd' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.asm' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rednodech07'
CRS-2677: Stop of 'ora.crf' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.asm' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rednodech07'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rednodech07'
CRS-2677: Stop of 'ora.cssd' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rednodech07'
CRS-2677: Stop of 'ora.gipcd' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rednodech07'
CRS-2677: Stop of 'ora.gpnpd' on 'rednodech07' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rednodech07' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@rednodech07 asm]#



[root@rednodech07 ~]# /u01/GRID/11.2.0.4/bin/acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9312: Existing ADVM/ACFS installation detected.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
[root@rednodech07 ~]#


At this point I have restarted the Grid Infrastructure, the ACFS kernel modules got loaded  and the ASM Volume State become ENABLED.

[root@rednodech07 ~]# /u01/GRID/11.2.0.4/bin/crsctl start crs
CRS-4123: Oracle High Availability Services has been started.


[root@rednodech07 ~]# /sbin/lsmod | grep oracle
oracleacfs 1994567 0
oracleadvm 243254 0
oracleoks 460313 2 oracleacfs,oracleadvm

[grid@rednodech07 ~]$ asmcmd
ASMCMD> volinfo -a
Diskgroup Name: FRA

 Volume Name: VOL_CLOUDFS
 Volume Device: /dev/asm/vol_cloudfs-390
 State: ENABLED
 Size (MB): 20480
 Resize Unit (MB): 32
 Redundancy: MIRROR
 Stripe Columns: 4
 Stripe Width (K): 128
 Usage: ACFS
 Mountpath: /cloudfs

ASMCMD>

It remained to restart the ACFS File system with the following command:

[root@rednodech07 /]# /bin/mount -t acfs /dev/asm/vol_cloudfs-390 /cloudfs

[oracle@rednodech07 duplicate_tcswu]$ mount
/dev/mapper/vg_rednodech07-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sdbc1 on /boot type ext4 (rw)
/dev/mapper/vg_rednodech07-lv_home on /home type ext4 (rw)
/dev/mapper/vg_rednodech07-lv_tmp on /tmp type ext4 (rw)
/dev/mapper/vg_rednodech07-lv_u01 on /u01 type ext4 (rw)
/dev/mapper/vg_rednodech07-lv_var on /var type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/asm/vol_cloudfs-390 on /cloudfs type acfs (rw)