Exadata as Code – PDB Snapshot Cloning

As follow up to the Exadata as Code post, today I’m going to focus on one of the latest features added to our automation: PDB Snapshot Cloning.

PDB snapshot cloning is one of the best development options to offers in a CI/CD project. In an Exadata environment there are special requirements to implement before start using this technology: Sparse Grid Disks and Sparse ASM Disk Group ( description and step-by-step example is available here).

On Exadata, the PDB Snapshot beneficial of all Smart features including offload capabilities, with in addition space- and time-efficient provisioning.

After this brief introduction let’s see how the PDB Snapshot Cloning has been implemented on our Exadata as Code automation.

Exadata Sparse Storage Automation

Select the Oracle Container Database (CDB) where the PDB Snapshot Cloning should be activated, and with One-click provisioning the Sparse Grid disks and Sparse ASM disk group are created.

After the initial provisioning, the automation starts monitoring the space usage, automatically resizing it (increasing/decreasing) when necessary.



Automated storage lifecycle management


PDB Snapshot Cloning Automation

Same principle applies to the different PDB actions: One-click Provisioning/Decommissioning of the PDB Test Master and PDB Snapshot.

Those features are exposed via UI or API to the application developers, making them autonomous on the management of such space efficient database environment.

PDB Snapshot lifecycle management


Grid Management DB filling up ASM disk space

Recently I discovered on Oracle Grid Infrastructure 12cR2 that the ASM disk group hosting the Management DB (-MGMTDB) was filling up the disk space very quickly.

This is due to a bug on the oclumon data purge procedure.

To fix the problem, two possibilities are available:

  1. Recreate the Management DB
  2. Manually truncate the tables not purged and shrinking the Tablespace Size

 

Below are described the two options.

 

Option 1 – Recreate the Management DB

As root user on each cluster node:

# /u01/app/12.2.0.1/grid/bin/crsctl stop res ora.crf -init
# /u01/app/12.2.0.1/grid/bin/crsctl modify res ora.crf -attr ENABLED=0 -init

 

As Grid from the local node hosting the Management Database Instance run the commands:

$ /u01/app/12.2.0.1/grid/bin/srvctl status mgmtdb
$ /u01/app/12.2.0.1/grid/bin/dbca -silent -deleteDatabase -sourceDB -MGMTDB
Connecting to database
4% complete
9% complete
14% complete
19% complete
23% complete
28% complete
47% complete
Updating network configuration files
48% complete
52% complete
Deleting instance and datafiles
76% complete
100% complete

 

How to recreate the MGMTDB:

$ /u01/app/12.2.0.1/grid/bin/dbca -silent -createDatabase -createAsContainerDatabase true -templateName MGMTSeed_Database.dbc
-sid -MGMTDB
-gdbName _mgmtdb
-storageType ASM
-diskGroupName GIMR
-datafileJarLocation <GI HOME>/assistants/dbca/templates
-characterset AL32UTF8
-autoGeneratePassword           
-skipUserTemplateCheck

 

Create the pluggable GIMR database

$ /u01/app/12.2.0.1/grid/bin/mgmtca -local

 


 

Option 2 – Manually truncate the tables

 

As root user stop and disable ora.crf resource on each cluster node:

# /u01/app/12.2.0.1/grid/bin/crsctl stop res ora.crf -init
# /u01/app/12.2.0.1/grid/bin/crsctl modify res ora.crf -attr ENABLED=0 -init

 

Connect to MGMTDB and identify the segments to truncate:

export ORACLE_SID=-MGMTDB
$ORACLE_HOME/bin/sqlplus / as sysdba
SQL> select pdb_name from dba_pdbs where pdb_name!='PDB$SEED';

SQL> alter session set container=GIMR_DSCREP_10;

Session altered.

SQL> col obj format a50
SQL> select owner||'.'||SEGMENT_NAME obj, BYTES from dba_segments where owner='CHM' order by 2 asc;

 

Likely those two tables are much bigger than the rest :

  • CHM.CHMOS_PROCESS_INT_TBL
  • CHM.CHMOS_DEVICE_INT_TBL

Truncate the tables:

SQL> truncate table CHM.CHMOS_PROCESS_INT_TBL;
SQL> truncate table CHM.CHMOS_DEVICE_INT_TBL;

 

Then if needed shrink the tablespace and job done!

 


 

Exadata Storage Snapshots

This post describes how to implement Oracle Database Snapshot Technology on Exadata Machine.

Because Exadata Storage Cell Smart Features, Storage Indexes, IORM and Network Resource Manager work at level of ASM Volume Manager only, (and they don’t work on top of ACFS Cluster File System), the implementation of the snapshot technology is different compared to any other non-Exadata environment.

At this purpuse Oracle has developed a new type of ASM Disk Group called SPARSE Disk Group. It uses ASM SPARSE Grid Disk based on Thin Provisioning to save the database snapshot copies and the associated metadata, and it supports non-CDB and PDB snapshot copy.

The implementation requires the following minimal software versions :

  • Exadata Storage Software version 12.1.2.1.0.
  • Oracle Database version 12.1.0.2 with bundle patch 5.
One major restriction applies to Exadata Storage Sanpshot compared to ACFS;
the source database must be a shared copy open on read only and called Test Master. The Test Master Database can not be modified or deleted as long the latest child snapshot is in use.
This restriction exists because Exadata Snapshot technology uses “allocate on first write”, and not “copy on write” (like for ACFS), and the snapshot is per-database-datafile.
When a child snapshot issue a write, the write goes to a private copy of that block inside the snapshot, preserving the original block value which can be accessed by other child snapshots of the same Test Master.

How to Implement Exadata Storage Snapshots in a PDB Environment

Check the celldisks for available free space to allocate to a new SPARSE Disk Group

[root@strgceladm01 ~]# cellcli -e list celldisk attributes name,freespace
 CD_00_strgceladm01 853.34375G
 CD_01_strgceladm01 853.34375G
 CD_02_strgceladm01 853.34375G
 CD_03_strgceladm01 853.34375G
 CD_04_strgceladm01 853.34375G
 CD_05_strgceladm01 853.34375G
 CD_06_strgceladm01 853.34375G
 CD_07_strgceladm01 853.34375G
 CD_08_strgceladm01 853.34375G
 CD_09_strgceladm01 853.34375G
 CD_10_strgceladm01 853.34375G
 CD_11_strgceladm01 853.34375G
 FD_00_strgceladm01 0
 FD_01_strgceladm01 0
 FD_02_strgceladm01 0
 FD_03_strgceladm01 0
[root@strgceladm01 ~]#


[root@strgceladm02 ~]# cellcli -e list celldisk attributes name,freespace
 CD_00_strgceladm02 853.34375G
 CD_01_strgceladm02 853.34375G
 CD_02_strgceladm02 853.34375G
 CD_03_strgceladm02 853.34375G
 CD_04_strgceladm02 853.34375G
 CD_05_strgceladm02 853.34375G
 CD_06_strgceladm02 853.34375G
 CD_07_strgceladm02 853.34375G
 CD_08_strgceladm02 853.34375G
 CD_09_strgceladm02 853.34375G
 CD_10_strgceladm02 853.34375G
 CD_11_strgceladm02 853.34375G
 FD_00_strgceladm02 0
 FD_01_strgceladm02 0
 FD_02_strgceladm02 0
 FD_03_strgceladm02 0
[root@strgceladm02 ~]#


[root@strgceladm03 ~]# cellcli -e list celldisk attributes name,freespace
 CD_00_strgceladm03 853.34375G
 CD_01_strgceladm03 853.34375G
 CD_02_strgceladm03 853.34375G
 CD_03_strgceladm03 853.34375G
 CD_04_strgceladm03 853.34375G
 CD_05_strgceladm03 853.34375G
 CD_06_strgceladm03 853.34375G
 CD_07_strgceladm03 853.34375G
 CD_08_strgceladm03 853.34375G
 CD_09_strgceladm03 853.34375G
 CD_10_strgceladm03 853.34375G
 CD_11_strgceladm03 853.34375G
 FD_00_strgceladm03 0
 FD_01_strgceladm03 0
 FD_02_strgceladm03 0
 FD_03_strgceladm03 0
[root@strgceladm03 ~]#

For each Storage Cell Create a SPARSE Grid Disks as described below

[root@strgceladm01 ~]# cellcli -e CREATE GRIDDISK ALL PREFIX=SPARSE, sparse=true, SIZE=853.34375G
Cell disks were skipped because they had no freespace for grid disks: FD_00_strgceladm01, FD_01_strgceladm01, FD_02_strgceladm01, FD_03_strgceladm01.
GridDisk SPARSE_CD_00_strgceladm01 successfully created
GridDisk SPARSE_CD_01_strgceladm01 successfully created
GridDisk SPARSE_CD_02_strgceladm01 successfully created
GridDisk SPARSE_CD_03_strgceladm01 successfully created
GridDisk SPARSE_CD_04_strgceladm01 successfully created
GridDisk SPARSE_CD_05_strgceladm01 successfully created
GridDisk SPARSE_CD_06_strgceladm01 successfully created
GridDisk SPARSE_CD_07_strgceladm01 successfully created
GridDisk SPARSE_CD_08_strgceladm01 successfully created
GridDisk SPARSE_CD_09_strgceladm01 successfully created
GridDisk SPARSE_CD_10_strgceladm01 successfully created
GridDisk SPARSE_CD_11_strgceladm01 successfully created
[root@strgceladm01 ~]#

For each Storage Cell List all Grid Disks

[root@strgceladm01 ~]# cellcli -e list griddisk attributes name,size
 DATAC1_CD_00_strgceladm01 6.294586181640625T
 DATAC1_CD_01_strgceladm01 6.294586181640625T
 DATAC1_CD_02_strgceladm01 6.294586181640625T
 DATAC1_CD_03_strgceladm01 6.294586181640625T
 DATAC1_CD_04_strgceladm01 6.294586181640625T
 DATAC1_CD_05_strgceladm01 6.294586181640625T
 DATAC1_CD_06_strgceladm01 6.294586181640625T
 DATAC1_CD_07_strgceladm01 6.294586181640625T
 DATAC1_CD_08_strgceladm01 6.294586181640625T
 DATAC1_CD_09_strgceladm01 6.294586181640625T
 DATAC1_CD_10_strgceladm01 6.294586181640625T
 DATAC1_CD_11_strgceladm01 6.294586181640625T
 FGRID_FD_00_strgceladm01 2.0717315673828125T
 FGRID_FD_01_strgceladm01 2.0717315673828125T
 FGRID_FD_02_strgceladm01 2.0717315673828125T
 FGRID_FD_03_strgceladm01 2.0717315673828125T
 RECOC1_CD_00_strgceladm01 1.78143310546875T
 RECOC1_CD_01_strgceladm01 1.78143310546875T
 RECOC1_CD_02_strgceladm01 1.78143310546875T
 RECOC1_CD_03_strgceladm01 1.78143310546875T
 RECOC1_CD_04_strgceladm01 1.78143310546875T
 RECOC1_CD_05_strgceladm01 1.78143310546875T
 RECOC1_CD_06_strgceladm01 1.78143310546875T
 RECOC1_CD_07_strgceladm01 1.78143310546875T
 RECOC1_CD_08_strgceladm01 1.78143310546875T
 RECOC1_CD_09_strgceladm01 1.78143310546875T
 RECOC1_CD_10_strgceladm01 1.78143310546875T
 RECOC1_CD_11_strgceladm01 1.78143310546875T
 SPARSE_CD_00_strgceladm01 853.34375G
 SPARSE_CD_01_strgceladm01 853.34375G
 SPARSE_CD_02_strgceladm01 853.34375G
 SPARSE_CD_03_strgceladm01 853.34375G
 SPARSE_CD_04_strgceladm01 853.34375G
 SPARSE_CD_05_strgceladm01 853.34375G
 SPARSE_CD_06_strgceladm01 853.34375G
 SPARSE_CD_07_strgceladm01 853.34375G
 SPARSE_CD_08_strgceladm01 853.34375G
 SPARSE_CD_09_strgceladm01 853.34375G
 SPARSE_CD_10_strgceladm01 853.34375G
 SPARSE_CD_11_strgceladm01 853.34375G
[root@strgceladm01 ~]#

From ASM Instance Create a SPARSE Disk Group

SQL> CREATE DISKGROUP SPARSEC1 EXTERNAL REDUNDANCY DISK 'o/*/SPARSE_CD_*'
ATTRIBUTE
'compatible.asm' = '12.2.0.1',
'compatible.rdbms' = '12.2.0.1',
'cell.smart_scan_capable'='TRUE',
'cell.sparse_dg' = 'allsparse',
'AU_SIZE' = '4M';

Diskgroup created.

Set the following ASM attributes on the Disk Group hosting the Test Master Database

ALTER DISKGROUP DATAC1 SET ATTRIBUTE 'access_control.enabled' = 'true';

Grant access to the OS RDBMS user used to access to the Disk Group

ALTER DISKGROUP DATAC1 ADD USER 'oracle';

From an ASM Instance Set ownership permissions for every file that belongs solely to the PDB being snapped cloned as per example below

alter diskgroup DATAC1 set ownership owner='oracle' for file '+DATAC1/CDBT/<xxxxxxxxxxxxxxxxxxx>/DATAFILE/system.xxx.xxxxxxx';
alter diskgroup DATAC1 set ownership owner='oracle' for file '+DATAC1/CDBT/<xxxxxxxxxxxxxxxxxxx>/DATAFILE/sysaux.xxx.xxxxxxx';
alter diskgroup DATAC1 set ownership owner='oracle' for file '+DATAC1/CDBT/<xxxxxxxxxxxxxxxxxxx>/DATAFILE/users.xxx.xxxxxxx';
...
..

Restart the Master Test PDB in Read Only

alter pluggable database PDBTESTMASTER close immediate instances=all;
alter pluggable database PDBTESTMASTER open read only;

Create the first PDB Snapshot Copy on Exadata SPARSE Disk Group

Create pluggable database PDBDEV01 from PDBTESTMASTER tempfile reuse create_file_dest='+SPARSEC1' snapshot copy;

Feedback of the Exadata Storage Snapshots

The ability to create storage efficient database copies in a few seconds, independently from the size of the Test Master is very useful for today IT departments; but such extreme velocity and flexibility is not entirely free. In fact performance tests on a I/O bound workload have highlighted important performance degradation. This reminds us that as defined by Oracle Corporation, the Snapshot Technology, included on Exadata Machine remains a non-production option.

RHEL 7.4 fails to mount ACFS File System due to KMOD package

After a fresh OS installation or an upgrade to RHEL 7.4, any attempt to install ACFS drivers will fail with the following message: “ACFS-9459 ADVM/ACFS is not supported on this OS version”

The error persists even if the Oracle Grid Infrastructure software includes the  Patch 26247490: 12.2 ACFS MODULE ERRORS & CRASH DURING MODULE LOAD & UNLOAD WITH OL7U4 RHCK.

 

This problem has been identified by Oracle with  BUG 26320387 – 7.4 kmod weak-modules not checking kABI compatibility correctly

And by Red Hat  Bugzilla bug:  1477073 – 7.4 kmod weak-modules –dry-run changed output format missing ‘is compatible’ messages.

root@oel7node06:/u01/app/12.2.0.1/grid/crs/install# /u01/app/12.2.0.1/grid/bin/acfsroot install
ACFS-9459: ADVM/ACFS is not supported on this OS version: '3.10.0-514.6.1.el7.x86_64'

root@oel7node06:~# /sbin/lsmod | grep oracle
oracleadvm 776830 7
oracleoks 654476 1 oracleadvm
oracleafd 205543 1

 

The current Workaround consists in downgrade the version of the kmod  RPM to  kmod-20-9.el7.x86_64.

root@oel7node06:~# yum downgrade kmod-20-9.el7

 

After the package downgrade the ACFS drivers are correcly loaded:

root@oel7node06:~# /sbin/lsmod | grep oracle
oracleacfs 4597925 2
oracleadvm 776830 8
oracleoks 654476 2 oracleacfs,oracleadvm
oracleafd 205543 1

 


 

 

 

Oracle DB stored on ASM vs ACFS

Nowadays a new Oracle database environment with Grid Infrastructure has three main storage options:

  1. Third party clustered file system
  2. ASM Disk Groups
  3. ACFS File System

While the first option was not in scope, this blog compares the result of the tests between ASM and ACFS, highlighting when to use one or the other to store 12c NON-CDB or CDB Databases.

The tests conducted on different environments using Oracle version 12.1.0.2 July PSU have shown controversial results compared to what Oracle  is promoting for the Oracle Database Appliance (ODA) in the following paper: “Frequently Asked Questions Storing Database Files in ACFS on Oracle Database Appliance

 

Outcome of the tests

ASM remains the preferred option to achieve the best I/O performance, while ACFS introduces interesting features like DB snapshot to quickly and space efficiently provision new databases.

The performance gap between the two solutions is not negligible as reported below by the  AWR – TOP Timed Events sections of two PDBs, sharing the same infrastructure, executing the same workload but respectively using ASM and ACFS storage:

  • PDBASM: Pluggable Database stored on  ASM Disk Group
  • PDBACFS:Pluggable Database stored on ACFS File System

 

 

PDBASM AWR – TOP Timed Events and Other Stats

topevents_asm

fg_asm

 

 

PDBACFS AWR – TOP Timed Events and Other Stats

TopEvents_ACFS.png

fg_acfs

 

Due to the different characteristics and results when ASM or ACFS is in use, it is not possible to give a generic recommendation. But case by case the choise should be driven by business needs like maximum performance versus fast and efficient database clone.

 

 

 

 

The “Great” ODA overwhelming the Exadata

Introduction

This article try to explain the technical reasons of the Oracle Database Appliance success, a well-known appliance with whom Oracle targets small and medium businesses, or specific departments of big companies looking for privacy and isolation from the rest of the IT. Nowadays this small and relatively cheap appliance (around 65’000$ price list) has evolved a lot, the storage has reached an important capacity 128TB raw expansible to 256TB, and the two X5-2 servers are the same used on the database node of the Exadata machine. Many customers, while defining the new database architecture evaluate the pros and cons of acquiring an ODA compared to the smallest Exadata configuration (one eight of a Rack). If the customer is not looking for a system with extreme performance and horizontal scalability beyond the two X5-2 servers, the Oracle Database Appliance is frequently the retained option.

Some of the ODA major features are:

  • High Availability: no single point of failure on all hardware and software components.
  • Performance: each server is equipped with 2×18-core Intel Xeon and 256GB of RAM extensible up to 768GB, cluster communication over InfiniBand. The shared storage offers a multi-tiers configuration with HDDs at 7.2K rpm and two type of SSDs for frequently accessed data and for database redo logs.
  • Flexibility & Scalability: running RAC, RAC One node and Single Instance databases.
  • Virtualized configuration: designed for offering Solution in-a-box, with high available virtual machines.
  • Optimized licensing model: pay-as-you-grow model activating a crescendo number of CPU-cores on demand, with the Bare Metal configuration; or capping the resources combining Oracle VM with the Hard Partitioning setup.
  • Time-to-market: no-matter if the ODA has to be installed bare metal or virtualized, this is a standardized and automated process generally completed in one or two day of work.
  • Price: the ODA is very competitive when comparing the cost to an equivalent commodity architecture; which in addition, must be engineered, integrated and maintained by the customer.

 

At the time of the writing of this article, the latest hardware model is ODA X5-2 and 12.1.2.6.0 is the software version. This HW and SW combination offers unique features, few of them not even available on the Exadata machine, like the possibility to host databases and applications in one single box, or the possibility to rapidly and space efficiently clone an 11gR2 and 12c database using ACFS Snapshot.

 

 

ODA HW & SW Architecture

Oracle Database Appliance is composed by two X5-2 servers and a shared storage shelf, which optionally can be doubled. Each Server disposes of: two 18-core Intel Xeon E5-2699 v3; 256GB RAM (optionally upgradable to 768GB) and two 600GB 10k rpm internal disks in RAID 1 for OS and software binaries.

This appliance is equipped with redundant networking connectivity up to 10Gb, redundant SAS HBAs and Storage I/O modules, redundant InfiniBand interconnect for cluster communication enabling 40 Gb/second server-to-server communication.

The software components are all part of Oracle “Red Stack” with Oracle Linux 6 UEK or OVM 3, Grid Infrastructure 12c, Oracle RDBMS 12c & 11gR2 and Oracle Appliance Manager.

 

 

ODA Front view

Components number 1 & 2 are the X5-2 Servers. Components 3 & 4 are the Storage and the optionally Storage extension.

ODA_Front

 

ODA Rear view

Highlight of the multiple redundant connections, including InfiniBand for Oracle Clusterware, ASM and RAC communications. No single point of HW or SW failure.

ODA_Back

 

 

Storage Organization

With 16x8TB SAS HDDs a total raw space of 128TB is available on each storage self (64TB in configuration ASM double-mirrored and 42.7TB with ASM triple-mirrored). To offer better I/O performance without exploding the price, Oracle has implemented the following SSD devices: 4x400GB ASM double-mirrored, for frequently accessed data, and 4x200GB ASM triple-mirrored, for database redo logs.

As shown on the picture aside, each rotating disk has two slices, the external, and more performant partition assigned to the +DATA ASM disk group, and the internal one allocated to +RECO ASM disk group.

 

ODA_Disk

This storage optimization allows the ODA to achieve competitive I/O performance. In a production-like environment, using the three type of disks, as per ODA Database template odb-24 (https://docs.oracle.com/cd/E22693_01/doc.12/e55580/sizing.htm), Trivadis has measured 12k I/O per second and a throughput of 2300 MB/s with an average latency of 10ms. As per Oracle documentation, the maximum number of I/O per second of the rotating disks, with a single storage shelf is 3300; but this value increases significantly relocating the hottest data files to +FLASH disk group created on SSD devices.

 

ACFS becomes the default database storage of ODA

Starting from the ODA software version 12.1.0.2, any fresh installation enforces ASM Cluster File System (ACFS) as lonely type of database storage support, restricting the supported database versions to 11.2.0.4 and greater. In case of ODA upgrade from previous release, all pre-existing databases are not automatically migrated to ACFS, but Oracle provides a tool called acfs_mig.pl for executing this mandatory step on all Non-CDB databases of version >= 11.2.0.4.

Oracle has decided to promote ACFS as default database storage on ODA environment for the following reasons:

  • ACFS provides almost equivalent performance than Oracle ASM disk groups.
  • Additional functionalities on industry standard POSIX file system.
  • Database snapshot copy of PDBs, and NON-CDB of version 11.2.0.4 or greater.
  • Advanced functionality for general-purpose files such as replication, tagging, encryption, security, and auditing.

Database created on ACFS follows the same Oracle Managed Files (OMF) standard used by ASM.

As in the past, the database provisioning requires the utilization of the command line interface oakcli and the selection of a database template, which defines several characteristics including the amount of space to allocate on each file system. Container and Non-Container databases can coexist on the same Oracle Database Appliance.

The ACFS file systems are created during the database provisioning process on top of the ASM disk groups +DATA, +RECO, +REDO, and optionally +FLASH. The file systems have two possible setups, depending on the database type Container or Non-Container.

  • Container database: for each CDB the ODA database-provisioning job creates dedicated ACFS file systems with the following characteristics:
Disk Characteristics ASM Disk group ACFS Mount Point
SAS Disk external partition +DATA /u02/app/oracle/oradata/datc<db_unique_name>
SAS Disk internal partition +RECO /u01/app/oracle/fast_recovery_area/rcoc<db_unique_name>
SSD Triple-mirrored +REDO /u01/app/oracle/oradata/rdoc<db_unique_name>
SSD Double-mirrored +FLASH (*) /u02/app/oracle/oradata/flashdata

 

  • Non-Container database: in case of Non-CDB the ODA database-provisioning job creates or resizes the following shared ACFS file systems:
Disk Characteristics ASM Disk group ACFS Mount Point
SAS Disk external partition +DATA /u02/app/oracle/oradata/datastore
SAS Disk internal partition +RECO /u01/app/oracle/fast_recovery_area/datastore
SSD Triple-mirrored +REDO /u01/app/oracle/oradata/datastore
SSD Double-mirrored +FLASH (*) /u02/app/oracle/oradata/flashdata

(*) Optionally used by the databases as Smart Flash Cache (extension of the SGA buffer cache), or allocated to store the hottest data files leveraging the I/O performance of the SSD disks.

 

Oracle Database Appliance Bare Metal

The bare metal configuration has been available since version one of the appliance, and nowadays it remains the default option proposed by Oracle, which pre-install the OS Linux on any new system. Very simple and intuitive to install thanks to the pre-built bundle software, which automates most of the steps. At the end of the installation, the architecture is very similar to any other two node RAC setup based on commodity hardware; but even from an operation point of view there are great advantages, because the Oracle Appliance Manager framework simplifies and accelerates the execution of almost any system and database administrator task.

Here below is depicted the ODA architecture when the bare metal configuration is in use:

ODA_Bare_Metal

 

Oracle Database Appliance Virtualized

When the ODA is deployed with the virtualization, both servers run Oracle VM Server, also called Dom0. Each Dom0 hosts in a local dedicated repository the ODA Base (or Dom Base), a privileged virtual machine where it is installed the Appliance Manager, Grid Infrastructure and RDBMS binaries. The ODA Base takes advantage of the Xen PCI Pass-through technology to provide direct access to the ODA shared disks presented and managed by ASM. This configuration reduces the VM flexibility; in fact, no VM migration is allowed for the two ODA Base, but it guarantees almost no I/O penalty in term of performance. With the Dom Base setup, the basic installation is completed and it is possible to start provisioning databases using Oracle Appliance Manager.

At the same time, the administrator can create new-shared repositories hosted on ACFS and NFS exported to the hypervisor for hosting the application virtual machines. Those application virtual machines are also identified with the name of Domain U.  The Domain U and the templates can be stored on a local or shared Oracle VM Server repository, but to enable the functionality to migrate between the two Oracle VM Servers a shared repository on the ACFS file system should be used.

Even when the virtualization is in use, Oracle Appliance Manager is the only framework for system and database administration tasks like repository creation, import of template, deployment of virtual machine, network configuration, database provisioning and so on, relieving the administrator from all complexity.

The implementation of the Solution-in-a-box guarantees the maximum Return on Investment of the ODA; in fact, while restricting the virtual CPUs to license on the Dom Base it allows relocating spare resources to the application virtual machines as showed on the picture below.

ODA_Virtualized

 

 

ODA compared to Exadata Machine and Commodity Hardware

As described on the previous sections, Oracle Database Appliance offers unique features such as pay-as-you-grow, solution-in-a-box and so on, which can heavily influence the decision for a new database architecture. The aim of the table below is to list the main architecture characteristics to evaluate while defining a new database infrastructure, comparing the result between Oracle Database Appliance, Exadata Machine and a Commodity Architecture based on Intel Linux engineered to run RAC databases.

Table_Architectures

As shown by the different scores of the three architectures, each solution comes with points of strength and weakness; about the Oracle Database Appliance, it is evident that due to its characteristics, the smallest Oracle Engineered System remains a great option for small, medium database environments.

 

Conclusion

I hope this article keep the initial promise to explain the technical reasons of the Oracle Database Appliance success, and it has highlighted the great work done by Oracle, engineering this solution on the edge of the technology keeping the price under control.

One last summary of what in my opinion are the major benefits offered by the ODA:

  • Time-to-market: Thanks to automated processes and pre-build software images, the deployment phase is extremely rapid.
  • Simplicity: The use of standard software components, combined to the appliance orchestrator Oracle Appliance Manager makes the ODA very simple to operate.
  • Standardization & Automation: The Appliance Manager encapsulates and automatizes all repeatable and error-prone tasks like provisioning, decommissioning, patching and so on.
  • Vendor certified platform: Oracle validates and certifies the compatibility among all HW & SW components.
  • Evolution: Over the time, the ODA benefits of specific bug fixing and software evolution (introduced by Oracle though the quarterly patch sets); keeping the system on the edge for longer time when compared to a commodity architecture.

Troubleshooting not mounting ACFS File System

The ACSF /cloufs was created and registered on the CRS, following node reboot the file system was no longer mounting!

 

I logged on ASMCMD and checked the status of the ASM Volume:

[grid@rednodech07 ~]$ asmcmd
ASMCMD> volinfo -a
Diskgroup Name: FRA

 Volume Name: VOL_CLOUDFS
 Volume Device: ERROR
 State: DISABLED
 Size (MB): 20480
 Resize Unit (MB): 32
 Redundancy: MIRROR
 Stripe Columns: 4
 Stripe Width (K): 128
 Usage: ACFS
 Mountpath: /cloudfs

The output of the command above shows that the volume VOL_CLOUDFS is DISABLED. I tried manually  to restart it, but I got the following error:

ASMCMD> volenable -a
ORA-15032: not all alterations performed
ORA-15477: cannot communicate with the volume driver (DBD ERROR: OCIStmtExecute)
ASMCMD>

 

Then I checked if the ACFS kernel module where loaded into the Linux kernel:

  • oracleacfs (oracleacfs.ko): manages all ACFS filesystem operations.
  •  oracleavdm (oracleavdm.ko): AVDM module enabling direct interface with the filesystem
  • oracleoks (oracleoks.ko): provides memory management, lock and cluster synchronization
[root@rednodech07 ~]# /sbin/lsmod | grep oracle


Because the kernel modules were not loaded, I tried to manally load with the command

/bin/acfsload start

But it didn’t work, so I stopped the Grid Instrastruceure on the local node and I have reinstalled the ACFS drivers:

[root@rednodech07 asm]# /u01/GRID/11.2.0.4/bin/crsctl stop crs
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rednodech07'
CRS-2673: Attempting to stop 'ora.crsd' on 'rednodech07'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rednodech07'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.cvu' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.oc4j' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.GRID.dg' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.tvdtst.db' on 'rednodech07'
CRS-2677: Stop of 'ora.cvu' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.rednodech07.vip' on 'rednodech07'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'rednodech07'
CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.scan2.vip' on 'rednodech07'
CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.scan3.vip' on 'rednodech07'
CRS-2677: Stop of 'ora.tvdtst.db' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'rednodech07'
CRS-2677: Stop of 'ora.DATA.dg' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.rednodech07.vip' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.scan1.vip' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.scan2.vip' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.scan3.vip' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.GRID.dg' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rednodech07'
CRS-2677: Stop of 'ora.asm' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'rednodech07'
CRS-2677: Stop of 'ora.ons' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'rednodech07'
CRS-2677: Stop of 'ora.net1.network' on 'rednodech07' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rednodech07' has completed
CRS-2677: Stop of 'ora.crsd' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.evmd' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.asm' on 'rednodech07'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rednodech07'
CRS-2677: Stop of 'ora.crf' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.evmd' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rednodech07' succeeded
CRS-2677: Stop of 'ora.asm' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rednodech07'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rednodech07'
CRS-2677: Stop of 'ora.cssd' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rednodech07'
CRS-2677: Stop of 'ora.gipcd' on 'rednodech07' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rednodech07'
CRS-2677: Stop of 'ora.gpnpd' on 'rednodech07' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rednodech07' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@rednodech07 asm]#



[root@rednodech07 ~]# /u01/GRID/11.2.0.4/bin/acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9312: Existing ADVM/ACFS installation detected.
ACFS-9314: Removing previous ADVM/ACFS installation.
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.
[root@rednodech07 ~]#


At this point I have restarted the Grid Infrastructure, the ACFS kernel modules got loaded  and the ASM Volume State become ENABLED.

[root@rednodech07 ~]# /u01/GRID/11.2.0.4/bin/crsctl start crs
CRS-4123: Oracle High Availability Services has been started.


[root@rednodech07 ~]# /sbin/lsmod | grep oracle
oracleacfs 1994567 0
oracleadvm 243254 0
oracleoks 460313 2 oracleacfs,oracleadvm

[grid@rednodech07 ~]$ asmcmd
ASMCMD> volinfo -a
Diskgroup Name: FRA

 Volume Name: VOL_CLOUDFS
 Volume Device: /dev/asm/vol_cloudfs-390
 State: ENABLED
 Size (MB): 20480
 Resize Unit (MB): 32
 Redundancy: MIRROR
 Stripe Columns: 4
 Stripe Width (K): 128
 Usage: ACFS
 Mountpath: /cloudfs

ASMCMD>

It remained to restart the ACFS File system with the following command:

[root@rednodech07 /]# /bin/mount -t acfs /dev/asm/vol_cloudfs-390 /cloudfs

[oracle@rednodech07 duplicate_tcswu]$ mount
/dev/mapper/vg_rednodech07-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sdbc1 on /boot type ext4 (rw)
/dev/mapper/vg_rednodech07-lv_home on /home type ext4 (rw)
/dev/mapper/vg_rednodech07-lv_tmp on /tmp type ext4 (rw)
/dev/mapper/vg_rednodech07-lv_u01 on /u01 type ext4 (rw)
/dev/mapper/vg_rednodech07-lv_var on /var type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/asm/vol_cloudfs-390 on /cloudfs type acfs (rw)


ASM Storage Reclamation Utility (ASRU) for HP 3PAR Thin Provisioning

 

ASM Storage Reclamation Utility (ASRU) reclaims storage from an ASM disk group that was previously allocated but is no longer in use. In example after decommissioning a database. This Perl script writes blocks of Zeros where space is currently unallocated; the Zeros blocks are interpreted by the 3PAR Storage Server, as physical space to reclaim.

The execution of the ASRU script consists in three sequential phases:

  1. Compaction the disks are logically resized keeping 25% of free space for future needs and without affecting the physical size of the disks. This operation triggers the ASM disk group rebalance which compact the data at the beginning of the disks.
  2. Deallocation this phase writes Zeros blocks above the current data High Water Mark, those blocks of Zeros are interpreted by the storage as space available for reclaiming.
  3. Expansion here the utility resize the logical disks to the original size, because data remains untouched no ASM rebalance operation is required.

 

How to use ASRU

ASM Disk Groups

 

ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED NORMAL N 512 4096 4194304 3071904 1220008 511984 354012 0 N DATA/
MOUNTED NORMAL N 512 4096 4194304 7167776 3631252 511984 1559634 0 N FRA/
MOUNTED HIGH N 512 4096 1048576 41886 40621 20448 6405 0 Y OCRVOTING/
ASMCMD>

——————————————————————
Invoke the ASRU utility wirh the Grid Infrastructure owner
——————————————————————

[grid@xxxxxxxx space_reclaim]$ bash ASRU DATA
Checking the system ...done
Calculating the sizes of the disks ...done
Writing the data to a file ...done
Resizing the disks...done
Calculating the sizes of the disks ...done

/u01/GRID/11.2.0.4/perl/bin/perl -I /u01/GRID/11.2.0.4/perl/lib/5.10.0 /cloudfs/space_reclaim/zerofill 7 /dev/mapper/asm500GB_360002ac0000000000000000c0000964bp1 385789 511984 /dev/mapper/asm500GB_360002ac000000000000000150000964cp1 385841 511984 /dev/mapper/asm500GB_360002ac000000000000000160000964cp1 385813 511984 /dev/mapper/asm500GB_360002ac000000000000000110000964bp1 385869 511984 /dev/mapper/asm500GB_360002ac000000000000000120000964bp1 385789 511984 /dev/mapper/asm500GB_360002ac000000000000000140000964cp1 385789 511984
126171+0 records in
126171+0 records out
132299882496 bytes (132 GB) copied, 519.831 s, 255 MB/s
126195+0 records in
126195+0 records out
132325048320 bytes (132 GB) copied, 519.927 s, 255 MB/s
126195+0 records in
126195+0 records out
132325048320 bytes (132 GB) copied, 520.045 s, 254 MB/s
126143+0 records in
126143+0 records out
132270522368 bytes (132 GB) copied, 520.064 s, 254 MB/s
126115+0 records in
126115+0 records out
132241162240 bytes (132 GB) copied, 520.076 s, 254 MB/s
126195+0 records in
126195+0 records out
132325048320 bytes (132 GB) copied, 520.174 s, 254 MB/s

Calculating the sizes of the disks ...done
Resizing the disks...done
Calculating the sizes of the disks ...done
Dropping the file ...done

 

The second phase of the script called Deallocation uses dd to reset to zero the blocks beyond the HWM. One dd process per ASM Disk is started:

[grid@xxxxxxxx space_reclaim]$ top
top - 10:13:02 up 44 days, 16:16, 4 users, load average: 16.63, 16.45, 13.75
Tasks: 732 total, 6 running, 726 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.8%us, 13.8%sy, 0.0%ni, 37.1%id, 43.9%wa, 0.0%hi, 2.4%si, 0.0%st
Mem: 131998748k total, 131419200k used, 579548k free, 42266420k buffers
Swap: 16777212k total, 0k used, 16777212k free, 3394532k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 101 root 20 0 0 0 0 R 39.4 0.0 8:38.60 kswapd0
20332 grid 20 0 103m 1564 572 R 19.5 0.0 1:46.35 dd
20333 grid 20 0 103m 1568 572 D 18.2 0.0 1:44.93 dd
20325 grid 20 0 103m 1568 572 D 17.2 0.0 1:44.53 dd
20324 grid 20 0 103m 1568 572 R 15.6 0.0 1:20.63 dd
20328 grid 20 0 103m 1564 572 R 15.2 0.0 1:21.55 dd
20331 grid 20 0 103m 1568 572 D 14.6 0.0 1:21.42 dd
26113 oracle 20 0 60.2g 32m 26m S 14.6 0.0 0:00.75 oracle
20335 root 20 0 0 0 0 D 14.2 0.0 1:18.94 flush-252:24
20322 grid 20 0 103m 1568 572 D 13.9 0.0 1:21.51 dd
20342 root 20 0 0 0 0 D 13.2 0.0 1:16.61 flush-252:25
20338 root 20 0 0 0 0 R 12.9 0.0 1:17.42 flush-252:30
20336 root 20 0 0 0 0 D 10.9 0.0 1:00.66 flush-252:55
20339 root 20 0 0 0 0 D 10.9 0.0 0:57.79 flush-252:50
20340 root 20 0 0 0 0 D 10.3 0.0 0:58.42 flush-252:54
20337 root 20 0 0 0 0 D 9.6 0.0 0:58.24 flush-252:60
24409 root RT 0 889m 96m 57m S 5.3 0.1 2570:35 osysmond.bin
24861 root 0 -20 0 0 0 S 1.7 0.0 41:31.95 kworker/1:1H
21086 root 0 -20 0 0 0 S 1.3 0.0 36:24.40 kworker/7

[grid@xxxxxxxxxx~]$ ps -ef|grep 20332
grid 20332 20326 17 10:02 pts/0 00:01:16 /bin/dd if=/dev/zero of=/dev/mapper/asm500GB_360002ac000000000000000110000964cp1 seek=315461 bs=1024k count=196523

[grid@xxxxxxxxxx ~]$ ps -ef|grep 20325
grid 20325 20319 17 10:02 pts/0 00:01:35 /bin/dd if=/dev/zero of=/dev/mapper/asm500GB_360002ac0000000000000000d0000964cp1 seek=315309 bs=1024k count=196675


 

——————————————————————
ASM I/O Statistics  during the disk group rebalance
——————————————————————

ASMCMD> lsop
Group_Name Dsk_Num State Power EST_WORK EST_RATE EST_TIME
DATA REBAL WAIT 7
ASMCMD>
ASMCMD> iostat -et 5
Group_Name Dsk_Name Reads Writes Read_Err Write_Err Read_Time Write_Time
DATA S1_DATA01_FG1 23030185984 2082245521408 0 0 629.202365 561627.214525
DATA S1_DATA02_FG1 9678848 2002875955200 0 0 141.271598 556226.65866
DATA S1_DATA03_FG1 101520732160 2016216610304 0 0 3024.887841 561404.578818
DATA S2_DATA01_FG1 819643435008 2062069520896 0 0 50319.400536 563116.826573
DATA S2_DATA02_FG1 1126678040576 2045156313600 0 0 56108.943316 555738.806255
DATA S2_DATA03_FG1 947842624000 1994103517696 0 0 51845.856561 545466.151177
FRA S1_FRA01_FG1 9695232 305258886144 0 0 251.129038 5234.922326
FRA S1_FRA02_FG1 9691136 324037302272 0 0 234.499119 5478.064898
FRA S1_FRA03_FG1 9674752 287679095808 0 0 237.140794 4322.92991
FRA S1_FRA04_FG1 9678848 279486220800 0 0 563.687636 3845.515979
FRA S1_FRA05_FG1 9687040 287006669312 0 0 236.97403 4162.291019
FRA S1_FRA06_FG1 9695232 305493610496 0 0 260.062194 4776.712435
FRA S1_FRA07_FG1 9691648 286196798976 0 0 236.804526 14257.967546
FRA S2_FRA01_FG1 28695552 282395977216 0 0 565.469092 3874.206606
FRA S2_FRA02_FG1 63110656 290152312832 0 0 622.124042 14264.906378
FRA S2_FRA03_FG1 10750508032 318696439808 0 0 214.440821 5200.272304
FRA S2_FRA04_FG1 102140928 311658688512 0 0 624.488925 5098.68159
FRA S2_FRA05_FG1 55187456 298768577536 0 0 587.286013 4398.231978
FRA S2_FRA06_FG1 33064960 289082719232 0 0 21.587277 4597.368455
FRA S2_FRA07_FG1 28070912 284403925504 0 0 568.334218 4320.709945
OCRVOTING S1_OCRVOTING01_FG1 9666560 4096 0 0 292.504971 .000388
OCRVOTING S1_OCRVOTING02_FG2 9674752 0 0 0 14.6555 0
OCRVOTING S2_OCRVOTING01_FG1 10866688 4096 0 0 99.140306 .000388
OCRVOTING S2_OCRVOTING02_FG2 9695232 4096 0 0 110.684821 .000388
OCRVOTING S3_OCRVOTING01_FG1 9666560 0 0 0 73.171492 0


Group_Name Dsk_Name Reads Writes Read_Err Write_Err Read_Time Write_Time
DATA S1_DATA01_FG1 1329561.60 51507.20 0.00 0.00 0.13 0.01
DATA S1_DATA02_FG1 773324.80 417792.00 0.00 0.00 0.14 0.03
DATA S1_DATA03_FG1 1255014.40 11468.80 0.00 0.00 0.18 0.00
DATA S2_DATA01_FG1 0.00 5734.40 0.00 0.00 0.00 0.00
DATA S2_DATA02_FG1 32768.00 30208.00 0.00 0.00 0.00 0.02
DATA S2_DATA03_FG1 0.00 416972.80 0.00 0.00 0.00 0.01
FRA S1_FRA01_FG1 0.00 6553.60 0.00 0.00 0.00 0.00
FRA S1_FRA02_FG1 3276.80 10649.60 0.00 0.00 0.00 0.00
FRA S1_FRA03_FG1 0.00 0.00 0.00 0.00 0.00 0.00
FRA S1_FRA04_FG1 0.00 3276.80 0.00 0.00 0.00 0.00
FRA S1_FRA05_FG1 0.00 0.00 0.00 0.00 0.00 0.00
FRA S1_FRA06_FG1 0.00 3276.80 0.00 0.00 0.00 0.00
FRA S1_FRA07_FG1 0.00 4812.80 0.00 0.00 0.00 0.00
FRA S2_FRA01_FG1 0.00 819.20 0.00 0.00 0.00 0.00
FRA S2_FRA02_FG1 0.00 3276.80 0.00 0.00 0.00 0.00
FRA S2_FRA03_FG1 0.00 6553.60 0.00 0.00 0.00 0.00
FRA S2_FRA04_FG1 0.00 6553.60 0.00 0.00 0.00 0.00
FRA S2_FRA05_FG1 0.00 3276.80 0.00 0.00 0.00 0.00
FRA S2_FRA06_FG1 0.00 4812.80 0.00 0.00 0.00 0.00
FRA S2_FRA07_FG1 0.00 3276.80 0.00 0.00 0.00 0.00
OCRVOTING S1_OCRVOTING01_FG1 0.00 819.20 0.00 0.00 0.00 0.60
OCRVOTING S1_OCRVOTING02_FG2 0.00 819.20 0.00 0.00 0.00 0.60
OCRVOTING S2_OCRVOTING01_FG1 0.00 819.20 0.00 0.00 0.00 0.60
OCRVOTING S2_OCRVOTING02_FG2 0.00 819.20 0.00 0.00 0.00 0.60
OCRVOTING S3_OCRVOTING01_FG1 0.00 819.20 0.00 0.00 0.00 0.0


Group_Name Dsk_Name Reads Writes Read_Err Write_Err Read_Time Write_Time
DATA S1_DATA01_FG1 77004.80 248217.60 0.00 0.00 0.01 0.01
DATA S1_DATA02_FG1 6553.60 819.20 0.00 0.00 0.01 0.60
DATA S1_DATA03_FG1 83558.40 11468.80 0.00 0.00 0.01 0.00
DATA S2_DATA01_FG1 0.00 235110.40 0.00 0.00 0.00 0.01
DATA S2_DATA02_FG1 36044.80 17203.20 0.00 0.00 0.00 0.60
DATA S2_DATA03_FG1 0.00 8192.00 0.00 0.00 0.00 0.00
FRA S1_FRA01_FG1 0.00 6553.60 0.00 0.00 0.00 0.00
FRA S1_FRA02_FG1 3276.80 11468.80 0.00 0.00 0.00 0.01
FRA S1_FRA03_FG1 0.00 233472.00 0.00 0.00 0.00 0.01
FRA S1_FRA04_FG1 0.00 0.00 0.00 0.00 0.00 0.00
FRA S1_FRA05_FG1 0.00 0.00 0.00 0.00 0.00 0.00
FRA S1_FRA06_FG1 0.00 6553.60 0.00 0.00 0.00 0.00
FRA S1_FRA07_FG1 0.00 0.00 0.00 0.00 0.00 0.00
FRA S2_FRA01_FG1 0.00 1638.40 0.00 0.00 0.00 0.01
FRA S2_FRA02_FG1 0.00 0.00 0.00 0.00 0.00 0.00
FRA S2_FRA03_FG1 0.00 9830.40 0.00 0.00 0.00 0.00
FRA S2_FRA04_FG1 0.00 6553.60 0.00 0.00 0.00 0.00
FRA S2_FRA05_FG1 0.00 6553.60 0.00 0.00 0.00 0.00
FRA S2_FRA06_FG1 0.00 0.00 0.00 0.00 0.00 0.00
FRA S2_FRA07_FG1 0.00 233472.00 0.00 0.00 0.00 0.01
OCRVOTING S1_OCRVOTING01_FG1 0.00 1638.40 0.00 0.00 0.00 1.20
OCRVOTING S1_OCRVOTING02_FG2 0.00 1638.40 0.00 0.00 0.00 1.20
OCRVOTING S2_OCRVOTING01_FG1 0.00 1638.40 0.00 0.00 0.00 1.20
OCRVOTING S2_OCRVOTING02_FG2 0.00 1638.40 0.00 0.00 0.00 1.20
OCRVOTING S3_OCRVOTING01_FG1 0.00 1638.40 0.00 0.00 0.00 0.01

——————————————————————
ASM Alert Log produced during the execution of the ASRU utility
——————————————————————

Mon Apr 04 09:11:39 2016
SQL> ALTER DISKGROUP DATA RESIZE DISK S2_DATA03_FG1 SIZE 385840M DISK S1_DATA01_FG1 SIZE 385788M DISK S2_DATA02_FG1 SIZE 385812M DISK S1_DATA02_FG1 SIZE 385868M DISK S2_DATA01_FG1 SIZE 385788M DISK S1_DATA03_FG1 SIZE 385788M REBALANCE WAIT/* ASRU */
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
Mon Apr 04 09:12:11 2016
NOTE: membership refresh pending for group 1/0x48695261 (DATA)
Mon Apr 04 09:12:12 2016
GMON querying group 1 at 10 for pid 18, osid 25195
SUCCESS: refreshed membership for 1/0x48695261 (DATA)
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
NOTE: starting rebalance of group 1/0x48695261 (DATA) at power 7
Starting background process ARB0
Mon Apr 04 09:12:15 2016
ARB0 started with pid=41, OS id=46711
NOTE: assigning ARB0 to group 1/0x48695261 (DATA) with 7 parallel I/Os
cellip.ora not found.
Mon Apr 04 09:13:38 2016
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 1/0x48695261 (DATA)
Mon Apr 04 09:13:39 2016
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
Mon Apr 04 09:13:42 2016
GMON updating for reconfiguration, group 1 at 11 for pid 41, osid 47334
NOTE: group 1 PST updated.
SUCCESS: disk S1_DATA01_FG1 resized to 96447 AUs
SUCCESS: disk S1_DATA02_FG1 resized to 96467 AUs
SUCCESS: disk S2_DATA01_FG1 resized to 96447 AUs
SUCCESS: disk S2_DATA02_FG1 resized to 96453 AUs
SUCCESS: disk S2_DATA03_FG1 resized to 96460 AUs
SUCCESS: disk S1_DATA03_FG1 resized to 96447 AUs
NOTE: resizing header on grp 1 disk S1_DATA01_FG1
NOTE: resizing header on grp 1 disk S1_DATA02_FG1
NOTE: resizing header on grp 1 disk S2_DATA01_FG1
NOTE: resizing header on grp 1 disk S2_DATA02_FG1
NOTE: resizing header on grp 1 disk S2_DATA03_FG1
NOTE: resizing header on grp 1 disk S1_DATA03_FG1
NOTE: membership refresh pending for group 1/0x48695261 (DATA)
GMON querying group 1 at 12 for pid 18, osid 25195
SUCCESS: refreshed membership for 1/0x48695261 (DATA)
Mon Apr 04 09:13:48 2016
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Mon Apr 04 09:13:49 2016
SUCCESS: ALTER DISKGROUP DATA RESIZE DISK S2_DATA03_FG1 SIZE 385840M DISK S1_DATA01_FG1 SIZE 385788M DISK S2_DATA02_FG1 SIZE 385812M DISK S1_DATA02_FG1 SIZE 385868M DISK S2_DATA01_FG1 SIZE 385788M DISK S1_DATA03_FG1 SIZE 385788M REBALANCE WAIT/* ASRU */
Mon Apr 04 09:22:42 2016
SQL> ALTER DISKGROUP DATA RESIZE DISK S2_DATA03_FG1 SIZE 511984M DISK S1_DATA01_FG1 SIZE 511984M DISK S2_DATA02_FG1 SIZE 511984M DISK S1_DATA02_FG1 SIZE 511984M DISK S2_DATA01_FG1 SIZE 511984M DISK S1_DATA03_FG1 SIZE 511984M REBALANCE WAIT/* ASRU */
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
NOTE: requesting all-instance disk validation for group=1
Mon Apr 04 09:22:46 2016
NOTE: disk validation pending for group 1/0x48695261 (DATA)
SUCCESS: validated disks for 1/0x48695261 (DATA)
Mon Apr 04 09:23:24 2016
NOTE: increased size in header on grp 1 disk S1_DATA01_FG1
NOTE: increased size in header on grp 1 disk S1_DATA02_FG1
NOTE: increased size in header on grp 1 disk S2_DATA01_FG1
NOTE: increased size in header on grp 1 disk S2_DATA02_FG1
NOTE: increased size in header on grp 1 disk S2_DATA03_FG1
NOTE: increased size in header on grp 1 disk S1_DATA03_FG1
Mon Apr 04 09:23:24 2016
NOTE: membership refresh pending for group 1/0x48695261 (DATA)
Mon Apr 04 09:23:26 2016
GMON querying group 1 at 13 for pid 18, osid 25195
SUCCESS: refreshed membership for 1/0x48695261 (DATA)
NOTE: starting rebalance of group 1/0x48695261 (DATA) at power 7
Starting background process ARB0
Mon Apr 04 09:23:26 2016
ARB0 started with pid=38, OS id=53105
NOTE: assigning ARB0 to group 1/0x48695261 (DATA) with 7 parallel I/Os
cellip.ora not found.
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Mon Apr 04 09:23:37 2016
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 1/0x48695261 (DATA)
Mon Apr 04 09:23:38 2016
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
NOTE: membership refresh pending for group 1/0x48695261 (DATA)
Mon Apr 04 09:23:44 2016
GMON querying group 1 at 14 for pid 18, osid 25195
SUCCESS: refreshed membership for 1/0x48695261 (DATA)
Mon Apr 04 09:23:47 2016
NOTE: Attempting voting file refresh on diskgroup DATA
NOTE: Refresh completed on diskgroup DATA. No voting file found.
Mon Apr 04 09:23:48 2016
SUCCESS: ALTER DISKGROUP DATA RESIZE DISK S2_DATA03_FG1 SIZE 511984M DISK S1_DATA01_FG1 SIZE 511984M DISK S2_DATA02_FG1 SIZE 511984M DISK S1_DATA02_FG1 SIZE 511984M DISK S2_DATA01_FG1 SIZE 511984M DISK S1_DATA03_FG1 SIZE 511984M REBALANCE WAIT/* ASRU */
Mon Apr 04 09:23:50 2016
SQL> /* ASRU */alter diskgroup DATA drop file '+DATA/tpfile'
SUCCESS: /* ASRU */alter diskgroup DATA drop file '+DATA/tpfile'



Once the ASRU utility has completed, the Storage Administrator should invoke the Space Compact from the 3Par console.

Featured

ASM 12c

A powerful framework for storage management

 

1 INTRODUCTION

Oracle Automatic Storage Management (ASM) is a well-known, largely used multi-platform volume manager and file system, designed for single-instance and clustered environment. Developed for managing Oracle database files with optimal performance and native data protection, simplifying the storage management; nowadays ASM includes several functionalities for general-purpose files too.
This article focuses on the architecture and characteristics of the version 12c, where great changes and enhancements of pre-existing capabilities have been introduced by Oracle.
Dedicated sections explaining how Oracle has leveraged ASM within the Oracle Engineered Systems complete the paper.

 

1.1 ASM 12c Instance Architecture Diagram

Below are highlighted the functionalities and the main background components associated to an ASM instance. It is important to notice how starting from Oracle 12c a database can run within ASM Disk Groups or on top of ASM Cluster file systems (ACFS).

 

ASM_db

 

Overview ASM options available in Oracle 12c.

ACFS

 

1.2       ASM 12c Multi-Nodes Architecture Diagram

In a Multi-node cluster environment, ASM 12c is now available in two configurations:

  • 11gR2 like: with one ASM instance on each Grid Infrastructure node.
  • Flex ASM: a new concept, which leverages the architecture availability and performance of the cluster; removing the 1:1 hard dependency between cluster node and local ASM instance. With Flex ASM only few nodes of the cluster run an ASM instance, (the default cardinality is 3) and the database instances communicate with ASM in two possible way: locally or over the ASM Network. In case of failure of one ASM instance, the databases automatically and transparently reconnect to another surviving instance on the cluster. This major architectural change required the introduction of two new cluster resources, ASM-Listener for supporting remote client connections and ADVM-Proxy, which permits the access to the ACFS layer. In case of wide cluster installation, Flex ASM enhances the performance and the scalability of the Grid Infrastructure, reducing the amount of network traffic generated between ASM instances.

 

Below two graphical representations of the same Oracle cluster; on the first drawing ASM is configured with pre-12c setup, on the second one Flex ASM is in use.

ASM architecture 11gR2 like

01_NO_FlexASM_Drawing

 

 

Flex ASM architecture

01_FlexASM_Drawing

 

 

2  ASM 12c NEW FEATURES

The table below summarizes the list of new functionalities introduced on ASM 12c R1

Feature Definition
Filter Driver Filter Driver (Oracle ASMFD) is a kernel module that resides in the I/O

path of the Oracle ASM disks used to validate write I/O requests to Oracle ASM disks, eliminates accidental overwrites of Oracle ASM disks that would cause corruption. For example, the Oracle ASM Filter Driver filters out all non-Oracle I/Os which could cause accidental overwrites.

General ASM Enhancements –       Oracle ASM now replicates physically addressed metadata, such as the disk header and allocation tables, within each disk, offering a better protection against bad block disk sectors and external corruptions.

–       Increased storage limits: ASM can manage up to 511 disk groups and a maximum disk size of 32 PB.

–       New REPLACE clause on the ALTER DISKGROUP statement.

Disk Scrubbing Disk scrubbing checks logical data corruptions and repairs the corruptions automatically in normal and high redundancy disks groups. This process automatically starts during rebalance operations or the administrator can trigger it.
Disk Resync Enhancements It enables fast recovery from instance failure and faster resyncs performance. Multiple disks can be brought online simultaneously. Checkpoint functionality enables to resume from the point where the process was interrupted.
Even Read For Disk Groups If ASM mirroring is in use, each I/O request submitted to the system can be satisfied by more than one disk. With this feature, each request to read is sent to the least loaded of the possible source disks.
ASM Rebalance Enhancements The rebalance operation has been improved in term of scalability, performance, and reliability; supporting concurrent operations on multiple disk groups in a single instance.  In this version, it has been enhanced also the support for thin provisioning, user-data validation, and error handling.
ASM Password File in a Disk Group ASM Password file is now stored within the ASM disk group.
Access Control Enhancements on Windows It is now possible to use access control to separate roles in Windows environments. With Oracle Database services running as users rather than Local System, the Oracle ASM access control feature is enabled to support role separation on Windows.
Rolling Migration Framework for ASM One-off Patches This feature enhances the rolling migration framework to apply oneoff patches released for ASM in a rolling manner, without affecting the overall availability of the cluster or the database

 

Updated Key Management Framework This feature updates Oracle key management commands to unify the key management application programming interface (API) layer. The updated key management framework makes interacting with keys in the wallet easier and adds new key metadata that describes how the keys are being used.

 

 

2.1 ASM 12c Client Cluster

One more ASM functionality explored but still in phase of development and therefore not really documented by Oracle, is ASM Client Cluster

Designed to host applications requiring cluster functionalities (monitoring, restart and failover capabilities), without the need to provision local shared storage.

The ASM Client Cluster installation is available as configuration option of the Grid Infrastructure binaries, starting from version 12.1.0.2.1 with Oct. 2014 GI PSU.

The use of ASM Client Cluster imposes the following pre-requisites and limitations:

  • The existence of an ASM Server Cluster version 12.1.0.2.1 with Oct. 2014 GI PSU, configured with the GNS server with or without zone delegation.
  • The ASM Server Cluster becomes aware of the ASM Client Cluster by importing an ad hoc XML configuration containing all details.
  • The ASM Client Cluster uses the OCR, Voting Files and Password File of the ASM Server Cluster.
  • ASM Client Cluster communicates with the ASM Server Cluster over the ASM Network.
  • ASM Server Cluster provides remote shared storage to ASM Client Cluster.

 

As already mentioned, at the time of writing this feature is still under development and without official documentation available, the only possible comment is that the ASM Client Cluster looks similar to another option introduced by Oracle 12c and called Flex Cluster. In fact, Flex Cluster has the concept of HUB and LEAF nodes; the first used to run database workload with direct access to the ASM disks and the second used to host applications in HA configuration but without direct access to the ASM disks.

 

 

3  ACFS NEW FEATURES

In Oracle 12c the Automatic Storage Management Cluster File System supports more and more types of files, offering advanced functionalities like snapshot, replication, encryption, ACL and tagging.  It is also important to highlight that this cluster file system comply with the POSIX standards of Linux/UNIX and with the Windows standards.

Access to ACFS from outside the Grid Infrastructure cluster is granted by NFS protocol; the NFS export can be registered as clusterware resource becoming available from any of the cluster nodes (HANFS).

Here is an exhaustive list of files supported by ACFS: executables, trace files, logs, application reports, BFILEs, configuration files, video, audio, text, images, engineering drawings, general-purpose and Oracle database files.

The major change, introduced in this version of ACFS, is definitely the capability and support to host Oracle database files; granting access to a set of functionalities that in the past were restricted to customer files only. Among them, the most important is the snapshot image, which has been fully integrated with the database Multitenant architecture, allowing cloning entire Pluggable databases in few seconds, independently from the size and in space efficient way using copy-on-write technology.

The snapshots are created and immediately available in the “<FS_mount_point>.ASFS/snaps” directory, and can be generated and later converted from read-only to read/write and vice versa. In addition, ACFS supports nested snapshots.

 

Example of ACFS snapshot copy:

-- Create a read/write Snapshot copy
[grid@oel6srv02 bin]$ acfsutil snap create -w cloudfs_snap /cloudfs

-- Display Snapshot Info
[grid@oel6srv02 ~]$ acfsutil snap info cloudfs_snap /cloudfs
snapshot name:               cloudfs_snap
RO snapshot or RW snapshot:  RW
parent name:                 /cloudfs
snapshot creation time:      Wed May 27 16:54:53 2015

-- Display specific file info 
[grid@oel6srv02 ~]$ acfsutil info file /cloudfs/scripts/utl_env/NEW_SESSION.SQL
/cloudfs/scripts/utl_env/NEW_SESSION.SQL
flags:        File
inode:        42
owner:        oracle
group:        oinstall
size:         684
allocated:    4096
hardlinks:    1
device index: 1
major, minor: 251,91137
access time:  Wed May 27 10:34:18 2013
modify time:  Wed May 27 10:34:18 2013
change time:  Wed May 27 10:34:18 2013
extents:
-offset ----length | -dev --------offset
0       4096 |    1     1496457216
extent count: 1

--Convert the snapshot from Read/Write to Read-only
acfsutil snap convert -r cloudfs_snap /cloudfs

 --Drop the snapshot 
[grid@oel6srv02 ~]$ acfsutil snap delete cloudfs_snap /cloudfs

Example of Pluggable database cloned using ACFS snapshot copy List of requirements that must be met to use ACFS SNAPSHOT COPY clause:

      • All pluggable database files of the source PDB must be stored on ACFS.

 

 

      • The source PDB cannot be in a remote CDB.

 

 

      • The source PDB must be in read-only mode.

 

 

      • Dropping the parent PDB with the including datafiles clause, does not automatically remove the snapshot dependencies, manual intervention is required.

 

 

SQL> CREATE PLUGGABLE DATABASE pt02 FROM ppq01
2  FILE_NAME_CONVERT = ('/u02/oradata/CDB4/PPQ01/',
3                       '/u02/oradata/CDB4/PT02/')
4  SNAPSHOT COPY;
Pluggable database created.
Elapsed: 00:00:13.70

The PDB snapshot copy imposes few restrictions among which the source database opened in read-only. This requirement prevents the implementation on most of the production environments where the database must remain available in read/write 24h/7. For this reason, ACFS for database files is particularly recommended on test and development where flexibility, speed and space efficiency of the clones are key factors for achieving high productive environment.

Graphical representation of how efficiently create and maintain a Test & Development database environment:

DB_Snapshot

 

 

4 ASM 12c and ORACLE ENGINEERED SYSTEMS

Oracle has developed few ASM features to leverage the characteristics of the Engineered Systems. Analyzing the architecture of the Exadata Storage, we see how the unique capabilities of ASM make possible to stripe and mirror data across independent set of disks grouped in different Storage Cells.

The sections below describe the implementation of ASM on the Oracle Database Appliance (ODA) and Exadata systems.

 

 

4.1 ASM 12c on Oracle Database Appliance

Oracle Database Appliance is a simple, reliable and affordable system engineered for running database workloads. One of the key characteristics present since the first version is the pay-as-you-grow model; it permits to activate a crescendo number of CPU-cores when needed, optimizing the licensing cost. With the new version of the ODA software bundle, Oracle has introduced the configuration Solution-in-a-box; which includes the virtualization layer for hosting Oracle databases and application components on the same appliance, but on separate virtual machines. The next sections highlight how the two configurations are architected and the role played by ASM:

  • ODA Bare metal: available since version one of the appliance, this is still the default configuration proposed by Oracle. Beyond the automated installation process, it is like any other two-node cluster, with all ASM and ACFS features available.

 

ODA_Bare_Metal

 

  • ODA Virtualized: on both ODA servers runs the Oracle VM Server software, also called Dom0. Each Dom0 hosts the ODA Base (or Dom Base), a privileged virtual machine where it is installed the Appliance Manager, Grid Infrastructure and RDBMS binaries. The ODA Base takes advantage of the Xen PCI Pass-through technology to provide direct access to the ODA shared disks presented and managed by ASM. This configuration reduces the VM flexibility; in fact, no VM migration is allowed, but it guarantees almost no I/O penalty in term of performance. After the Dom Base creation, it is possible to add Virtual Machine where running application components. Those optional application virtual machines are also identified with the name of Domain U.

By default, all VMs and templates are stored on a local Oracle VM Server repository, but in order to be able to migrate application virtual machines between the two Oracle VM Servers a shared repository on the ACFS file system should be created.

The implementation of the Solution-in-a-box guarantees the maximum Return on Investment of the ODA, because while licensing only the virtual CPUs allocated to Dom Base, the remaining resources are assigned to the application components as showed on the picture below.

ODA_Virtualized

 

 

4.2 ACFS Becomes the default database storage of ODA

Starting from Version 12.1.0.2, a fresh installation of the Oracle Database Appliance adopts ACFS as primary cluster file system to store database files and general-purpose data. Three file systems are created in the ASM disk groups (DATA, RECO, and REDO) and the new databases are stored in these three ACFS file systems instead of in the ASM disk groups.

In case of ODA upgrade from previous release to 12.1.0.2, all pre-existing databases are not automatically migrated to ACFS; but can coexist with the new databases created on ACFS.

At any time, the databases can be migrated from ASM to ACFS as post upgrade step.

Oracle has decided to promote ACFS as default database storage on ODA environment for the following reasons:

 

  • ACFS provides almost equivalent performance than Oracle ASM disk groups.
  • Additional functionalities on industry standard POSIX file system.
  • Database snapshot copy of PDBs, and NON-CDB version 11.2.0.4 of greater.
  • Advanced functionality for general-purpose files such as replication, tagging, encryption, security, and auditing.

Database created on ACFS follows the same Oracle Managed Files (OMF) standard used by ASM.

 

 

4.3 ASM 12c on Exadata Machine

Oracle Exadata Database machine is now at the fifth hardware generation; the latest software update has embraced the possibility to run virtual environments, but differently from the ODA or other Engineered System like Oracle Virtual Appliance, the VMs are not intended to host application components. ASM plays a key role on the success of the Exadata, because it orchestrates all Storage Cells in a way that appear as a single entity, while in reality, they do not know and they do not talk to each other.

The Exadata, available in a wide range of hardware configurations from 1/8 to multi-racks, offers a great flexibility on the storage setup too. The sections below illustrate what is possible to achieve in term of storage configuration when the Exadata is exploited bare metal and virtualized:

  • Exadata Bare Metal: despite the default storage configuration, which foresees three disk groups striped across all Storage Cells, guaranteeing the best I/O performance; as post-installation step, it is possible to deploy a different configuration. Before changing the storage setup, it is vital to understand and evaluate all associated consequences. In fact, even though in specific cases can be a meaningful decision, any storage configuration different from the default one, has as result a shift from optimal performance to flexibility and workload isolation.

Shown below a graphical representation of the default Exadata storage setup, compared to a custom configuration, where the Storage Cells have been divided in multiple groups, segmenting the I/O workloads and avoiding disruption between environments.

Exa_BareMetal_Disks_Default

Exa_BareMetal_Disks_Segmented.png

  • Exadata Virtualized: the installation of the Exadata with the virtualization option foresees a first step of meticulous capacity planning, defining the resources to allocate to the virtual machines (CPU and memory) and the size of each ASM disk group (DBFS, Data, Reco) of the clusters. This last step is particularly important, because unlike the VM resources, the characteristics of the ASM disk groups cannot be changed.

The new version of the Exadata Deployment Assistant, which generates the configuration file to submit to the Exadata installation process, now in conjunction with the use of Oracle Virtual Machines, permits to enter the information related to multiple Grid Infrastructure clusters.

The hardware-based I/O virtualization (so called Xen SR-IOV Virtualization), implemented on the Oracle VMs running on the Exadata Database servers, guarantees almost native I/O and Networking performance over InfiniBand; with lower CPU consumption when compared to a Xen Software I/O virtualization. Unfortunately, this performance advantage comes at the detriment of other virtualization features like Load Balancing, Live Migration and VM Save/Restore operations.

If the Exadata combined with the virtualization open new horizon in term of database consolidation and licensing optimization, do not leave any option to the storage configuration. In fact, the only possible user definition is the amount of space to allocate to each disk group; with this information, the installation procedure defines the size of the Grid Disks on all available Storage Cells.

Following a graphical representation of the Exadata Storage Cells, partitioned for holding three virtualized clusters. For each cluster, ASM access is automatically restricted to the associated Grid Disks.

Exa_BareMetal_Disk_Virtual

 

 

4.4 ACFS on Linux Exadata Database Machine

Starting from version 12.1.0.2, the Exadata Database Machine running Oracle Linux, supports ACFS for database file and general-purpose, with no functional restriction.

This makes ACFS an attractive storage alternative for holding: external tables, data loads, scripts and general-purpose files.

In addition, Oracle ACFS on Exadata Database Machines supports database files for the following database versions:

  • Oracle Database 10g Rel. 2 (10.2.0.4 and 10.2.0.5)
  • Oracle Database 11g (11.2.0.4 and higher)
  • Oracle Database 12c (12.1.0.1 and higher)

Since Exadata Storage Cell does not support database version 10g, ACFS becomes an important storage option for customers wishing to host older databases on their Exadata system.

However, those new configuration options and flexibility come with one major performance restriction. When ACFS for database files is in use, the Exadata is still not supporting the Smart Scan operations and is not able to push database operations directly to the storage. Hence, for a best performance result, it is recommended to store database files on the Exadata Storage using ASM disk groups.

As per any other system, when implementing ACFS on Exadata Database Machine, snapshots and tagging are supported for database and general-purpose files, while replication, security, encryption, audit and high availability NFS functionalities are only supported with general-purpose files.

 

 

 5 Conclusion

Oracle Automatic Storage Management 12c is a single integrated solution, designed to manage database files and general-purpose data under different hardware and software configurations. The adoption of ASM and ACFS not only eliminates the need for third party volume managers and file systems, but also simplifies the storage management offering the best I/O performance, enforcing Oracle best practices. In addition, ASM 12c with the Flex ASM setup removes previous important architecture limitations:

  • Availability: the hard dependency between the local ASM and database instance, was a single point of failure. In fact, without Flex ASM, the failure of the ASM instance causes the crash of all local database instances.
  • Performance: Flex ASM reduces the network traffic generated among the ASM instances, leveraging the architecture scalability; and it is easier and faster to keep the ASM metadata synchronized across large clusters. Finally yet importantly, only few nodes of the cluster have to support the burden of an ASM instance, leaving additional resources to application processing.

 

Oracle ASM offers a large set of configurations and options; it is now our duty to understand case-by-case, when it is relevant to use one setup or another, with the aim to maximize performance, availability and flexibility of the infrastructure.

 

 

ASM 11gR2 Create ACFS Cluster FS

#####################################################
##           Step by step how to create Oracle ACFS Cluster Filesystem       ##
#####################################################

[grid@lnxcld02 trace]$ asmcmd


  Type "help [command]" to get help on a specific ASMCMD command.

        commands:
        --------

        md_backup, md_restore

        lsattr, setattr

        cd, cp, du, find, help, ls, lsct, lsdg, lsof, mkalias
        mkdir, pwd, rm, rmalias

        chdg, chkdg, dropdg, iostat, lsdsk, lsod, mkdg, mount
        offline, online, rebal, remap, umount

        dsget, dsset, lsop, shutdown, spbackup, spcopy, spget
        spmove, spset, startup

        chtmpl, lstmpl, mktmpl, rmtmpl

        chgrp, chmod, chown, groups, grpmod, lsgrp, lspwusr, lsusr
        mkgrp, mkusr, orapwusr, passwd, rmgrp, rmusr

        volcreate, voldelete, voldisable, volenable, volinfo
        volresize, volset, volstat


ASMCMD>     
ASMCMD> volcreate -G FRA1 -s 5G Vol_ACFS01
ASMCMD> volinfo -a
Diskgroup Name: FRA1

         Volume Name: VOL_ACFS01
         Volume Device: /dev/asm/vol_acfs01-199
         State: ENABLED
         Size (MB): 5120
         Resize Unit (MB): 32
         Redundancy: UNPROT
         Stripe Columns: 4
         Stripe Width (K): 128
         Usage:
         Mountpath:

ASMCMD> volenable -a
ASMCMD>
ASMCMD> exit


[grid@lnxcld02 trace]$ acfsdriverstate version
ACFS-9325:     Driver OS kernel version = 2.6.18-8.el5(i386).
ACFS-9326:     Driver Oracle version = 110803.1.
[grid@lnxcld02 trace]$ acfsdriverstate loaded
ACFS-9203: true



SQL> SELECT volume_name, volume_device FROM V$ASM_VOLUME;

VOLUME_NAME                    VOLUME_DEVICE
------------------------------ ----------------------------------------
VOL_ACFS01                     /dev/asm/vol_acfs01-199

1 row selected.

---------------------------------------------------------------------------------

[root@lnxcld02 adump]# ls -la /dev/asm/vol_acfs01-199
brwxrwx--- 1 root asmadmin 252, 101889 Nov  1 20:03 /dev/asm/vol_acfs01-199

[root@lnxcld02 adump]# mkdir /cloud_FS
[root@lnxcld01 adump]# mkdir /cloud_FS


[root@lnxcld02 adump]# mkfs -t acfs /dev/asm/vol_acfs01-199
mkfs.acfs: version                   = 11.2.0.3.0
mkfs.acfs: on-disk version           = 39.0
mkfs.acfs: volume                    = /dev/asm/vol_acfs01-199
mkfs.acfs: volume size               = 5368709120
mkfs.acfs: Format complete.


[root@lnxcld02 adump]# acfsutil registry -a -f /dev/asm/vol_acfs01-199 /cloud_FS
acfsutil registry: mount point /cloud_FS successfully added to Oracle Registry


[root@lnxcld02 adump]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda1              11G  3.6G  6.0G  38% /
/dev/hdb1              12G  7.2G  3.9G  66% /home
tmpfs                 1.5G  634M  867M  43% /dev/shm
Oracle_Software       293G  180G  114G  62% /media/sf_Oracle_Software
/dev/hdc               40G   18G    22  45% /u01
/dev/asm/vol_acfs01-199
                      5.0G   75M  5.0G   2% /cloud_FS

                      
                      
SQL> select * from v$asm_volume;

GROUP_NUMBER VOLUME_NAME                    COMPOUND_INDEX    SIZE_MB VOLUME_NUMBER REDUND STRIPE_COLUMNS STRIPE_WIDTH_K STATE            FILE_NUMBER
------------ ------------------------------ -------------- ---------- ------------- ------ -------------- -------------- ---------------- -----------
INCARNATION DRL_FILE_NUMBER RESIZE_UNIT_MB USAGE                          VOLUME_DEVICE                            MOUNTPATH
----------- --------------- -------------- ------------------------------ ---------------------------------------- --------------------
           2 VOL_ACFS01                           33554433       5120             1 UNPROT              4            128 ENABLED                  270   
766094623               0             32    ACFS                           /dev/asm/vol_acfs01-199                  /cloud_FS


1 row selected.

	

Setup ASM Lib

#####################################
##       How to install and setup ASMLib packages         ##
#####################################

###### List of Platform depenfent but Kernel independent packages ######
 oracleasm-support-2.1.3-1.<distro>.x86_64.rpm
 oracleasmlib-2.0.4-1.<distro>.x86_64.rpm
###### List of Platform and Kernel dependent packages ######
 oracleasm-2.6.16.46-0.12-smp-2.0.3-1.x86_64.rpm
 oracleasm-2.6.16.46-0.12-default-2.0.3-1.x86_64.rpm
-- Install the packages using the command rpm –ivh on all the nodes
###### ASMLib Configuration (to repeat on all nodes of the cluster)  ######
 [root@lrh-node1 /]# /etc/init.d/oracleasm configure
 Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
 driver. The following questions will determine whether the driver is
 loaded on boot and what permissions it will have. The current values
 will be shown in brackets ('[]'). Hitting without typing an
 answer will keep that current value. Ctrl-C will abort.
Default user to own the driver interface []: grid
 Default group to own the driver interface []: asmdba
 Start Oracle ASM library driver on boot (y/n) [n]: y
 Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
 Writing Oracle ASM library driver configuration [ OK ]
 Creating /dev/oracleasm mount point [ OK ]
 Loading module "oracleasm" [ OK ]
 Mounting ASMlib driver filesystem [ OK ]
 Scanning system for ASM disks [ OK ]
--------------------------------------------------------
 -- Create disk partitions and  ASM disks
 -- (from one of the nodes of the cluster)
 --------------------------------------------------------
-- Having a list of devices dedicated to ASM create one primary partition per disk or LUN using fdisk
 -- command and than use ASMLib utility to implement one ASM Disk per device.
lrh-node1:/u01 # fdisk -l /dev/sdh
Disk /dev/sdh: 38.6 GB, 38654705664 bytes
 255 heads, 63 sectors/track, 4699 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
 /dev/sdh1               1        4700    37747712   83  Linux
 lrh-node1:/u01 # fdisk -l /dev/sdi
Disk /dev/sdi: 38.6 GB, 38654705664 bytes
 255 heads, 63 sectors/track, 4699 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
 /dev/sdi1               1        4700    37747712   83  Linux
 lrh-node1:/u01 # fdisk -l /dev/sdj
Disk /dev/sdj: 38.6 GB, 38654705664 bytes
 255 heads, 63 sectors/track, 4699 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
 /dev/sdj1               1        4699    37744686   83  Linux
 lrh-node1:/u01 #
 lrh-node1:/u01 # fdisk /dev/sdh
The number of cylinders for this disk is set to 4699.
 There is nothing wrong with that, but this is larger than 1024,
 and could in certain setups cause problems with:
 1) software that runs at boot time (e.g., old versions of LILO)
 2) booting and partitioning software from other OSs
 (e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): d
 Selected partition 1
Command (m for help): p
Disk /dev/sdh: 38.6 GB, 38654705664 bytes
 255 heads, 63 sectors/track, 4699 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
Command (m for help): n
 Command action
 e   extended
 p   primary partition (1-4)
 p
 Partition number (1-4): 1
 First cylinder (1-4699, default 1): 1
 Last cylinder or +size or +sizeM or +sizeK (1-4699, default 4699): +1024M
Command (m for help): p
Disk /dev/sdh: 38.6 GB, 38654705664 bytes
 255 heads, 63 sectors/track, 4699 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
 /dev/sdh1               1         125     1004031   83  Linux
Command (m for help): n
 Command action
 e   extended
 p   primary partition (1-4)
 p
 Partition number (1-4): 2
 First cylinder (126-4699, default 126):
 Using default value 126
 Last cylinder or +size or +sizeM or +sizeK (126-4699, default 4699):
 Using default value 4699
Command (m for help): p
Disk /dev/sdh: 38.6 GB, 38654705664 bytes
 255 heads, 63 sectors/track, 4699 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
 /dev/sdh1               1         125     1004031   83  Linux
 /dev/sdh2             126        4699    36740655   83  Linux
Command (m for help): w
 The partition table has been altered!
Calling ioctl() to re-read partition table.
 Syncing disks.
 lrh-node1:/u01 #partprobe
######  Once the disks have been sliced the ASM Disks can be created ######
 lrh-node1:/u01 # /etc/init.d/oracleasm
 Usage: /etc/init.d/oracleasm {start|stop|restart|enable|disable|configure|createdisk|deletedisk|querydisk|listdisks|scandisks|status}
 lrh-node1:/u01 # /etc/init.d/oracleasm createdisk OCR1 /dev/sdh1
 Marking disk "/dev/sdh1" as an ASM disk:                              done
 lrh-node1:/u01 # /etc/init.d/oracleasm createdisk DATA1 /dev/sdh2
 Marking disk "/dev/sdh2" as an ASM disk:                              done
 lrh-node1:/u01 # /etc/init.d/oracleasm scandisks
 Scanning system for ASM disks:                                           done
 lrh-node1:/u01 #
--  After having created all the ASM Disks runs the utility scandisks  on all nodes of the cluster,
 --  this allows ASM to discover all the new ASM Disks created.
lrh-node2:/dev/oracleasm/disks # /etc/init.d/oracleasm scandisks
 Scanning system for ASM disks:                                          done
 lrh-node2:/dev/oracleasm/disks # /etc/init.d/oracleasm listdisks
 DATA1
 DATA2
 DATA3
 OCR1
 OCR2
 OCR3
######  ASM diskstring and diskgroup parameters ######
 *.asm_diskstring='ORCL:*'
 *.asm_diskgroups='OCR_VOT','DATA1','FRA1'