My OOW18 Summary

 

For those who are interested here my major takeaways from the OOW18

 

As we all know, since few years the HOTTEST topic advertized at the OOW is “Cloud Computing”, but this time Oracle Cloud was no longer alone!

In fact the focus was divided between the new Oracle OCI Cloud, also named by Larry as Second Generation of Cloud and the Autonomous Database.

 

OCI Second Gen of Cloud

Here a summary of the major advantages compared to the previous version:

– Security, guaranteed by robots which scan the network for any malicious attack.  

– The cutting edge virtual network, which brings up to 50GB speed and extreme flexibility.

– Bare Metal Infrastructure based on Exadata Machines.

– Aggressive pricing, compared to the competitors.

 

Autonomous Database.

The Autonomous Database option is now available for OLTP and DWH databases and includes new capabilities like automatic index creation and column stored table conversion. In version 19 it will manage online memory increase and additional tuning options.

As announces during Larry’s keynote, the  Autonomous database will be also available with the Cloud @ Customer option (on Exadata only), ant it will no longer require human labor (DBA and Sys Admin intervention), because Self Provisioning, Self Driving, Self Tuning and Self Repairing.

For non-technical people it looks magic, but it is few steps from what we already use in a standard Oracle 12c Database. In fact Autonomous Database leverages a bunch of database advisors and tuning options, now orchestrated by an Artificial Intelligence and Machine Learning software, in order to provide data-driven predictions and decisions.

Over the next few years, Autonomous Database will be enriched with several new options, improving the quality of live of many DBAs, which will be relieved of the majority of the tedious and recurring tasks, leaving the most added value tasks under their own responsibility.

Last but not least, the Autonomous Database runs in a very high end configurations (Oracle guarantees 99,995% of availability), which is quite expensive to acquire due to the list of mandatory requirements: Exadata, RAC, Active DG, Multitenant, Tuning Pack, Diagnostic Pack etc..

 

Exadata Machine

Several interesting features are coming next year with the introduction of the INTEL Optane DC Persistent Memory for even faster OLTP.

This new type of memory will be installed on the Storage Cell and used as accelerator in front of Flash memory.

The database node will  access to the Persistent Memory via RDMA with a gain up to 20 x faster access latency.

Oracle is developing the more and more Remote Direct Memory Access (RDMA) instructions for Cache Fusion and Storage Cell operations in order to offload the database nodes and increase the overall performance.

Stay tuned on Exadata Machine because the next generation will also include BIG architectural change…

 

Oracle Virtual Machine (OVM)

One curiosity directly collected at Linux Virtualization booth is that even though the next generation of hypervisor will be based on KVM, Oracle will keep calling it OVM and of course the current OVM product based on XEN (OVS, OVM) will still be in use by many companies.

How possibly the customers can get confused ?!?

 

With this I finished, although there would be much more to write.

 


 

Advertisements

Exadata How Safely Erase All Data

When the time arrives to decommission an environment with sesitive data, we are frequently confronted to the problem how to certify to our customer or management the erase of all data and logs.

On Exadata Machine starting from the software release 12.2.1.1.0, this problem has been elegantly solved by Oracle introducing a new utility called Secure Eraser; which securely erases data on hard drives, flash devices, internal USBs, and resets ILOM to factory default.

 

In earlier software versions, the Exadata Storage Software includes CellCli commands to securely erase the user data:

CellCLI> DROP GRIDDISK ALL FLASHDISK PREFIX=DATA, ERASE=7pass
CellCLI> DROP GRIDDISK ALL PREFIX=DATA, ERASE=3pass

and

CellCLI> DROP CELLDISK ALL FLASHDISK ERASE=7pass 
CellCLI> DROP CELL ERASE=3pass

Unfortunatly those commands only cover the user data stored on the Storage Cell, and none of them produces an official certificate with the summary of the actions taken to guarantee the wipe of the data. While all this is done by Secure Eraser on all Compute and Storage nodes, sanitizing on all type of devices: user data, OS logs and network configurations.

 

Depending from the Exadata model, a subset of all of available options to execute Secure Eraser is possible:

  • Automatic Secure Eraser Ethrough PXE Boot
  • Interactive Secure Eraser through PXE Boot
  • Interactive Secure Eraser through Network Boot
  • Interactive Secure Eraser through External USB

 


 

Recently I used Secure Eraser through External USB on one Exadata X7-2 Machine and here are reported the different steps.

 

Copy the Secure Eraser Diagnostic image from MOS 2180963.1 to a USB stick.

 # dd if=image_diagnostics_18.1.4.0.0_LINUX.X64_180125.3-1.x86_64.usb of=/dev/sdb

 

Boot the server using the USB device with the Secure Eraser Diagnostic image

Exa_BootList.jpg

 

After login, start the Secure Erase process

/usr/sbin/secureeraser --erase --all --flash_erasure_method=7pass --hdd_erasure_method=3pass --technician=Emiliano_Fusaglia --witness=Mario_Bros --output=/mnt/iso

 

 

At the end of the erase process a Data Erasure Certificate similar to the one on the example below will be available in TXT, HTML and PDF format.

Exa_SecureErase_Report


 

 

 

Exadata Storage Snapshots

This post describes how to implement Oracle Database Snapshot Technology on Exadata Machine.

Because Exadata Storage Cell Smart Features, Storage Indexes, IORM and Network Resource Manager work at level of ASM Volume Manager only, (and they don’t work on top of ACFS Cluster File System), the implementation of the snapshot technology is different compared to any other non-Exadata environment.

At this purpuse Oracle has developed a new type of ASM Disk Group called SPARSE Disk Group. It uses ASM SPARSE Grid Disk based on Thin Provisioning to save the database snapshot copies and the associated metadata, and it supports non-CDB and PDB snapshot copy.

The implementation requires the following minimal software versions :

  • Exadata Storage Software version 12.1.2.1.0.
  • Oracle Database version 12.1.0.2 with bundle patch 5.
One major restriction applies to Exadata Storage Sanpshot compared to ACFS;
the source database must be a shared copy open on read only and called Test Master. The Test Master Database can not be modified or deleted as long the latest child snapshot is in use.
This restriction exists because Exadata Snapshot technology uses “allocate on first write”, and not “copy on write” (like for ACFS), and the snapshot is per-database-datafile.
When a child snapshot issue a write, the write goes to a private copy of that block inside the snapshot, preserving the original block value which can be accessed by other child snapshots of the same Test Master.

How to Implement Exadata Storage Snapshots in a PDB Environment

Check the celldisks for available free space to allocate to a new SPARSE Disk Group

[root@strgceladm01 ~]# cellcli -e list celldisk attributes name,freespace
 CD_00_strgceladm01 853.34375G
 CD_01_strgceladm01 853.34375G
 CD_02_strgceladm01 853.34375G
 CD_03_strgceladm01 853.34375G
 CD_04_strgceladm01 853.34375G
 CD_05_strgceladm01 853.34375G
 CD_06_strgceladm01 853.34375G
 CD_07_strgceladm01 853.34375G
 CD_08_strgceladm01 853.34375G
 CD_09_strgceladm01 853.34375G
 CD_10_strgceladm01 853.34375G
 CD_11_strgceladm01 853.34375G
 FD_00_strgceladm01 0
 FD_01_strgceladm01 0
 FD_02_strgceladm01 0
 FD_03_strgceladm01 0
[root@strgceladm01 ~]#


[root@strgceladm02 ~]# cellcli -e list celldisk attributes name,freespace
 CD_00_strgceladm02 853.34375G
 CD_01_strgceladm02 853.34375G
 CD_02_strgceladm02 853.34375G
 CD_03_strgceladm02 853.34375G
 CD_04_strgceladm02 853.34375G
 CD_05_strgceladm02 853.34375G
 CD_06_strgceladm02 853.34375G
 CD_07_strgceladm02 853.34375G
 CD_08_strgceladm02 853.34375G
 CD_09_strgceladm02 853.34375G
 CD_10_strgceladm02 853.34375G
 CD_11_strgceladm02 853.34375G
 FD_00_strgceladm02 0
 FD_01_strgceladm02 0
 FD_02_strgceladm02 0
 FD_03_strgceladm02 0
[root@strgceladm02 ~]#


[root@strgceladm03 ~]# cellcli -e list celldisk attributes name,freespace
 CD_00_strgceladm03 853.34375G
 CD_01_strgceladm03 853.34375G
 CD_02_strgceladm03 853.34375G
 CD_03_strgceladm03 853.34375G
 CD_04_strgceladm03 853.34375G
 CD_05_strgceladm03 853.34375G
 CD_06_strgceladm03 853.34375G
 CD_07_strgceladm03 853.34375G
 CD_08_strgceladm03 853.34375G
 CD_09_strgceladm03 853.34375G
 CD_10_strgceladm03 853.34375G
 CD_11_strgceladm03 853.34375G
 FD_00_strgceladm03 0
 FD_01_strgceladm03 0
 FD_02_strgceladm03 0
 FD_03_strgceladm03 0
[root@strgceladm03 ~]#

For each Storage Cell Create a SPARSE Grid Disks as described below

[root@strgceladm01 ~]# cellcli -e CREATE GRIDDISK ALL PREFIX=SPARSE, sparse=true, SIZE=853.34375G
Cell disks were skipped because they had no freespace for grid disks: FD_00_strgceladm01, FD_01_strgceladm01, FD_02_strgceladm01, FD_03_strgceladm01.
GridDisk SPARSE_CD_00_strgceladm01 successfully created
GridDisk SPARSE_CD_01_strgceladm01 successfully created
GridDisk SPARSE_CD_02_strgceladm01 successfully created
GridDisk SPARSE_CD_03_strgceladm01 successfully created
GridDisk SPARSE_CD_04_strgceladm01 successfully created
GridDisk SPARSE_CD_05_strgceladm01 successfully created
GridDisk SPARSE_CD_06_strgceladm01 successfully created
GridDisk SPARSE_CD_07_strgceladm01 successfully created
GridDisk SPARSE_CD_08_strgceladm01 successfully created
GridDisk SPARSE_CD_09_strgceladm01 successfully created
GridDisk SPARSE_CD_10_strgceladm01 successfully created
GridDisk SPARSE_CD_11_strgceladm01 successfully created
[root@strgceladm01 ~]#

For each Storage Cell List all Grid Disks

[root@strgceladm01 ~]# cellcli -e list griddisk attributes name,size
 DATAC1_CD_00_strgceladm01 6.294586181640625T
 DATAC1_CD_01_strgceladm01 6.294586181640625T
 DATAC1_CD_02_strgceladm01 6.294586181640625T
 DATAC1_CD_03_strgceladm01 6.294586181640625T
 DATAC1_CD_04_strgceladm01 6.294586181640625T
 DATAC1_CD_05_strgceladm01 6.294586181640625T
 DATAC1_CD_06_strgceladm01 6.294586181640625T
 DATAC1_CD_07_strgceladm01 6.294586181640625T
 DATAC1_CD_08_strgceladm01 6.294586181640625T
 DATAC1_CD_09_strgceladm01 6.294586181640625T
 DATAC1_CD_10_strgceladm01 6.294586181640625T
 DATAC1_CD_11_strgceladm01 6.294586181640625T
 FGRID_FD_00_strgceladm01 2.0717315673828125T
 FGRID_FD_01_strgceladm01 2.0717315673828125T
 FGRID_FD_02_strgceladm01 2.0717315673828125T
 FGRID_FD_03_strgceladm01 2.0717315673828125T
 RECOC1_CD_00_strgceladm01 1.78143310546875T
 RECOC1_CD_01_strgceladm01 1.78143310546875T
 RECOC1_CD_02_strgceladm01 1.78143310546875T
 RECOC1_CD_03_strgceladm01 1.78143310546875T
 RECOC1_CD_04_strgceladm01 1.78143310546875T
 RECOC1_CD_05_strgceladm01 1.78143310546875T
 RECOC1_CD_06_strgceladm01 1.78143310546875T
 RECOC1_CD_07_strgceladm01 1.78143310546875T
 RECOC1_CD_08_strgceladm01 1.78143310546875T
 RECOC1_CD_09_strgceladm01 1.78143310546875T
 RECOC1_CD_10_strgceladm01 1.78143310546875T
 RECOC1_CD_11_strgceladm01 1.78143310546875T
 SPARSE_CD_00_strgceladm01 853.34375G
 SPARSE_CD_01_strgceladm01 853.34375G
 SPARSE_CD_02_strgceladm01 853.34375G
 SPARSE_CD_03_strgceladm01 853.34375G
 SPARSE_CD_04_strgceladm01 853.34375G
 SPARSE_CD_05_strgceladm01 853.34375G
 SPARSE_CD_06_strgceladm01 853.34375G
 SPARSE_CD_07_strgceladm01 853.34375G
 SPARSE_CD_08_strgceladm01 853.34375G
 SPARSE_CD_09_strgceladm01 853.34375G
 SPARSE_CD_10_strgceladm01 853.34375G
 SPARSE_CD_11_strgceladm01 853.34375G
[root@strgceladm01 ~]#

From ASM Instance Create a SPARSE Disk Group

SQL> CREATE DISKGROUP SPARSEC1 EXTERNAL REDUNDANCY DISK 'o/*/SPARSE_CD_*'
ATTRIBUTE
'compatible.asm' = '12.2.0.1',
'compatible.rdbms' = '12.2.0.1',
'cell.smart_scan_capable'='TRUE',
'cell.sparse_dg' = 'allsparse',
'AU_SIZE' = '4M';

Diskgroup created.

Set the following ASM attributes on the Disk Group hosting the Test Master Database

ALTER DISKGROUP DATAC1 SET ATTRIBUTE 'access_control.enabled' = 'true';

Grant access to the OS RDBMS user used to access to the Disk Group

ALTER DISKGROUP DATAC1 ADD USER 'oracle';

From an ASM Instance Set ownership permissions for every file that belongs solely to the PDB being snapped cloned as per example below

alter diskgroup DATAC1 set ownership owner='oracle' for file '+DATAC1/CDBT/<xxxxxxxxxxxxxxxxxxx>/DATAFILE/system.xxx.xxxxxxx';
alter diskgroup DATAC1 set ownership owner='oracle' for file '+DATAC1/CDBT/<xxxxxxxxxxxxxxxxxxx>/DATAFILE/sysaux.xxx.xxxxxxx';
alter diskgroup DATAC1 set ownership owner='oracle' for file '+DATAC1/CDBT/<xxxxxxxxxxxxxxxxxxx>/DATAFILE/users.xxx.xxxxxxx';
...
..

Restart the Master Test PDB in Read Only

alter pluggable database PDBTESTMASTER close immediate instances=all;
alter pluggable database PDBTESTMASTER open read only;

Create the first PDB Snapshot Copy on Exadata SPARSE Disk Group

Create pluggable database PDBDEV01 from PDBTESTMASTER tempfile reuse create_file_dest='+SPARSEC1' snapshot copy;

Feedback of the Exadata Storage Snapshots

The ability to create storage efficient database copies in a few seconds, independently from the size of the Test Master is very useful for today IT departments; but such extreme velocity and flexibility is not entirely free. In fact performance tests on a I/O bound workload have highlighted important performance degradation. This reminds us that as defined by Oracle Corporation, the Snapshot Technology, included on Exadata Machine remains a non-production option.

Oracle DB stored on ASM vs ACFS

Nowadays a new Oracle database environment with Grid Infrastructure has three main storage options:

  1. Third party clustered file system
  2. ASM Disk Groups
  3. ACFS File System

While the first option was not in scope, this blog compares the result of the tests between ASM and ACFS, highlighting when to use one or the other to store 12c NON-CDB or CDB Databases.

The tests conducted on different environments using Oracle version 12.1.0.2 July PSU have shown controversial results compared to what Oracle  is promoting for the Oracle Database Appliance (ODA) in the following paper: “Frequently Asked Questions Storing Database Files in ACFS on Oracle Database Appliance

 

Outcome of the tests

ASM remains the preferred option to achieve the best I/O performance, while ACFS introduces interesting features like DB snapshot to quickly and space efficiently provision new databases.

The performance gap between the two solutions is not negligible as reported below by the  AWR – TOP Timed Events sections of two PDBs, sharing the same infrastructure, executing the same workload but respectively using ASM and ACFS storage:

  • PDBASM: Pluggable Database stored on  ASM Disk Group
  • PDBACFS:Pluggable Database stored on ACFS File System

 

 

PDBASM AWR – TOP Timed Events and Other Stats

topevents_asm

fg_asm

 

 

PDBACFS AWR – TOP Timed Events and Other Stats

TopEvents_ACFS.png

fg_acfs

 

Due to the different characteristics and results when ASM or ACFS is in use, it is not possible to give a generic recommendation. But case by case the choise should be driven by business needs like maximum performance versus fast and efficient database clone.

 

 

 

 

ODA X5-2 how to cap the number of active CPU Cores

I recently had to cap the number of active CPUs on a bare metal ODA X5-2, and I noticed that the procedure is slightly different from what I used in the past (link to initial post).

 

Perform the following steps to generate the Core Key:

  • Login to My Oracle Support (MOS) and click the submenu Systems.
  • Select the serial number of the appliance and click on “Core Configuration”in the Asset Details Screen
  • Select Manage Key
  • From the Combo list select the number of cores to activate  and click Generate Key to generate the key.
  • Click Copy Key to Clipboard to copy the key to the clipboard.
  • Paste the key into an empty text file and save the file to a location on the Oracle Database Appliance.

 

ODA X5-2 initial number of CPU Cores

[root@odax5-2n0 ~]# cat /proc/cpuinfo | grep -i processor
processor : 0
processor : 1
processor : 2
processor : 3
...
...
..
.
processor : 70
processor : 71

[root@odax5-2n0 ~]# cat /proc/cpuinfo | grep -i processor |wc -l
72
[root@odax5-2n0 ~]#

 

Checks before enforcing the CPU restriction:

[root@odax5-2n0 ~]# oakcli show server

Power State : On
 Open Problems : 0
 Model : ODA X5-2
 Type : Rack Mount
 Part Number : xxxxxxxxxxx
 Serial Number : nnnnXXXXnnX <<<<<<<<<<<< This serial MUST match on BOTH of the ODA servers
 Primary OS : Not Available
 ILOM Address : 192.168.21.35
 ILOM MAC Address : xx:xx:xx:xx:xx:xx
 Description : Oracle Database Appliance X5-2 nnnnXXXXnnX
 Locator Light : Off
 Actual Power Consumption : 345 watts
 Ambient Temperature : 21.250 degree C
 Open Problems Report : System is healthy

[root@odax5-2n0 ~]#


[root@odax5-2n1 /]# oakcli show server

Power State : On
 Open Problems : 0
 Model : ODA X5-2
 Type : Rack Mount
 Part Number : xxxxxxxxxxx
 Serial Number : nnnnXXXXnnX <<<<<<<<<<<< This serial MUST match on BOTH of the ODA servers 
 Primary OS : Not Available
 ILOM Address : 192.168.21.36
 ILOM MAC Address : xx:xx:xx:xx:xx:xx
 Description : Oracle Database Appliance X5-2 nnnnXXXXnnX
 Locator Light : Off
 Actual Power Consumption : 342 watts
 Ambient Temperature : 21.750 degree C
 Open Problems Report : System is healthy

[root@odax5-2n1 /]#

[root@odax5-2n0 ~]# oakcli show env_hw
BM ODA X5-2
Public interface : COPPER
[root@odax5-2n0 ~]#


[root@odax5-2n1 /]# oakcli show env_hw
BM ODA X5-2
Public interface : COPPER
[root@odax5-2n1 /]#


[root@odax5-2n0 ~]# ipmitool -I open sunoem getval /X/system_identifier
Target Value: Oracle Database Appliance X5-2 nnnnXXXXnnX
[root@odax5-2n0 ~]# fwupdate list sp_bios
==================================================
SP + BIOS
==================================================
ID Product Name ILOM Version BIOS/OBP Version XML Support
---------------------------------------------------------------------------------------------------------------
sp_bios ORACLE SERVER X5-2 v3.2.4.52 r101649 30050100 N/A
[root@odax5-2n0 ~]#

[root@odax5-2n1 /]# ipmitool -I open sunoem getval /X/system_identifier
Target Value: Oracle Database Appliance X5-2 nnnnXXXXnnX
[root@odax5-2n1 /]# fwupdate list sp_bios
==================================================
SP + BIOS
==================================================
ID Product Name ILOM Version BIOS/OBP Version XML Support
---------------------------------------------------------------------------------------------------------------
sp_bios ORACLE SERVER X5-2 v3.2.4.52 r101649 30050100 N/A
[root@odax5-2n1 /]#

 

Apply the CPU Key form the first ODA node

[root@odax5-2n0 ~]# /opt/oracle/oak/bin/oakcli apply core_config_key /root/ODA_PROD_CPU_KEY_SerialNumber_NumberofCores_Configkey.txt
INFO: Both nodes will be rebooted automatically after applying the license
Do you want to continue: [Y/N]?:
Y
INFO: User has confirmed for reboot


Please enter the root password:

............Completed

INFO: Applying core_config_key on '192.168.16.25'
... 
INFO : Running as root: /usr/bin/ssh -l root 192.168.16.25 /tmp/tmp_lic_exec.pl
INFO : Running as root: /usr/bin/ssh -l root 192.168.16.25 /opt/oracle/oak/bin/oakcli enforce core_config_key /tmp/.lic_file
Waiting for the Node '192.168.16.25' to reboot..................................
Node '192.168.16.25' is rebooted
Waiting for the Node '192.168.16.25' to be up before applying the license on the node '192.168.16.24'.
INFO: Applying core_config_key on '192.168.16.24'
...
INFO : Running as root: /usr/bin/ssh -l root 192.168.16.24 /tmp/tmp_lic_exec.pl
INFO : Running as root: /usr/bin/ssh -l root 192.168.16.24 /opt/oracle/oak/bin/oakcli enforce core_config_key /tmp/.lic_file

Broadcast message from root@odax5-2n0
 (unknown) at 11:03 ...

The system is going down for reboot NOW!
[root@odax5-2n0 ~]#

 

New CPU cores configuration

[root@odax5-2n0 ~]# /opt/oracle/oak/bin/oakcli show core_config_key

Host's serialnumber = nnnnXXXXnnX
Enabled Cores (per server) = 6
Total Enabled Cores (on two servers) = 12
Server type = X5-2 -> Oracle Server X5-2
Hyperthreading is enabled. Each core has 2 threads. Operating system displays 12 processors per server
[root@odax5-2n0 ~]#

[root@odax5-2n1 ~]# /opt/oracle/oak/bin/oakcli show core_config_key

Host's serialnumber = nnnnXXXXnnX
Enabled Cores (per server) = 6
Total Enabled Cores (on two servers) = 12
Server type = X5-2 -> Oracle Server X5-2
Hyperthreading is enabled. Each core has 2 threads. Operating system displays 12 processors per server
[root@odax5-2n1 ~]#

The “Great” ODA overwhelming the Exadata

Introduction

This article try to explain the technical reasons of the Oracle Database Appliance success, a well-known appliance with whom Oracle targets small and medium businesses, or specific departments of big companies looking for privacy and isolation from the rest of the IT. Nowadays this small and relatively cheap appliance (around 65’000$ price list) has evolved a lot, the storage has reached an important capacity 128TB raw expansible to 256TB, and the two X5-2 servers are the same used on the database node of the Exadata machine. Many customers, while defining the new database architecture evaluate the pros and cons of acquiring an ODA compared to the smallest Exadata configuration (one eight of a Rack). If the customer is not looking for a system with extreme performance and horizontal scalability beyond the two X5-2 servers, the Oracle Database Appliance is frequently the retained option.

Some of the ODA major features are:

  • High Availability: no single point of failure on all hardware and software components.
  • Performance: each server is equipped with 2×18-core Intel Xeon and 256GB of RAM extensible up to 768GB, cluster communication over InfiniBand. The shared storage offers a multi-tiers configuration with HDDs at 7.2K rpm and two type of SSDs for frequently accessed data and for database redo logs.
  • Flexibility & Scalability: running RAC, RAC One node and Single Instance databases.
  • Virtualized configuration: designed for offering Solution in-a-box, with high available virtual machines.
  • Optimized licensing model: pay-as-you-grow model activating a crescendo number of CPU-cores on demand, with the Bare Metal configuration; or capping the resources combining Oracle VM with the Hard Partitioning setup.
  • Time-to-market: no-matter if the ODA has to be installed bare metal or virtualized, this is a standardized and automated process generally completed in one or two day of work.
  • Price: the ODA is very competitive when comparing the cost to an equivalent commodity architecture; which in addition, must be engineered, integrated and maintained by the customer.

 

At the time of the writing of this article, the latest hardware model is ODA X5-2 and 12.1.2.6.0 is the software version. This HW and SW combination offers unique features, few of them not even available on the Exadata machine, like the possibility to host databases and applications in one single box, or the possibility to rapidly and space efficiently clone an 11gR2 and 12c database using ACFS Snapshot.

 

 

ODA HW & SW Architecture

Oracle Database Appliance is composed by two X5-2 servers and a shared storage shelf, which optionally can be doubled. Each Server disposes of: two 18-core Intel Xeon E5-2699 v3; 256GB RAM (optionally upgradable to 768GB) and two 600GB 10k rpm internal disks in RAID 1 for OS and software binaries.

This appliance is equipped with redundant networking connectivity up to 10Gb, redundant SAS HBAs and Storage I/O modules, redundant InfiniBand interconnect for cluster communication enabling 40 Gb/second server-to-server communication.

The software components are all part of Oracle “Red Stack” with Oracle Linux 6 UEK or OVM 3, Grid Infrastructure 12c, Oracle RDBMS 12c & 11gR2 and Oracle Appliance Manager.

 

 

ODA Front view

Components number 1 & 2 are the X5-2 Servers. Components 3 & 4 are the Storage and the optionally Storage extension.

ODA_Front

 

ODA Rear view

Highlight of the multiple redundant connections, including InfiniBand for Oracle Clusterware, ASM and RAC communications. No single point of HW or SW failure.

ODA_Back

 

 

Storage Organization

With 16x8TB SAS HDDs a total raw space of 128TB is available on each storage self (64TB in configuration ASM double-mirrored and 42.7TB with ASM triple-mirrored). To offer better I/O performance without exploding the price, Oracle has implemented the following SSD devices: 4x400GB ASM double-mirrored, for frequently accessed data, and 4x200GB ASM triple-mirrored, for database redo logs.

As shown on the picture aside, each rotating disk has two slices, the external, and more performant partition assigned to the +DATA ASM disk group, and the internal one allocated to +RECO ASM disk group.

 

ODA_Disk

This storage optimization allows the ODA to achieve competitive I/O performance. In a production-like environment, using the three type of disks, as per ODA Database template odb-24 (https://docs.oracle.com/cd/E22693_01/doc.12/e55580/sizing.htm), Trivadis has measured 12k I/O per second and a throughput of 2300 MB/s with an average latency of 10ms. As per Oracle documentation, the maximum number of I/O per second of the rotating disks, with a single storage shelf is 3300; but this value increases significantly relocating the hottest data files to +FLASH disk group created on SSD devices.

 

ACFS becomes the default database storage of ODA

Starting from the ODA software version 12.1.0.2, any fresh installation enforces ASM Cluster File System (ACFS) as lonely type of database storage support, restricting the supported database versions to 11.2.0.4 and greater. In case of ODA upgrade from previous release, all pre-existing databases are not automatically migrated to ACFS, but Oracle provides a tool called acfs_mig.pl for executing this mandatory step on all Non-CDB databases of version >= 11.2.0.4.

Oracle has decided to promote ACFS as default database storage on ODA environment for the following reasons:

  • ACFS provides almost equivalent performance than Oracle ASM disk groups.
  • Additional functionalities on industry standard POSIX file system.
  • Database snapshot copy of PDBs, and NON-CDB of version 11.2.0.4 or greater.
  • Advanced functionality for general-purpose files such as replication, tagging, encryption, security, and auditing.

Database created on ACFS follows the same Oracle Managed Files (OMF) standard used by ASM.

As in the past, the database provisioning requires the utilization of the command line interface oakcli and the selection of a database template, which defines several characteristics including the amount of space to allocate on each file system. Container and Non-Container databases can coexist on the same Oracle Database Appliance.

The ACFS file systems are created during the database provisioning process on top of the ASM disk groups +DATA, +RECO, +REDO, and optionally +FLASH. The file systems have two possible setups, depending on the database type Container or Non-Container.

  • Container database: for each CDB the ODA database-provisioning job creates dedicated ACFS file systems with the following characteristics:
Disk Characteristics ASM Disk group ACFS Mount Point
SAS Disk external partition +DATA /u02/app/oracle/oradata/datc<db_unique_name>
SAS Disk internal partition +RECO /u01/app/oracle/fast_recovery_area/rcoc<db_unique_name>
SSD Triple-mirrored +REDO /u01/app/oracle/oradata/rdoc<db_unique_name>
SSD Double-mirrored +FLASH (*) /u02/app/oracle/oradata/flashdata

 

  • Non-Container database: in case of Non-CDB the ODA database-provisioning job creates or resizes the following shared ACFS file systems:
Disk Characteristics ASM Disk group ACFS Mount Point
SAS Disk external partition +DATA /u02/app/oracle/oradata/datastore
SAS Disk internal partition +RECO /u01/app/oracle/fast_recovery_area/datastore
SSD Triple-mirrored +REDO /u01/app/oracle/oradata/datastore
SSD Double-mirrored +FLASH (*) /u02/app/oracle/oradata/flashdata

(*) Optionally used by the databases as Smart Flash Cache (extension of the SGA buffer cache), or allocated to store the hottest data files leveraging the I/O performance of the SSD disks.

 

Oracle Database Appliance Bare Metal

The bare metal configuration has been available since version one of the appliance, and nowadays it remains the default option proposed by Oracle, which pre-install the OS Linux on any new system. Very simple and intuitive to install thanks to the pre-built bundle software, which automates most of the steps. At the end of the installation, the architecture is very similar to any other two node RAC setup based on commodity hardware; but even from an operation point of view there are great advantages, because the Oracle Appliance Manager framework simplifies and accelerates the execution of almost any system and database administrator task.

Here below is depicted the ODA architecture when the bare metal configuration is in use:

ODA_Bare_Metal

 

Oracle Database Appliance Virtualized

When the ODA is deployed with the virtualization, both servers run Oracle VM Server, also called Dom0. Each Dom0 hosts in a local dedicated repository the ODA Base (or Dom Base), a privileged virtual machine where it is installed the Appliance Manager, Grid Infrastructure and RDBMS binaries. The ODA Base takes advantage of the Xen PCI Pass-through technology to provide direct access to the ODA shared disks presented and managed by ASM. This configuration reduces the VM flexibility; in fact, no VM migration is allowed for the two ODA Base, but it guarantees almost no I/O penalty in term of performance. With the Dom Base setup, the basic installation is completed and it is possible to start provisioning databases using Oracle Appliance Manager.

At the same time, the administrator can create new-shared repositories hosted on ACFS and NFS exported to the hypervisor for hosting the application virtual machines. Those application virtual machines are also identified with the name of Domain U.  The Domain U and the templates can be stored on a local or shared Oracle VM Server repository, but to enable the functionality to migrate between the two Oracle VM Servers a shared repository on the ACFS file system should be used.

Even when the virtualization is in use, Oracle Appliance Manager is the only framework for system and database administration tasks like repository creation, import of template, deployment of virtual machine, network configuration, database provisioning and so on, relieving the administrator from all complexity.

The implementation of the Solution-in-a-box guarantees the maximum Return on Investment of the ODA; in fact, while restricting the virtual CPUs to license on the Dom Base it allows relocating spare resources to the application virtual machines as showed on the picture below.

ODA_Virtualized

 

 

ODA compared to Exadata Machine and Commodity Hardware

As described on the previous sections, Oracle Database Appliance offers unique features such as pay-as-you-grow, solution-in-a-box and so on, which can heavily influence the decision for a new database architecture. The aim of the table below is to list the main architecture characteristics to evaluate while defining a new database infrastructure, comparing the result between Oracle Database Appliance, Exadata Machine and a Commodity Architecture based on Intel Linux engineered to run RAC databases.

Table_Architectures

As shown by the different scores of the three architectures, each solution comes with points of strength and weakness; about the Oracle Database Appliance, it is evident that due to its characteristics, the smallest Oracle Engineered System remains a great option for small, medium database environments.

 

Conclusion

I hope this article keep the initial promise to explain the technical reasons of the Oracle Database Appliance success, and it has highlighted the great work done by Oracle, engineering this solution on the edge of the technology keeping the price under control.

One last summary of what in my opinion are the major benefits offered by the ODA:

  • Time-to-market: Thanks to automated processes and pre-build software images, the deployment phase is extremely rapid.
  • Simplicity: The use of standard software components, combined to the appliance orchestrator Oracle Appliance Manager makes the ODA very simple to operate.
  • Standardization & Automation: The Appliance Manager encapsulates and automatizes all repeatable and error-prone tasks like provisioning, decommissioning, patching and so on.
  • Vendor certified platform: Oracle validates and certifies the compatibility among all HW & SW components.
  • Evolution: Over the time, the ODA benefits of specific bug fixing and software evolution (introduced by Oracle though the quarterly patch sets); keeping the system on the edge for longer time when compared to a commodity architecture.

EXADATA: How to enable Flash Cache WriteBack on a running system

In a recent tuning activity it was necessary to change the Exadata Smart Flash Cache from “WriteThrough” to “WriteBack“. Because the system was used in a 24/7 environment we had to implement the change in a Rolling Upgrade Fashion.

Here below are described the different steps.

 

From one DB node using dcli check the currest status of the storage cells:

[root@efudbadm02 ~]# dcli -g ~/cell_group -l root cellcli -e "list cell attributes flashcachemode"
efuceladm01: WriteThrough
efuceladm02: WriteThrough
efuceladm03: WriteThrough
efuceladm04: WriteThrough
efuceladm05: WriteThrough
efuceladm06: WriteThrough
efuceladm07: WriteThrough
efuceladm08: WriteThrough
efuceladm09: WriteThrough
efuceladm10: WriteThrough
efuceladm11: WriteThrough

From one DB node using dcli check that the properties asmdeactivationoutcome and asmmodestatus of all griddisks are respectively “Yes” and “ONLINE” before continuing with the change.

[root@efudbadm02 ~]# dcli -g cell_group -l root cellcli -e list griddisk attributes asmdeactivationoutcome, asmmodestatus
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm01: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
efuceladm02: Yes ONLINE
...
..
.

From one DB node using dcli check that all flashcache modules are in “normal” state and no flash disk is in degraded or critical state.

[root@efudbadm02 ~]# dcli -g cell_group -l root cellcli -e list flashcache detail
efuceladm01: name: efuceladm01_FLASHCACHE
efuceladm01: cellDisk: FD_00_efuceladm01,FD_07_efuceladm01,FD_06_efuceladm01,FD_03_efuceladm01,FD_05_efuceladm01,FD_01_efuceladm01,FD_02_efuceladm01,FD_04_efuceladm01
efuceladm01: creationTime: 2013-06-18T15:21:13+02:00
efuceladm01: degradedCelldisks:
efuceladm01: effectiveCacheSize: 744.125G
efuceladm01: id: 35b61001-438f-4d66-8ce9-40704f758d3f
efuceladm01: size: 744.125G
efuceladm01: status: normal
efuceladm02: name: efuceladm02_FLASHCACHE
efuceladm02: cellDisk: FD_06_efuceladm02,FD_05_efuceladm02,FD_00_efuceladm02,FD_02_efuceladm02,FD_01_efuceladm02,FD_07_efuceladm02,FD_03_efuceladm02,FD_04_efuceladm02
efuceladm02: creationTime: 2013-06-18T15:21:12+02:00
efuceladm02: degradedCelldisks:
efuceladm02: effectiveCacheSize: 744.125G
efuceladm02: id: 2f7eedd6-cda2-496e-98ec-417b94fb8ee7
efuceladm02: size: 744.125G
efuceladm02: status: normal
efuceladm03: name: efuceladm03_FLASHCACHE
efuceladm03: cellDisk: FD_00_efuceladm03,FD_04_efuceladm03,FD_01_efuceladm03,FD_02_efuceladm03,FD_03_efuceladm03,FD_06_efuceladm03,FD_05_efuceladm03,FD_07_efuceladm03
efuceladm03: creationTime: 2013-06-18T15:21:10+02:00
efuceladm03: degradedCelldisks:
efuceladm03: effectiveCacheSize: 744.125G
efuceladm03: id: c271cdb8-dc70-4009-ba97-dfc4c26b00ef
efuceladm03: size: 744.125G
efuceladm03: status: normal
...
..
.

Logon on the first Cell Storage and using CellCli interface perform the following procedure to enable the WriteBack Flash Cache in a rolling upgrade fashion.

 

Drop the existing flash cache

CellCLI> drop flashcache
Flash cache efuceladm01_FLASHCACHE successfully dropped

Inactivate the griddisk on the cell

CellCLI> alter griddisk all inactive
GridDisk DATA_CD_00_efuceladm01 successfully altered
GridDisk DATA_CD_01_efuceladm01 successfully altered
GridDisk DATA_CD_02_efuceladm01 successfully altered
GridDisk DATA_CD_03_efuceladm01 successfully altered
GridDisk DATA_CD_04_efuceladm01 successfully altered
GridDisk DATA_CD_05_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_02_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_03_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_04_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_05_efuceladm01 successfully altered
GridDisk RECO_CD_00_efuceladm01 successfully altered
GridDisk RECO_CD_01_efuceladm01 successfully altered
GridDisk RECO_CD_02_efuceladm01 successfully altered
GridDisk RECO_CD_03_efuceladm01 successfully altered
GridDisk RECO_CD_04_efuceladm01 successfully altered
GridDisk RECO_CD_05_efuceladm01 successfully altered

Shut down cellsrv service

CellCLI> alter cell shutdown services cellsrv

Stopping CELLSRV services...
The SHUTDOWN of CELLSRV services was successful.

Enable the Smart Flash Cache WriteBack

CellCLI> alter cell flashCacheMode=writeback
Cell efuceladm01 successfully altered

Restart the cellsrv service

CellCLI> alter cell startup services cellsrv

Starting CELLSRV services...
The STARTUP of CELLSRV services was successful.

Reactivate the griddisk on the cell

CellCLI> alter griddisk all active
GridDisk DATA_CD_00_efuceladm01 successfully altered
GridDisk DATA_CD_01_efuceladm01 successfully altered
GridDisk DATA_CD_02_efuceladm01 successfully altered
GridDisk DATA_CD_03_efuceladm01 successfully altered
GridDisk DATA_CD_04_efuceladm01 successfully altered
GridDisk DATA_CD_05_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_02_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_03_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_04_efuceladm01 successfully altered
GridDisk DBFS_DG_CD_05_efuceladm01 successfully altered
GridDisk RECO_CD_00_efuceladm01 successfully altered
GridDisk RECO_CD_01_efuceladm01 successfully altered
GridDisk RECO_CD_02_efuceladm01 successfully altered
GridDisk RECO_CD_03_efuceladm01 successfully altered
GridDisk RECO_CD_04_efuceladm01 successfully altered
GridDisk RECO_CD_05_efuceladm01 successfully altered

Recreate the flash cache

CellCLI> create flashcache all
Flash cache efuceladm01_FLASHCACHE successfully created

 


Verify that the Smart Flash Cache WriteBackWriteBack option is enabled

[root@efuceladm01 ~]# cellcli -e list cell detail | grep flashCacheMode
 flashCacheMode: writeback

Before applying the change to the next Exadata Storage Server  wait that all griddisk are synronized and online.

[root@efuceladm01 ~]# cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
 DATA_CD_00_efuceladm01 SYNCING Yes
 DATA_CD_01_efuceladm01 SYNCING Yes
 DATA_CD_02_efuceladm01 SYNCING Yes
 DATA_CD_03_efuceladm01 SYNCING Yes
 DATA_CD_04_efuceladm01 SYNCING Yes
 DATA_CD_05_efuceladm01 SYNCING Yes
 DBFS_DG_CD_02_efuceladm01 ONLINE Yes
 DBFS_DG_CD_03_efuceladm01 ONLINE Yes
 DBFS_DG_CD_04_efuceladm01 ONLINE Yes
 DBFS_DG_CD_05_efuceladm01 ONLINE Yes
 RECO_CD_00_efuceladm01 OFFLINE Yes
 RECO_CD_01_efuceladm01 OFFLINE Yes
 RECO_CD_02_efuceladm01 OFFLINE Yes
 RECO_CD_03_efuceladm01 OFFLINE Yes
 RECO_CD_04_efuceladm01 OFFLINE Yes
 RECO_CD_05_efuceladm01 OFFLINE Yes

Once the asmmodestatus is ONLINE on all griddisks it is safe to move to the next Storage Server.


 

At the end of the procedure all Storage Servers are configured with Smart Flash Cache WriteBach option:

[root@efudbadm02 ~]# dcli -g ~/cell_group -l root cellcli -e "list cell attributes flashcachemode"
efuceladm01: writeback
efuceladm02: writeback
efuceladm03: writeback
efuceladm04: writeback
efuceladm05: writeback
efuceladm06: writeback
efuceladm07: writeback
efuceladm08: writeback
efuceladm09: writeback
efuceladm10: writeback
efuceladm11: writeback