Adding flexibility to Oracle GI Implementing Multiple SCANs

Nowadays the business requirements force the IT to implement the more and more sophisticated and consolidated environments without compromising availability, performance and flexibility of each application running on it.

In this post, I explain how to improve the Grid Infrastructure Network flexibility, implementing multiple SCANs and how to associate one or multiple networks to the Oracle databases.

To better understand the reasons for such type of implementation, below are listed few common use cases:

  • Applications are deployed on different/dedicated subnets.
  • Network isolation due to security requirement.
  • Different database protocols are in use (TCP, TCPS, etc.).

 

 

Single Client Access Name (SCAN)

By default on each Oracle Grid Infrastructure cluster, indipendently from the number of nodes, one SCAN with 3 SCAN VIPs is created.

Below is depicted the default Oracle Clusterware network/SCAN configuration.

 

Single_Scan_Listener

 

Multiple Single Client Access Name (SCAN) implementation

Before implemeting additional SCANs, the OS provisioning of new network interfaces or new VLAN Tagging has to be completed.

The current example uses the second option (VLAN Tagging), and the bond0 interface is an Active/Active setup of two 10gbe cards, to which a VLAN tag has been added.

Below is represented the customized Oracle Clusterware network/SCAN configuration, having added a second SCAN.

 

Multi_Scan_Listeners

 

Step-by-step implementation

After completing the OS network setup, as grid owner add the new interface to the Grid Infrastructure:

grid@host01a:~# oifcfg setif -global bond0.764/10.15.69.0:public

grid@host01a:~# oifcfg getif
eno49 192.168.7.32 global cluster_interconnect,asm
eno50 192.168.9.48 global cluster_interconnect,asm
bond0 10.11.8.0 global public
bond0.764 10.15.69.0 global public
grid@host01a:~#

 

Then as root create the network number 2 and disply the configuration:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add network -netnum 2 -subnet 10.15.69.0/255.255.255.0/bond0.764 -nettype STATIC

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl config network -netnum 2
Network 2 exists
Subnet IPv4: 10.15.69.0/255.255.255.0/, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:

 

As root user add the node VIPs:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host01a -netnum 2 -address host01b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host02a -netnum 2 -address host02b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host03a -netnum 2 -address host03b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host04a -netnum 2 -address host04b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host05a -netnum 2 -address host05b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host06a -netnum 2 -address host06b-vip.emilianofusaglia.net/255.255.255.0

 

As grid user  create a new listener based on the network number 2:

grid@host01a:~# srvctl add listener -listener LISTENER2 -netnum 2 -endpoints "TCP:1532"

 

As root user add the new SCAN to the network number 2:

 root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add scan -scanname scan-02.emilianofusaglia.net -netnum 2

 

As root user start the new node VIPs:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host01b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host02b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host03b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host04b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host05b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host06b-vip.emilianofusaglia.net

 

As grid user start the new node Listeners:

grid@host01a:~# srvctl start listener -listener LISTENER2
grid@host01a:~# srvctl status listener -listener LISTENER2
Listener LISTENER2 is enabled
Listener LISTENER2 is running on node(s): host01a,host02a,host03a,host04a,host05a,host06a

 

As root user start the new SCAN and as grid user check the configuration:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start scan -netnum 2

grid@host01a:~# srvctl config scan -netnum 2
SCAN name: scan-02.emilianofusaglia.net, Network: 2
Subnet IPv4: 10.15.69.0/255.255.255.0/, static
Subnet IPv6:
SCAN 1 IPv4 VIP: 10.15.69.44
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:
SCAN 2 IPv4 VIP: 10.15.69.45
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:
SCAN 3 IPv4 VIP: 10.15.69.43
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:

grid@host01a:~# srvctl status scan -netnum 2
SCAN VIP scan1_net2 is enabled
SCAN VIP scan1_net2 is running on node host02a
SCAN VIP scan2_net2 is enabled
SCAN VIP scan2_net2 is running on node host01a
SCAN VIP scan3_net2 is enabled
SCAN VIP scan3_net2 is running on node host03a

 

As grid user add the SCAN Listener and check the configuration:

grid@host01a:~# srvctl add scan_listener -netnum 2 -listener LISTENER2 -endpoints TCP:1532

grid@host01a:~# srvctl config scan_listener -netnum 2
SCAN Listener LISTENER2_SCAN1_NET2 exists. Port: TCP:1532
Registration invited nodes:
Registration invited subnets:
SCAN Listener is enabled.
SCAN Listener is individually enabled on nodes:
SCAN Listener is individually disabled on nodes:
SCAN Listener LISTENER2_SCAN2_NET2 exists. Port: TCP:1532
Registration invited nodes:
Registration invited subnets:
SCAN Listener is enabled.
SCAN Listener is individually enabled on nodes:
SCAN Listener is individually disabled on nodes:
SCAN Listener LISTENER2_SCAN3_NET2 exists. Port: TCP:1532
Registration invited nodes:
Registration invited subnets:
SCAN Listener is enabled.
SCAN Listener is individually enabled on nodes:
SCAN Listener is individually disabled on nodes:

 

As grid user start the SCAN Listener2 and check the status:

grid@host01a:~# srvctl start scan_listener -netnum 2

grid@host01a:~# srvctl status scan_listener -netnum 2
SCAN Listener LISTENER2_SCAN1_NET2 is enabled
SCAN listener LISTENER2_SCAN1_NET2 is running on node host02a
SCAN Listener LISTENER2_SCAN2_NET2 is enabled
SCAN listener LISTENER2_SCAN2_NET2 is running on node host01a
SCAN Listener LISTENER2_SCAN3_NET2 is enabled
SCAN listener LISTENER2_SCAN3_NET2 is running on node host03a

 

Defining the multi SCANs configuration per database

Once the above configuration is completed, it remains to define which SCAN/s should be used by each database.

When multiple SCANs exists, by default the CRS populate the LISTENER_NETWORKS parameter to register the database against all SCANs and LISTENERs.

To overwrite this default behavior, allowing for example the authentication of a specific database only against the SCAN scan-02.emilianofusaglia.net, the database parameter LISTENER_NETWORKS should be manually configured.
The parameter LISTENER_NETWORKS can be dynamically set but the new value is enforced during the next instance restart.

 


 

ASM Filter Driver (ASMFD)

 

ASM Filter Driver is a Linux kernel module introduced in 12c R1. It resides in the I/O path of the Oracle ASM disks providing the following features:

  • Rejecting all non-Oracle I/O write requests to ASM Disks.
  • Device name persistency.
  • Node level fencing without reboot.

 

In 12c R2 ASMFD can be enabled from the GUI interface of the Grid Infrastructure installation, as shown on this post GI 12c R2 Installation at the step #8 “Create ASM Disk Group”.

Once ASM Filter Driver is in use, similarly to ASMLib the disks are managed using the ASMFD Label Name.

 

Here few examples about the implementation of ASM Filter Driver.

--How to create an ASMFD label in SQL*Plus
SQL> Alter system label set 'DATA1' to '/dev/mapper/mpathak';

System altered.


--How to create an ASM Disk Group with ASMFD
CREATE DISKGROUP DATA_DG EXTERNAL REDUNDANCY DISK 'AFD:DATA1' SIZE 30720M
ATTRIBUTE 'SECTOR_SIZE'='512','LOGICAL_SECTOR_SIZE'='512','compatible.asm'='12.2.0.1',
'compatible.rdbms'='12.2.0.1','compatible.advm'='12.2.0.1','au_size'='4M';

Diskgroup created.

 

ASM Filter Driver can also be managed from the ASM command line utility ASMCMD

--Check ASMFD status
ASMCMD> afd_state
ASMCMD-9526: The AFD state is 'LOADED' and filtering is 'ENABLED' on host 'oel7node06.localdomain'


--List ASM Disks where ASMFD is enabled
ASMCMD> afd_lsdsk
--------------------------------------------------------------------------------
Label                    Filtering                Path
================================================================================
DATA1                      ENABLED                /dev/mapper/mpathak
DATA2                      ENABLED                /dev/mapper/mpathan
DATA3                      ENABLED                /dev/mapper/mpathw
DATA4                      ENABLED                /dev/mapper/mpathac
GIMR1                      ENABLED                /dev/mapper/mpatham
GIMR2                      ENABLED                /dev/mapper/mpathaj
GIMR3                      ENABLED                /dev/mapper/mpathal
GIMR4                      ENABLED                /dev/mapper/mpathaf
GIMR5                      ENABLED                /dev/mapper/mpathai
RECO3                      ENABLED                /dev/mapper/mpathy
RECO1                      ENABLED                /dev/mapper/mpathab
RECO2                      ENABLED                /dev/mapper/mpathx
ASMCMD>


--How to remove an ASMFD label in ASMCMD
ASMCMD> afd_unlabel DATA4

 

 


 

Installing Oracle Grid Infrastructure 12c R2

It has been an exciting week, Oracle 12c R2 came out and suddenly was time to refresh the RAC test environments. My friend Jacques opted for an upgrade from 12.1.0.2 to 12.2.0.1 (here the link to his blog post),  I started with a fresh installation, because I also upgraded the Operating System to OEL  7.3.

Compared to 12c R1 there are new options on the installation process, but general speaking the wizard is quite similar.

The first breakthrough is about the installation simplified with an image based, no more runIstaller.sh to invoke but …

Unpack the .Zip file directly inside the Grid Infrastructure Home of the first cluster node as described below:

[grid@oel7node06 ~]$ mkdir -p /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ chown grid:oinstall /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ cd /u01/app/12.2.0.1/grid 
[grid@oel7node06 grid]$ unzip -q download_location/grid_home_image.zip

# From an X session invoke the Grid Infrastructure wizard: 
[grid@oel7node06 grid]$ ./gridSetup.sh

 

01

 

 

The second screenshot list the new Cluster typoligies available on 12c R2:

  • Oracle Standalone Cluster
  • Oracle Cluster Domain
    • Oracle Domain Services Cluster
    • Oracle Member Clusters
      • Oracle Member Cluster for Oracle Database
      • Oracle Member Cluster for Applications

 

In my case I’m installing an Oracle Standalone Cluster

02

 

 

03

04

 

05

 

06

 

07

 

08

 

09

 

10

 

11

 

12

 

13

 

14

 

15

 

16

 

17

 

18

19

 

20

 

21

 

22

 

And now time for testing.

 

 

New Oracle version (12.2.0.1) old BUG!

 

In June 2016 I posted the following BUG: Bug on Oracle 12c Multitenant & PDB Clone as Snapshot Copypromising to post an update once the version 12cR2 is available, because in the service request, originally opened with the version 12.1.0.2 Oracle stated that the bug would be fixed in 12cR2.

I was so impatient, that just few hours after the general availability of the Oracle Database 12c Release 2  I created a new cluster and tested the resolution.

 

For the record, it states that the resolution of this bug is important for one of my clients, where we have implemented the snapshot PDB on the application development lifecycle.

 

So let’s see if the bug has been fixed!

SQL*Plus: Release 12.2.0.1.0 Production on Wed Mar 1 21:06:54 2017

Copyright (c) 1982, 2016, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production


SQL> CREATE PLUGGABLE DATABASE PDBACFS1_SNAP1 from PDBACFS1 SNAPSHOT COPY;

Pluggable database created.

SQL> ALTER PLUGGABLE DATABASE PDBACFS1_SNAP1 OPEN instances=all;

Pluggable database altered.

SQL> select CON_ID, NAME, OPEN_MODE, SNAPSHOT_PARENT_CON_ID from v$pdbs where NAME in ('PDBACFS1','PDBACFS1_SNAP1');

 CON_ID     NAME               OPEN_MODE  SNAPSHOT_PARENT_CON_ID
---------- ------------------- ---------- ----------------------
 5          PDBACFS1            READ WRITE
 6          PDBACFS1_SNAP1      READ WRITE               <-- This should be 5 but is NULL

2 rows selected.

 

To a certain point of view progress has been made, in version 12.1.0.2 the column SNAPSHOT_PARENT_CON_ID was always zero (0) now is null!

I’m sorry for my customer, I’ll keep testing hoping …

 

 

Linux for DBA: How disable the ssh banner for a given user

Ready to install a new Oracle RAC cluster, but the ssh banner (in /etc/issue.net protected by root privileges) is compromising the non-interactive ssh commands issued by grid & oracle?

Here the trick to disable it:

--Add this empty file to the grid and oracle UNIX home
touch ~/.hushlogin 

--or
mkdir -p .ssh
chmod 700 .ssh 
echo "LogLevel quiet" > ~/.ssh/config

Oracle DB stored on ASM vs ACFS

Nowadays a new Oracle database environment with Grid Infrastructure has three main storage options:

  1. Third party clustered file system
  2. ASM Disk Groups
  3. ACFS File System

While the first option was not in scope, this blog compares the result of the tests between ASM and ACFS, highlighting when to use one or the other to store 12c NON-CDB or CDB Databases.

The tests conducted on different environments using Oracle version 12.1.0.2 July PSU have shown controversial results compared to what Oracle  is promoting for the Oracle Database Appliance (ODA) in the following paper: “Frequently Asked Questions Storing Database Files in ACFS on Oracle Database Appliance

 

Outcome of the tests

ASM remains the preferred option to achieve the best I/O performance, while ACFS introduces interesting features like DB snapshot to quickly and space efficiently provision new databases.

The performance gap between the two solutions is not negligible as reported below by the  AWR – TOP Timed Events sections of two PDBs, sharing the same infrastructure, executing the same workload but respectively using ASM and ACFS storage:

  • PDBASM: Pluggable Database stored on  ASM Disk Group
  • PDBACFS:Pluggable Database stored on ACFS File System

 

 

PDBASM AWR – TOP Timed Events and Other Stats

topevents_asm

fg_asm

 

 

PDBACFS AWR – TOP Timed Events and Other Stats

TopEvents_ACFS.png

fg_acfs

 

Due to the different characteristics and results when ASM or ACFS is in use, it is not possible to give a generic recommendation. But case by case the choise should be driven by business needs like maximum performance versus fast and efficient database clone.

 

 

 

 

Severe Oracle instability due to new RedHat 7.2 feature which releases IPC objects

I have recently installed a two node RAC version 12.1.0.2 on top of RedHat 7.2 and few hours after the initial setup I started experiencing ASM and database crashes.

Checking in the alert log I found the following errors:

Tue Oct 04 05:25:17 2016
Dumping diagnostic data in directory=[cdmp_20161004052517], requested by (instance=1, osid=84872 (MMAN)), summary=[abnormal instance termination].
Tue Oct 04 05:25:18 2016
Instance terminated by USER, pid = 84872
Tue Oct 04 05:25:18 2016
Errors in file /oams/base/diag/rdbms/txdop/txdop1/trace/txdop1_mman_84872.trc:
ORA-27300: OS system dependent operation:semctl failed with status: 22
ORA-27301: OS failure message: Invalid argument
ORA-27302: failure occurred at: sskgpwrm1
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1

 

The errors pointed to the OS and in particular to the possibility that semaphores in use by Oracle have been removed.

Because this is a fresh installation and I was the only person using the cluster, it was easy to exclude any third party activity. Then I double-checked the kernel parameters and all other system pre-requisites without finding any wrong configuration.

Finally, on MOS and I found the followinfg note ALERT: Setting RemoveIPC=yes on Redhat 7.2 Crashes ASM and Database Instances as Well as Any Application That Uses a Shared Memory Segment (SHM) or Semaphores (SEM) (Doc ID 2081410.1)”

Redhat 7.2, systemd-logind service introduced a new feature to remove all IPC objects when a user fully logs out.
The feature is controled by the option RemoveIPC in the /etc/systemd/logind.conf configuration file, see man logind.conf(5) for details.

The default value for RemoveIPC in RHEL7.2 is yes.

As a result, when the last oracle or grid user disconnects, the OS removes shared memory segments and semaphores for those users.
As Oracle ASM and Databases use shared memory segments for SGA, removing shared memory segments will crash the Oracle ASM and database instances.