Severe Oracle instability due to new RedHat 7.2 feature which releases IPC objects

I have recently installed a two node RAC version 12.1.0.2 on top of RedHat 7.2 and few hours after the initial setup I started experiencing ASM and database crashes.

Checking in the alert log I found the following errors:

Tue Oct 04 05:25:17 2016
Dumping diagnostic data in directory=[cdmp_20161004052517], requested by (instance=1, osid=84872 (MMAN)), summary=[abnormal instance termination].
Tue Oct 04 05:25:18 2016
Instance terminated by USER, pid = 84872
Tue Oct 04 05:25:18 2016
Errors in file /oams/base/diag/rdbms/txdop/txdop1/trace/txdop1_mman_84872.trc:
ORA-27300: OS system dependent operation:semctl failed with status: 22
ORA-27301: OS failure message: Invalid argument
ORA-27302: failure occurred at: sskgpwrm1
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1

 

The errors pointed to the OS and in particular to the possibility that semaphores in use by Oracle have been removed.

Because this is a fresh installation and I was the only person using the cluster, it was easy to exclude any third party activity. Then I double-checked the kernel parameters and all other system pre-requisites without finding any wrong configuration.

Finally, on MOS and I found the followinfg note ALERT: Setting RemoveIPC=yes on Redhat 7.2 Crashes ASM and Database Instances as Well as Any Application That Uses a Shared Memory Segment (SHM) or Semaphores (SEM) (Doc ID 2081410.1)”

Redhat 7.2, systemd-logind service introduced a new feature to remove all IPC objects when a user fully logs out.
The feature is controled by the option RemoveIPC in the /etc/systemd/logind.conf configuration file, see man logind.conf(5) for details.

The default value for RemoveIPC in RHEL7.2 is yes.

As a result, when the last oracle or grid user disconnects, the OS removes shared memory segments and semaphores for those users.
As Oracle ASM and Databases use shared memory segments for SGA, removing shared memory segments will crash the Oracle ASM and database instances.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s