Bulk Exadata Patching

After more than 11 year from the launch Oracle Exadata Machine has become popular on many companies across industries, making administrators, developers and final users almost unanimously satisfied about performance and availability.

But also, on Exadata there are cumbersome maintenance activities like patching.

Most of my Exadata customers have acquired 2 non-full RACKs, which makes the patching effort quite reasonable; but recently I started working on a project with multiple full RACKs, with tens of Storage Servers, Compute Nodes and hundreds of Virtual Machines…

A very challenging environment, especially when it came to patching…

Patching all the systems using the standard patchmgr utiliy was not acceptable, therefore I had to replace my standard patching procedure with a new one offering automation and scalability.

At this subject Oracle provides few handy options:

Patching Exadata Infrastructure

Storage Server Patching via http/https server: starting with Oracle Exadata System Software release 18.1.0.0.0, it is possible to patch the Storage Servers using an external http server hosting the new software image. The activitiy can be scheduled up to one week before the installation, allowing on each Cell the Management Server (MS) to start downloading and run pre-checks in advance. MS interrupts the software upgrade and generates an alert if the Cell does not comply with all pre-requisites.

Unbreakable Linux Network: ULN offers software patches, updates, and fixes for Oracle Linux and Oracle VM. The implementation of a local YUM repository leverages the patch automation of the bare metal OS or dom0/domU.

InfiniBand Switch: standard rolling upgrade patching procedure using patchmgr.

Patching Grid Infrastructure & RDBMS

GI & RDBMS: those components are patched using the standard Oracle tools common to all platforms, but the entire process has been parallalized using OS tools like dcli commands.

Overview Bulk Exadata Patching

Main Patching Commands

Storage Server – Scheduling Automated Storage Server Update via HTTP/HTTPS

On the Storage Cell set the local Apache location hosting the cell software

[root@efucndb01-a ~]# dcli -l root -g ~/cells cellcli -e 'alter softwareUpdate store=\"http://uln-yum.emilianofusaglia.net/cellsw\"'
efucncel01-a: Software Update successfully altered.
efucncel02-a: Software Update successfully altered.
efucncel03-a: Software Update successfully altered.
efucncel04-a: Software Update successfully altered.
efucncel05-a: Software Update successfully altered.
efucncel06-a: Software Update successfully altered.
efucncel07-a: Software Update successfully altered.
efucncel08-a: Software Update successfully altered.
efucncel09-a: Software Update successfully altered.
efucncel10-a: Software Update successfully altered.
efucncel11-a: Software Update successfully altered.
efucncel12-a: Software Update successfully altered.
efucncel13-a: Software Update successfully altered.
efucncel14-a: Software Update successfully altered.
[root@efucndb01-a ~]#

Schedule the update

[root@efucndb01-a ~]# dcli -l root -g ~/cells cellcli -e 'alter softwareUpdate time=\"03:20 AM WEDNESDAY\"'
efucncel01-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel02-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel03-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel04-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel05-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel06-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel07-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel08-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel09-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel10-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel11-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel12-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel13-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel14-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
[root@efucndb01-a ~]#

Verify the scheduled upgrade

[root@efucndb01-a ~]# dcli -l root -g ~/cells cellcli -e 'list softwareupdate detail'
efucncel01-a: name: 19.3.4.0.0.200130
efucncel01-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel01-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel01-a: time: 2020-02-05T03:20:00+01:00
efucncel02-a: name: 19.3.4.0.0.200130
efucncel02-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel02-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel02-a: time: 2020-02-05T03:20:00+01:00
efucncel03-a: name: 19.3.4.0.0.200130
efucncel03-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel03-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel03-a: time: 2020-02-05T03:20:00+01:00
efucncel04-a: name: 19.3.4.0.0.200130
efucncel04-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel04-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel04-a: time: 2020-02-05T03:20:00+01:00
efucncel05-a: name: 19.3.4.0.0.200130
efucncel05-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel05-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel05-a: time: 2020-02-05T03:20:00+01:00
efucncel06-a: name: 19.3.4.0.0.200130
efucncel06-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel06-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel06-a: time: 2020-02-05T03:20:00+01:00
efucncel07-a: name: 19.3.4.0.0.200130
efucncel07-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel07-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel07-a: time: 2020-02-05T03:20:00+01:00
efucncel08-a: name: 19.3.4.0.0.200130
efucncel08-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel08-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel08-a: time: 2020-02-05T03:20:00+01:00
efucncel09-a: name: 19.3.4.0.0.200130
efucncel09-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel09-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel09-a: time: 2020-02-05T03:20:00+01:00
efucncel10-a: name: 19.3.4.0.0.200130
efucncel10-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel10-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel10-a: time: 2020-02-05T03:20:00+01:00
efucncel11-a: name: 19.3.4.0.0.200130
efucncel11-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel11-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel11-a: time: 2020-02-05T03:20:00+01:00
efucncel12-a: name: 19.3.4.0.0.200130
efucncel12-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel12-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel12-a: time: 2020-02-05T03:20:00+01:00
efucncel13-a: name: 19.3.4.0.0.200130
efucncel13-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel13-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel13-a: time: 2020-02-05T03:20:00+01:00
efucncel14-a: name: 19.3.4.0.0.200130
efucncel14-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel14-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel14-a: time: 2020-02-05T03:20:00+01:00
[root@efucndb01-a ~]#

Unbreakable Linux Network

dom0 checks

[root@efuconsole dbserver_patch_19.200120]# ./patchmgr -dbnodes ~/dom0 -precheck -yum_repo http://uln-yum.emilianofusaglia.net/yum/EngineeredSystems/exadata/dbserver/dom0/19.3.4.0.0/base/x86_64 -target_version 19.3.4.0.0.200130

NOTE patchmgr release: 19.200120 (always check MOS 1553103.1 for the latest release of dbserver.patch.zip)
NOTE
WARNING Do not interrupt the patchmgr session.
WARNING Do not resize the screen. It may disturb the screen layout.
WARNING Do not reboot database nodes during update or rollback.
WARNING Do not open logfiles in write mode and do not try to alter them.

2020-02-06 14:06:17 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-06 14:06:19 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-06 14:06:22 +0100 :Working: Initiate precheck on 8 node(s)
2020-02-06 14:07:36 +0100 :Working: Check free space on node(s)
2020-02-06 14:07:42 +0100 :SUCCESS: Check free space on node(s)
2020-02-06 14:08:07 +0100 :Working: dbnodeupdate.sh running a precheck on node(s).
2020-02-06 14:09:43 +0100 :SUCCESS: Initiate precheck on node(s).
2020-02-06 14:09:45 +0100 :SUCCESS: Completed run of command: ./patchmgr -dbnodes /root/dom0 -precheck -yum_repo http://uln-yum.emilianofusaglia.net/yum/EngineeredSystems/exadata/dbserver/dom0/19.3.4.0.0/base/x86_64 -target_version 19.3.4.0.0.200130
2020-02-06 14:09:45 +0100 :INFO : Precheck attempted on nodes in file /root/dom0: [efucndb01-a efucndb02-a efucndb03-a efucndb04-a efucndb05-a efucndb06-a efucndb07-a efucndb08-a]
2020-02-06 14:09:45 +0100 :INFO : Current image version on dbnode(s) is:
2020-02-06 14:09:45 +0100 :INFO : efucndb01-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb02-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb03-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb04-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb05-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb06-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb07-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb08-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : For details, check the following files in /EXAVMIMAGES/Patch/patchmgr_DBSERVER/dbserver_patch_19.200120:
2020-02-06 14:09:45 +0100 :INFO : - _dbnodeupdate.log
2020-02-06 14:09:45 +0100 :INFO : - patchmgr.log
2020-02-06 14:09:45 +0100 :INFO : - patchmgr.trc
2020-02-06 14:09:45 +0100 :INFO : Exit status:0
2020-02-06 14:09:45 +0100 :INFO : Exiting.
[root@efucndb01-a dbserver_patch_19.200120]#

dom0 upgrade

[root@efuconsole dbserver_patch_19.200120]# ./patchmgr -dbnodes ~/dom0 -upgrade -yum_repo http://uln-yum.emilianofusaglia.net/yum/EngineeredSystems/exadata/dbserver/dom0/19.3.4.0.0/base/x86_64 -target_version 19.3.4.0.0.200130

NOTE patchmgr release: 19.200120 (always check MOS 1553103.1 for the latest release of dbserver.patch.zip)
NOTE
NOTE Database nodes will reboot during the update process.
NOTE
WARNING Do not interrupt the patchmgr session.
WARNING Do not resize the screen. It may disturb the screen layout.
WARNING Do not reboot database nodes during update or rollback.
WARNING Do not open logfiles in write mode and do not try to alter them.

2020-02-06 14:29:11 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-06 14:29:13 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-06 14:29:15 +0100 :Working: Initiate prepare steps on node(s).
2020-02-06 14:29:17 +0100 :Working: Check free space on node(s)
2020-02-06 14:29:23 +0100 :SUCCESS: Check free space on node(s)
2020-02-06 14:29:59 +0100 :SUCCESS: Initiate prepare steps on node(s).
2020-02-06 14:29:59 +0100 :Working: Initiate update on 8 node(s).
2020-02-06 14:29:59 +0100 :Working: dbnodeupdate.sh running a backup on 8 node(s).
2020-02-06 14:36:16 +0100 :SUCCESS: dbnodeupdate.sh running a backup on 8 node(s).
2020-02-06 14:36:16 +0100 :Working: Initiate update on node(s)
2020-02-06 14:36:16 +0100 :Working: Get information about any required OS upgrades from node(s).
2020-02-06 14:36:28 +0100 :SUCCESS: Get information about any required OS upgrades from node(s).
2020-02-06 14:36:28 +0100 :Working: dbnodeupdate.sh running an update step on all nodes.
2020-02-06 14:56:38 +0100 :INFO : efucndb01-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb02-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb03-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb04-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb05-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb06-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb07-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb08-a is ready to reboot.
2020-02-06 14:56:39 +0100 :SUCCESS: dbnodeupdate.sh running an update step on all nodes.
2020-02-06 14:56:51 +0100 :Working: Initiate reboot on node(s)
2020-02-06 14:56:55 +0100 :SUCCESS: Initiate reboot on node(s)
2020-02-06 14:56:55 +0100 :Working: Waiting to ensure node(s) is down before reboot.
2020-02-06 14:58:20 +0100 :SUCCESS: Waiting to ensure node(s) is down before reboot.
2020-02-06 14:58:20 +0100 :Working: Waiting to ensure node(s) is up after reboot.
2020-02-06 15:04:23 +0100 :SUCCESS: Waiting to ensure node(s) is up after reboot.
2020-02-06 15:04:23 +0100 :Working: Waiting to connect to node(s) with SSH. During Linux upgrades this can take some time.
2020-02-06 15:27:46 +0100 :SUCCESS: Waiting to connect to node(s) with SSH. During Linux upgrades this can take some time.
2020-02-06 15:27:46 +0100 :Working: Wait for node(s) is ready for the completion step of update.
2020-02-06 15:31:29 +0100 :SUCCESS: Wait for node(s) is ready for the completion step of update.
2020-02-06 15:31:30 +0100 :Working: Initiate completion step from dbnodeupdate.sh on node(s)
2020-02-06 15:48:10 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb01-a
2020-02-06 15:48:14 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb02-a
2020-02-06 15:48:19 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb03-a
2020-02-06 15:48:30 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb04-a
2020-02-06 15:48:35 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb05-a
2020-02-06 15:48:46 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb06-a
2020-02-06 15:48:50 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb07-a
2020-02-06 15:49:02 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb08-a
2020-02-06 15:49:18 +0100 :SUCCESS: Initiate update on node(s).
2020-02-06 15:49:18 +0100 :SUCCESS: Initiate update on 0 node(s).
[INFO ] Collected dbnodeupdate diag in file: Diag_patchmgr_dbnode_upgrade_060220142909.tbz
-rw-r--r-- 1 root root 6381043 Feb 6 15:49 Diag_patchmgr_dbnode_upgrade_060220142909.tbz
2020-02-06 15:49:22 +0100 :SUCCESS: Completed run of command: ./patchmgr -dbnodes /root/dom0 -upgrade -yum_repo http://uln-yum.emilianofusaglia.net/yum/EngineeredSystems/exadata/dbserver/dom0/19.3.4.0.0/base/x86_64 -target_version 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : Upgrade attempted on nodes in file /root/dom0: [efucndb01-a efucndb02-a efucndb03-a efucndb04-a efucndb05-a efucndb06-a efucndb07-a efucndb08-a]
2020-02-06 15:49:22 +0100 :INFO : Current image version on dbnode(s) is:
2020-02-06 15:49:22 +0100 :INFO : efucndb01-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb02-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb03-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb04-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb05-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb06-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb07-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb08-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : For details, check the following files in /EXAVMIMAGES/Patch/patchmgr_DBSERVER/dbserver_patch_19.200120:
2020-02-06 15:49:22 +0100 :INFO : - _dbnodeupdate.log
2020-02-06 15:49:22 +0100 :INFO : - patchmgr.log
2020-02-06 15:49:22 +0100 :INFO : - patchmgr.trc
2020-02-06 15:49:22 +0100 :INFO : Exit status:0
2020-02-06 15:49:22 +0100 :INFO : Exiting.

InfiniBand Switch

IB switch checks

[root@efucndb01-a patch_switch_19.3.4.0.0.200130]# ./patchmgr -ibswitches ~/ibs -upgrade -ibswitch_precheck
2020-02-10 07:57:44 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-10 07:57:46 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-10 07:57:47 +0100 1 of 1 :Working: Initiate pre-upgrade validation check on InfiniBand switch(es).
----- InfiniBand switch update process started 2020-02-10 07:57:48 +0100 -----
[NOTE ] Log file at /EXAVMIMAGES/Patch/IBs_19.3.4.0.0/patch_switch_19.3.4.0.0.200130/upgradeIBSwitch.log
[INFO ] List of InfiniBand switches for upgrade: ( efucnsw-iba01-a efucnsw-ibb01-a )
[SUCCESS ] Verifying Network connectivity to efucnsw-iba01-a
[SUCCESS ] Verifying Network connectivity to efucnsw-ibb01-a
[SUCCESS ] Validating verify-topology output
[INFO ] Master Subnet Manager is set to "efucnsw-iba01-a" in all Switches
[INFO ] ---------- Starting with InfiniBand Switch efucnsw-iba01-a
[WARNING ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.13-2 with the current package.
To downgrade to other versions:
Manually download the InfiniBand switch firmware package to the patch directory
Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
Run patchmgr command to initiate downgrade.
[SUCCESS ] Verify SSH access to the patchmgr host efucndb01-a.emilianofusaglia.net from the InfiniBand Switch efucnsw-iba01-a.
[INFO ] Starting pre-update validation on efucnsw-iba01-a
[SUCCESS ] Verifying that /tmp has 150M in efucnsw-iba01-a, found 492M
[SUCCESS ] Verifying that / has 20M in efucnsw-iba01-a, found 28M
[SUCCESS ] NTP daemon is running on efucnsw-iba01-a.
[INFO ] Manually validate the following entries Date:(YYYY-aM-DD) 2020-02-10 Time:(HH:MM:SS) 07:58:05
[INFO ] Validating the current firmware on the InfiniBand Switch
[SUCCESS ] Firmware verification on InfiniBand switch efucnsw-iba01-a
[SUCCESS ] Verifying that the patchmgr host efucndb01-a.emilianofusaglia.net is recognized on the InfiniBand Switch efucnsw-iba01-a through getHostByName
[SUCCESS ] Execute plugin check for Patch Check Prereq on efucnsw-iba01-a
[INFO ] Finished pre-update validation on efucnsw-iba01-a
[SUCCESS ] Pre-update validation on efucnsw-iba01-a
[SUCCESS ] Prereq check on efucnsw-iba01-a
[INFO ] ---------- Starting with InfiniBand Switch efucnsw-ibb01-a
[WARNING ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.13-2 with the current package.
To downgrade to other versions:
Manually download the InfiniBand switch firmware package to the patch directory
Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
Run patchmgr command to initiate downgrade.
[SUCCESS ] Verify SSH access to the patchmgr host efucndb01-a.emilianofusaglia.net from the InfiniBand Switch efucnsw-ibb01-a.
[INFO ] Starting pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Verifying that /tmp has 150M in efucnsw-ibb01-a, found 492M
[SUCCESS ] Verifying that / has 20M in efucnsw-ibb01-a, found 28M
[SUCCESS ] NTP daemon is running on efucnsw-ibb01-a.
[INFO ] Manually validate the following entries Date:(YYYY-aM-DD) 2020-02-10 Time:(HH:MM:SS) 07:58:25
[INFO ] Validating the current firmware on the InfiniBand Switch
[SUCCESS ] Firmware verification on InfiniBand switch efucnsw-ibb01-a
[SUCCESS ] Verifying that the patchmgr host efucndb01-a.emilianofusaglia.net is recognized on the InfiniBand Switch efucnsw-ibb01-a through getHostByName
[SUCCESS ] Execute plugin check for Patch Check Prereq on efucnsw-ibb01-a
[INFO ] Finished pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Prereq check on efucnsw-ibb01-a
[SUCCESS ] Overall status
----- InfiniBand switch update process ended 2020-02-10 07:58:42 +0100 -----
2020-02-10 07:58:42 +0100 1 of 1 :SUCCESS: Initiate pre-upgrade validation check on InfiniBand switch(es).
2020-02-10 07:58:42 +0100 :SUCCESS: Completed run of command: ./patchmgr -ibswitches /root/ibs -upgrade -ibswitch_precheck
2020-02-10 07:58:42 +0100 :INFO : upgrade attempted on nodes in file /root/ibs: [efucnsw-iba01-a efucnsw-ibb01-a]
2020-02-10 07:58:42 +0100 :INFO : For details, check the following files in /EXAVMIMAGES/Patch/IBs_19.3.4.0.0/patch_switch_19.3.4.0.0.200130:
2020-02-10 07:58:42 +0100 :INFO : - upgradeIBSwitch.log
2020-02-10 07:58:42 +0100 :INFO : - upgradeIBSwitch.trc
2020-02-10 07:58:42 +0100 :INFO : - patchmgr.stdout
2020-02-10 07:58:42 +0100 :INFO : - patchmgr.stderr
2020-02-10 07:58:42 +0100 :INFO : - patchmgr.log
2020-02-10 07:58:42 +0100 :INFO : - patchmgr.trc
2020-02-10 07:58:42 +0100 :INFO : Exit status:0
2020-02-10 07:58:42 +0100 :INFO : Exiting.
[root@efucndb01-a patch_switch_19.3.4.0.0.200130]#

IB switch upgrade

[root@efucndb01-a patch_switch_19.3.4.0.0.200130]# ./patchmgr -ibswitches ~/ibs -upgrade
2020-02-10 07:59:22 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-10 07:59:24 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-10 07:59:25 +0100 1 of 1 :Working: Initiate upgrade of InfiniBand switches to 2.2.14-1. Expect up to 40 minutes for each switch
----- InfiniBand switch update process started 2020-02-10 07:59:25 +0100 -----
[NOTE ] Log file at /EXAVMIMAGES/Patch/IBs_19.3.4.0.0/patch_switch_19.3.4.0.0.200130/upgradeIBSwitch.log
[INFO ] List of InfiniBand switches for upgrade: ( efucnsw-iba01-a efucnsw-ibb01-a )
[SUCCESS ] Verifying Network connectivity to efucnsw-iba01-a
[SUCCESS ] Verifying Network connectivity to efucnsw-ibb01-a
[SUCCESS ] Validating verify-topology output
[INFO ] Proceeding with upgrade of InfiniBand switches to version 2.2.14_1
[INFO ] Master Subnet Manager is set to "efucnsw-iba01-a" in all Switches
[INFO ] ---------- Starting with InfiniBand Switch efucnsw-iba01-a
[WARNING ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.13-2 with the current package.
To downgrade to other versions:
Manually download the InfiniBand switch firmware package to the patch directory
Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
Run patchmgr command to initiate downgrade.
[SUCCESS ] Verify SSH access to the patchmgr host efucndb01-a.emilianofusaglia.net from the InfiniBand Switch efucnsw-iba01-a.
[INFO ] Starting pre-update validation on efucnsw-iba01-a
[SUCCESS ] Verifying that /tmp has 150M in efucnsw-iba01-a, found 492M
[SUCCESS ] Verifying that / has 20M in efucnsw-iba01-a, found 26M
[SUCCESS ] Service opensmd is running on InfiniBand Switch efucnsw-iba01-a
[SUCCESS ] NTP daemon is running on efucnsw-iba01-a.
[INFO ] Manually validate the following entries Date:(YYYY-aM-DD) 2020-02-10 Time:(HH:MM:SS) 07:59:41
[INFO ] Validating the current firmware on the InfiniBand Switch
[SUCCESS ] Firmware verification on InfiniBand switch efucnsw-iba01-a
[SUCCESS ] Verifying that the patchmgr host efucndb01-a.emilianofusaglia.net is recognized on the InfiniBand Switch efucnsw-iba01-a through getHostByName
[SUCCESS ] Execute plugin check for Patch Check Prereq on efucnsw-iba01-a
[INFO ] Finished pre-update validation on efucnsw-iba01-a
[SUCCESS ] Pre-update validation on efucnsw-iba01-a
[INFO ] Package will be downloaded at firmware update time via scp
[SUCCESS ] Execute plugin check for Patching on efucnsw-iba01-a
[INFO ] Starting upgrade on efucnsw-iba01-a to 2.2.14_1. Please give upto 15 mins for the process to complete. DO NOT INTERRUPT or HIT CTRL+C during the upgrade
[INFO ] Rebooting efucnsw-iba01-a to complete the firmware update. Wait for 15 minutes before continuing. DO NOT MANUALLY REBOOT THE INFINIBAND SWITCH
Connection to efucndb01-a closed by remote host.
Connection to efucndb01-a closed.
2020-02-10 08:27:49 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-10 08:27:51 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-10 08:27:52 +0100 1 of 1 :Working: Initiate upgrade of InfiniBand switches to 2.2.14-1. Expect up to 40 minutes for each switch
----- InfiniBand switch update process started 2020-02-10 08:27:52 +0100 -----
[NOTE ] Log file at /EXAVMIMAGES/Patch/IBs_19.3.4.0.0/patch_switch_19.3.4.0.0.200130/upgradeIBSwitch.log
[INFO ] List of InfiniBand switches for upgrade: ( efucnsw-iba01-a efucnsw-ibb01-a )
[SUCCESS ] Verifying Network connectivity to efucnsw-iba01-a
[SUCCESS ] Verifying Network connectivity to efucnsw-ibb01-a
[INFO ] InfiniBand switch efucnsw-iba01-a is already at target version.
[SUCCESS ] Validating verify-topology output
[INFO ] Proceeding with upgrade of InfiniBand switches to version 2.2.14_1
[INFO ] Master Subnet Manager is set to "efucnsw-ibb01-a" in all Switches
[INFO ] ---------- Starting with InfiniBand Switch efucnsw-ibb01-a
[WARNING ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.13-2 with the current package.
To downgrade to other versions:
Manually download the InfiniBand switch firmware package to the patch directory
Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
Run patchmgr command to initiate downgrade.
[SUCCESS ] Verify SSH access to the patchmgr host efucndb01-a.emilianofusaglia.net from the InfiniBand Switch efucnsw-ibb01-a.
[INFO ] Starting pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Verifying that /tmp has 150M in efucnsw-ibb01-a, found 492M
[SUCCESS ] Verifying that / has 20M in efucnsw-ibb01-a, found 26M
[SUCCESS ] Service opensmd is running on InfiniBand Switch efucnsw-ibb01-a
[SUCCESS ] NTP daemon is running on efucnsw-ibb01-a.
[INFO ] Manually validate the following entries Date:(YYYY-aM-DD) 2020-02-10 Time:(HH:MM:SS) 08:28:07
[INFO ] Validating the current firmware on the InfiniBand Switch
[SUCCESS ] Firmware verification on InfiniBand switch efucnsw-ibb01-a
[SUCCESS ] Verifying that the patchmgr host efucndb01-a.emilianofusaglia.net is recognized on the InfiniBand Switch efucnsw-ibb01-a through getHostByName
[SUCCESS ] Execute plugin check for Patch Check Prereq on efucnsw-ibb01-a
[INFO ] Finished pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Pre-update validation on efucnsw-ibb01-a
[INFO ] Package will be downloaded at firmware update time via scp
[SUCCESS ] Execute plugin check for Patching on efucnsw-ibb01-a
[INFO ] Starting upgrade on efucnsw-ibb01-a to 2.2.14_1. Please give upto 15 mins for the process to complete. DO NOT INTERRUPT or HIT CTRL+C during the upgrade
[INFO ] Rebooting efucnsw-ibb01-a to complete the firmware update. Wait for 15 minutes before continuing. DO NOT MANUALLY REBOOT THE INFINIBAND SWITCH
Connection to efucndb01-a closed by remote host.
Connection to efucndb01-a closed.

Exadata Deployment with Elastic Configuration

Recently, for one of my customers, I had the chance to install a couples of Exadata X7-2 using the new Elastic Configuration. The major benefits of using Elastic Configuration consists in the possibility to acquire the Exadata Machine with almost any possible combination of Database Nodes and Storage Cells.

In the past we were used to standard Oracle pre-defined Exadata Machine configurations: Eighth Rack, Quarter Rack, Half Rack and Full Rack, which is still possible, but not flexible enough.

The pictures below highlight the differences between the two configurations:

Edadats_Classiv_vs_Elastic

source: Oracle Data Sheet Exadata Database Machine X7-2

Deployment Exadata Elactic Configuration

The elastic configuration process automates the initial IP address allocations to databasenodes and storage cells, regardless the ordered configuration. The Exadata Machine is connected to the InfiniBand switches using a standard cabling methodology which allows to determinate the node’s location in the rack. This information is therefore used when the nodes are powered up for the first time in order to assign the initial default IPs.

[root@exatest-iba0 ~]# ibhosts
Ca : 0x579b0123796ba0 ports 2 "node10 elasticNode 192.168.10.17,192.168.10.18 ETH0"
Ca : 0x579b01237966e0 ports 2 "node8 elasticNode 192.168.10.15,192.168.10.16 ETH0"
Ca : 0x579b0123844ab0 ports 2 "node6 elasticNode 192.168.10.11,192.168.10.12 ETH0"
Ca : 0x579b0123845e50 ports 2 "node5 elasticNode 192.168.10.7,192.168.10.8 ETH0"
Ca : 0x579b0123845fe0 ports 2 "node4 elasticNode 192.168.10.40,172.16.2.40 ETH0"
Ca : 0x579b0123845ea0 ports 2 "node3 elasticNode 192.168.10.9,192.168.10.10 ETH0"
Ca : 0x579b0123812b90 ports 2 "node2 elasticNode 192.168.10.1,192.168.10.2 ETH0"
Ca : 0x579b0123812970 ports 2 "node1 elasticNode 192.168.10.3,192.168.10.4 ETH0"
[root@exatest-iba0 ~]#

Because the Virtualization option was required, it has to be activated at this stage:

[root@node8 ~]# /opt/oracle.SupportTools/switch_to_ovm.sh
2019-03-07 01:05:22 -0800 [INFO] Switch to DOM0 system partition /dev/VGExaDb/LVDbSys3 (/dev/mapper/VGExaDb-LVDbSys3)
2019-03-07 01:05:22 -0800 [INFO] Active system device: /dev/mapper/VGExaDb-LVDbSys1
2019-03-07 01:05:22 -0800 [INFO] Active system device in boot area: /dev/mapper/VGExaDb-LVDbSys1
2019-03-07 01:05:23 -0800 [INFO] Set active system device to /dev/VGExaDb/LVDbSys3 in /boot/I_am_hd_boot
2019-03-07 01:05:23 -0800 [INFO] Creating /.elasticConfig on DOM0 boot partition /boot
2019-03-07 01:05:34 -0800 [INFO] Reboot has been initiated to switch to the DOM0 system partition
Connection to 192.168.1.8 closed by remote host.
Connection to 192.168.1.8 closed.
✘

After the switch to OVM command it is time to reclaim the space initially used by the Linux bare metal Logical Volumes:

[root@node8 ~]# /opt/oracle.SupportTools/reclaimdisks.sh -free -reclaim
Model is ORACLE SERVER X7-2
Number of LSI controllers: 1
Physical disks found: 4 (252:0 252:1 252:2 252:3)
Logical drives found: 1
Linux logical drive: 0
RAID Level for the Linux logical drive: 5
Physical disks in the Linux logical drive: 4 (252:0 252:1 252:2 252:3)
Dedicated Hot Spares for the Linux logical drive: 0
Global Hot Spares: 0
[INFO ] Check for DOM0 with inactive Linux system disk
[INFO ] Valid DOM0 with inactive Linux system disk is detected
[INFO ] Number of partitions on the system device /dev/sda: 3
[INFO ] Higher partition number on the system device /dev/sda: 3
[INFO ] Last sector on the system device /dev/sda: 3509760000
[INFO ] End sector of the last partition on the system device /dev/sda: 3509759966
[INFO ] Remove inactive system logical volume /dev/VGExaDb/LVDbSys1
[INFO ] Remove logical volume /dev/VGExaDb/LVDbOra1
[INFO ] Extend logical volume /dev/VGExaDb/LVDbExaVMImages
[INFO ] Resize ocfs2 on logical volume /dev/VGExaDb/LVDbExaVMImages
[INFO ] XEN boot version and rpm versions are in sync
[INFO ] XEN EFI files will not be updated
[INFO ] Force setup grub
[root@node8 ~]#

Check the success of the reclaim disks procedure:

[root@node8 ~]# /opt/oracle.SupportTools/reclaimdisks.sh -check
Model is ORACLE SERVER X7-2
Number of LSI controllers: 1
Physical disks found: 4 (252:0 252:1 252:2 252:3)
Logical drives found: 1
Linux logical drive: 0
RAID Level for the Linux logical drive: 5
Physical disks in the Linux logical drive: 4 (252:0 252:1 252:2 252:3)
Dedicated Hot Spares for the Linux logical drive: 0
Global Hot Spares: 0
Valid. Disks configuration: RAID5 from 4 disks with no global and dedicated hot spare disks.
Valid. Booted: DOM0. Layout: DOM0.
[root@node8 ~]#

Upload the Oracle Exadata Database Machine Deployment Assistant configuration files to the database server, together with all software images, and run the One command procedure.

List of all Steps

[root@exatestdbadm01 linux-x64]# ./install.sh -cf TVD-exatest.xml -l
Initializing

1. Validate Configuration File
2. Update Nodes for Eighth Rack
3. Create Virtual Machine
4. Create Users
5. Setup Cell Connectivity
6. Calibrate Cells
7. Create Cell Disks
8. Create Grid Disks
9. Install Cluster Software
10. Initialize Cluster Software
11. Install Database Software
12. Relink Database with RDS
13. Create ASM Diskgroups
14. Create Databases
15. Apply Security Fixes
16. Install Exachk
17. Create Installation Summary
18. Resecure Machine
[root@exatestdbadm01 linux-x64]#

Run Step One to validate the setup

This example includes the creation of three different Clusters.

[root@exatestdbadm01 linux-x64]# ./install.sh -cf TVD-exatest.xml -s 1
Initializing
Executing Validate Configuration File
Validating cluster: Cluster-EFU
Locating machines...
Verifying operating systems...
Validating cluster networks...
Validating network connectivity...
Validating private ips on virtual cluster
Validating NTP setup...
Validating physical disks on storage cells...
Validating users...
Validating cluster: Cluster-PR1
Locating machines...
Verifying operating systems...
Validating cluster networks...
Validating network connectivity...
Validating private ips on virtual cluster
Validating NTP setup...
Validating physical disks on storage cells...
Validating users...
Validating cluster: Cluster-VAL
Locating machines...
Verifying operating systems...
Validating cluster networks...
Validating network connectivity...
Validating private ips on virtual cluster
Validating NTP setup...
Validating physical disks on storage cells...
Validating users...
Validating platinum...
Validating switches...
Checking disk reclaim status...
Checking Disk Tests Status....
Completed validation...

SUCCESS: Ip address: 10.x8.xx.40 is configured correctly
SUCCESS: Ip address: 10.x9.xx.55 is configured correctly
SUCCESS: Ip address: 10.x8.xx.41 is configured correctly
SUCCESS: Ip address: 10.x9.xx.56 is configured correctly
SUCCESS: Ip address: 10.x8.xx.45 is configured correctly
SUCCESS: Ip address: 10.x8.xx.46 is configured correctly
SUCCESS: Ip address: 10.x8.xx.44 is configured correctly
SUCCESS: Ip address: 10.x8.xx.43 is configured correctly
SUCCESS: Ip address: 10.x8.xx.42 is configured correctly
SUCCESS: 10.x8.xx.40 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.55 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.41 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.56 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.45 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.46 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.44 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.43 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.42 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.40 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.55 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.41 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.56 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.45 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.46 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.44 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.43 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.42 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.40 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.55 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.41 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.56 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.45 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.46 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.44 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.43 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.42 configured correctly on exatestceladm03.my.domain.com
SUCCESS: Ip address: 10.x8.xx.47 is configured correctly
SUCCESS: Ip address: 10.x9.xx.57 is configured correctly
SUCCESS: Ip address: 10.x8.xx.48 is configured correctly
SUCCESS: Ip address: 10.x9.xx.58 is configured correctly
SUCCESS: Ip address: 10.x8.xx.52 is configured correctly
SUCCESS: Ip address: 10.x8.xx.51 is configured correctly
SUCCESS: Ip address: 10.x8.xx.53 is configured correctly
SUCCESS: Ip address: 10.x8.xx.50 is configured correctly
SUCCESS: Ip address: 10.x8.xx.49 is configured correctly
SUCCESS: 10.x8.xx.47 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.57 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.48 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.58 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.52 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.51 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.53 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.50 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.49 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.47 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.57 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.48 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.58 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.52 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.51 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.53 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.50 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.49 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.47 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.57 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.48 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.58 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.52 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.51 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.53 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.50 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.49 configured correctly on exatestceladm03.my.domain.com
SUCCESS: Ip address: 10.x8.xx.54 is configured correctly
SUCCESS: Ip address: 10.x9.xx.59 is configured correctly
SUCCESS: Ip address: 10.x8.xx.55 is configured correctly
SUCCESS: Ip address: 10.x9.xx.60 is configured correctly
SUCCESS: Ip address: 10.x8.xx.58 is configured correctly
SUCCESS: Ip address: 10.x8.xx.60 is configured correctly
SUCCESS: Ip address: 10.x8.xx.59 is configured correctly
SUCCESS: Ip address: 10.x8.xx.57 is configured correctly
SUCCESS: Ip address: 10.x8.xx.56 is configured correctly
SUCCESS: 10.x8.xx.54 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.59 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.55 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.60 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.58 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.60 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.59 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.57 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.56 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.54 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.59 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.55 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.60 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.58 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.60 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.59 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.57 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.56 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.54 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.59 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.55 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.60 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.58 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.60 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.59 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.57 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.56 configured correctly on exatestceladm03.my.domain.com
SUCCESS: Validated NTP server 10.x3.xx.xx0
SUCCESS: Validated NTP server 10.x3.xx.xx1
SUCCESS: Required file /EXAVMIMAGES/onecommand/linux-x64/WorkDir/p28514222_122118_Linux-x86-64.zip exists...
SUCCESS: Required file /EXAVMIMAGES/onecommand/linux-x64/WorkDir/p28762988_12201181016GIOCT2018RU_Linux-x86-64.zip exists...
SUCCESS: Required file /EXAVMIMAGES/onecommand/linux-x64/WorkDir/p28762989_12201181016DBOCT2018RU_Linux-x86-64.zip exists...
SUCCESS: Required file config/exachk.zip exists...
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm03.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm02.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm01.my.domain.com, machine type: storage
SUCCESS: Expected machine exatestdbadm01.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Expected machine exatestdbadm02.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: NTP servers on machine exatestceladm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm03.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm02.my.domain.com verified successfully
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm02.my.domain.com
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm01.my.domain.com
SUCCESS: Expected machine exatestdbadm02.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm01.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm02.my.domain.com, machine type: storage
SUCCESS: Expected machine exatestdbadm01.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm03.my.domain.com, machine type: storage
SUCCESS: NTP servers on machine exatestceladm03.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm01.my.domain.com verified successfully
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm02.my.domain.com
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm01.my.domain.com
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm03.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm02.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm01.my.domain.com, machine type: storage
SUCCESS: Expected machine exatestdbadm02.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Expected machine exatestdbadm01.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: NTP servers on machine exatestceladm03.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm02.my.domain.com verified successfully
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm02.my.domain.com
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm01.my.domain.com
SUCCESS: Switch IP 10.x9.xx.51 resolves successfully to host exatest-iba0.my.domain.com on node exatestceladm03.my.domain.com
SUCCESS:
SUCCESS: Switch IP 10.x9.xx.51 resolves successfully to host exatest-iba0.my.domain.com on node exatestceladm02.my.domain.com
SUCCESS: Switch IP 10.x9.xx.52 resolves successfully to host exatest-ibb0.my.domain.com on node exatestceladm03.my.domain.com
SUCCESS:
SUCCESS:
SUCCESS:
SUCCESS: Switch IP 10.x9.xx.52 resolves successfully to host exatest-ibb0.my.domain.com on node exatestceladm02.my.domain.com
SUCCESS:
SUCCESS: Switch IP 10.x9.xx.51 resolves successfully to host exatest-iba0.my.domain.com on node exatestceladm01.my.domain.com
SUCCESS: Switch IP 10.x9.xx.52 resolves successfully to host exatest-ibb0.my.domain.com on node exatestceladm01.my.domain.com
SUCCESS:
SUCCESS: X7 compute node exatestdbadm01.my.domain.com has updated Broadcom firmware
SUCCESS: X7 compute node exatestdbadm02.my.domain.com has updated Broadcom firmware
SUCCESS: Disk Tests are not running/active on any of the Storage Servers.
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm01
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm02
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm01
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm02
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm01
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm02
SUCCESS: Disk size 10000GB on cell exatestceladm01.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm02.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm03.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm04.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm05.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm06.my.domain.com matches the value specified in the OEDA configuration file
Successfully completed execution of step Validate Configuration File [elapsed Time [Elapsed = 250301 mS [4.0 minutes] Thu Mar 07 12:35:31 CET 2019]]
[root@exatestdbadm01 linux-x64]#

Execution of all remaining steps

Than, because we felt confident, we decide to invoke all remaining steps together:

root@exatestdbadm01 linux-x64]# ./install.sh -cf TVD-exatest.xml -r 1-18
...
..

The final result is the Exadata Machine installed with six Oracle VMs, and three Grid Infrastructure clusters each one running a test RAC database.

Grid Management DB filling up ASM disk space

Recently I discovered on Oracle Grid Infrastructure 12cR2 that the ASM disk group hosting the Management DB (-MGMTDB) was filling up the disk space very quickly.

This is due to a bug on the oclumon data purge procedure.

To fix the problem, two possibilities are available:

Recreate the Management DB
Manually truncate the tables not purged and shrinking the Tablespace Size

Below are described the two options.

Option 1 – Recreate the Management DB

As root user on each cluster node:

# /u01/app/12.2.0.1/grid/bin/crsctl stop res ora.crf -init
# /u01/app/12.2.0.1/grid/bin/crsctl modify res ora.crf -attr ENABLED=0 -init

As Grid from the local node hosting the Management Database Instance run the commands:

$ /u01/app/12.2.0.1/grid/bin/srvctl status mgmtdb
$ /u01/app/12.2.0.1/grid/bin/dbca -silent -deleteDatabase -sourceDB -MGMTDB
Connecting to database
4% complete
9% complete
14% complete
19% complete
23% complete
28% complete
47% complete
Updating network configuration files
48% complete
52% complete
Deleting instance and datafiles
76% complete
100% complete

How to recreate the MGMTDB:

$ /u01/app/12.2.0.1/grid/bin/dbca -silent -createDatabase -createAsContainerDatabase true -templateName MGMTSeed_Database.dbc
-sid -MGMTDB
-gdbName _mgmtdb
-storageType ASM
-diskGroupName GIMR
-datafileJarLocation <GI HOME>/assistants/dbca/templates
-characterset AL32UTF8
-autoGeneratePassword           
-skipUserTemplateCheck

Create the pluggable GIMR database

$ /u01/app/12.2.0.1/grid/bin/mgmtca -local

Option 2 – Manually truncate the tables

As root user stop and disable ora.crf resource on each cluster node:

# /u01/app/12.2.0.1/grid/bin/crsctl stop res ora.crf -init
# /u01/app/12.2.0.1/grid/bin/crsctl modify res ora.crf -attr ENABLED=0 -init

Connect to MGMTDB and identify the segments to truncate:

export ORACLE_SID=-MGMTDB
$ORACLE_HOME/bin/sqlplus / as sysdba
SQL> select pdb_name from dba_pdbs where pdb_name!='PDB$SEED';

SQL> alter session set container=GIMR_DSCREP_10;

Session altered.

SQL> col obj format a50
SQL> select owner||'.'||SEGMENT_NAME obj, BYTES from dba_segments where owner='CHM' order by 2 asc;

Likely those two tables are much bigger than the rest :

CHM.CHMOS_PROCESS_INT_TBL
CHM.CHMOS_DEVICE_INT_TBL

Truncate the tables:

SQL> truncate table CHM.CHMOS_PROCESS_INT_TBL;
SQL> truncate table CHM.CHMOS_DEVICE_INT_TBL;

Then if needed shrink the tablespace and job done!

RHEL 7.4 fails to mount ACFS File System due to KMOD package

After a fresh OS installation or an upgrade to RHEL 7.4, any attempt to install ACFS drivers will fail with the following message: “ACFS-9459 ADVM/ACFS is not supported on this OS version”

The error persists even if the Oracle Grid Infrastructure software includes the Patch 26247490: 12.2 ACFS MODULE ERRORS & CRASH DURING MODULE LOAD & UNLOAD WITH OL7U4 RHCK.

This problem has been identified by Oracle with BUG 26320387 – 7.4 kmod weak-modules not checking kABI compatibility correctly

And by Red Hat Bugzilla bug: 1477073 – 7.4 kmod weak-modules –dry-run changed output format missing ‘is compatible’ messages.

root@oel7node06:/u01/app/12.2.0.1/grid/crs/install# /u01/app/12.2.0.1/grid/bin/acfsroot install
ACFS-9459: ADVM/ACFS is not supported on this OS version: '3.10.0-514.6.1.el7.x86_64'

root@oel7node06:~# /sbin/lsmod | grep oracle
oracleadvm 776830 7
oracleoks 654476 1 oracleadvm
oracleafd 205543 1

The current Workaround consists in downgrade the version of the kmod RPM to kmod-20-9.el7.x86_64.

root@oel7node06:~# yum downgrade kmod-20-9.el7

After the package downgrade the ACFS drivers are correcly loaded:

root@oel7node06:~# /sbin/lsmod | grep oracle
oracleacfs 4597925 2
oracleadvm 776830 8
oracleoks 654476 2 oracleacfs,oracleadvm
oracleafd 205543 1

Adding flexibility to Oracle GI Implementing Multiple SCANs

Nowadays the business requirements force the IT to implement the more and more sophisticated and consolidated environments without compromising availability, performance and flexibility of each application running on it.

In this post, I explain how to improve the Grid Infrastructure Network flexibility, implementing multiple SCANs and how to associate one or multiple networks to the Oracle databases.

To better understand the reasons for such type of implementation, below are listed few common use cases:

Applications are deployed on different/dedicated subnets.
Network isolation due to security requirement.
Different database protocols are in use (TCP, TCPS, etc.).

Single Client Access Name (SCAN)

By default on each Oracle Grid Infrastructure cluster, indipendently from the number of nodes, one SCAN with 3 SCAN VIPs is created.

Below is depicted the default Oracle Clusterware network/SCAN configuration.

Single_Scan_Listener

Multiple Single Client Access Name (SCAN) implementation

Before implemeting additional SCANs, the OS provisioning of new network interfaces or new VLAN Tagging has to be completed.

The current example uses the second option (VLAN Tagging), and the bond0 interface is an Active/Active setup of two 10gbe cards, to which a VLAN tag has been added.

Below is represented the customized Oracle Clusterware network/SCAN configuration, having added a second SCAN.

Multi_Scan_Listeners

Step-by-step implementation

After completing the OS network setup, as grid owner add the new interface to the Grid Infrastructure:

grid@host01a:~# oifcfg setif -global bond0.764/10.15.69.0:public

grid@host01a:~# oifcfg getif
eno49 192.168.7.32 global cluster_interconnect,asm
eno50 192.168.9.48 global cluster_interconnect,asm
bond0 10.11.8.0 global public
bond0.764 10.15.69.0 global public
grid@host01a:~#

Then as root create the network number 2 and disply the configuration:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add network -netnum 2 -subnet 10.15.69.0/255.255.255.0/bond0.764 -nettype STATIC

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl config network -netnum 2
Network 2 exists
Subnet IPv4: 10.15.69.0/255.255.255.0/, static
Subnet IPv6:
Ping Targets:
Network is enabled
Network is individually enabled on nodes:
Network is individually disabled on nodes:

As root user add the node VIPs:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host01a -netnum 2 -address host01b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host02a -netnum 2 -address host02b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host03a -netnum 2 -address host03b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host04a -netnum 2 -address host04b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host05a -netnum 2 -address host05b-vip.emilianofusaglia.net/255.255.255.0
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add vip -node host06a -netnum 2 -address host06b-vip.emilianofusaglia.net/255.255.255.0

As grid user create a new listener based on the network number 2:

grid@host01a:~# srvctl add listener -listener LISTENER2 -netnum 2 -endpoints "TCP:1532"

As root user add the new SCAN to the network number 2:

 root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl add scan -scanname scan-02.emilianofusaglia.net -netnum 2

As root user start the new node VIPs:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host01b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host02b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host03b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host04b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host05b-vip.emilianofusaglia.net
root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start vip -vip host06b-vip.emilianofusaglia.net

As grid user start the new node Listeners:

grid@host01a:~# srvctl start listener -listener LISTENER2
grid@host01a:~# srvctl status listener -listener LISTENER2
Listener LISTENER2 is enabled
Listener LISTENER2 is running on node(s): host01a,host02a,host03a,host04a,host05a,host06a

As root user start the new SCAN and as grid user check the configuration:

root@host01a:~# /u01/app/12.2.0.1/grid/bin/srvctl start scan -netnum 2

grid@host01a:~# srvctl config scan -netnum 2
SCAN name: scan-02.emilianofusaglia.net, Network: 2
Subnet IPv4: 10.15.69.0/255.255.255.0/, static
Subnet IPv6:
SCAN 1 IPv4 VIP: 10.15.69.44
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:
SCAN 2 IPv4 VIP: 10.15.69.45
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:
SCAN 3 IPv4 VIP: 10.15.69.43
SCAN VIP is enabled.
SCAN VIP is individually enabled on nodes:
SCAN VIP is individually disabled on nodes:

grid@host01a:~# srvctl status scan -netnum 2
SCAN VIP scan1_net2 is enabled
SCAN VIP scan1_net2 is running on node host02a
SCAN VIP scan2_net2 is enabled
SCAN VIP scan2_net2 is running on node host01a
SCAN VIP scan3_net2 is enabled
SCAN VIP scan3_net2 is running on node host03a

As grid user add the SCAN Listener and check the configuration:

grid@host01a:~# srvctl add scan_listener -netnum 2 -listener LISTENER2 -endpoints TCP:1532

grid@host01a:~# srvctl config scan_listener -netnum 2
SCAN Listener LISTENER2_SCAN1_NET2 exists. Port: TCP:1532
Registration invited nodes:
Registration invited subnets:
SCAN Listener is enabled.
SCAN Listener is individually enabled on nodes:
SCAN Listener is individually disabled on nodes:
SCAN Listener LISTENER2_SCAN2_NET2 exists. Port: TCP:1532
Registration invited nodes:
Registration invited subnets:
SCAN Listener is enabled.
SCAN Listener is individually enabled on nodes:
SCAN Listener is individually disabled on nodes:
SCAN Listener LISTENER2_SCAN3_NET2 exists. Port: TCP:1532
Registration invited nodes:
Registration invited subnets:
SCAN Listener is enabled.
SCAN Listener is individually enabled on nodes:
SCAN Listener is individually disabled on nodes:

As grid user start the SCAN Listener2 and check the status:

grid@host01a:~# srvctl start scan_listener -netnum 2

grid@host01a:~# srvctl status scan_listener -netnum 2
SCAN Listener LISTENER2_SCAN1_NET2 is enabled
SCAN listener LISTENER2_SCAN1_NET2 is running on node host02a
SCAN Listener LISTENER2_SCAN2_NET2 is enabled
SCAN listener LISTENER2_SCAN2_NET2 is running on node host01a
SCAN Listener LISTENER2_SCAN3_NET2 is enabled
SCAN listener LISTENER2_SCAN3_NET2 is running on node host03a

Defining the multi SCANs configuration per database

Once the above configuration is completed, it remains to define which SCAN/s should be used by each database.

When multiple SCANs exists, by default the CRS populate the LISTENER_NETWORKS parameter to register the database against all SCANs and LISTENERs.

To overwrite this default behavior, allowing for example the authentication of a specific database only against the SCAN scan-02.emilianofusaglia.net, the database parameter LISTENER_NETWORKS should be manually configured.
The parameter LISTENER_NETWORKS can be dynamically set but the new value is enforced during the next instance restart.

ASM Filter Driver (ASMFD)

ASM Filter Driver is a Linux kernel module introduced in 12c R1. It resides in the I/O path of the Oracle ASM disks providing the following features:

Rejecting all non-Oracle I/O write requests to ASM Disks.

Device name persistency.

Node level fencing without reboot.

In 12c R2 ASMFD can be enabled from the GUI interface of the Grid Infrastructure installation, as shown on this post GI 12c R2 Installation at the step #8 “Create ASM Disk Group”.

Once ASM Filter Driver is in use, similarly to ASMLib the disks are managed using the ASMFD Label Name.

Here few examples about the implementation of ASM Filter Driver.

--How to create an ASMFD label in SQL*Plus
SQL> Alter system label set 'DATA1' to '/dev/mapper/mpathak';

System altered.


--How to create an ASM Disk Group with ASMFD
CREATE DISKGROUP DATA_DG EXTERNAL REDUNDANCY DISK 'AFD:DATA1' SIZE 30720M
ATTRIBUTE 'SECTOR_SIZE'='512','LOGICAL_SECTOR_SIZE'='512','compatible.asm'='12.2.0.1',
'compatible.rdbms'='12.2.0.1','compatible.advm'='12.2.0.1','au_size'='4M';

Diskgroup created.

ASM Filter Driver can also be managed from the ASM command line utility ASMCMD

--Check ASMFD status
ASMCMD> afd_state
ASMCMD-9526: The AFD state is 'LOADED' and filtering is 'ENABLED' on host 'oel7node06.localdomain'


--List ASM Disks where ASMFD is enabled
ASMCMD> afd_lsdsk
--------------------------------------------------------------------------------
Label                    Filtering                Path
================================================================================
DATA1                      ENABLED                /dev/mapper/mpathak
DATA2                      ENABLED                /dev/mapper/mpathan
DATA3                      ENABLED                /dev/mapper/mpathw
DATA4                      ENABLED                /dev/mapper/mpathac
GIMR1                      ENABLED                /dev/mapper/mpatham
GIMR2                      ENABLED                /dev/mapper/mpathaj
GIMR3                      ENABLED                /dev/mapper/mpathal
GIMR4                      ENABLED                /dev/mapper/mpathaf
GIMR5                      ENABLED                /dev/mapper/mpathai
RECO3                      ENABLED                /dev/mapper/mpathy
RECO1                      ENABLED                /dev/mapper/mpathab
RECO2                      ENABLED                /dev/mapper/mpathx
ASMCMD>


--How to remove an ASMFD label in ASMCMD
ASMCMD> afd_unlabel DATA4

Installing Oracle Grid Infrastructure 12c R2

It has been an exciting week, Oracle 12c R2 came out and suddenly was time to refresh the RAC test environments. My friend Jacques opted for an upgrade from 12.1.0.2 to 12.2.0.1 (here the link to his blog post), I started with a fresh installation, because I also upgraded the Operating System to OEL 7.3.

Compared to 12c R1 there are new options on the installation process, but general speaking the wizard is quite similar.

The first breakthrough is about the installation simplified with an image based, no more runIstaller.sh to invoke but …

Unpack the .Zip file directly inside the Grid Infrastructure Home of the first cluster node as described below:

[grid@oel7node06 ~]$ mkdir -p /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ chown grid:oinstall /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ cd /u01/app/12.2.0.1/grid 
[grid@oel7node06 grid]$ unzip -q download_location/grid_home_image.zip

# From an X session invoke the Grid Infrastructure wizard: 
[grid@oel7node06 grid]$ ./gridSetup.sh

The second screenshot list the new Cluster typoligies available on 12c R2:

Oracle Standalone Cluster
Oracle Cluster Domain
- Oracle Domain Services Cluster
- Oracle Member Clusters
  - Oracle Member Cluster for Oracle Database
  - Oracle Member Cluster for Applications

In my case I’m installing an Oracle Standalone Cluster

And now time for testing.

Linux for DBA: How disable the ssh banner for a given user

Ready to install a new Oracle RAC cluster, but the ssh banner (in /etc/issue.net protected by root privileges) is compromising the non-interactive ssh commands issued by grid & oracle?

Here the trick to disable it:

--Add this empty file to the grid and oracle UNIX home
touch ~/.hushlogin 

--or
mkdir -p .ssh
chmod 700 .ssh 
echo "LogLevel quiet" > ~/.ssh/config

How to restore OCR and Voting disk

################################################################
# How to restore OCR and Voting disk on Oracle 11g R2.
################################################################

--Location and status of OCR before starting the test:
 root@host1:/u01/GRID/11.2/cdata # /u01/GRID/11.2/bin/ocrcheck
 Status of Oracle Cluster Registry is as follows :
 Version                  :          3
 Total space (kbytes)     :     262120
 Used space (kbytes)      :       2744
 Available space (kbytes) :     259376
 ID                       :  401168391
 Device/File Name         : +OCRVOTING
 Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

--Check the existency of BACKUPS:
 root@host1:/root # /u01/GRID/11.2/bin/ocrconfig -showbackup

host1     2010/01/21 14:17:54     /u01/GRID/11.2/cdata/cluster01/backup00.ocr

host1     2010/01/21 05:58:31     /u01/GRID/11.2/cdata/cluster01/backup01.ocr

host1     2010/01/21 01:58:30     /u01/GRID/11.2/cdata/cluster01/backup02.ocr

host1     2010/01/20 05:58:21     /u01/GRID/11.2/cdata/cluster01/day.ocr

host1     2010/01/14 23:12:07     /u01/GRID/11.2/cdata/cluster01/week.ocr
 PROT-25: Manual backups for the Oracle Cluster Registry are not available

--Identify all the disks belong the Disk group +OCRVOTING:

NAME                                       PATH
 ------------------------------ ------------------------------------------------------------
 OCRVOTING_0000                 /dev/oracle/asm.25.lun
 OCRVOTING_0001                 /dev/oracle/asm.26.lun
 OCRVOTING_0002                 /dev/oracle/asm.27.lun
 OCRVOTING_0003                 /dev/oracle/asm.28.lun
 OCRVOTING_0004                 /dev/oracle/asm.29.lun

5 rows selected.

--Corrupt tht disks belong the Disk group +OCRVOTING:
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.25.lun bs=1024 count=1000
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.26.lun bs=1024 count=1000
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.27.lun bs=1024 count=1000
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.28.lun bs=1024 count=1000
 dd if=/tmp/corrupt_disk of=/dev/oracle/asm.29.lun bs=1024 count=1000

--OCR Check after Corruption:
 root@host1:/tmp # /u01/GRID/11.2/bin/ocrcheck
 Status of Oracle Cluster Registry is as follows :
 Version                  :          3
 Total space (kbytes)     :     262120
 Used space (kbytes)      :       2712
 Available space (kbytes) :     259408
 ID                       :  701409037
 Device/File Name         : +OCRVOTING
 Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

--Stop and Start of database instance after corruption
 oracle@host1:/u01/oracle/data $ srvctl stop instance -d DB -i DB1
 oracle@host1:/u01/oracle/data $ srvctl start instance -d DB -i DB1

--Stop and Start entire Cluster:

-host1:
 root@host1:/tmp # /u01/GRID/11.2/bin/crsctl stop crs
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host1'
 CRS-2673: Attempting to stop 'ora.crsd' on 'host1'
 CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'host1'
 CRS-2673: Attempting to stop 'ora.OCRVOTING.dg' on 'host1'
 CRS-2673: Attempting to stop 'ora.db.db' on 'host1'
 CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'host1'
 CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.host1.vip' on 'host1'
 CRS-2677: Stop of 'ora.host1.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.OCRVOTING.dg' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.scan2.vip' on 'host1'
 CRS-2673: Attempting to stop 'ora.scan3.vip' on 'host1'
 CRS-2673: Attempting to stop 'ora.host2.vip' on 'host1'
 CRS-2677: Stop of 'ora.scan2.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.scan3.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.host2.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.db.db' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.DATA1.dg' on 'host1'
 CRS-2673: Attempting to stop 'ora.FRA1.dg' on 'host1'
 CRS-2677: Stop of 'ora.DATA1.dg' on 'host1' succeeded
 CRS-2677: Stop of 'ora.FRA1.dg' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.asm' on 'host1'
 CRS-2677: Stop of 'ora.asm' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.ons' on 'host1'
 CRS-2673: Attempting to stop 'ora.eons' on 'host1'
 CRS-2677: Stop of 'ora.ons' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.net1.network' on 'host1'
 CRS-2677: Stop of 'ora.net1.network' on 'host1' succeeded
 CRS-2677: Stop of 'ora.eons' on 'host1' succeeded
 CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'host1' has completed
 CRS-2677: Stop of 'ora.crsd' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.mdnsd' on 'host1'
 CRS-2673: Attempting to stop 'ora.gpnpd' on 'host1'
 CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'host1'
 CRS-2673: Attempting to stop 'ora.ctssd' on 'host1'
 CRS-2673: Attempting to stop 'ora.evmd' on 'host1'
 CRS-2673: Attempting to stop 'ora.asm' on 'host1'
 CRS-2677: Stop of 'ora.cssdmonitor' on 'host1' succeeded
 CRS-2677: Stop of 'ora.mdnsd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.gpnpd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.evmd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.ctssd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.asm' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.cssd' on 'host1'
 CRS-2677: Stop of 'ora.cssd' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.diskmon' on 'host1'
 CRS-2673: Attempting to stop 'ora.gipcd' on 'host1'
 CRS-2677: Stop of 'ora.gipcd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.diskmon' on 'host1' succeeded
 CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'host1' has completed
 CRS-4133: Oracle High Availability Services has been stopped.

--host2:
 root@host2:/root # /u01/GRID/11.2/bin/crsctl stop crs
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host2'
 CRS-2673: Attempting to stop 'ora.crsd' on 'host2'
 CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'host2'
 CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'host2'
 CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'host2'
 CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'host2'
 CRS-2673: Attempting to stop 'ora.OCRVOTING.dg' on 'host2'
 CRS-2673: Attempting to stop 'ora.db.db' on 'host2'
 CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'host2'
 CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.scan2.vip' on 'host2'
 CRS-2677: Stop of 'ora.scan2.vip' on 'host2' succeeded
 CRS-2672: Attempting to start 'ora.scan2.vip' on 'host1'
 CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.scan3.vip' on 'host2'
 CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.host2.vip' on 'host2'
 CRS-2677: Stop of 'ora.scan3.vip' on 'host2' succeeded
 CRS-2672: Attempting to start 'ora.scan3.vip' on 'host1'
 CRS-2677: Stop of 'ora.host2.vip' on 'host2' succeeded
 CRS-2672: Attempting to start 'ora.host2.vip' on 'host1'
 CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.scan1.vip' on 'host2'
 CRS-2677: Stop of 'ora.scan1.vip' on 'host2' succeeded
 CRS-2676: Start of 'ora.scan2.vip' on 'host1' succeeded
 CRS-2676: Start of 'ora.scan3.vip' on 'host1' succeeded
 CRS-2676: Start of 'ora.host2.vip' on 'host1' succeeded
 CRS-2677: Stop of 'ora.OCRVOTING.dg' on 'host2' succeeded
 CRS-2677: Stop of 'ora.db.db' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.DATA1.dg' on 'host2'
 CRS-2673: Attempting to stop 'ora.FRA1.dg' on 'host2'
 CRS-2677: Stop of 'ora.DATA1.dg' on 'host2' succeeded
 CRS-2677: Stop of 'ora.FRA1.dg' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.asm' on 'host2'
 CRS-2677: Stop of 'ora.asm' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.ons' on 'host2'
 CRS-2673: Attempting to stop 'ora.eons' on 'host2'
 CRS-2677: Stop of 'ora.ons' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.net1.network' on 'host2'
 CRS-2677: Stop of 'ora.net1.network' on 'host2' succeeded
 CRS-2677: Stop of 'ora.eons' on 'host2' succeeded
 CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'host2' has completed
 CRS-2677: Stop of 'ora.crsd' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.gpnpd' on 'host2'
 CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'host2'
 CRS-2673: Attempting to stop 'ora.ctssd' on 'host2'
 CRS-2673: Attempting to stop 'ora.evmd' on 'host2'
 CRS-2673: Attempting to stop 'ora.asm' on 'host2'
 CRS-2673: Attempting to stop 'ora.mdnsd' on 'host2'
 CRS-2677: Stop of 'ora.cssdmonitor' on 'host2' succeeded
 CRS-2677: Stop of 'ora.gpnpd' on 'host2' succeeded
 CRS-2677: Stop of 'ora.evmd' on 'host2' succeeded
 CRS-2677: Stop of 'ora.mdnsd' on 'host2' succeeded
 CRS-2677: Stop of 'ora.asm' on 'host2' succeeded
 CRS-2677: Stop of 'ora.ctssd' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.cssd' on 'host2'
 CRS-2677: Stop of 'ora.cssd' on 'host2' succeeded
 CRS-2673: Attempting to stop 'ora.diskmon' on 'host2'
 CRS-2673: Attempting to stop 'ora.gipcd' on 'host2'
 CRS-2677: Stop of 'ora.gipcd' on 'host2' succeeded
 CRS-2677: Stop of 'ora.diskmon' on 'host2' succeeded
 CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'host2' has completed
 CRS-4133: Oracle High Availability Services has been stopped.

--host1
 root@host1:/root # /u01/GRID/11.2/bin/crsctl start crs
 CRS-4123: Oracle High Availability Services has been started.

--host2
 root@host2:/u01/GRID/11.2/cdata/cluster01 # /u01/GRID/11.2/bin/crsctl start crs
 CRS-4123: Oracle High Availability Services has been started.

--CRS Alert log: (Start failed because the Diskgroup is not available)
 2010-01-21 16:29:07.785
 [cssd(10123)]CRS-1705:Found 0 configured voting files but 1 voting files are required, terminating to ensure data integrity; details at (:CSSNM00065:) in /u01/GRID/11.2/log/host1/cssd/ocssd.log
 2010-01-21 16:29:07.785
 [cssd(10123)]CRS-1603:CSSD on node host1 shutdown by user.
 2010-01-21 16:29:07.918
 [ohasd(9931)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'host1'.
 2010-01-21 16:30:05.489
 [/u01/GRID/11.2/bin/orarootagent.bin(10113)]CRS-5818:Aborted command 'start for resource: ora.diskmon 1 1' for resource 'ora.diskmon'. Details at (:CRSAGF00113:) in /u01/GRID/11.2/log/host1/agent/ohasd/orarootagent_root/orarootagent_root.log.
 2010-01-21 16:30:09.504
 [ohasd(9931)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.diskmon'. Details at (:CRSPE00111:) in /u01/GRID/11.2/log/host1/ohasd/ohasd.log.
 2010-01-21 16:30:20.687
 [cssd(10622)]CRS-1713:CSSD daemon is started in clustered mode
 2010-01-21 16:30:21.801
 [cssd(10622)]CRS-1705:Found 0 configured voting files but 1 voting files are required, terminating to ensure data integrity; details at (:CSSNM00065:) in /u01/GRID/11.2/log/host1/cssd/ocssd.log
 2010-01-21 16:30:21.801
 [cssd(10622)]CRS-1603:CSSD on node host1 shutdown by user.

--host1 STOP CRS because due to Voting Disk unavailability is not running properly:
 root@host1:/tmp # /u01/GRID/11.2/bin/crsctl stop crs
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host1'
 CRS-2673: Attempting to stop 'ora.crsd' on 'host1'
 CRS-4548: Unable to connect to CRSD
 CRS-2675: Stop of 'ora.crsd' on 'host1' failed
 CRS-2679: Attempting to clean 'ora.crsd' on 'host1'
 CRS-4548: Unable to connect to CRSD
 CRS-2678: 'ora.crsd' on 'host1' has experienced an unrecoverable failure
 CRS-0267: Human intervention required to resume its availability.
 CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'host1' has failed
 CRS-4687: Shutdown command has completed with error(s).
 CRS-4000: Command Stop failed, or completed with errors.

--Because all the processes are not STOPPING, disable the cluster AUTO Start and reboot
 --the server for cleaning all the pending processes.

root@host1:/tmp # /u01/GRID/11.2/bin/crsctl disable crs
 CRS-4621: Oracle High Availability Services autostart is disabled.

root@host1:/tmp # reboot

--Start the Cluster in EXLUSIVE Mode in order to recreate ASM Diskgroup:
 root@host1:/root # /u01/GRID/11.2/bin/crsctl start crs -excl
 CRS-4123: Oracle High Availability Services has been started.
 CRS-2672: Attempting to start 'ora.gipcd' on 'host1'
 CRS-2672: Attempting to start 'ora.mdnsd' on 'host1'
 CRS-2676: Start of 'ora.gipcd' on 'host1' succeeded
 CRS-2676: Start of 'ora.mdnsd' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.gpnpd' on 'host1'
 CRS-2676: Start of 'ora.gpnpd' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.cssdmonitor' on 'host1'
 CRS-2676: Start of 'ora.cssdmonitor' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.cssd' on 'host1'
 CRS-2679: Attempting to clean 'ora.diskmon' on 'host1'
 CRS-2681: Clean of 'ora.diskmon' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.diskmon' on 'host1'
 CRS-2676: Start of 'ora.diskmon' on 'host1' succeeded
 CRS-2676: Start of 'ora.cssd' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.ctssd' on 'host1'
 CRS-2676: Start of 'ora.ctssd' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.asm' on 'host1'
 CRS-2676: Start of 'ora.asm' on 'host1' succeeded
 CRS-2672: Attempting to start 'ora.crsd' on 'host1'
 CRS-2676: Start of 'ora.crsd' on 'host1' succeeded

--Stop ASM and restart it using a pfile example:
 *.asm_diskgroups='DATA1','FRA1'
 *.asm_diskstring='/dev/oracle/asm*'
 *.diagnostic_dest='/u01/oracle'
 +ASM1.instance_number=1
 +ASM2.instance_number=2
 *.instance_type='asm'
 *.large_pool_size=12M
 *.processes=500
 *.sga_max_size=1G
 *.sga_target=1G
 *.shared_pool_size=300M

--Recreate ASM Diskgroup
 --This command FAILS because asmca is not able to update the OCR:
 asmca -silent -createDiskGroup -diskGroupName OCRVOTING  -disk '/dev/oracle/asm.25.lun' -disk '/dev/oracle/asm.26.lun'  -disk '/dev/oracle/asm.27.lun'  -disk '/dev/oracle/asm.28.lun'  -disk '/dev/oracle/asm.29.lun'  -redundancy HIGH -compatible.asm '11.2.0.0.0'  -compatible.rdbms '11.2.0.0.0' -compatible.advm '11.2.0.0.0'

--Create the Diskgroup Using SQLPLUS Create Diskgroup and save the ASM spfile inside:
 create Diskgroup OCRVOTING high redundancy disk '/dev/oracle/asm.25.lun',
 '/dev/oracle/asm.26.lun', '/dev/oracle/asm.27.lun',
 '/dev/oracle/asm.28.lun', '/dev/oracle/asm.29.lun'
 ATTRIBUTE  'compatible.asm'='11.2.0.0.0', 'compatible.rdbms'='11.2.0.0.0';

create spfile='+OCRVOTING' from pfile='/tmp/asm_pfile.ora';

File created.

SQL> shut immediate
 ASM diskgroups dismounted
 ASM instance shutdown
 SQL> startup
 ASM instance started

Total System Global Area 1069252608 bytes
 Fixed Size                  2154936 bytes
 Variable Size            1041931848 bytes
 ASM Cache                  25165824 bytes
 ASM diskgroups mounted

-- Restore OCR from backup:
 root@host1:/root # /u01/GRID/11.2/bin/ocrconfig -restore /u01/GRID/11.2/cdata/cluster01/backup00.ocr

--Check the OCR status after restore:
 root@host1:/root # /u01/GRID/11.2/bin/ocrcheck
 Status of Oracle Cluster Registry is as follows :
 Version                  :          3
 Total space (kbytes)     :     262120
 Used space (kbytes)      :       2712
 Available space (kbytes) :     259408
 ID                       :  701409037
 Device/File Name         : +OCRVOTING
 Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

--Restore the Voting Disk:
 root@host1:/root # /u01/GRID/11.2/bin/crsctl replace votedisk +OCRVOTING
 Successful addition of voting disk 7s16f9fbf4b64f74bfy0ee8826f15eb4.
 Successful addition of voting disk 9k6af49d3cd54fc5bf28a2fc3899c8c6.
 Successful addition of voting disk 876eb99563924ff6bfc1defe6865deeb.
 Successful addition of voting disk 12230b5ef41f4fc2bf2cae957f765fb0.
 Successful addition of voting disk 47812b7f6p034f33bf13490e6e136b8b.
 Successfully replaced voting disk group with +OCRVOTING.
 CRS-4266: Voting file(s) successfully replaced

--Re-enable CRS auto starup
 root@host1:/root # /u01/GRID/11.2/bin/crsctl enable crs
 CRS-4622: Oracle High Availability Services autostart is enabled.

--Stop CRS on host1
 root@host1:/root # /u01/GRID/11.2/bin/crsctl stop crs
 CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'host1'
 CRS-2673: Attempting to stop 'ora.crsd' on 'host1'
 CRS-2677: Stop of 'ora.crsd' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.gpnpd' on 'host1'
 CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'host1'
 CRS-2673: Attempting to stop 'ora.ctssd' on 'host1'
 CRS-2673: Attempting to stop 'ora.asm' on 'host1'
 CRS-2673: Attempting to stop 'ora.mdnsd' on 'host1'
 CRS-2677: Stop of 'ora.cssdmonitor' on 'host1' succeeded
 CRS-2677: Stop of 'ora.gpnpd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.mdnsd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.ctssd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.asm' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.cssd' on 'host1'
 CRS-2677: Stop of 'ora.cssd' on 'host1' succeeded
 CRS-2673: Attempting to stop 'ora.diskmon' on 'host1'
 CRS-2673: Attempting to stop 'ora.gipcd' on 'host1'
 CRS-2677: Stop of 'ora.gipcd' on 'host1' succeeded
 CRS-2677: Stop of 'ora.diskmon' on 'host1' succeeded
 CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'host1' has completed
 CRS-4133: Oracle High Availability Services has been stopped.

--Start CRS on host1
 root@host1:/root # /u01/GRID/11.2/bin/crsctl start crs
 CRS-4123: Oracle High Availability Services has been started.

--Start CRS on host2
 root@host2:/root # /u01/GRID/11.2/bin/crsctl start crs
 CRS-4123: Oracle High Availability Services has been started.

--Check if all the Resources are running:
 root@host1:/root # /u01/GRID/11.2/bin/crsctl stat res -t
 --------------------------------------------------------------------------------
 NAME           TARGET  STATE        SERVER                   STATE_DETAILS
 --------------------------------------------------------------------------------
 Local Resources
 --------------------------------------------------------------------------------
 ora.DATA1.dg
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.FRA1.dg
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.LISTENER.lsnr
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.OCRVOTING.dg
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.asm
 ONLINE  ONLINE       host1                 Started
 ONLINE  ONLINE       host2                 Started
 ora.eons
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.gsd
 OFFLINE OFFLINE      host1
 OFFLINE OFFLINE      host2
 ora.net1.network
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 ora.ons
 ONLINE  ONLINE       host1
 ONLINE  ONLINE       host2
 --------------------------------------------------------------------------------
 Cluster Resources
 --------------------------------------------------------------------------------
 ora.LISTENER_SCAN1.lsnr
 1        ONLINE  ONLINE       host1
 ora.LISTENER_SCAN2.lsnr
 1        ONLINE  ONLINE       host2
 ora.LISTENER_SCAN3.lsnr
 1        ONLINE  ONLINE       host2
 ora.db.db
 1        ONLINE  ONLINE       host1                 Open
 2        ONLINE  ONLINE       host2                 Open
 ora.oc4j
 1        OFFLINE OFFLINE
 ora.scan1.vip
 1        ONLINE  ONLINE       host1
 ora.scan2.vip
 1        ONLINE  ONLINE       host2
 ora.scan3.vip
 1        ONLINE  ONLINE       host2
 ora.host1.vip
 1        ONLINE  ONLINE       host1
 ora.host2.vip
 1        ONLINE  ONLINE       host2

Installation Oracle Grid Infrastrucure 12c

####################################

LINUX Setup click here

####################################

Setup OS

– Disable the Firewall

[root@oel6srv01 ~]# service iptables save
[root@oel6srv01 ~]# service iptables stop
[root@oel6srv01 ~]# chkconfig iptables off
[root@oel6srv01 ~]# service iptables status

-- If you are using IPv6 firewall, enter:
 [root@oel6srv01 ~]# service ip6tables save
 [root@oel6srv01 ~]# service ip6tables stop
 [root@oel6srv01 ~]# chkconfig ip6tables off
 [root@oel6srv01 ~]# service ip6tables status

– Disable the SELinux

[root@oel6srv01 ~]# vi /etc/sysconfig/selinux

– DISABLED kdump

[root@oel6srv01 ~]# chkconfig kdump on

[root@oel6srv01 ~]# chkconfig --list |grep kdump

kdump           0:off   1:off   2:off   3:off   4:off   5:off   6:off

– Network Setup

Public Cluster interphases, VIPs and SCAN

Subnet 10.0.0.x
Netmask 255.255.255.0

Private Cluster interphases

Subnet 192.168.0.x
Netmask 255.255.255.0

– Kernel add or amend the following lines to the “/etc/sysctl.conf” file.

# vi /etc/sysctl.conf

fs.aio-max-nr = 1048576
 fs.file-max = 6815744
 kernel.shmall = 2097152
 kernel.shmmax = 1062637568
 kernel.shmmni = 4096
 # semaphores: semmsl, semmns, semopm, semmni
 kernel.sem = 250 32000 100 128
 net.ipv4.ip_local_port_range = 9000 65500
 net.core.rmem_default=262144
 net.core.rmem_max=4194304
 net.core.wmem_default=262144
 net.core.wmem_max=1048586

--Activate the current Kernel parameters:

/sbin/sysctl -p

– Add the following lines to /etc/security/limits.conf

# vi  /etc/security/limits.conf

## Go to the end
 grid     soft     nproc      2047
 grid     hard     nproc     16384
 grid     soft     nofile     1024
 grid     hard     nofile    65536
 oracle   soft     nproc      2047
 oracle   hard     nproc     16384
 oracle   soft     nofile     1024
 oracle   hard     nofile    65536

– Add the following line to /etc/pam.d/login

# vi /etc/pam.d/login

session     required    pam_limits.so

– Disable secure linux

–making sure the SELINUX flag is set as follows.

# vi /etc/selinux/config

SELINUX=disabled

– NTP Setup

–If you are using NTP, you must add the “-x” option into the following line in the “/etc/sysconfig/ntpd” file.

# vi /etc/sysconfig/ntpd

OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid"

# service ntpd restart
 -- OR STOP NTP SERVER the Grid will start CSSD in Active mode instead of Observe:
 # service ntpd stop
 # chkconfig ntpd off

---------------------------------------------
 - Mandatory Packages for Oracle Linux 6  and Red Hat Enterprise Linux 6
 ---------------------------------------------
 binutils-2.20.51.0.2-5.11.el6 (x86_64)
 compat-libcap1-1.10-1 (x86_64)
 compat-libstdc++-33-3.2.3-69.el6 (x86_64)
 compat-libstdc++-33-3.2.3-69.el6.i686
 gcc-4.4.4-13.el6 (x86_64)
 gcc-c++-4.4.4-13.el6 (x86_64)
 glibc-2.12-1.7.el6 (i686)
 glibc-2.12-1.7.el6 (x86_64)
 glibc-devel-2.12-1.7.el6 (x86_64)
 glibc-devel-2.12-1.7.el6.i686
 ksh
 libgcc-4.4.4-13.el6 (i686)
 libgcc-4.4.4-13.el6 (x86_64)
 libstdc++-4.4.4-13.el6 (x86_64)
 libstdc++-4.4.4-13.el6.i686
 libstdc++-devel-4.4.4-13.el6 (x86_64)
 libstdc++-devel-4.4.4-13.el6.i686
 libaio-0.3.107-10.el6 (x86_64)
 libaio-0.3.107-10.el6.i686
 libaio-devel-0.3.107-10.el6 (x86_64)
 libaio-devel-0.3.107-10.el6.i686
 make-3.81-19.el6
 sysstat-9.0.4-11.el6 (x86_64)

–Install the cvuqdisk RPM. Without cvuqdisk, Cluster Verification Utility cannot discover shared disks,
–and it raises the following error message “Package cvuqdisk not installed” when Cluster Verification Utility is executed.

–A copy of the cvuqdisk package is present on the 1st Grid Infrastructure ZIP file.

— Log in as root.

Use the following command to find if you have an existing version of the cvuqdisk package:

# rpm -qi cvuqdisk

2. If you have an existing version, then enter the following command to deinstall the existing version:

rpm -e cvuqdisk

Set the environment variable CVUQDISK_GRP to point to the group that will own cvuqdisk, typically oinstall. For example:

# CVUQDISK_GRP=oinstall; export CVUQDISK_GRP

4. Install the cvuqdisk package:

rpm -iv package

# rpm -iv cvuqdisk-1.0.9-1.rpm

--OR

you install oracle-rdbms-server-12cR1-preinstall

– OPTIONAL RPMs

--Minimum ODBC Drivers for Oracle and Red Hat Linux 6 on x86-64
 unixODBC-2.2.14-11.el6 (64-bit) or later
 unixODBC-devel-2.2.14-11.el6 (64-bit) or later

– UNIX Groups

/usr/sbin/groupadd -g 1000 oinstall
/usr/sbin/groupadd -g 1001 asmadmin
/usr/sbin/groupadd -g 1002 dba
/usr/sbin/groupadd -g 1003 asmdba
/usr/sbin/groupadd -g 1004 asmoper

–New optional roles which grant access to specific features like Data Guard, RMAN and Security have been added in 12c, but not implemented in this example.

– UNIX Users

useradd -u 1100 -g oinstall -G asmadmin,asmdba,asmoper grid
useradd -u 1101 -g oinstall -G asmdba,dba oracle

– Set SSH timeout wait to unlimited

# vi /etc/ssh/sshd_config

LoginGraceTime 0

– Create the u01 file system

[root@oel6srv01 ~]# fdisk /dev/sdb

The number of cylinders for this disk is set to 2871.
 There is nothing wrong with that, but this is larger than 1024,
 and could in certain setups cause problems with:
 1) software that runs at boot time (e.g., old versions of LILO)
 2) booting and partitioning software from other OSs
 (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): n
 Command action
 e   extended
 p   primary partition (1-4)
 p

Partition number (1-4): 1
 First cylinder (1-2871, default 1):
 Using default value 1
 Last cylinder or +size or +sizeM or +sizeK (1-2871, default 2871):
 Using default value 2871

Command (m for help): w
 The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
 The kernel still uses the old table.
 The new table will be used at the next reboot.
 Syncing disks.
 [root@oel6srv01 ~]#
 [root@oel6srv01 ~]# fdisk -l

Disk /dev/sda: 21.4 GB, 21474836480 bytes
 255 heads, 63 sectors/track, 2610 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot      Start         End      Blocks   Id  System
 /dev/sda1   *           1          13      104391   83  Linux
 /dev/sda2              14        2610    20860402+  8e  Linux LVM

Disk /dev/sdb: 23.6 GB, 23622320128 bytes
 255 heads, 63 sectors/track, 2871 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot      Start         End      Blocks   Id  System
 /dev/sdb1               1        2871    23061276   83  Linux

[root@oel6srv01 ~]# mkfs.ext4 /dev/sdb1
 mke4fs 1.41.12 (17-May-2010)
 Filesystem label=
 OS type: Linux
 Block size=4096 (log=2)
 Fragment size=4096 (log=2)
 Stride=0 blocks, Stripe width=0 blocks
 1441792 inodes, 5765319 blocks
 288265 blocks (5.00%) reserved for the super user
 First data block=0
 Maximum filesystem blocks=4294967296
 176 block groups
 32768 blocks per group, 32768 fragments per group
 8192 inodes per group
 Superblock backups stored on blocks:
 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
 4096000

Writing inode tables: done
 Creating journal (32768 blocks): done
 Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 37 mounts or
 180 days, whichever comes first.  Use tune4fs -c or -i to override.
 [root@oel6srv01 ~]#

[root@oel6srv01 /]# mkdir u01
 [root@oel6srv01 /]# cat /etc/fstab

..

..

/dev/sdb1               /u01                    ext4    defaults        0 0

[root@oel6srv01 /]# mount /u01
 [root@oel6srv01 /]# df -h

Filesystem            Size  Used Avail Use% Mounted on

/dev/mapper/VolGroup00-LogVol00
 15G  3.2G   12G  22% /
 /dev/sda1              99M   24M   71M  25% /boot
 tmpfs                 1.3G     0  1.3G   0% /dev/shm
 /dev/sdb1              22G  172M   21G   1% /u01

– Creation of GI and RDBMS directories

mkdir -p /u01/GRID/12.1.0.1
 mkdir -p /u01/app/product/12.1.0.1

#Oracle Base
 chown -R oracle:oinstall /u01/app
 chmod -R 775 /u01/app

#Oracle RDBMS Home
 chown -R oracle:oinstall /u01/app/product/12.1.0.1
 chmod -R 775 /u01/app/product/12.1.0.1

#Grid Home
 chown -R grid:oinstall /u01/GRID
 chmod -R 775 /u01/GRID/12.1.0.1

– Add this entries to the generic User Profile

# vi /etc/profile

if [ $USER = "oracle" ] || [ $USER = "grid" ]; then
 if [ $SHELL = "/bin/ksh" ]; then
 ulimit -p 16384
 ulimit -n 65536
 else
 ulimit -u 16384 -n 65536
 fi
 umask 022
 fi
 if [ $USER = "root" ]; then
 umask 022
 fi

— Configure the shared storage for ASM

##########################################################
## Installation Oracle Grid Infrastructure
##########################################################

--Run 12c Cluvfy with the following options to verify that all prerequisites are met:
 ./runcluvfy.sh stage -post hwos -n oel6srv01,oel6srv02,oel6srv03,oel6srv04 -verbose

# Start the Grid Installation…..

[grid@oel6srv01 grid]$ ./runInstaller
 Starting Oracle Universal Installer...

Checking Temp space: must be greater than 120 MB.   Actual 39776 MB    Passed
 Checking swap space: must be greater than 150 MB.   Actual 4031 MB    Passed
 Checking monitor: must be configured to display at least 256 colors.    Actual 16777216    Passed
 Preparing to launch Oracle Universal Installer from /tmp/OraInstall2013-06-25_08-23-26PM. Please wait ..

##############################################################
## Grid Infrastructure crsctl output
##############################################################

[grid@oel6srv01 ~]$ crsctl stat res -t
 --------------------------------------------------------------------------------
 Name           Target  State        Server                   State details
 --------------------------------------------------------------------------------
 Local Resources
 --------------------------------------------------------------------------------
 ora.ASMNET2LSNR_ASM.lsnr
 ONLINE  ONLINE       oel6srv01                STABLE
 ONLINE  ONLINE       oel6srv02                STABLE
 ONLINE  ONLINE       oel6srv03                STABLE
 ONLINE  ONLINE       oel6srv04                STABLE
 ora.DATA1.VOL_CLOUD01.advm
 ONLINE  ONLINE       oel6srv01                Volume device /dev/a
 sm/vol_cloud01-178 i
 s online,STABLE
 ONLINE  ONLINE       oel6srv02                Volume device /dev/a
 sm/vol_cloud01-178 i
 s online,STABLE
 ONLINE  ONLINE       oel6srv03                Volume device /dev/a
 sm/vol_cloud01-178 i
 s online,STABLE
 ONLINE  ONLINE       oel6srv04                Volume device /dev/a
 sm/vol_cloud01-178 i
 s online,STABLE
 ora.DATA1.dg
 OFFLINE OFFLINE      oel6srv01               STABLE
 OFFLINE OFFLINE      oel6srv02               STABLE
 ONLINE  ONLINE       oel6srv03                STABLE
 ONLINE  ONLINE       oel6srv04                STABLE
 ora.FRA1.dg
 OFFLINE OFFLINE      oel6srv01               STABLE
 OFFLINE OFFLINE      oel6srv02               STABLE
 ONLINE  ONLINE       oel6srv03                STABLE
 ONLINE  ONLINE       oel6srv04                STABLE
 ora.LISTENER.lsnr
 ONLINE  ONLINE       oel6srv01                STABLE
 ONLINE  ONLINE       oel6srv02                STABLE
 ONLINE  ONLINE       oel6srv03                STABLE
 ONLINE  ONLINE       oel6srv04                STABLE
 ora.OCRVOTING.dg
 OFFLINE OFFLINE      oel6srv01               STABLE
 OFFLINE OFFLINE      oel6srv02               STABLE
 ONLINE  ONLINE       oel6srv03                STABLE
 ONLINE  ONLINE       oel6srv04                STABLE
 ora.data1.vol_cloud01.acfs
 ONLINE  ONLINE       oel6srv01                mounted on /cloudfs,
 STABLE
 ONLINE  ONLINE       oel6srv02                mounted on /cloudfs,
 STABLE
 ONLINE  ONLINE       oel6srv03                mounted on /cloudfs,
 STABLE
 ONLINE  ONLINE       oel6srv04                mounted on /cloudfs,
 STABLE
 ora.net1.network
 ONLINE  ONLINE       oel6srv01                STABLE
 ONLINE  ONLINE       oel6srv02                STABLE
 ONLINE  ONLINE       oel6srv03                STABLE
 ONLINE  ONLINE       oel6srv04                STABLE
 ora.ons
 ONLINE  ONLINE       oel6srv01                STABLE
 ONLINE  ONLINE       oel6srv02                STABLE
 ONLINE  ONLINE       oel6srv03                STABLE
 ONLINE  ONLINE       oel6srv04                STABLE
 ora.proxy_advm
 ONLINE  ONLINE       oel6srv01                STABLE
 ONLINE  ONLINE       oel6srv02                STABLE
 ONLINE  ONLINE       oel6srv03                STABLE
 ONLINE  ONLINE       oel6srv04                STABLE
 --------------------------------------------------------------------------------
 Cluster Resources
 --------------------------------------------------------------------------------
 ora.LISTENER_SCAN1.lsnr
 1        ONLINE  ONLINE       oel6srv04                STABLE
 ora.MGMTLSNR
 1        ONLINE  ONLINE       oel6srv04                169.254.25.188 192.1
 68.0.114 192.168.0.1
 24,STABLE
 ora.asm
 1        ONLINE  ONLINE       oel6srv04                STABLE
 3        ONLINE  ONLINE       oel6srv03                STABLE
 ora.cvu
 1        ONLINE  ONLINE       oel6srv04                STABLE
 ora.mgmtdb
 1        ONLINE  ONLINE       oel6srv04                Open,STABLE
 ora.oc4j
 1        ONLINE  ONLINE       oel6srv01                STABLE
 ora.oel6srv01.vip
 1        ONLINE  ONLINE       oel6srv01                STABLE
 ora.oel6srv02.vip
 1        ONLINE  ONLINE       oel6srv02                STABLE
 ora.oel6srv03.vip
 1        ONLINE  ONLINE       oel6srv03                STABLE
 ora.oel6srv04.vip
 1        ONLINE  ONLINE       oel6srv04                STABLE
 ora.scan1.vip
 1        ONLINE  ONLINE       oel6srv04                STABLE
 --------------------------------------------------------------------------------