Exadata as Code

It is very exciting for me to share this post, because what I’m going to describe here is not the final, but the current and intermediate result achieved by the Trivadis team on the development and implementation of what I call “Exadata as Code” project.

In the Cloud and automation era, this is the Trivadis answer to increase efficiency, time-to-market and quality on the most challenging Exadata projects. Trivadis has been working hard to achieve this level of automation, covering most of the recurrent activities on the platform, and this is not all, as any CI/CD development, it is getting every day better, enriched by new features and fixes, which simplify the lifecycle management of such platforms.

Few months ago, I posted a blog entitled Bulk Exadata Patching, where it is shown how to improve the Exadata patching automation, but despite the improvements, it was still having a number of manual interactions. Now we have reached a much better level of automation, with one-click action to perform cumbersome tasks like: Infrastructure and Database Provisioning/Decommissioning, Patching, and many other operational tasks.

How Exadata as Code works

The concept is quite simple, all developments are made available in the form of Ansible playbooks and incapsulated inside Jenkins; this brings the following advantages:

  • User friendly interface
  • Orchestration Pipelines
  • Enhanced Security, recording auditing information and logging job executions
  • Job Scheduling

Exadata Administration Workflow

Scalable solution with which efficiently manage Exadata platforms

Oracle Database Administration Workflow

Secure high quality database operation with an effective DBA Team

Wrap-up

One takeaway from this experience, automation is the only option to stay competitive and deliver high quality service.


Special thanks go to all Trivadis’ colleagues working everyday so passionately on the project #BetterTogether.


Bulk Exadata Patching

After more than 11 year from the launch Oracle Exadata Machine has become popular on many companies across industries, making administrators, developers and final users almost unanimously satisfied about performance and availability.

But also, on Exadata there are cumbersome maintenance activities like patching.

Most of my Exadata customers have acquired 2 non-full RACKs, which makes the patching effort quite reasonable; but recently I started working on a project with multiple full RACKs, with tens of Storage Servers, Compute Nodes and hundreds of Virtual Machines…

A very challenging environment, especially when it came to patching…

Patching all the systems using the standard patchmgr utiliy was not acceptable, therefore I had to replace my standard patching procedure with a new one offering automation and scalability.

At this subject Oracle provides few handy options:

Patching Exadata Infrastructure

  • Storage Server Patching via http/https server: starting with Oracle Exadata System Software release 18.1.0.0.0, it is possible to patch the Storage Servers using an external http server hosting the new software image. The activitiy can be scheduled up to one week before the installation, allowing on each Cell the Management Server (MS) to start downloading and run pre-checks in advance. MS interrupts the software upgrade and generates an alert if the Cell does not comply with all pre-requisites.
  • Unbreakable Linux Network: ULN offers software patches, updates, and fixes for Oracle Linux and Oracle VM. The implementation of a local YUM repository leverages the patch automation of the bare metal OS or dom0/domU.
  • InfiniBand Switch: standard rolling upgrade patching procedure using patchmgr.

Patching Grid Infrastructure & RDBMS

  • GI & RDBMS: those components are patched using the standard Oracle tools common to all platforms, but the entire process has been parallalized using OS tools like dcli commands.

Overview Bulk Exadata Patching


Main Patching Commands

Storage Server – Scheduling Automated Storage Server Update via HTTP/HTTPS

On the Storage Cell set the local Apache location hosting the cell software

[root@efucndb01-a ~]# dcli -l root -g ~/cells cellcli -e 'alter softwareUpdate store=\"http://uln-yum.emilianofusaglia.net/cellsw\"'
efucncel01-a: Software Update successfully altered.
efucncel02-a: Software Update successfully altered.
efucncel03-a: Software Update successfully altered.
efucncel04-a: Software Update successfully altered.
efucncel05-a: Software Update successfully altered.
efucncel06-a: Software Update successfully altered.
efucncel07-a: Software Update successfully altered.
efucncel08-a: Software Update successfully altered.
efucncel09-a: Software Update successfully altered.
efucncel10-a: Software Update successfully altered.
efucncel11-a: Software Update successfully altered.
efucncel12-a: Software Update successfully altered.
efucncel13-a: Software Update successfully altered.
efucncel14-a: Software Update successfully altered.
[root@efucndb01-a ~]#

Schedule the update

[root@efucndb01-a ~]# dcli -l root -g ~/cells cellcli -e 'alter softwareUpdate time=\"03:20 AM WEDNESDAY\"'
efucncel01-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel02-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel03-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel04-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel05-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel06-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel07-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel08-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel09-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel10-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel11-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel12-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel13-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
efucncel14-a: Software update is scheduled to begin at: 2020-02-05T03:20:00+01:00.
[root@efucndb01-a ~]#

Verify the scheduled upgrade

[root@efucndb01-a ~]# dcli -l root -g ~/cells cellcli -e 'list softwareupdate detail'
efucncel01-a: name: 19.3.4.0.0.200130
efucncel01-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel01-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel01-a: time: 2020-02-05T03:20:00+01:00
efucncel02-a: name: 19.3.4.0.0.200130
efucncel02-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel02-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel02-a: time: 2020-02-05T03:20:00+01:00
efucncel03-a: name: 19.3.4.0.0.200130
efucncel03-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel03-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel03-a: time: 2020-02-05T03:20:00+01:00
efucncel04-a: name: 19.3.4.0.0.200130
efucncel04-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel04-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel04-a: time: 2020-02-05T03:20:00+01:00
efucncel05-a: name: 19.3.4.0.0.200130
efucncel05-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel05-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel05-a: time: 2020-02-05T03:20:00+01:00
efucncel06-a: name: 19.3.4.0.0.200130
efucncel06-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel06-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel06-a: time: 2020-02-05T03:20:00+01:00
efucncel07-a: name: 19.3.4.0.0.200130
efucncel07-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel07-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel07-a: time: 2020-02-05T03:20:00+01:00
efucncel08-a: name: 19.3.4.0.0.200130
efucncel08-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel08-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel08-a: time: 2020-02-05T03:20:00+01:00
efucncel09-a: name: 19.3.4.0.0.200130
efucncel09-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel09-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel09-a: time: 2020-02-05T03:20:00+01:00
efucncel10-a: name: 19.3.4.0.0.200130
efucncel10-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel10-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel10-a: time: 2020-02-05T03:20:00+01:00
efucncel11-a: name: 19.3.4.0.0.200130
efucncel11-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel11-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel11-a: time: 2020-02-05T03:20:00+01:00
efucncel12-a: name: 19.3.4.0.0.200130
efucncel12-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel12-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel12-a: time: 2020-02-05T03:20:00+01:00
efucncel13-a: name: 19.3.4.0.0.200130
efucncel13-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel13-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel13-a: time: 2020-02-05T03:20:00+01:00
efucncel14-a: name: 19.3.4.0.0.200130
efucncel14-a: status: PreReq OK. Ready to update at: 2020-02-05T03:20:00+01:00
efucncel14-a: store: http://uln-yum.emilianofusaglia.net/cellsw
efucncel14-a: time: 2020-02-05T03:20:00+01:00
[root@efucndb01-a ~]#

Unbreakable Linux Network

dom0 checks

[root@efuconsole dbserver_patch_19.200120]# ./patchmgr -dbnodes ~/dom0 -precheck -yum_repo http://uln-yum.emilianofusaglia.net/yum/EngineeredSystems/exadata/dbserver/dom0/19.3.4.0.0/base/x86_64 -target_version 19.3.4.0.0.200130

NOTE patchmgr release: 19.200120 (always check MOS 1553103.1 for the latest release of dbserver.patch.zip)
NOTE
WARNING Do not interrupt the patchmgr session.
WARNING Do not resize the screen. It may disturb the screen layout.
WARNING Do not reboot database nodes during update or rollback.
WARNING Do not open logfiles in write mode and do not try to alter them.

2020-02-06 14:06:17 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-06 14:06:19 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-06 14:06:22 +0100 :Working: Initiate precheck on 8 node(s)
2020-02-06 14:07:36 +0100 :Working: Check free space on node(s)
2020-02-06 14:07:42 +0100 :SUCCESS: Check free space on node(s)
2020-02-06 14:08:07 +0100 :Working: dbnodeupdate.sh running a precheck on node(s).
2020-02-06 14:09:43 +0100 :SUCCESS: Initiate precheck on node(s).
2020-02-06 14:09:45 +0100 :SUCCESS: Completed run of command: ./patchmgr -dbnodes /root/dom0 -precheck -yum_repo http://uln-yum.emilianofusaglia.net/yum/EngineeredSystems/exadata/dbserver/dom0/19.3.4.0.0/base/x86_64 -target_version 19.3.4.0.0.200130
2020-02-06 14:09:45 +0100 :INFO : Precheck attempted on nodes in file /root/dom0: [efucndb01-a efucndb02-a efucndb03-a efucndb04-a efucndb05-a efucndb06-a efucndb07-a efucndb08-a]
2020-02-06 14:09:45 +0100 :INFO : Current image version on dbnode(s) is:
2020-02-06 14:09:45 +0100 :INFO : efucndb01-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb02-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb03-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb04-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb05-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb06-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb07-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : efucndb08-a: 19.2.6.0.0.190911.1
2020-02-06 14:09:45 +0100 :INFO : For details, check the following files in /EXAVMIMAGES/Patch/patchmgr_DBSERVER/dbserver_patch_19.200120:
2020-02-06 14:09:45 +0100 :INFO : - _dbnodeupdate.log
2020-02-06 14:09:45 +0100 :INFO : - patchmgr.log
2020-02-06 14:09:45 +0100 :INFO : - patchmgr.trc
2020-02-06 14:09:45 +0100 :INFO : Exit status:0
2020-02-06 14:09:45 +0100 :INFO : Exiting.
[root@efucndb01-a dbserver_patch_19.200120]#

dom0 upgrade

[root@efuconsole dbserver_patch_19.200120]# ./patchmgr -dbnodes ~/dom0 -upgrade -yum_repo http://uln-yum.emilianofusaglia.net/yum/EngineeredSystems/exadata/dbserver/dom0/19.3.4.0.0/base/x86_64 -target_version 19.3.4.0.0.200130

NOTE patchmgr release: 19.200120 (always check MOS 1553103.1 for the latest release of dbserver.patch.zip)
NOTE
NOTE Database nodes will reboot during the update process.
NOTE
WARNING Do not interrupt the patchmgr session.
WARNING Do not resize the screen. It may disturb the screen layout.
WARNING Do not reboot database nodes during update or rollback.
WARNING Do not open logfiles in write mode and do not try to alter them.

2020-02-06 14:29:11 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-06 14:29:13 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-06 14:29:15 +0100 :Working: Initiate prepare steps on node(s).
2020-02-06 14:29:17 +0100 :Working: Check free space on node(s)
2020-02-06 14:29:23 +0100 :SUCCESS: Check free space on node(s)
2020-02-06 14:29:59 +0100 :SUCCESS: Initiate prepare steps on node(s).
2020-02-06 14:29:59 +0100 :Working: Initiate update on 8 node(s).
2020-02-06 14:29:59 +0100 :Working: dbnodeupdate.sh running a backup on 8 node(s).
2020-02-06 14:36:16 +0100 :SUCCESS: dbnodeupdate.sh running a backup on 8 node(s).
2020-02-06 14:36:16 +0100 :Working: Initiate update on node(s)
2020-02-06 14:36:16 +0100 :Working: Get information about any required OS upgrades from node(s).
2020-02-06 14:36:28 +0100 :SUCCESS: Get information about any required OS upgrades from node(s).
2020-02-06 14:36:28 +0100 :Working: dbnodeupdate.sh running an update step on all nodes.
2020-02-06 14:56:38 +0100 :INFO : efucndb01-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb02-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb03-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb04-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb05-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb06-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb07-a is ready to reboot.
2020-02-06 14:56:38 +0100 :INFO : efucndb08-a is ready to reboot.
2020-02-06 14:56:39 +0100 :SUCCESS: dbnodeupdate.sh running an update step on all nodes.
2020-02-06 14:56:51 +0100 :Working: Initiate reboot on node(s)
2020-02-06 14:56:55 +0100 :SUCCESS: Initiate reboot on node(s)
2020-02-06 14:56:55 +0100 :Working: Waiting to ensure node(s) is down before reboot.
2020-02-06 14:58:20 +0100 :SUCCESS: Waiting to ensure node(s) is down before reboot.
2020-02-06 14:58:20 +0100 :Working: Waiting to ensure node(s) is up after reboot.
2020-02-06 15:04:23 +0100 :SUCCESS: Waiting to ensure node(s) is up after reboot.
2020-02-06 15:04:23 +0100 :Working: Waiting to connect to node(s) with SSH. During Linux upgrades this can take some time.
2020-02-06 15:27:46 +0100 :SUCCESS: Waiting to connect to node(s) with SSH. During Linux upgrades this can take some time.
2020-02-06 15:27:46 +0100 :Working: Wait for node(s) is ready for the completion step of update.
2020-02-06 15:31:29 +0100 :SUCCESS: Wait for node(s) is ready for the completion step of update.
2020-02-06 15:31:30 +0100 :Working: Initiate completion step from dbnodeupdate.sh on node(s)
2020-02-06 15:48:10 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb01-a
2020-02-06 15:48:14 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb02-a
2020-02-06 15:48:19 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb03-a
2020-02-06 15:48:30 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb04-a
2020-02-06 15:48:35 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb05-a
2020-02-06 15:48:46 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb06-a
2020-02-06 15:48:50 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb07-a
2020-02-06 15:49:02 +0100 :SUCCESS: Initiate completion step from dbnodeupdate.sh on efucndb08-a
2020-02-06 15:49:18 +0100 :SUCCESS: Initiate update on node(s).
2020-02-06 15:49:18 +0100 :SUCCESS: Initiate update on 0 node(s).
[INFO ] Collected dbnodeupdate diag in file: Diag_patchmgr_dbnode_upgrade_060220142909.tbz
-rw-r--r-- 1 root root 6381043 Feb 6 15:49 Diag_patchmgr_dbnode_upgrade_060220142909.tbz
2020-02-06 15:49:22 +0100 :SUCCESS: Completed run of command: ./patchmgr -dbnodes /root/dom0 -upgrade -yum_repo http://uln-yum.emilianofusaglia.net/yum/EngineeredSystems/exadata/dbserver/dom0/19.3.4.0.0/base/x86_64 -target_version 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : Upgrade attempted on nodes in file /root/dom0: [efucndb01-a efucndb02-a efucndb03-a efucndb04-a efucndb05-a efucndb06-a efucndb07-a efucndb08-a]
2020-02-06 15:49:22 +0100 :INFO : Current image version on dbnode(s) is:
2020-02-06 15:49:22 +0100 :INFO : efucndb01-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb02-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb03-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb04-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb05-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb06-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb07-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : efucndb08-a: 19.3.4.0.0.200130
2020-02-06 15:49:22 +0100 :INFO : For details, check the following files in /EXAVMIMAGES/Patch/patchmgr_DBSERVER/dbserver_patch_19.200120:
2020-02-06 15:49:22 +0100 :INFO : - _dbnodeupdate.log
2020-02-06 15:49:22 +0100 :INFO : - patchmgr.log
2020-02-06 15:49:22 +0100 :INFO : - patchmgr.trc
2020-02-06 15:49:22 +0100 :INFO : Exit status:0
2020-02-06 15:49:22 +0100 :INFO : Exiting.

InfiniBand Switch

IB switch checks

[root@efucndb01-a patch_switch_19.3.4.0.0.200130]# ./patchmgr -ibswitches ~/ibs -upgrade -ibswitch_precheck
2020-02-10 07:57:44 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-10 07:57:46 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-10 07:57:47 +0100 1 of 1 :Working: Initiate pre-upgrade validation check on InfiniBand switch(es).
----- InfiniBand switch update process started 2020-02-10 07:57:48 +0100 -----
[NOTE ] Log file at /EXAVMIMAGES/Patch/IBs_19.3.4.0.0/patch_switch_19.3.4.0.0.200130/upgradeIBSwitch.log
[INFO ] List of InfiniBand switches for upgrade: ( efucnsw-iba01-a efucnsw-ibb01-a )
[SUCCESS ] Verifying Network connectivity to efucnsw-iba01-a
[SUCCESS ] Verifying Network connectivity to efucnsw-ibb01-a
[SUCCESS ] Validating verify-topology output
[INFO ] Master Subnet Manager is set to "efucnsw-iba01-a" in all Switches
[INFO ] ---------- Starting with InfiniBand Switch efucnsw-iba01-a
[WARNING ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.13-2 with the current package.
To downgrade to other versions:
Manually download the InfiniBand switch firmware package to the patch directory
Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
Run patchmgr command to initiate downgrade.
[SUCCESS ] Verify SSH access to the patchmgr host efucndb01-a.emilianofusaglia.net from the InfiniBand Switch efucnsw-iba01-a.
[INFO ] Starting pre-update validation on efucnsw-iba01-a
[SUCCESS ] Verifying that /tmp has 150M in efucnsw-iba01-a, found 492M
[SUCCESS ] Verifying that / has 20M in efucnsw-iba01-a, found 28M
[SUCCESS ] NTP daemon is running on efucnsw-iba01-a.
[INFO ] Manually validate the following entries Date:(YYYY-aM-DD) 2020-02-10 Time:(HH:MM:SS) 07:58:05
[INFO ] Validating the current firmware on the InfiniBand Switch
[SUCCESS ] Firmware verification on InfiniBand switch efucnsw-iba01-a
[SUCCESS ] Verifying that the patchmgr host efucndb01-a.emilianofusaglia.net is recognized on the InfiniBand Switch efucnsw-iba01-a through getHostByName
[SUCCESS ] Execute plugin check for Patch Check Prereq on efucnsw-iba01-a
[INFO ] Finished pre-update validation on efucnsw-iba01-a
[SUCCESS ] Pre-update validation on efucnsw-iba01-a
[SUCCESS ] Prereq check on efucnsw-iba01-a
[INFO ] ---------- Starting with InfiniBand Switch efucnsw-ibb01-a
[WARNING ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.13-2 with the current package.
To downgrade to other versions:
Manually download the InfiniBand switch firmware package to the patch directory
Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
Run patchmgr command to initiate downgrade.
[SUCCESS ] Verify SSH access to the patchmgr host efucndb01-a.emilianofusaglia.net from the InfiniBand Switch efucnsw-ibb01-a.
[INFO ] Starting pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Verifying that /tmp has 150M in efucnsw-ibb01-a, found 492M
[SUCCESS ] Verifying that / has 20M in efucnsw-ibb01-a, found 28M
[SUCCESS ] NTP daemon is running on efucnsw-ibb01-a.
[INFO ] Manually validate the following entries Date:(YYYY-aM-DD) 2020-02-10 Time:(HH:MM:SS) 07:58:25
[INFO ] Validating the current firmware on the InfiniBand Switch
[SUCCESS ] Firmware verification on InfiniBand switch efucnsw-ibb01-a
[SUCCESS ] Verifying that the patchmgr host efucndb01-a.emilianofusaglia.net is recognized on the InfiniBand Switch efucnsw-ibb01-a through getHostByName
[SUCCESS ] Execute plugin check for Patch Check Prereq on efucnsw-ibb01-a
[INFO ] Finished pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Prereq check on efucnsw-ibb01-a
[SUCCESS ] Overall status
----- InfiniBand switch update process ended 2020-02-10 07:58:42 +0100 -----
2020-02-10 07:58:42 +0100 1 of 1 :SUCCESS: Initiate pre-upgrade validation check on InfiniBand switch(es).
2020-02-10 07:58:42 +0100 :SUCCESS: Completed run of command: ./patchmgr -ibswitches /root/ibs -upgrade -ibswitch_precheck
2020-02-10 07:58:42 +0100 :INFO : upgrade attempted on nodes in file /root/ibs: [efucnsw-iba01-a efucnsw-ibb01-a]
2020-02-10 07:58:42 +0100 :INFO : For details, check the following files in /EXAVMIMAGES/Patch/IBs_19.3.4.0.0/patch_switch_19.3.4.0.0.200130:
2020-02-10 07:58:42 +0100 :INFO : - upgradeIBSwitch.log
2020-02-10 07:58:42 +0100 :INFO : - upgradeIBSwitch.trc
2020-02-10 07:58:42 +0100 :INFO : - patchmgr.stdout
2020-02-10 07:58:42 +0100 :INFO : - patchmgr.stderr
2020-02-10 07:58:42 +0100 :INFO : - patchmgr.log
2020-02-10 07:58:42 +0100 :INFO : - patchmgr.trc
2020-02-10 07:58:42 +0100 :INFO : Exit status:0
2020-02-10 07:58:42 +0100 :INFO : Exiting.
[root@efucndb01-a patch_switch_19.3.4.0.0.200130]#

IB switch upgrade

[root@efucndb01-a patch_switch_19.3.4.0.0.200130]# ./patchmgr -ibswitches ~/ibs -upgrade
2020-02-10 07:59:22 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-10 07:59:24 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-10 07:59:25 +0100 1 of 1 :Working: Initiate upgrade of InfiniBand switches to 2.2.14-1. Expect up to 40 minutes for each switch
----- InfiniBand switch update process started 2020-02-10 07:59:25 +0100 -----
[NOTE ] Log file at /EXAVMIMAGES/Patch/IBs_19.3.4.0.0/patch_switch_19.3.4.0.0.200130/upgradeIBSwitch.log
[INFO ] List of InfiniBand switches for upgrade: ( efucnsw-iba01-a efucnsw-ibb01-a )
[SUCCESS ] Verifying Network connectivity to efucnsw-iba01-a
[SUCCESS ] Verifying Network connectivity to efucnsw-ibb01-a
[SUCCESS ] Validating verify-topology output
[INFO ] Proceeding with upgrade of InfiniBand switches to version 2.2.14_1
[INFO ] Master Subnet Manager is set to "efucnsw-iba01-a" in all Switches
[INFO ] ---------- Starting with InfiniBand Switch efucnsw-iba01-a
[WARNING ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.13-2 with the current package.
To downgrade to other versions:
Manually download the InfiniBand switch firmware package to the patch directory
Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
Run patchmgr command to initiate downgrade.
[SUCCESS ] Verify SSH access to the patchmgr host efucndb01-a.emilianofusaglia.net from the InfiniBand Switch efucnsw-iba01-a.
[INFO ] Starting pre-update validation on efucnsw-iba01-a
[SUCCESS ] Verifying that /tmp has 150M in efucnsw-iba01-a, found 492M
[SUCCESS ] Verifying that / has 20M in efucnsw-iba01-a, found 26M
[SUCCESS ] Service opensmd is running on InfiniBand Switch efucnsw-iba01-a
[SUCCESS ] NTP daemon is running on efucnsw-iba01-a.
[INFO ] Manually validate the following entries Date:(YYYY-aM-DD) 2020-02-10 Time:(HH:MM:SS) 07:59:41
[INFO ] Validating the current firmware on the InfiniBand Switch
[SUCCESS ] Firmware verification on InfiniBand switch efucnsw-iba01-a
[SUCCESS ] Verifying that the patchmgr host efucndb01-a.emilianofusaglia.net is recognized on the InfiniBand Switch efucnsw-iba01-a through getHostByName
[SUCCESS ] Execute plugin check for Patch Check Prereq on efucnsw-iba01-a
[INFO ] Finished pre-update validation on efucnsw-iba01-a
[SUCCESS ] Pre-update validation on efucnsw-iba01-a
[INFO ] Package will be downloaded at firmware update time via scp
[SUCCESS ] Execute plugin check for Patching on efucnsw-iba01-a
[INFO ] Starting upgrade on efucnsw-iba01-a to 2.2.14_1. Please give upto 15 mins for the process to complete. DO NOT INTERRUPT or HIT CTRL+C during the upgrade
[INFO ] Rebooting efucnsw-iba01-a to complete the firmware update. Wait for 15 minutes before continuing. DO NOT MANUALLY REBOOT THE INFINIBAND SWITCH
Connection to efucndb01-a closed by remote host.
Connection to efucndb01-a closed.
2020-02-10 08:27:49 +0100 :Working: Verify SSH equivalence for the root user to node(s)
2020-02-10 08:27:51 +0100 :SUCCESS: Verify SSH equivalence for the root user to node(s)
2020-02-10 08:27:52 +0100 1 of 1 :Working: Initiate upgrade of InfiniBand switches to 2.2.14-1. Expect up to 40 minutes for each switch
----- InfiniBand switch update process started 2020-02-10 08:27:52 +0100 -----
[NOTE ] Log file at /EXAVMIMAGES/Patch/IBs_19.3.4.0.0/patch_switch_19.3.4.0.0.200130/upgradeIBSwitch.log
[INFO ] List of InfiniBand switches for upgrade: ( efucnsw-iba01-a efucnsw-ibb01-a )
[SUCCESS ] Verifying Network connectivity to efucnsw-iba01-a
[SUCCESS ] Verifying Network connectivity to efucnsw-ibb01-a
[INFO ] InfiniBand switch efucnsw-iba01-a is already at target version.
[SUCCESS ] Validating verify-topology output
[INFO ] Proceeding with upgrade of InfiniBand switches to version 2.2.14_1
[INFO ] Master Subnet Manager is set to "efucnsw-ibb01-a" in all Switches
[INFO ] ---------- Starting with InfiniBand Switch efucnsw-ibb01-a
[WARNING ] Infiniband switch meets minimal version requirements, but downgrade is only available to 2.2.13-2 with the current package.
To downgrade to other versions:
Manually download the InfiniBand switch firmware package to the patch directory
Set export variable "EXADATA_IMAGE_IBSWITCH_DOWNGRADE_VERSION" to the appropriate version
Run patchmgr command to initiate downgrade.
[SUCCESS ] Verify SSH access to the patchmgr host efucndb01-a.emilianofusaglia.net from the InfiniBand Switch efucnsw-ibb01-a.
[INFO ] Starting pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Verifying that /tmp has 150M in efucnsw-ibb01-a, found 492M
[SUCCESS ] Verifying that / has 20M in efucnsw-ibb01-a, found 26M
[SUCCESS ] Service opensmd is running on InfiniBand Switch efucnsw-ibb01-a
[SUCCESS ] NTP daemon is running on efucnsw-ibb01-a.
[INFO ] Manually validate the following entries Date:(YYYY-aM-DD) 2020-02-10 Time:(HH:MM:SS) 08:28:07
[INFO ] Validating the current firmware on the InfiniBand Switch
[SUCCESS ] Firmware verification on InfiniBand switch efucnsw-ibb01-a
[SUCCESS ] Verifying that the patchmgr host efucndb01-a.emilianofusaglia.net is recognized on the InfiniBand Switch efucnsw-ibb01-a through getHostByName
[SUCCESS ] Execute plugin check for Patch Check Prereq on efucnsw-ibb01-a
[INFO ] Finished pre-update validation on efucnsw-ibb01-a
[SUCCESS ] Pre-update validation on efucnsw-ibb01-a
[INFO ] Package will be downloaded at firmware update time via scp
[SUCCESS ] Execute plugin check for Patching on efucnsw-ibb01-a
[INFO ] Starting upgrade on efucnsw-ibb01-a to 2.2.14_1. Please give upto 15 mins for the process to complete. DO NOT INTERRUPT or HIT CTRL+C during the upgrade
[INFO ] Rebooting efucnsw-ibb01-a to complete the firmware update. Wait for 15 minutes before continuing. DO NOT MANUALLY REBOOT THE INFINIBAND SWITCH
Connection to efucndb01-a closed by remote host.
Connection to efucndb01-a closed.

Exadata X8M introduces BIG Architectural Changes

Launched at the 2019 Oracle Open Word Conference the Exadata X8M introduces few big architectural changes which make the leading Oracle Database Machine even more attractive.

Among the most relevant changes there are:

  • InfiniBand Network replacement with an Ethernet network fabric
  • Intel Optane DC Persistent Memory inside the Storage Cell
  • New Remote Direct Memory Access (RDMA) functionalities
  • Replacement of the XEN Hypervisor with KVM

InfiniBand Network replacement with an Ethernet network fabric

The characteristic 40Gbit/second InfiniBand network used for all private network communications among database nodes and storage cells has been replaced by a new 100Gbit/second RDMA over Converged Ethernet Fabric (RoCE) based on the Cisco Switch 9336c RoCE .

The new network not only increase 2.5x the throughput but also reduces the communication latancy.

The schema below highlight the network architecture change for all private communications

Intel Optane DC Persistent Memory inside the Storage Cell

Oracle has introduced 1.5TB of Intel Optane DC Persistent Memory as additional storage device inside all Exadata X8M Storage Cell, (no matter if equipped with HC and EF devices), and it is used as accelerator in front of the Flash Memory Cards. In term of speed this new type of ultra fast storage device is located between the DRAM and the Flash Memory, bringing to three the number of multi-tiered storage devices present inside the Storage Cell.

The Exadata unique software is than capable to extract the maximum performance from this HW configuration, automatically detecting and placing the hottest data on the Persistent Memory, reducing the I/O latency of the most critical tasks.

Below is described the list of Storage Cell’s devices ordered by speed.

New Remote Direct Memory Access (RDMA) functionalities

Until now the RDMA was used among database nodes for exchanging Exafusion messages or for Smart Fusion Block Transfer. Starting with Exadata X8M, the RDMA technology is also used to perform direct I/O access to the Persistent Memory of the Storage Cells, bypassing the network and I/O stack and eliminating expensive CPU interrupts and context switches. This optimization reduces the latency by 10x, from 200μs to less than 19μs.

The picture below highlight the “Database Node to Database Node” and the new “Database Node to Storage Cell” communication using RDMA functionalities.

Replacement of the XEN Hypervisor with KVM

Oracle virtualization technology is called Oracle Virtual Machine (OVM) and in a productive invironment it can be implemented with one of this two different products:

  • Xen
  • KVM

Starting with Exadata X8M-2 the virtualization technology in use is KVM instead of Xen. Oracle started replacing Xen with KVM few years ago, for example on the smaller engineered system ODA X7-2M & X7-2S, but for the Exadata took longer, and I think the root cause was the InfiniBand network. Infact KVM is not fully integrated with InfiniBand, and it does not support bridging.

Exadata Deployment with Elastic Configuration

Recently, for one of my customers, I had the chance to install a couples of Exadata X7-2 using the new Elastic Configuration. The major benefits of using Elastic Configuration consists in the possibility to acquire the Exadata Machine with almost any possible combination of Database Nodes and Storage Cells.

In the past we were used to standard Oracle pre-defined Exadata Machine configurations: Eighth Rack, Quarter Rack, Half Rack and Full Rack, which is still possible, but not flexible enough.

The pictures below highlight the differences between the two configurations:

Edadats_Classiv_vs_Elastic

source: Oracle Data Sheet Exadata Database Machine X7-2

 

Deployment Exadata Elactic Configuration

The elastic configuration process automates the initial IP address allocations to databasenodes and storage cells, regardless the ordered configuration.  The Exadata Machine is connected to the InfiniBand switches using a standard cabling methodology which allows to determinate the node’s location in the rack. This information is therefore used when the nodes are powered up for the first time in order to assign the initial default IPs.

[root@exatest-iba0 ~]# ibhosts
Ca : 0x579b0123796ba0 ports 2 "node10 elasticNode 192.168.10.17,192.168.10.18 ETH0"
Ca : 0x579b01237966e0 ports 2 "node8 elasticNode 192.168.10.15,192.168.10.16 ETH0"
Ca : 0x579b0123844ab0 ports 2 "node6 elasticNode 192.168.10.11,192.168.10.12 ETH0"
Ca : 0x579b0123845e50 ports 2 "node5 elasticNode 192.168.10.7,192.168.10.8 ETH0"
Ca : 0x579b0123845fe0 ports 2 "node4 elasticNode 192.168.10.40,172.16.2.40 ETH0"
Ca : 0x579b0123845ea0 ports 2 "node3 elasticNode 192.168.10.9,192.168.10.10 ETH0"
Ca : 0x579b0123812b90 ports 2 "node2 elasticNode 192.168.10.1,192.168.10.2 ETH0"
Ca : 0x579b0123812970 ports 2 "node1 elasticNode 192.168.10.3,192.168.10.4 ETH0"
[root@exatest-iba0 ~]#

 

 

Because the Virtualization option was required,  it has to be activated at this stage:

[root@node8 ~]# /opt/oracle.SupportTools/switch_to_ovm.sh
2019-03-07 01:05:22 -0800 [INFO] Switch to DOM0 system partition /dev/VGExaDb/LVDbSys3 (/dev/mapper/VGExaDb-LVDbSys3)
2019-03-07 01:05:22 -0800 [INFO] Active system device: /dev/mapper/VGExaDb-LVDbSys1
2019-03-07 01:05:22 -0800 [INFO] Active system device in boot area: /dev/mapper/VGExaDb-LVDbSys1
2019-03-07 01:05:23 -0800 [INFO] Set active system device to /dev/VGExaDb/LVDbSys3 in /boot/I_am_hd_boot
2019-03-07 01:05:23 -0800 [INFO] Creating /.elasticConfig on DOM0 boot partition /boot
2019-03-07 01:05:34 -0800 [INFO] Reboot has been initiated to switch to the DOM0 system partition
Connection to 192.168.1.8 closed by remote host.
Connection to 192.168.1.8 closed.
✘

 

After the switch to OVM command it is time to reclaim the space initially used by the Linux bare metal Logical Volumes:

[root@node8 ~]# /opt/oracle.SupportTools/reclaimdisks.sh -free -reclaim
Model is ORACLE SERVER X7-2
Number of LSI controllers: 1
Physical disks found: 4 (252:0 252:1 252:2 252:3)
Logical drives found: 1
Linux logical drive: 0
RAID Level for the Linux logical drive: 5
Physical disks in the Linux logical drive: 4 (252:0 252:1 252:2 252:3)
Dedicated Hot Spares for the Linux logical drive: 0
Global Hot Spares: 0
[INFO ] Check for DOM0 with inactive Linux system disk
[INFO ] Valid DOM0 with inactive Linux system disk is detected
[INFO ] Number of partitions on the system device /dev/sda: 3
[INFO ] Higher partition number on the system device /dev/sda: 3
[INFO ] Last sector on the system device /dev/sda: 3509760000
[INFO ] End sector of the last partition on the system device /dev/sda: 3509759966
[INFO ] Remove inactive system logical volume /dev/VGExaDb/LVDbSys1
[INFO ] Remove logical volume /dev/VGExaDb/LVDbOra1
[INFO ] Extend logical volume /dev/VGExaDb/LVDbExaVMImages
[INFO ] Resize ocfs2 on logical volume /dev/VGExaDb/LVDbExaVMImages
[INFO ] XEN boot version and rpm versions are in sync
[INFO ] XEN EFI files will not be updated
[INFO ] Force setup grub
[root@node8 ~]#

 

Check the success of the reclaim disks procedure:

[root@node8 ~]# /opt/oracle.SupportTools/reclaimdisks.sh -check
Model is ORACLE SERVER X7-2
Number of LSI controllers: 1
Physical disks found: 4 (252:0 252:1 252:2 252:3)
Logical drives found: 1
Linux logical drive: 0
RAID Level for the Linux logical drive: 5
Physical disks in the Linux logical drive: 4 (252:0 252:1 252:2 252:3)
Dedicated Hot Spares for the Linux logical drive: 0
Global Hot Spares: 0
Valid. Disks configuration: RAID5 from 4 disks with no global and dedicated hot spare disks.
Valid. Booted: DOM0. Layout: DOM0.
[root@node8 ~]#

 

Upload the Oracle Exadata Database Machine Deployment Assistant configuration files to the database server, together with all software images, and run the One command procedure.

List of all Steps

[root@exatestdbadm01 linux-x64]# ./install.sh -cf TVD-exatest.xml -l
Initializing

1. Validate Configuration File
2. Update Nodes for Eighth Rack
3. Create Virtual Machine
4. Create Users
5. Setup Cell Connectivity
6. Calibrate Cells
7. Create Cell Disks
8. Create Grid Disks
9. Install Cluster Software
10. Initialize Cluster Software
11. Install Database Software
12. Relink Database with RDS
13. Create ASM Diskgroups
14. Create Databases
15. Apply Security Fixes
16. Install Exachk
17. Create Installation Summary
18. Resecure Machine
[root@exatestdbadm01 linux-x64]#

 

Run Step One to validate the setup

This example includes the creation of three different Clusters.

[root@exatestdbadm01 linux-x64]# ./install.sh -cf TVD-exatest.xml -s 1
Initializing
Executing Validate Configuration File
Validating cluster: Cluster-EFU
Locating machines...
Verifying operating systems...
Validating cluster networks...
Validating network connectivity...
Validating private ips on virtual cluster
Validating NTP setup...
Validating physical disks on storage cells...
Validating users...
Validating cluster: Cluster-PR1
Locating machines...
Verifying operating systems...
Validating cluster networks...
Validating network connectivity...
Validating private ips on virtual cluster
Validating NTP setup...
Validating physical disks on storage cells...
Validating users...
Validating cluster: Cluster-VAL
Locating machines...
Verifying operating systems...
Validating cluster networks...
Validating network connectivity...
Validating private ips on virtual cluster
Validating NTP setup...
Validating physical disks on storage cells...
Validating users...
Validating platinum...
Validating switches...
Checking disk reclaim status...
Checking Disk Tests Status....
Completed validation...

SUCCESS: Ip address: 10.x8.xx.40 is configured correctly
SUCCESS: Ip address: 10.x9.xx.55 is configured correctly
SUCCESS: Ip address: 10.x8.xx.41 is configured correctly
SUCCESS: Ip address: 10.x9.xx.56 is configured correctly
SUCCESS: Ip address: 10.x8.xx.45 is configured correctly
SUCCESS: Ip address: 10.x8.xx.46 is configured correctly
SUCCESS: Ip address: 10.x8.xx.44 is configured correctly
SUCCESS: Ip address: 10.x8.xx.43 is configured correctly
SUCCESS: Ip address: 10.x8.xx.42 is configured correctly
SUCCESS: 10.x8.xx.40 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.55 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.41 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.56 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.45 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.46 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.44 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.43 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.42 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.40 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.55 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.41 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.56 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.45 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.46 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.44 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.43 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.42 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.40 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.55 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.41 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.56 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.45 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.46 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.44 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.43 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.42 configured correctly on exatestceladm03.my.domain.com
SUCCESS: Ip address: 10.x8.xx.47 is configured correctly
SUCCESS: Ip address: 10.x9.xx.57 is configured correctly
SUCCESS: Ip address: 10.x8.xx.48 is configured correctly
SUCCESS: Ip address: 10.x9.xx.58 is configured correctly
SUCCESS: Ip address: 10.x8.xx.52 is configured correctly
SUCCESS: Ip address: 10.x8.xx.51 is configured correctly
SUCCESS: Ip address: 10.x8.xx.53 is configured correctly
SUCCESS: Ip address: 10.x8.xx.50 is configured correctly
SUCCESS: Ip address: 10.x8.xx.49 is configured correctly
SUCCESS: 10.x8.xx.47 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.57 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.48 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.58 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.52 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.51 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.53 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.50 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.49 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.47 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.57 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.48 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.58 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.52 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.51 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.53 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.50 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.49 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.47 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.57 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.48 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.58 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.52 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.51 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.53 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.50 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.49 configured correctly on exatestceladm03.my.domain.com
SUCCESS: Ip address: 10.x8.xx.54 is configured correctly
SUCCESS: Ip address: 10.x9.xx.59 is configured correctly
SUCCESS: Ip address: 10.x8.xx.55 is configured correctly
SUCCESS: Ip address: 10.x9.xx.60 is configured correctly
SUCCESS: Ip address: 10.x8.xx.58 is configured correctly
SUCCESS: Ip address: 10.x8.xx.60 is configured correctly
SUCCESS: Ip address: 10.x8.xx.59 is configured correctly
SUCCESS: Ip address: 10.x8.xx.57 is configured correctly
SUCCESS: Ip address: 10.x8.xx.56 is configured correctly
SUCCESS: 10.x8.xx.54 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.59 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.55 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x9.xx.60 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.58 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.60 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.59 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.57 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.56 configured correctly on exatestceladm01.my.domain.com
SUCCESS: 10.x8.xx.54 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.59 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.55 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x9.xx.60 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.58 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.60 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.59 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.57 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.56 configured correctly on exatestceladm02.my.domain.com
SUCCESS: 10.x8.xx.54 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.59 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.55 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x9.xx.60 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.58 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.60 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.59 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.57 configured correctly on exatestceladm03.my.domain.com
SUCCESS: 10.x8.xx.56 configured correctly on exatestceladm03.my.domain.com
SUCCESS: Validated NTP server 10.x3.xx.xx0
SUCCESS: Validated NTP server 10.x3.xx.xx1
SUCCESS: Required file /EXAVMIMAGES/onecommand/linux-x64/WorkDir/p28514222_122118_Linux-x86-64.zip exists...
SUCCESS: Required file /EXAVMIMAGES/onecommand/linux-x64/WorkDir/p28762988_12201181016GIOCT2018RU_Linux-x86-64.zip exists...
SUCCESS: Required file /EXAVMIMAGES/onecommand/linux-x64/WorkDir/p28762989_12201181016DBOCT2018RU_Linux-x86-64.zip exists...
SUCCESS: Required file config/exachk.zip exists...
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm03.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm02.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm01.my.domain.com, machine type: storage
SUCCESS: Expected machine exatestdbadm01.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Expected machine exatestdbadm02.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: NTP servers on machine exatestceladm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm03.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm02.my.domain.com verified successfully
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm02.my.domain.com
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm01.my.domain.com
SUCCESS: Expected machine exatestdbadm02.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm01.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm02.my.domain.com, machine type: storage
SUCCESS: Expected machine exatestdbadm01.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm03.my.domain.com, machine type: storage
SUCCESS: NTP servers on machine exatestceladm03.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm01.my.domain.com verified successfully
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm02.my.domain.com
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm01.my.domain.com
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm03.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm02.my.domain.com, machine type: storage
SUCCESS: Found Operating system LinuxPhysical and configuration file expects LinuxPhysical on machine exatestceladm01.my.domain.com, machine type: storage
SUCCESS: Expected machine exatestdbadm02.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: Expected machine exatestdbadm01.my.domain.com to have OS Type of Linux Dom0, and found OsType LinuxDom0
SUCCESS: NTP servers on machine exatestceladm03.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm02.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestceladm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm01.my.domain.com verified successfully
SUCCESS: NTP servers on machine exatestdbadm02.my.domain.com verified successfully
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm02.my.domain.com
SUCCESS: Sufficient memory for all the guests on database node exatestdbadm01.my.domain.com
SUCCESS: Switch IP 10.x9.xx.51 resolves successfully to host exatest-iba0.my.domain.com on node exatestceladm03.my.domain.com
SUCCESS:
SUCCESS: Switch IP 10.x9.xx.51 resolves successfully to host exatest-iba0.my.domain.com on node exatestceladm02.my.domain.com
SUCCESS: Switch IP 10.x9.xx.52 resolves successfully to host exatest-ibb0.my.domain.com on node exatestceladm03.my.domain.com
SUCCESS:
SUCCESS:
SUCCESS:
SUCCESS: Switch IP 10.x9.xx.52 resolves successfully to host exatest-ibb0.my.domain.com on node exatestceladm02.my.domain.com
SUCCESS:
SUCCESS: Switch IP 10.x9.xx.51 resolves successfully to host exatest-iba0.my.domain.com on node exatestceladm01.my.domain.com
SUCCESS: Switch IP 10.x9.xx.52 resolves successfully to host exatest-ibb0.my.domain.com on node exatestceladm01.my.domain.com
SUCCESS:
SUCCESS: X7 compute node exatestdbadm01.my.domain.com has updated Broadcom firmware
SUCCESS: X7 compute node exatestdbadm02.my.domain.com has updated Broadcom firmware
SUCCESS: Disk Tests are not running/active on any of the Storage Servers.
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm01
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm02
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm01
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm02
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm01
SUCCESS: Cluster Version 12.2.0.1.181016 is compatible with OL7 on exatestdbadm02
SUCCESS: Disk size 10000GB on cell exatestceladm01.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm02.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm03.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm04.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm05.my.domain.com matches the value specified in the OEDA configuration file
SUCCESS: Disk size 10000GB on cell exatestceladm06.my.domain.com matches the value specified in the OEDA configuration file
Successfully completed execution of step Validate Configuration File [elapsed Time [Elapsed = 250301 mS [4.0 minutes] Thu Mar 07 12:35:31 CET 2019]]
[root@exatestdbadm01 linux-x64]#

 

 

Execution of all remaining steps

Than, because we felt confident, we decide to invoke all remaining steps together:

root@exatestdbadm01 linux-x64]# ./install.sh -cf TVD-exatest.xml -r 1-18
...
..

 

The final result is the Exadata Machine installed with six Oracle VMs, and three Grid Infrastructure clusters each one running a test RAC database.

 

 

Oracle VM Server 3.4.5 – Kernel Memory Leak

 

Oracle VM Server instability on version 3.4.5 due to a Oracle Unbreakable Enterprise Kernel (UEK)  bug.

The kernel version 4.1.12-124.14.5.el6uek.x86_64  has introduced a memory leak of the network module i40e

 

Here is reported the backtraces collected after the problem from the /var/log/messages:

Dec 12 07:06:30 efuovs02 kernel: [1508192.885203] ntpd invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
Dec 12 07:06:30 efuovs02 kernel: [1508192.885208] ntpd cpuset=/ mems_allowed=0
Dec 12 07:06:30 efuovs02 kernel: [1508192.885217] CPU: 3 PID: 4751 Comm: ntpd Not tainted 4.1.12-124.14.5.el6uek.x86_64 #2
Dec 12 07:06:30 efuovs02 kernel: [1508192.885221] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 02/14/2018
Dec 12 07:06:30 efuovs02 kernel: [1508192.885224]  0000000000000000 ffff8804484cf678 ffffffff816e4bdb ffff88044d44aa00
Dec 12 07:06:30 efuovs02 kernel: [1508192.885230]  0000000000000000 ffff8804484cf708 ffffffff816e32d1 01ff8804484cf688
Dec 12 07:06:30 efuovs02 kernel: [1508192.885235]  ffff8804484cf718 ffff8804484cf6c8 ffffffff811fc561 ffff8804484cf800
Dec 12 07:06:30 efuovs02 kernel: [1508192.885241] Call Trace:
Dec 12 07:06:30 efuovs02 kernel: [1508192.885251]  [<ffffffff816e4bdb>] dump_stack+0x63/0x81
Dec 12 07:06:30 efuovs02 kernel: [1508192.885256]  [<ffffffff816e32d1>] dump_header+0x7f/0x1f3
Dec 12 07:06:30 efuovs02 kernel: [1508192.885264]  [<ffffffff811fc561>] ? vmpressure+0x21/0x90
Dec 12 07:06:30 efuovs02 kernel: [1508192.885272]  [<ffffffff8118e53c>] oom_kill_process+0x1cc/0x3c0
Dec 12 07:06:30 efuovs02 kernel: [1508192.885283]  [<ffffffff8108de0e>] ? has_capability_noaudit+0x1e/0x30
Dec 12 07:06:31 efuovs02 kernel: [1508192.885288]  [<ffffffff8118eaab>] __out_of_memory+0x31b/0x530
Dec 12 07:06:31 efuovs02 kernel: [1508192.885294]  [<ffffffff8118ee5b>] out_of_memory+0x5b/0x80
Dec 12 07:06:31 efuovs02 kernel: [1508192.885300]  [<ffffffff81194d42>] __alloc_pages_nodemask+0x952/0xab0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885307]  [<ffffffff811de28d>] alloc_pages_vma+0xbd/0x260
Dec 12 07:06:31 efuovs02 kernel: [1508192.885311]  [<ffffffff8118a59e>] ? find_get_entry+0x1e/0xc0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885317]  [<ffffffff811ce6bd>] read_swap_cache_async+0xed/0x170
Dec 12 07:06:31 efuovs02 kernel: [1508192.885322]  [<ffffffff811ce82d>] swapin_readahead+0xed/0x190
Dec 12 07:06:31 efuovs02 kernel: [1508192.885328]  [<ffffffff811bbfe0>] handle_mm_fault+0x12d0/0x1770
Dec 12 07:06:31 efuovs02 kernel: [1508192.885335]  [<ffffffff8121d910>] ? poll_select_copy_remaining+0x130/0x130
Dec 12 07:06:31 efuovs02 kernel: [1508192.885340]  [<ffffffff8106d57f>] __do_page_fault+0x1af/0x480
Dec 12 07:06:31 efuovs02 kernel: [1508192.885346]  [<ffffffff816f2c1c>] ? page_fault+0xcc/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885350]  [<ffffffff8106d87f>] do_page_fault+0x2f/0x80
Dec 12 07:06:31 efuovs02 kernel: [1508192.885354]  [<ffffffff816f2be4>] ? page_fault+0x94/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885359]  [<ffffffff816f2bdd>] ? page_fault+0x8d/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885363]  [<ffffffff816f2bd6>] ? page_fault+0x86/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885367]  [<ffffffff816f2c5f>] page_fault+0x10f/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885375]  [<ffffffff813316c5>] ? copy_user_enhanced_fast_string+0x5/0x10
Dec 12 07:06:31 efuovs02 kernel: [1508192.885379]  [<ffffffff8121d7d1>] ? set_fd_set+0x21/0x30
Dec 12 07:06:31 efuovs02 kernel: [1508192.885384]  [<ffffffff8121e5aa>] core_sys_select+0x1fa/0x2f0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885392]  [<ffffffff810f8fc3>] ? ntp_notify_cmos_timer+0x23/0x30
Dec 12 07:06:31 efuovs02 kernel: [1508192.885396]  [<ffffffff810f8a1d>] ? do_adjtimex+0xed/0x100
Dec 12 07:06:31 efuovs02 kernel: [1508192.885402]  [<ffffffff810ed3ac>] ? SYSC_adjtimex+0x4c/0x80
Dec 12 07:06:31 efuovs02 kernel: [1508192.885410]  [<ffffffff810209e9>] ? read_tsc+0x9/0x10
Dec 12 07:06:31 efuovs02 kernel: [1508192.885414]  [<ffffffff810f68cb>] ? ktime_get_ts64+0x4b/0x110
Dec 12 07:06:31 efuovs02 kernel: [1508192.885419]  [<ffffffff8121e74b>] SyS_select+0xab/0x100
Dec 12 07:06:31 efuovs02 kernel: [1508192.885424]  [<ffffffff816ed451>] ? system_call_after_swapgs+0xdb/0x18c
Dec 12 07:06:31 efuovs02 kernel: [1508192.885428]  [<ffffffff816ed51a>] system_call_fastpath+0x18/0xd4
Dec 12 07:06:31 efuovs02 kernel: [1508192.885457] Mem-Info:
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469] active_anon:1452 inactive_anon:1426 isolated_anon:65
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  active_file:4559 inactive_file:873 isolated_file:0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  unevictable:1547 dirty:20 writeback:31 unstable:0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  slab_reclaimable:6776 slab_unreclaimable:8649
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  mapped:3007 shmem:0 pagetables:1705 bounce:0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  free:33536 free_pcp:918 free_cma:0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885483] Node 0 DMA free:15740kB min:60kB low:72kB high:84kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Dec 12 07:06:31 efuovs02 kernel: [1508192.885499] lowmem_reserve[]: 0 2661 15921 15921
Dec 12 07:06:31 efuovs02 kernel: [1508192.885508] Node 0 DMA32 free:64064kB min:11076kB low:13844kB high:16612kB active_anon:5912kB inactive_anon:5876kB active_file:112kB inactive_file:52kB unevictable:836kB isolated(anon):256kB isolated(file):0kB present:2781336kB managed:2751088kB mlocked:836kB dirty:0kB writeback:0kB mapped:668kB shmem:0kB slab_reclaimable:4792kB slab_unreclaimable:6452kB kernel_stack:912kB pagetables:1692kB unstable:0kB bounce:0kB free_pcp:1156kB local_pcp:248kB free_cma:0kB writeback_tmp:0kB pages_scanned:619296 all_unreclaimable? yes
Dec 12 07:06:31 efuovs02 kernel: [1508192.885524] lowmem_reserve[]: 0 0 13260 13260
Dec 12 07:06:31 efuovs02 kernel: [1508192.885532] Node 0 Normal free:54340kB min:54392kB low:67988kB high:81584kB active_anon:0kB inactive_anon:0kB active_file:18124kB inactive_file:3440kB unevictable:5352kB isolated(anon):4kB isolated(file):0kB present:13979888kB managed:13534768kB mlocked:5352kB dirty:80kB writeback:124kB mapped:11360kB shmem:0kB slab_reclaimable:22312kB slab_unreclaimable:28144kB kernel_stack:2880kB pagetables:5128kB unstable:0kB bounce:0kB free_pcp:2516kB local_pcp:572kB free_cma:0kB writeback_tmp:0kB pages_scanned:129384 all_unreclaimable? yes
Dec 12 07:06:31 efuovs02 kernel: [1508192.885546] lowmem_reserve[]: 0 0 0 0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885551] Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 1*32kB (U) 1*64kB (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15740kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885573] Node 0 DMA32: 143*4kB (UE) 111*8kB (UEM) 209*16kB (UE) 140*32kB (UE) 94*64kB (UEM) 57*128kB (UEM) 28*256kB (UEM) 7*512kB (UEM) 4*1024kB (EM) 9*2048kB (MR) 2*4096kB (MR) = 64068kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885596] Node 0 Normal: 8736*4kB (UEM) 1360*8kB (UEM) 208*16kB (UEM) 32*32kB (UE) 2*64kB (UE) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 54400kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885615] 8002 total pagecache pages
Dec 12 07:06:31 efuovs02 kernel: [1508192.885618] 1494 pages in swap cache
Dec 12 07:06:31 efuovs02 kernel: [1508192.885621] Swap cache stats: add 3717250313, delete 3717248819, find 2895172777/5168362256
Dec 12 07:06:31 efuovs02 kernel: [1508192.885624] Free swap  = 4129656kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885626] Total swap = 4194300kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885628] 4194303 pages RAM
Dec 12 07:06:31 efuovs02 kernel: [1508192.885630] 0 pages HighMem/MovableOnly
Dec 12 07:06:31 efuovs02 kernel: [1508192.885632] 118864 pages reserved
Dec 12 07:06:31 efuovs02 kernel: [1508192.885634] 0 pages cma reserved
Dec 12 07:06:31 efuovs02 kernel: [1508192.885636] 0 pages hwpoisoned
Dec 12 07:06:31 efuovs02 kernel: [1508192.885638] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Dec 12 07:06:31 efuovs02 kernel: [1508192.885650] [  983]     0   983     2677      266      11       3      111         -1000 udevd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885657] [ 3783]     0  3783   125771     1414      48       5        0         -1000 multipathd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885662] [ 4334]     0  4334     6944      399      15       3      108         -1000 auditd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885667] [ 4368]     0  4368    61281      438      23       3      377             0 rsyslogd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885671] [ 4383]     0  4383     2832      352      11       3      164             0 irqbalance
Dec 12 07:06:31 efuovs02 kernel: [1508192.885675] [ 4412]    32  4412     4760      397      16       3       74             0 rpcbind
Dec 12 07:06:31 efuovs02 kernel: [1508192.885680] [ 4436]    29  4436     5853      354      17       3      112             0 rpc.statd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885684] [ 4481]     0  4481     5790        0      15       3       50             0 rpc.idmapd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885689] [ 4522]     0  4522     2106      268      12       5       28             0 fcoemon
Dec 12 07:06:31 efuovs02 kernel: [1508192.885694] [ 4537]    81  4537     5373        0      15       3       62             0 dbus-daemon
Dec 12 07:06:31 efuovs02 kernel: [1508192.885698] [ 4612]     0  4612     1030      310       9       5       41             0 o2hbmonitor
Dec 12 07:06:31 efuovs02 kernel: [1508192.885702] [ 4632]     0  4632    47286      410      51       3      221             0 cupsd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885706] [ 4692]     0  4692     1039      323       9       5       30             0 acpid
Dec 12 07:06:31 efuovs02 kernel: [1508192.885711] [ 4718]     0  4718     1580      211       8       5       27             0 mcelog
Dec 12 07:06:31 efuovs02 kernel: [1508192.885715] [ 4738]     0  4738    16579      304      34       3      185         -1000 sshd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885720] [ 4751]    38  4751     6644      567      18       3      162             0 ntpd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885724] [ 4796]     0  4796     3235      331      15       6      143             0 xenstored
Dec 12 07:06:32 efuovs02 kernel: [1508192.885729] [ 4803]     0  4803    21126      307      21       6       69             0 xenconsoled
Dec 12 07:06:32 efuovs02 kernel: [1508192.885733] [ 4807]     0  4807    53362      393      65       3      525             0 qemu-system-i38
Dec 12 07:06:32 efuovs02 kernel: [1508192.885737] [ 4910]     0  4910    20252      478      45       3      239             0 master
Dec 12 07:06:32 efuovs02 kernel: [1508192.885742] [ 4922]    89  4922    20315      486      46       3      238             0 qmgr
Dec 12 07:06:32 efuovs02 kernel: [1508192.885746] [ 4930]     0  4930    29223      395      16       3      171             0 crond
Dec 12 07:06:32 efuovs02 kernel: [1508192.885750] [ 5036]     0  5036     5291      283      15       3       67             0 atd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885755] [ 5345]     0  5345    38468      249      14       5       30             0 osmdaemon
Dec 12 07:06:32 efuovs02 kernel: [1508192.885759] [ 5366]     0  5366    85597      650      61       7     1514             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885764] [ 5378]     0  5378    24079      467      27       6      113             0 ovmport
Dec 12 07:06:32 efuovs02 kernel: [1508192.885768] [ 5390]     0  5390    60521      410      65       6      920             0 ovmwatch
Dec 12 07:06:32 efuovs02 kernel: [1508192.885772] [ 5405]     0  5405   208969      656      87       7     1558             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885777] [ 5772]     0  5772   177327     1015      89       6     1775             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885782] [ 5789]     0  5789    49154      741      71       7     1366             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885786] [ 5831]     0  5831    82559      555      70       6     1491             0 devmon
Dec 12 07:06:32 efuovs02 kernel: [1508192.885790] [ 5901]     0  5901     1031      292       9       5       18             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885794] [ 5903]     0  5903     1031      292       8       5       19             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885798] [ 5905]     0  5905     1031      292       9       5       19             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885802] [ 5907]     0  5907     1031      292       9       5       19             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885806] [ 5909]     0  5909     1031      292       9       5       19             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885812] [26455]     0 26455    11091      458      44       5      108             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885816] [27087]     0 27087    11091      458      45       5      108             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885820] [27845]     0 27845    11091      458      44       5      109             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885825] [27996]     0 27996    11091      458      44       5      107             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885829] [14189]     0 14189    11091      458      44       5      109             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885833] [16371]     0 16371    11091      458      44       5      109             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885838] [14238]     0 14238     2676      256      11       3      129         -1000 udevd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885842] [14374]     0 14374     2676      240      11       3      119         -1000 udevd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885846] [15869]     0 15869    22957      931      62       6     1730             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885851] [16935]     0 16935    28695     2029      16       5       64             0 OSWatcher
Dec 12 07:06:32 efuovs02 kernel: [1508192.885855] [ 5867]    89  5867    20272     1250      45       3      229             0 pickup
Dec 12 07:06:32 efuovs02 kernel: [1508192.885860] [ 8948]     0  8948    27070      682      17       5       71             0 vmsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885864] [ 8951]     0  8951    27070      675      17       5       77             0 mpsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885868] [ 8953]     0  8953     1581      328      11       5       42             0 vmstat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885872] [ 8958]     0  8958    27070      659      17       6       41             0 iosub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885876] [ 8959]     0  8959    25258      441      13       5       46             0 mpstat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885880] [ 8966]     0  8966    25261      420      11       5       19             0 iostat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885884] [ 8971]     0  8971    27070      679      17       5        0             0 xtop
Dec 12 07:06:32 efuovs02 kernel: [1508192.885888] [ 8976]     0  8976    27070      695      17       5        0             0 psmemsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885892] [ 8977]     0  8977     3771      483      20       5        3             0 top
Dec 12 07:06:32 efuovs02 kernel: [1508192.885896] [ 8980]     0  8980    27070      680      17       5        0             0 oswsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885901] [ 8985]     0  8985    28695     1794      15       5      131             0 OSWatcher
Dec 12 07:06:32 efuovs02 kernel: [1508192.885905] [ 8986]     0  8986    27564      523      19       5        8             0 ps
Dec 12 07:06:32 efuovs02 kernel: [1508192.885909] [ 8987]     0  8987    27070       54      12       5        0             0 psmemsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885913] [ 8988]     0  8988    27070       52      11       5        0             0 oswsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885917] Out of memory: Kill process 5772 (python) score 0 or sacrifice child
Dec 12 07:06:32 efuovs02 kernel: [1508192.886216] Killed process 5772 (python) total-vm:709308kB, anon-rss:0kB, file-rss:4060kB

 

 

How to fix the OVS Kernel Memory Leak

Download the following kernel version which includes the memoy leak fix for the i40e module:  link to Oracle RPM repository

kernel-uek-4.1.12-124.21.1.el6uek.x86_64.rpm
kernel-uek-firmware-4.1.12-124.21.1.el6uek.noarch.rpm


[root@efuovs02 new_Kernel]# rpm -qp --changelog kernel-uek-4.1.12-124.21.1.el6uek.x86_64.rpm | grep -B 3 28228724
warning: kernel-uek-4.1.12-124.21.1.el6uek.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID ec551f03: NOKEY
* Tue Oct 30 2018 Brian Maly <brian.maly@oracle.com> [4.1.12-124.20.8.el6uek] 
- scsi: lpfc: devloss timeout race condition caused null pointer reference (James Smart) [Orabug: 27994179] 
- scsi: qla2xxx: Fix race condition between iocb timeout and initialisation (Ben Hutchings) [Orabug: 28013813] 
- i40e: Add programming descriptors to cleaned_count (Alexander Duyck) [Orabug: 28228724] 
- i40e: Fix memory leak related filter programming status (Alexander Duyck) [Orabug: 28228724]

 

 

Install the new OVS Kernel

Using the steps reported below, the new kernel has been installed on all OVS servers of the farm.

[root@efuovs02 new_Kernel]# rpm -ivh kernel*
warning: kernel-uek-4.1.12-124.21.1.el6uek.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID ec551f03: NOKEY
Preparing... ########################################### [100%]
1:kernel-uek-firmware ########################################### [ 50%]
2:kernel-uek ########################################### [100%]
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.1.12-124.21.1.el6uek.x86_64
Found linux image: /boot/vmlinuz-4.1.12-124.14.5.el6uek.x86_64
Found initrd image: /boot/initramfs-4.1.12-124.14.5.el6uek.x86_64.img
done
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.1.12-124.21.1.el6uek.x86_64
Found initrd image: /boot/initramfs-4.1.12-124.21.1.el6uek.x86_64.img
Found linux image: /boot/vmlinuz-4.1.12-124.14.5.el6uek.x86_64
Found initrd image: /boot/initramfs-4.1.12-124.14.5.el6uek.x86_64.img
done
[root@efuovs02 new_Kernel]#

....

[root@efuovs02 new_Kernel]# reboot

[root@efuovs02 ~]# uname -a
Linux efuovs02 4.1.12-124.21.1.el6uek.x86_64 #2 SMP Tue Nov 6 13:31:13 PST 2018 x86_64 x86_64 x86_64 GNU/Linux

 

 

 

 

Installing Oracle Grid Infrastructure 12c R2

It has been an exciting week, Oracle 12c R2 came out and suddenly was time to refresh the RAC test environments. My friend Jacques opted for an upgrade from 12.1.0.2 to 12.2.0.1 (here the link to his blog post),  I started with a fresh installation, because I also upgraded the Operating System to OEL  7.3.

Compared to 12c R1 there are new options on the installation process, but general speaking the wizard is quite similar.

The first breakthrough is about the installation simplified with an image based, no more runIstaller.sh to invoke but …

Unpack the .Zip file directly inside the Grid Infrastructure Home of the first cluster node as described below:

[grid@oel7node06 ~]$ mkdir -p /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ chown grid:oinstall /u01/app/12.2.0.1/grid 
[grid@oel7node06 ~]$ cd /u01/app/12.2.0.1/grid 
[grid@oel7node06 grid]$ unzip -q download_location/grid_home_image.zip

# From an X session invoke the Grid Infrastructure wizard: 
[grid@oel7node06 grid]$ ./gridSetup.sh

 

01

 

 

The second screenshot list the new Cluster typoligies available on 12c R2:

  • Oracle Standalone Cluster
  • Oracle Cluster Domain
    • Oracle Domain Services Cluster
    • Oracle Member Clusters
      • Oracle Member Cluster for Oracle Database
      • Oracle Member Cluster for Applications

 

In my case I’m installing an Oracle Standalone Cluster

02

 

 

03

04

 

05

 

06

 

07

 

08

 

09

 

10

 

11

 

12

 

13

 

14

 

15

 

16

 

17

 

18

19

 

20

 

21

 

22

 

And now time for testing.