Oracle VM Server 3.4.5 – Kernel Memory Leak

 

Oracle VM Server instability on version 3.4.5 due to a Oracle Unbreakable Enterprise Kernel (UEK)  bug.

The kernel version 4.1.12-124.14.5.el6uek.x86_64  has introduced a memory leak of the network module i40e

 

Here is reported the backtraces collected after the problem from the /var/log/messages:

Dec 12 07:06:30 efuovs02 kernel: [1508192.885203] ntpd invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
Dec 12 07:06:30 efuovs02 kernel: [1508192.885208] ntpd cpuset=/ mems_allowed=0
Dec 12 07:06:30 efuovs02 kernel: [1508192.885217] CPU: 3 PID: 4751 Comm: ntpd Not tainted 4.1.12-124.14.5.el6uek.x86_64 #2
Dec 12 07:06:30 efuovs02 kernel: [1508192.885221] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 02/14/2018
Dec 12 07:06:30 efuovs02 kernel: [1508192.885224]  0000000000000000 ffff8804484cf678 ffffffff816e4bdb ffff88044d44aa00
Dec 12 07:06:30 efuovs02 kernel: [1508192.885230]  0000000000000000 ffff8804484cf708 ffffffff816e32d1 01ff8804484cf688
Dec 12 07:06:30 efuovs02 kernel: [1508192.885235]  ffff8804484cf718 ffff8804484cf6c8 ffffffff811fc561 ffff8804484cf800
Dec 12 07:06:30 efuovs02 kernel: [1508192.885241] Call Trace:
Dec 12 07:06:30 efuovs02 kernel: [1508192.885251]  [<ffffffff816e4bdb>] dump_stack+0x63/0x81
Dec 12 07:06:30 efuovs02 kernel: [1508192.885256]  [<ffffffff816e32d1>] dump_header+0x7f/0x1f3
Dec 12 07:06:30 efuovs02 kernel: [1508192.885264]  [<ffffffff811fc561>] ? vmpressure+0x21/0x90
Dec 12 07:06:30 efuovs02 kernel: [1508192.885272]  [<ffffffff8118e53c>] oom_kill_process+0x1cc/0x3c0
Dec 12 07:06:30 efuovs02 kernel: [1508192.885283]  [<ffffffff8108de0e>] ? has_capability_noaudit+0x1e/0x30
Dec 12 07:06:31 efuovs02 kernel: [1508192.885288]  [<ffffffff8118eaab>] __out_of_memory+0x31b/0x530
Dec 12 07:06:31 efuovs02 kernel: [1508192.885294]  [<ffffffff8118ee5b>] out_of_memory+0x5b/0x80
Dec 12 07:06:31 efuovs02 kernel: [1508192.885300]  [<ffffffff81194d42>] __alloc_pages_nodemask+0x952/0xab0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885307]  [<ffffffff811de28d>] alloc_pages_vma+0xbd/0x260
Dec 12 07:06:31 efuovs02 kernel: [1508192.885311]  [<ffffffff8118a59e>] ? find_get_entry+0x1e/0xc0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885317]  [<ffffffff811ce6bd>] read_swap_cache_async+0xed/0x170
Dec 12 07:06:31 efuovs02 kernel: [1508192.885322]  [<ffffffff811ce82d>] swapin_readahead+0xed/0x190
Dec 12 07:06:31 efuovs02 kernel: [1508192.885328]  [<ffffffff811bbfe0>] handle_mm_fault+0x12d0/0x1770
Dec 12 07:06:31 efuovs02 kernel: [1508192.885335]  [<ffffffff8121d910>] ? poll_select_copy_remaining+0x130/0x130
Dec 12 07:06:31 efuovs02 kernel: [1508192.885340]  [<ffffffff8106d57f>] __do_page_fault+0x1af/0x480
Dec 12 07:06:31 efuovs02 kernel: [1508192.885346]  [<ffffffff816f2c1c>] ? page_fault+0xcc/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885350]  [<ffffffff8106d87f>] do_page_fault+0x2f/0x80
Dec 12 07:06:31 efuovs02 kernel: [1508192.885354]  [<ffffffff816f2be4>] ? page_fault+0x94/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885359]  [<ffffffff816f2bdd>] ? page_fault+0x8d/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885363]  [<ffffffff816f2bd6>] ? page_fault+0x86/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885367]  [<ffffffff816f2c5f>] page_fault+0x10f/0x120
Dec 12 07:06:31 efuovs02 kernel: [1508192.885375]  [<ffffffff813316c5>] ? copy_user_enhanced_fast_string+0x5/0x10
Dec 12 07:06:31 efuovs02 kernel: [1508192.885379]  [<ffffffff8121d7d1>] ? set_fd_set+0x21/0x30
Dec 12 07:06:31 efuovs02 kernel: [1508192.885384]  [<ffffffff8121e5aa>] core_sys_select+0x1fa/0x2f0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885392]  [<ffffffff810f8fc3>] ? ntp_notify_cmos_timer+0x23/0x30
Dec 12 07:06:31 efuovs02 kernel: [1508192.885396]  [<ffffffff810f8a1d>] ? do_adjtimex+0xed/0x100
Dec 12 07:06:31 efuovs02 kernel: [1508192.885402]  [<ffffffff810ed3ac>] ? SYSC_adjtimex+0x4c/0x80
Dec 12 07:06:31 efuovs02 kernel: [1508192.885410]  [<ffffffff810209e9>] ? read_tsc+0x9/0x10
Dec 12 07:06:31 efuovs02 kernel: [1508192.885414]  [<ffffffff810f68cb>] ? ktime_get_ts64+0x4b/0x110
Dec 12 07:06:31 efuovs02 kernel: [1508192.885419]  [<ffffffff8121e74b>] SyS_select+0xab/0x100
Dec 12 07:06:31 efuovs02 kernel: [1508192.885424]  [<ffffffff816ed451>] ? system_call_after_swapgs+0xdb/0x18c
Dec 12 07:06:31 efuovs02 kernel: [1508192.885428]  [<ffffffff816ed51a>] system_call_fastpath+0x18/0xd4
Dec 12 07:06:31 efuovs02 kernel: [1508192.885457] Mem-Info:
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469] active_anon:1452 inactive_anon:1426 isolated_anon:65
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  active_file:4559 inactive_file:873 isolated_file:0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  unevictable:1547 dirty:20 writeback:31 unstable:0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  slab_reclaimable:6776 slab_unreclaimable:8649
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  mapped:3007 shmem:0 pagetables:1705 bounce:0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885469]  free:33536 free_pcp:918 free_cma:0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885483] Node 0 DMA free:15740kB min:60kB low:72kB high:84kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15988kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Dec 12 07:06:31 efuovs02 kernel: [1508192.885499] lowmem_reserve[]: 0 2661 15921 15921
Dec 12 07:06:31 efuovs02 kernel: [1508192.885508] Node 0 DMA32 free:64064kB min:11076kB low:13844kB high:16612kB active_anon:5912kB inactive_anon:5876kB active_file:112kB inactive_file:52kB unevictable:836kB isolated(anon):256kB isolated(file):0kB present:2781336kB managed:2751088kB mlocked:836kB dirty:0kB writeback:0kB mapped:668kB shmem:0kB slab_reclaimable:4792kB slab_unreclaimable:6452kB kernel_stack:912kB pagetables:1692kB unstable:0kB bounce:0kB free_pcp:1156kB local_pcp:248kB free_cma:0kB writeback_tmp:0kB pages_scanned:619296 all_unreclaimable? yes
Dec 12 07:06:31 efuovs02 kernel: [1508192.885524] lowmem_reserve[]: 0 0 13260 13260
Dec 12 07:06:31 efuovs02 kernel: [1508192.885532] Node 0 Normal free:54340kB min:54392kB low:67988kB high:81584kB active_anon:0kB inactive_anon:0kB active_file:18124kB inactive_file:3440kB unevictable:5352kB isolated(anon):4kB isolated(file):0kB present:13979888kB managed:13534768kB mlocked:5352kB dirty:80kB writeback:124kB mapped:11360kB shmem:0kB slab_reclaimable:22312kB slab_unreclaimable:28144kB kernel_stack:2880kB pagetables:5128kB unstable:0kB bounce:0kB free_pcp:2516kB local_pcp:572kB free_cma:0kB writeback_tmp:0kB pages_scanned:129384 all_unreclaimable? yes
Dec 12 07:06:31 efuovs02 kernel: [1508192.885546] lowmem_reserve[]: 0 0 0 0
Dec 12 07:06:31 efuovs02 kernel: [1508192.885551] Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 1*32kB (U) 1*64kB (U) 0*128kB 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15740kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885573] Node 0 DMA32: 143*4kB (UE) 111*8kB (UEM) 209*16kB (UE) 140*32kB (UE) 94*64kB (UEM) 57*128kB (UEM) 28*256kB (UEM) 7*512kB (UEM) 4*1024kB (EM) 9*2048kB (MR) 2*4096kB (MR) = 64068kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885596] Node 0 Normal: 8736*4kB (UEM) 1360*8kB (UEM) 208*16kB (UEM) 32*32kB (UE) 2*64kB (UE) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 54400kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885615] 8002 total pagecache pages
Dec 12 07:06:31 efuovs02 kernel: [1508192.885618] 1494 pages in swap cache
Dec 12 07:06:31 efuovs02 kernel: [1508192.885621] Swap cache stats: add 3717250313, delete 3717248819, find 2895172777/5168362256
Dec 12 07:06:31 efuovs02 kernel: [1508192.885624] Free swap  = 4129656kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885626] Total swap = 4194300kB
Dec 12 07:06:31 efuovs02 kernel: [1508192.885628] 4194303 pages RAM
Dec 12 07:06:31 efuovs02 kernel: [1508192.885630] 0 pages HighMem/MovableOnly
Dec 12 07:06:31 efuovs02 kernel: [1508192.885632] 118864 pages reserved
Dec 12 07:06:31 efuovs02 kernel: [1508192.885634] 0 pages cma reserved
Dec 12 07:06:31 efuovs02 kernel: [1508192.885636] 0 pages hwpoisoned
Dec 12 07:06:31 efuovs02 kernel: [1508192.885638] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Dec 12 07:06:31 efuovs02 kernel: [1508192.885650] [  983]     0   983     2677      266      11       3      111         -1000 udevd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885657] [ 3783]     0  3783   125771     1414      48       5        0         -1000 multipathd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885662] [ 4334]     0  4334     6944      399      15       3      108         -1000 auditd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885667] [ 4368]     0  4368    61281      438      23       3      377             0 rsyslogd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885671] [ 4383]     0  4383     2832      352      11       3      164             0 irqbalance
Dec 12 07:06:31 efuovs02 kernel: [1508192.885675] [ 4412]    32  4412     4760      397      16       3       74             0 rpcbind
Dec 12 07:06:31 efuovs02 kernel: [1508192.885680] [ 4436]    29  4436     5853      354      17       3      112             0 rpc.statd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885684] [ 4481]     0  4481     5790        0      15       3       50             0 rpc.idmapd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885689] [ 4522]     0  4522     2106      268      12       5       28             0 fcoemon
Dec 12 07:06:31 efuovs02 kernel: [1508192.885694] [ 4537]    81  4537     5373        0      15       3       62             0 dbus-daemon
Dec 12 07:06:31 efuovs02 kernel: [1508192.885698] [ 4612]     0  4612     1030      310       9       5       41             0 o2hbmonitor
Dec 12 07:06:31 efuovs02 kernel: [1508192.885702] [ 4632]     0  4632    47286      410      51       3      221             0 cupsd
Dec 12 07:06:31 efuovs02 kernel: [1508192.885706] [ 4692]     0  4692     1039      323       9       5       30             0 acpid
Dec 12 07:06:31 efuovs02 kernel: [1508192.885711] [ 4718]     0  4718     1580      211       8       5       27             0 mcelog
Dec 12 07:06:31 efuovs02 kernel: [1508192.885715] [ 4738]     0  4738    16579      304      34       3      185         -1000 sshd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885720] [ 4751]    38  4751     6644      567      18       3      162             0 ntpd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885724] [ 4796]     0  4796     3235      331      15       6      143             0 xenstored
Dec 12 07:06:32 efuovs02 kernel: [1508192.885729] [ 4803]     0  4803    21126      307      21       6       69             0 xenconsoled
Dec 12 07:06:32 efuovs02 kernel: [1508192.885733] [ 4807]     0  4807    53362      393      65       3      525             0 qemu-system-i38
Dec 12 07:06:32 efuovs02 kernel: [1508192.885737] [ 4910]     0  4910    20252      478      45       3      239             0 master
Dec 12 07:06:32 efuovs02 kernel: [1508192.885742] [ 4922]    89  4922    20315      486      46       3      238             0 qmgr
Dec 12 07:06:32 efuovs02 kernel: [1508192.885746] [ 4930]     0  4930    29223      395      16       3      171             0 crond
Dec 12 07:06:32 efuovs02 kernel: [1508192.885750] [ 5036]     0  5036     5291      283      15       3       67             0 atd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885755] [ 5345]     0  5345    38468      249      14       5       30             0 osmdaemon
Dec 12 07:06:32 efuovs02 kernel: [1508192.885759] [ 5366]     0  5366    85597      650      61       7     1514             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885764] [ 5378]     0  5378    24079      467      27       6      113             0 ovmport
Dec 12 07:06:32 efuovs02 kernel: [1508192.885768] [ 5390]     0  5390    60521      410      65       6      920             0 ovmwatch
Dec 12 07:06:32 efuovs02 kernel: [1508192.885772] [ 5405]     0  5405   208969      656      87       7     1558             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885777] [ 5772]     0  5772   177327     1015      89       6     1775             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885782] [ 5789]     0  5789    49154      741      71       7     1366             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885786] [ 5831]     0  5831    82559      555      70       6     1491             0 devmon
Dec 12 07:06:32 efuovs02 kernel: [1508192.885790] [ 5901]     0  5901     1031      292       9       5       18             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885794] [ 5903]     0  5903     1031      292       8       5       19             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885798] [ 5905]     0  5905     1031      292       9       5       19             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885802] [ 5907]     0  5907     1031      292       9       5       19             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885806] [ 5909]     0  5909     1031      292       9       5       19             0 mingetty
Dec 12 07:06:32 efuovs02 kernel: [1508192.885812] [26455]     0 26455    11091      458      44       5      108             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885816] [27087]     0 27087    11091      458      45       5      108             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885820] [27845]     0 27845    11091      458      44       5      109             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885825] [27996]     0 27996    11091      458      44       5      107             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885829] [14189]     0 14189    11091      458      44       5      109             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885833] [16371]     0 16371    11091      458      44       5      109             0 socat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885838] [14238]     0 14238     2676      256      11       3      129         -1000 udevd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885842] [14374]     0 14374     2676      240      11       3      119         -1000 udevd
Dec 12 07:06:32 efuovs02 kernel: [1508192.885846] [15869]     0 15869    22957      931      62       6     1730             0 python
Dec 12 07:06:32 efuovs02 kernel: [1508192.885851] [16935]     0 16935    28695     2029      16       5       64             0 OSWatcher
Dec 12 07:06:32 efuovs02 kernel: [1508192.885855] [ 5867]    89  5867    20272     1250      45       3      229             0 pickup
Dec 12 07:06:32 efuovs02 kernel: [1508192.885860] [ 8948]     0  8948    27070      682      17       5       71             0 vmsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885864] [ 8951]     0  8951    27070      675      17       5       77             0 mpsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885868] [ 8953]     0  8953     1581      328      11       5       42             0 vmstat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885872] [ 8958]     0  8958    27070      659      17       6       41             0 iosub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885876] [ 8959]     0  8959    25258      441      13       5       46             0 mpstat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885880] [ 8966]     0  8966    25261      420      11       5       19             0 iostat
Dec 12 07:06:32 efuovs02 kernel: [1508192.885884] [ 8971]     0  8971    27070      679      17       5        0             0 xtop
Dec 12 07:06:32 efuovs02 kernel: [1508192.885888] [ 8976]     0  8976    27070      695      17       5        0             0 psmemsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885892] [ 8977]     0  8977     3771      483      20       5        3             0 top
Dec 12 07:06:32 efuovs02 kernel: [1508192.885896] [ 8980]     0  8980    27070      680      17       5        0             0 oswsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885901] [ 8985]     0  8985    28695     1794      15       5      131             0 OSWatcher
Dec 12 07:06:32 efuovs02 kernel: [1508192.885905] [ 8986]     0  8986    27564      523      19       5        8             0 ps
Dec 12 07:06:32 efuovs02 kernel: [1508192.885909] [ 8987]     0  8987    27070       54      12       5        0             0 psmemsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885913] [ 8988]     0  8988    27070       52      11       5        0             0 oswsub
Dec 12 07:06:32 efuovs02 kernel: [1508192.885917] Out of memory: Kill process 5772 (python) score 0 or sacrifice child
Dec 12 07:06:32 efuovs02 kernel: [1508192.886216] Killed process 5772 (python) total-vm:709308kB, anon-rss:0kB, file-rss:4060kB

 

 

How to fix the OVS Kernel Memory Leak

Download the following kernel version which includes the memoy leak fix for the i40e module:  link to Oracle RPM repository

kernel-uek-4.1.12-124.21.1.el6uek.x86_64.rpm
kernel-uek-firmware-4.1.12-124.21.1.el6uek.noarch.rpm


[root@efuovs02 new_Kernel]# rpm -qp --changelog kernel-uek-4.1.12-124.21.1.el6uek.x86_64.rpm | grep -B 3 28228724
warning: kernel-uek-4.1.12-124.21.1.el6uek.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID ec551f03: NOKEY
* Tue Oct 30 2018 Brian Maly <brian.maly@oracle.com> [4.1.12-124.20.8.el6uek] 
- scsi: lpfc: devloss timeout race condition caused null pointer reference (James Smart) [Orabug: 27994179] 
- scsi: qla2xxx: Fix race condition between iocb timeout and initialisation (Ben Hutchings) [Orabug: 28013813] 
- i40e: Add programming descriptors to cleaned_count (Alexander Duyck) [Orabug: 28228724] 
- i40e: Fix memory leak related filter programming status (Alexander Duyck) [Orabug: 28228724]

 

 

Install the new OVS Kernel

Using the steps reported below, the new kernel has been installed on all OVS servers of the farm.

[root@efuovs02 new_Kernel]# rpm -ivh kernel*
warning: kernel-uek-4.1.12-124.21.1.el6uek.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID ec551f03: NOKEY
Preparing... ########################################### [100%]
1:kernel-uek-firmware ########################################### [ 50%]
2:kernel-uek ########################################### [100%]
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.1.12-124.21.1.el6uek.x86_64
Found linux image: /boot/vmlinuz-4.1.12-124.14.5.el6uek.x86_64
Found initrd image: /boot/initramfs-4.1.12-124.14.5.el6uek.x86_64.img
done
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.1.12-124.21.1.el6uek.x86_64
Found initrd image: /boot/initramfs-4.1.12-124.21.1.el6uek.x86_64.img
Found linux image: /boot/vmlinuz-4.1.12-124.14.5.el6uek.x86_64
Found initrd image: /boot/initramfs-4.1.12-124.14.5.el6uek.x86_64.img
done
[root@efuovs02 new_Kernel]#

....

[root@efuovs02 new_Kernel]# reboot

[root@efuovs02 ~]# uname -a
Linux efuovs02 4.1.12-124.21.1.el6uek.x86_64 #2 SMP Tue Nov 6 13:31:13 PST 2018 x86_64 x86_64 x86_64 GNU/Linux

 

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s