[Kernel-packages] [Bug 1734744] Re: OOM killer with kernel 4.4.0-92

2017-11-30 Thread Vladimir Nicolici
** Attachment added: "Second OOM log"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1734744/+attachment/5016590/+files/oom-2.log

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1734744

Title:
  OOM killer with kernel 4.4.0-92

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Xenial:
  Incomplete

Bug description:
  When putting pressure on a Linux Node in Azure we see OOM killers (similar 
to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1655842 but not this 
is a different kernel)
  Reverting the kernel to 4.4.0-57 fixes the problem.

  dmesg and kernel.log attached

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1734744/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1734744] Re: OOM killer with kernel 4.4.0-92

2017-11-30 Thread Vladimir Nicolici
I have a similar problem on 4.4.0-91. Happened two times so far on Oct
31st and Nov 30th, during the nightly dumps of a postgres database.
Strange coincidence, it's the last day of the month in both cases, but
we don't do anything special at that time compared to the rest of the
month.

I'll attach the OOM errors from both incidents, seems to be a similar
error to the one reported in this bug: gfp_mask=0x26000c0, order=2,
oom_score_adj=0

The server is a dual CPU machine, so 2 NUMA nodes.

We have disabled numa_balancing, zone_reclaim_mode, and transparent huge
pages for performance reasons, they were causing the database to become
unresponsive for a few minutes from time to time.

We are now considering also running the database with numactl
--interleave=all

If needed I can try to run apport.

** Attachment added: "First OOM log"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1734744/+attachment/5016589/+files/oom.log

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1734744

Title:
  OOM killer with kernel 4.4.0-92

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Xenial:
  Incomplete

Bug description:
  When putting pressure on a Linux Node in Azure we see OOM killers (similar 
to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1655842 but not this 
is a different kernel)
  Reverting the kernel to 4.4.0-57 fixes the problem.

  dmesg and kernel.log attached

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1734744/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1734744] Re: OOM killer with kernel 4.4.0-92

2017-12-21 Thread Vladimir Nicolici
As I said in a previous comment we were experiencing this very
infrequently during nightly dumps on a database, it was happening about
once per month, but the issue became more frequent and persistent this
week.

So, after experiencing the issue 3 nights in a row during nightly
database dumps, we upgraded from 4.4.0-91 to 4.4.0-104. We'll see what
happens next night.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1734744

Title:
  OOM killer with kernel 4.4.0-92

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Xenial:
  Incomplete

Bug description:
  When putting pressure on a Linux Node in Azure we see OOM killers (similar 
to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1655842 but not this 
is a different kernel)
  Reverting the kernel to 4.4.0-57 fixes the problem.

  dmesg and kernel.log attached
  --- 
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Dec 11 15:21 seq
   crw-rw 1 root audio 116, 33 Dec 11 15:21 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.20.1-0ubuntu2.13
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  DistroRelease: Ubuntu 16.04
  IwConfig: Error: [Errno 2] No such file or directory
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-92-generic 
root=UUID=512611f4-b05a-4d8a-b743-438b71c5385d ro console=tty1 console=ttyS0 
earlyprintk=ttyS0 rootdelay=300
  ProcVersionSignature: Ubuntu 4.4.0-92.115-generic 4.4.76
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-92-generic N/A
   linux-backports-modules-4.4.0-92-generic  N/A
   linux-firmware1.157.13
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  xenial uec-images xenial uec-images
  Uname: Linux 4.4.0-92-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True
  dmi.bios.date: 06/02/2017
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 090007
  dmi.board.name: Virtual Machine
  dmi.board.vendor: Microsoft Corporation
  dmi.board.version: 7.0
  dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77
  dmi.chassis.type: 3
  dmi.chassis.vendor: Microsoft Corporation
  dmi.chassis.version: 7.0
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr090007:bd06/02/2017:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0:
  dmi.product.name: Virtual Machine
  dmi.product.uuid: D88904A9-DF88-454F-9F50-7A8DE8964445
  dmi.product.version: 7.0
  dmi.sys.vendor: Microsoft Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1734744/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1734744] Re: OOM killer with kernel 4.4.0-92

2018-01-03 Thread Vladimir Nicolici
Unfortunately the out of memory errors persisted with 4.4.0-104, so 6
days ago we switched to 4.8.0-58.

No more OOM errors so far, but it seems the performance was affected a
bit, there seem to be some freezes during the nightly jobs.

Seeing that the similar bug,
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1655842 was
introduced while fixing "LP#1647400, a bug that caused freezes under
some workloads", it seems plausible that, since the two issues are
related, fixing one breaks the other and the other way around.

In any case, I can live with freezes on this particular machine, instead
of OOM errors, so for the time being I'll stay on 4.8.0-58. I may
experiment again with newer 4.4.0 kernels at the end of the month.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1734744

Title:
  OOM killer with kernel 4.4.0-92

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Xenial:
  Incomplete

Bug description:
  When putting pressure on a Linux Node in Azure we see OOM killers (similar 
to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1655842 but not this 
is a different kernel)
  Reverting the kernel to 4.4.0-57 fixes the problem.

  dmesg and kernel.log attached
  --- 
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 Dec 11 15:21 seq
   crw-rw 1 root audio 116, 33 Dec 11 15:21 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.20.1-0ubuntu2.13
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  DistroRelease: Ubuntu 16.04
  IwConfig: Error: [Errno 2] No such file or directory
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  MachineType: Microsoft Corporation Virtual Machine
  Package: linux (not installed)
  PciMultimedia:
   
  ProcFB: 0 hyperv_fb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-92-generic 
root=UUID=512611f4-b05a-4d8a-b743-438b71c5385d ro console=tty1 console=ttyS0 
earlyprintk=ttyS0 rootdelay=300
  ProcVersionSignature: Ubuntu 4.4.0-92.115-generic 4.4.76
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-92-generic N/A
   linux-backports-modules-4.4.0-92-generic  N/A
   linux-firmware1.157.13
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  xenial uec-images xenial uec-images
  Uname: Linux 4.4.0-92-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True
  dmi.bios.date: 06/02/2017
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 090007
  dmi.board.name: Virtual Machine
  dmi.board.vendor: Microsoft Corporation
  dmi.board.version: 7.0
  dmi.chassis.asset.tag: 7783-7084-3265-9085-8269-3286-77
  dmi.chassis.type: 3
  dmi.chassis.vendor: Microsoft Corporation
  dmi.chassis.version: 7.0
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr090007:bd06/02/2017:svnMicrosoftCorporation:pnVirtualMachine:pvr7.0:rvnMicrosoftCorporation:rnVirtualMachine:rvr7.0:cvnMicrosoftCorporation:ct3:cvr7.0:
  dmi.product.name: Virtual Machine
  dmi.product.uuid: D88904A9-DF88-454F-9F50-7A8DE8964445
  dmi.product.version: 7.0
  dmi.sys.vendor: Microsoft Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1734744/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1655842] Re: "Out of memory" errors after upgrade to 4.4.0-59

2017-11-01 Thread Vladimir Nicolici
Not sure if it's the same issue, but we had an unexpected OOM with
Ubuntu 16.04.3 LTS, 4.4.0-91.

Oct 31 23:52:25 db3 kernel: [6569272.882023] psql invoked oom-killer:
gfp_mask=0x26000c0, order=2, oom_score_adj=0

...

Oct 31 23:52:25 db3 kernel: [6569272.882154] Mem-Info:
Oct 31 23:52:25 db3 kernel: [6569272.882165] active_anon:38011018 
inactive_anon:1422084 isolated_anon:0
Oct 31 23:52:25 db3 kernel: [6569272.882165]  active_file:11699125 
inactive_file:11727535 isolated_file:0
Oct 31 23:52:25 db3 kernel: [6569272.882165]  unevictable:0 dirty:88019 
writeback:2902991 unstable:23308
Oct 31 23:52:25 db3 kernel: [6569272.882165]  slab_reclaimable:1455159 
slab_unreclaimable:533985
Oct 31 23:52:25 db3 kernel: [6569272.882165]  mapped:38499394 shmem:38495946 
pagetables:33687177 bounce:0
Oct 31 23:52:25 db3 kernel: [6569272.882165]  free:212612 free_pcp:0 free_cma:0
Oct 31 23:52:25 db3 kernel: [6569272.882172] Node 0 DMA free:13256kB min:0kB 
low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB 
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB 
shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Oct 31 23:52:25 db3 kernel: [6569272.882182] lowmem_reserve[]: 0 1882 193368 
193368 193368
Oct 31 23:52:25 db3 kernel: [6569272.882188] Node 0 DMA32 free:768204kB 
min:316kB low:392kB high:472kB active_anon:8kB inactive_anon:32kB 
active_file:20kB inactive_file:48kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:2045556kB managed:1964868kB mlocked:0kB dirty:0kB 
writeback:44kB mapped:16kB shmem:12kB slab_reclaimable:729192kB 
slab_unreclaimable:35928kB kernel_stack:1920kB pagetables:415552kB unstable:0kB 
bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:0 all_unreclaimable? no
Oct 31 23:52:25 db3 kernel: [6569272.882196] lowmem_reserve[]: 0 0 191486 
191486 191486
Oct 31 23:52:25 db3 kernel: [6569272.882201] Node 0 Normal free:34260kB 
min:32432kB low:40540kB high:48648kB active_anon:58162056kB 
inactive_anon:2546400kB active_file:18254204kB inactive_file:18282192kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:199229440kB 
managed:196081724kB mlocked:0kB dirty:152124kB writeback:4685924kB 
mapped:58223800kB shmem:58229824kB slab_reclaimable:2362116kB 
slab_unreclaimable:1123984kB kernel_stack:11056kB pagetables:94580096kB 
unstable:22108kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Oct 31 23:52:25 db3 kernel: [6569272.882210] lowmem_reserve[]: 0 0 0 0 0
Oct 31 23:52:25 db3 kernel: [6569272.882215] Node 1 Normal free:34728kB 
min:32780kB low:40972kB high:49168kB active_anon:93882008kB 
inactive_anon:3141904kB active_file:28542276kB inactive_file:28627900kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:201326592kB 
managed:198178644kB mlocked:0kB dirty:199952kB writeback:6925996kB 
mapped:95773760kB shmem:95753948kB slab_reclaimable:2729328kB 
slab_unreclaimable:976028kB kernel_stack:6608kB pagetables:39753060kB 
unstable:71124kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Oct 31 23:52:25 db3 kernel: [6569272.882226] lowmem_reserve[]: 0 0 0 0 0
Oct 31 23:52:25 db3 kernel: [6569272.882230] Node 0 DMA: 0*4kB 1*8kB (U) 0*16kB 
0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 2*2048kB (UM) 
2*4096kB (M) = 13256kB
Oct 31 23:52:25 db3 kernel: [6569272.882248] Node 0 DMA32: 121*4kB (UME) 95*8kB 
(UME) 5337*16kB (UME) 4229*32kB (UME) 2523*64kB (UME) 624*128kB (UME) 237*256kB 
(UME) 83*512kB (UM) 51*1024kB (UM) 73*2048kB (UM) 0*4096kB = 768204kB
Oct 31 23:52:25 db3 kernel: [6569272.882268] Node 0 Normal: 8587*4kB (UM) 8*8kB 
(MH) 15*16kB (H) 8*32kB (H) 3*64kB (H) 1*128kB (H) 2*256kB (H) 1*512kB (H) 
0*1024kB 0*2048kB 0*4096kB = 36252kB
Oct 31 23:52:25 db3 kernel: [6569272.882284] Node 1 Normal: 9063*4kB (UM) 0*8kB 
7*16kB (H) 7*32kB (H) 5*64kB (H) 2*128kB (H) 0*256kB 1*512kB (H) 0*1024kB 
0*2048kB 0*4096kB = 37676kB
Oct 31 23:52:25 db3 kernel: [6569272.882303] Node 0 hugepages_total=0 
hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Oct 31 23:52:25 db3 kernel: [6569272.882306] Node 0 hugepages_total=0 
hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Oct 31 23:52:25 db3 kernel: [6569272.882308] Node 1 hugepages_total=0 
hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Oct 31 23:52:25 db3 kernel: [6569272.882311] Node 1 hugepages_total=0 
hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Oct 31 23:52:25 db3 kernel: [6569272.882313] 61926313 total pagecache pages
Oct 31 23:52:25 db3 kernel: [6569272.882315] 3557 pages in swap cache
Oct 31 23:52:25 db3 kernel: [6569272.882318] Swap cache stats: add 1695