You can potentially use numactl to launch the process and set a policy of interleaving allocations between NUMA nodes to avoid these 1 sided allocations. Tends to happen with servers that make big allocations from a single thread during startup, as commonly seen on mysqld servers and the innodb_buffer_pool for example.
numactl --interleave all /path/to/server/process --argument-1 #etc Reference: https://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/ -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1655842 Title: "Out of memory" errors after upgrade to 4.4.0-59 Status in linux package in Ubuntu: Fix Released Status in linux-aws package in Ubuntu: Confirmed Status in linux-raspi2 package in Ubuntu: Fix Committed Status in linux source package in Xenial: Fix Released Status in linux-aws source package in Xenial: Confirmed Status in linux-raspi2 source package in Xenial: Fix Committed Bug description: After a fix for LP#1647400, a bug that caused freezes under some workloads, some users noticed regular OOMs. Those regular OOMs were reported under this bug, and fixed after some releases. Some of the affected kernels are documented below. In order to check your particular kernel, read its changelog and lookup for 1655842 and 1647400. If it has the fix for 1647400, but not the fix for 1655842, then it's affected. It's still possible that you notice regressions compared to kernels that didn't have the fixes for any of the bugs. However, reverting all fixes would cause the freeze bug to come back. So, it's not a possible solution moving forward. If you see any regressions, in the form of OOMs, mainly, please report a new bug. Different workloads may require different solutions, and it's possible that further fixes are needed, be them upstream or not. The best way to get such fixes applied is reporting that under a new bug, one that can be verified, so being able to reproduce the bug makes it possible to verify the fixes really fix the identified bug. Kernels affected: linux 4.4.0-58, 4.4.0-59, 4.4.0-60, 4.4.0-61, 4.4.0-62. linux-raspi2 4.4.0-1039 to 4.4.0-1042 and 4.4.0-1044 to 4.4.0-1071 Particular kernels NOT affected by THIS bug: linux-aws To reiterate, if you find an OOM with an affected kernel, please upgrade. If you find an OOM with a non-affected kernel, please report a new bug. We want to investigate it and fix it. =================== I recently replaced some Xenial servers, and started experiencing "Out of memory" problems with the default kernel. We bake Amazon AMIs based on an official Ubuntu-provided image (ami- e6b58e85, in ap-southeast-2, from https://cloud- images.ubuntu.com/locator/ec2/). Previous versions of our AMI included "4.4.0-57-generic", but the latest version picked up "4.4.0-59-generic" as part of a "dist-upgrade". Instances booted using the new AMI have been using more memory, and experiencing OOM issues - sometimes during boot, and sometimes a while afterwards. An example from the system log is: [ 130.113411] cloud-init[1560]: Cloud-init v. 0.7.8 running 'modules:final' at Wed, 11 Jan 2017 22:07:53 +0000. Up 29.28 seconds. [ 130.124219] cloud-init[1560]: Cloud-init v. 0.7.8 finished at Wed, 11 Jan 2017 22:09:35 +0000. Datasource DataSourceEc2. Up 130.09 seconds [29871.137128] Out of memory: Kill process 2920 (ruby) score 107 or sacrifice child [29871.140816] Killed process 2920 (ruby) total-vm:675048kB, anon-rss:51184kB, file-rss:2164kB [29871.449209] Out of memory: Kill process 3257 (splunkd) score 97 or sacrifice child [29871.453282] Killed process 3258 (splunkd) total-vm:66272kB, anon-rss:6676kB, file-rss:0kB [29871.677910] Out of memory: Kill process 2647 (fluentd) score 51 or sacrifice child [29871.681872] Killed process 2647 (fluentd) total-vm:117944kB, anon-rss:23956kB, file-rss:1356kB I have a hunch that this may be related to the fix for https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1647400, introduced in linux (4.4.0-58.79). ProblemType: Bug DistroRelease: Ubuntu 16.04 Package: linux-image-4.4.0-59-generic 4.4.0-59.80 ProcVersionSignature: User Name 4.4.0-59.80-generic 4.4.35 Uname: Linux 4.4.0-59-generic x86_64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Jan 12 06:29 seq crw-rw---- 1 root audio 116, 33 Jan 12 06:29 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.1-0ubuntu2.4 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Thu Jan 12 06:38:45 2017 Ec2AMI: ami-0f93966c Ec2AMIManifest: (unknown) Ec2AvailabilityZone: ap-southeast-2a Ec2InstanceType: t2.nano Ec2Kernel: unavailable Ec2Ramdisk: unavailable IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lsusb: Error: command ['lsusb'] failed with exit code 1: MachineType: Xen HVM domU PciMultimedia: ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 cirrusdrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-59-generic root=UUID=fb0fef08-f3c5-40bf-9776-f7ba00fe72be ro console=tty1 console=ttyS0 RelatedPackageVersions: linux-restricted-modules-4.4.0-59-generic N/A linux-backports-modules-4.4.0-59-generic N/A linux-firmware 1.157.6 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 12/09/2016 dmi.bios.vendor: Xen dmi.bios.version: 4.2.amazon dmi.chassis.type: 1 dmi.chassis.vendor: Xen dmi.modalias: dmi:bvnXen:bvr4.2.amazon:bd12/09/2016:svnXen:pnHVMdomU:pvr4.2.amazon:cvnXen:ct1:cvr: dmi.product.name: HVM domU dmi.product.version: 4.2.amazon dmi.sys.vendor: Xen To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1655842/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp