Hello, I use Debian Testing on AMD64, on a workstation with Ryzen 5800X - 16 CPU cores and 64GB of ECC DDR4 RAM.
Today, Windows application I run on Wine for work has decided to eat all available memory, CPU and HDD I/O. I don't have swapfile, so Linux kernel must kill something to remain online when all RAM is taken by rogue application. That's where problem I noticed comes in - Debian oom-kill has killed EVERYTHING and actual offending memory hungry application at the end. Why?! It destroyed working KDE session and I had to hard reset the PC. Have a look at journalctl results from last boot (I cut timestamps for easier reading): kernel: RSP: 002b:00007ffcff9ead98 EFLAGS: 00010246 systemd-journald[411]: Missed 10 kernel messages kernel: lowmem_reserve[]: 0 3128 64155 64155 64155 kernel: Node 0 DMA32 free:246472kB boost:0kB min:3292kB low:6492kB high:9692kB reserved_highatomic:0KB active_anon:44kB inactive_anon:3057032kB active_file:0kB inactive_file:220kB un> kernel: lowmem_reserve[]: 0 0 61027 61027 61027 kernel: Node 0 Normal free:245324kB boost:283884kB min:348156kB low:410648kB high:473140kB reserved_highatomic:2048KB active_anon:270712kB inactive_anon:60254116kB active_file:29564k> kernel: lowmem_reserve[]: 0 0 0 0 0 (...) Mar 29 08:58:28 ryzen kernel: 539654 total pagecache pages Mar 29 08:58:28 ryzen kernel: 0 pages in swap cache Mar 29 08:58:28 ryzen kernel: Swap cache stats: add 0, delete 0, find 0/0 Mar 29 08:58:28 ryzen kernel: Free swap = 0kB Mar 29 08:58:28 ryzen kernel: Total swap = 0kB Mar 29 08:58:28 ryzen kernel: 16753821 pages RAM Mar 29 08:58:28 ryzen kernel: 0 pages HighMem/MovableOnly Mar 29 08:58:28 ryzen kernel: 296896 pages reserved Mar 29 08:58:28 ryzen kernel: 0 pages hwpoisoned And here we have all processes running, let me only highlight a few: kernel: Tasks state (memory values in pages): kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name kernel: [ 4611] 1000 4611 137776 1767 253952 0 200 kactivitymanage kernel: [ 676751] 1000 676751 2356154 115060 2740224 0 0 terminal64.exe kernel: [ 702184] 1000 702184 1226983 824654 7540736 0 0 metatester64.ex kernel: [ 731468] 1000 731468 1194211 814761 7442432 0 0 metatester64.ex kernel: [ 731471] 1000 731471 1245415 835020 7593984 0 0 metatester64.ex (and it goes on, at least 16 Wine exe processes like that eating all RAM) As you can see, my Wine application has spawned a lot of exes, each one of them uses around 7.5 million pagetables of memory (I am not sure what is pagetable size in bytes in my Debian), and there are several of such processes. But instead of killing one, or a few of these processes, OOM manager has decided to kill everything *but* the offending exes. Killing of all processes has begun: kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/background.slice/plasma-kactiv> kernel: Out of memory: Killed process 4611 (kactivitymanage) total-vm:551104kB, anon-rss:7068kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:248kB oom_score_adj:200 Instead of killing ONE 7.5 million-worth pagetable process, Linux is killing everything else! KDE activity manager killed. Then it goes on to kill EVERYTHING in the system: kernel: Out of memory: Killed process 4555 (kglobalaccel5) kernel: Out of memory: Killed process 444878 (kiod5) kernel: oom_reaper: reaped process 444878 (kiod5) kernel: Out of memory: Killed process 4405 (pipewire) kernel: oom_reaper: reaped process 4405 (pipewire) kernel: Out of memory: Killed process 505026 (gvfs-udisks2-vo) (it goes on...) Out of memory: Killed process 4414 (dbus-daemon) (...) Out of memory: Killed process 4390 (systemd) And behold, at the end it kills Wine process: Out of memory: Killed process 731550 (metatester64.ex) total-vm:4891544kB, anon-rss:3483116kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:7704kB oom_score_adj:0 It even says total-vm:4891544kB, but just before that it killed systemd with total-vm:18760kB. At this stage, system is completely crashed and I have to hard reset. I'd appreciate any explanation to this situation and how to prevent it in the future. Please find journalctl result as compressed attachment (16 KB). I didn't modified Debian in any way which can affect RAM and out of memory situations, apart from increasing I/O buffers for better performance (comments to changes are my own): $ cat /etc/sysctl.conf (...) vm.dirty_background_ratio=20 # Writing starts after 20% of RAM is filled with data to write. vm.dirty_ratio=40 # up to 40% of memory can be used as write buffers (more write requests will cause I/O lock until enough data is flushed). vm.dirty_expire_centisecs=30000 # data is allowed to sit in the buffers for max 5 minutes (max lost work time) vm.dirty_writeback_centisecs=6000 # how often to check for write data in buffers: 1 minute Not sure if that causes OOM to kill entire system instead of one offending process, I doubt it. Thanks in advance for your comments friends! -- With kindest regards, Piotr. ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system ⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org/ ⠈⠳⣄⠀⠀⠀⠀
journalctl.tar.zst
Description: application/zstd