The Problem ------------ The problem is obviously system instability due to memory pressure, but what can an admin do about it?
Some options exist to configure the priority of processes killed due to memory pressure. Non-snap processes can be configured via systemd using OOMScoreAdjust[1] (kernel OOM[2]) and ManagedOOMMemoryPressureLimit[3][4] (userspace OOM[5]). Snap has no support for configuring userspace OOM priority. Snap has a setting[6] for setting kernel OOM priority. While the setting may be logical within the snap ecosystem, it has deficiencies when integrating with the rest of Ubuntu: 1) This setting provides no ability to set specific kernel OOM scores, which is required to set priority with respect to non-snap processes. 2) The current user interface does not allow a user to __increase__ the likelihood of cgroup being killed (please kill my browser before a component of my desktop). 3) The default snap setting makes snap-based processes (and snapd itself) less likely to be killed than core system services on which snap processes depend - a priority inversion. Proposed solution ----------------- Ubuntu ships with both DEBs and snaps, therefore a solution that allows snaps to configure OOM priority with respect to the rest of the OS would be best. In the near term, providing snap users with the ability to set the values of ManagedOOMMemoryPressureLimit and OOMScoreAdjust for snaps would empower users to tune their system to achieve better stability under memory pressure. In the long term in order to provide the best system behavior under memory pressure it would benefit Ubuntu to ship with more appropriate defaults (desktop / server personality, snaps using a more sensible default). Definitions ----------- [1] OOMScoreAdjust - a systemd directive which systemd uses to set the the kernel OOM score[7], defaults to 0 [2] Kernel OOM - action taken when the kernel runs out of memory to allocate and memory reclaim hasn't returned enough memory - kernel kills a processes based on a metric derived from the OOM score and amount of memory used by each process [3][4] ManagedOOMMemoryPressureLimit - a systemd threshold used by systemd-oomd, it represents the fraction of time in a 10 second window in which all tasks in the control group were delayed - defaults to 60% [5] systemd-oomd - a system service that uses cgroups-v2 and pressure stall information (PSI[8]) to monitor and take corrective action before an OOM occurs in the kernel space [6] `snap set system resilience.vitality-hint=snapA,snapB,snapC` [7] /proc/<pid>/oom_score_adj - a procfs file containing OOM score of a process [8] PSI (Pressure Stall Information) - counters exported to user-space which indicate memory pressure - available since linux version 4.20 References ---------- [1] https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html#OOMScoreAdjust= [2] https://www.kernel.org/doc/gorman/html/understand/understand016.html [3] https://www.freedesktop.org/software/systemd/man/latest/systemd.resource-control.html#ManagedOOMMemoryPressureLimit= [4] https://www.freedesktop.org/software/systemd/man/latest/oomd.conf.html# [5] https://www.freedesktop.org/software/systemd/man/latest/systemd-oomd.service.html [6] https://snapcraft.io/docs/system-options#heading--resilience [7] https://man7.org/linux/man-pages/man5/proc_pid_oom_score_adj.5.html -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089800 Title: Ubuntu desktop is unstable under memory pressure due to undesireable OOMScoreAdjust values To manage notifications about this bug go to: https://bugs.launchpad.net/lxd/+bug/2089800/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
