Hello Qiliang, Den 2026-05-20 kl. 08:07, skrev Qiliang Yuan: > Introduce the "high" soft limit for the dmem cgroup v2 controller. > When a cgroup's device memory usage exceeds its high limit, tasks > belonging to that cgroup are throttled by being forced into a sleep > before returning to user space, instead of being failed outright > as with the "max" limit. > > Key changes: > - Add high counter configuration to dmem_cgroup_pool. > - Add over-high check in the try_charge path and set TIF_NOTIFY_RESUME. > - Inject the dmem throttling handler into resume_user_mode_work. > - Implement the handler to perform a 100ms interruptible sleep for > over-limit tasks. > > This mechanism provides smoother over-subscription support for device > memory resources. > > Signed-off-by: Qiliang Yuan <[email protected]> > --- > This series introduces the "high" soft limit and associated task > throttling mechanism to the dmem cgroup v2 controller. > > The device memory (VRAM) management currently only supports hard limits > (max), which leads to immediate allocation failures when reached. This > can be disruptive for GPU-bound AI workloads. By introducing a soft > limit, we allow cgroups to exceed their quota temporarily while > applying backpressure via task throttling before the process returns > to user space. > > The mechanism is inspired by the memory cgroup's high limit: > - When usage > high, the task is marked with TIF_NOTIFY_RESUME. > - Upon returning to user space, it triggers a 100ms sleep. > - This provides a smoother over-subscription model for GPU resources. > > Qiliang Yuan (1): > > cgroup/dmem: implement dmem.high soft limit and throttling > --- > To: Maarten Lankhorst <[email protected]> > To: Maxime Ripard <[email protected]> > To: Natalie Vock <[email protected]> > To: Tejun Heo <[email protected]> > To: Johannes Weiner <[email protected]> > To: Michal Koutný <[email protected]> > Cc: [email protected] > Cc: [email protected] > Cc: [email protected] > ---
I think the concept of allowing userspace to throttle on high is interesting. It's the approach I'm more worried about. I believe that it's better if we punish exceeding their high limit by preferentially evicting those. It would make eviction run in 3 passes on the affected cgroup tree: - Round 1: Clients above their 'high' limit - Round 2: Clients above their 'low/min' limits - Round 3: Clients at or below their 'low' limit And the same client's cgroup, below 'min' limit as well. I'm open for other ideas as well. Perhaps a flag that would allow allocation or binding to an address space to fail if it would need to evict, or a notification sent to the affected client that they went over high. Have you tried any other approaches before this one? Kind regards, ~Maarten Lankhorst
