Hello Qiliang,

Den 2026-05-20 kl. 08:07, skrev Qiliang Yuan:
> Introduce the "high" soft limit for the dmem cgroup v2 controller.
> When a cgroup's device memory usage exceeds its high limit, tasks
> belonging to that cgroup are throttled by being forced into a sleep
> before returning to user space, instead of being failed outright
> as with the "max" limit.
> 
> Key changes:
> - Add high counter configuration to dmem_cgroup_pool.
> - Add over-high check in the try_charge path and set TIF_NOTIFY_RESUME.
> - Inject the dmem throttling handler into resume_user_mode_work.
> - Implement the handler to perform a 100ms interruptible sleep for
>   over-limit tasks.
> 
> This mechanism provides smoother over-subscription support for device
> memory resources.
> 
> Signed-off-by: Qiliang Yuan <[email protected]>
> ---
> This series introduces the "high" soft limit and associated task
> throttling mechanism to the dmem cgroup v2 controller.
> 
> The device memory (VRAM) management currently only supports hard limits
> (max), which leads to immediate allocation failures when reached. This
> can be disruptive for GPU-bound AI workloads. By introducing a soft
> limit, we allow cgroups to exceed their quota temporarily while
> applying backpressure via task throttling before the process returns
> to user space.
> 
> The mechanism is inspired by the memory cgroup's high limit:
> - When usage > high, the task is marked with TIF_NOTIFY_RESUME.
> - Upon returning to user space, it triggers a 100ms sleep.
> - This provides a smoother over-subscription model for GPU resources.
> 
> Qiliang Yuan (1):
> 
> cgroup/dmem: implement dmem.high soft limit and throttling
> ---
> To: Maarten Lankhorst <[email protected]>
> To: Maxime Ripard <[email protected]>
> To: Natalie Vock <[email protected]>
> To: Tejun Heo <[email protected]>
> To: Johannes Weiner <[email protected]>
> To: Michal Koutný <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---

I think the concept of allowing userspace to throttle on high
is interesting.

It's the approach I'm more worried about. I believe that it's
better if we punish exceeding their high limit by preferentially
evicting those.

It would make eviction run in 3 passes on the affected cgroup tree:
- Round 1: Clients above their 'high' limit
- Round 2: Clients above their 'low/min' limits
- Round 3: Clients at or below their 'low' limit

And the same client's cgroup, below 'min' limit as well.

I'm open for other ideas as well. Perhaps a flag that would allow
allocation or binding to an address space to fail if it would need
to evict, or a notification sent to the affected client that they
went over high.

Have you tried any other approaches before this one?

Kind regards,
~Maarten Lankhorst

Reply via email to