On Mon, May 18, 2026 at 7:07 AM Christian König
<[email protected]> wrote:
>
> On 5/18/26 14:50, Albert Esteve wrote:
> > On Mon, May 18, 2026 at 9:20 AM Christian König
> > <[email protected]> wrote:
> >>
> >> On 5/15/26 19:06, T.J. Mercier wrote:
> >>> On Fri, May 15, 2026 at 6:53 AM Christian Brauner <[email protected]> 
> >>> wrote:
> >>>>
> >>>> On Tue, May 12, 2026 at 11:10:44AM +0200, Albert Esteve wrote:
> >>>>> On embedded platforms a central process often allocates dma-buf
> >>>>> memory on behalf of client applications. Without a way to
> >>>>> attribute the charge to the requesting client's cgroup, the
> >>>>> cost lands on the allocator, making per-cgroup memory limits
> >>>>> ineffective for the actual consumers.
> >>>>>
> >>>>> Add charge_pid_fd to struct dma_heap_allocation_data. When set to
> >>>>
> >>>> Please be aware that pidfds come in two flavors:
> >>>>
> >>>> thread-group pidfds and thread-specific pidfds. Make sure that your API
> >>>> doesn't implicitly depend on this distinction not existing.
> >>>
> >>> Hi Christian,
> >>>
> >>> Memcg is not a controller that supports "thread mode" so all threads
> >>> in a group should belong to the same memcg.
> >>
> >> BTW: Exactly that is the requirement automotive has with their native 
> >> context use case.
> >>
> >> The use case is that you have a deamon which has multiple threads were 
> >> each one is acting on behalve of some other process.
> >>
> >> At the moment we basically say they are simply not using cgroups for that 
> >> use case, but it would be really nice if we could handle that as well.
> >>
> >> Summarizing the requirement of that use case: You need a different cgroup 
> >> for each thread of a process.
> >
> > Hi Christian,
> >
> > Thanks for sharing this atuomotive usecase. If I understand correctly,
> > the actual requirement is attributing dma-buf charges to the right
> > client, not putting each daemon thread in a different cgroup?
>
> Nope, exactly that's the difference.
>
> The thread acts as a filtering agent for both memory allocation and command 
> submission for somebody else, the process on which behalve the daemon does 
> things can even be in a client VM, completely remote over some network or 
> even something like a microcontroller.
>
> Everything the thread does regarding CPU time, GPU driver memory allocation 
> as well as resources like GPU processing and I/O time etc.. needs to be 
> accounted to one client which can be different for each thread of the process.
>
> The only thing which is shared with the main process thread is CPU memory 
> resources, e.g. malloc() because that is basically just needed for 
> housekeeping and pretty much irrelevant for this kind of use case.
>
> The problem is now you can't do that with cgroups at the moment but 
> unfortunately only the kernel has the information you need to know to do this.
>
> So what you end up with is to define tons of interfaces just to get the 
> necessary information from the kernel into userspace and then essentially 
> duplicate the same infrastructure cgroup provides in the kernel in userspace 
> again.
>
> > If so,
> > the `charge_pid_fd` approach achieves this directly by passing the
> > client's `pid_fd`, without needing to add per-thread cgroup
> > infrastructure.
>
> Well it's already a massive improvemt, we could basically stop doing the 
> whole duplication part for the GPU driver stack and just use cgroups for this 
> part.
>
> Doing that automatically for CPU and I/O time would just be nice to have 
> additionally.
>
> Regards,
> Christian.

Hopefully I'm following correctly here.... So you are duplicating the
GPU driver stack to achieve remote accounting on a per-thread basis?
Does this mean for GPU allocations you currently have some GFP_ACCOUNT
magic in your driver to attribute GPU memory to the correct remote
client? So this series would close the gap for dma-buf allocations,
but what about private GPU driver memory allocated on behalf of a
client?

Reply via email to