On Mon, Jun 23, 2025 at 6:54 PM Christian König
<[email protected]> wrote:
>
> On 6/19/25 09:20, Dave Airlie wrote:
> > From: Dave Airlie <[email protected]>
> >
> > While discussing memcg intergration with gpu memory allocations,
> > it was pointed out that there was no numa/system counters for
> > GPU memory allocations.
> >
> > With more integrated memory GPU server systems turning up, and
> > more requirements for memory tracking it seems we should start
> > closing the gap.
> >
> > Add two counters to track GPU per-node system memory allocations.
> >
> > The first is currently allocated to GPU objects, and the second
> > is for memory that is stored in GPU page pools that can be reclaimed,
> > by the shrinker.
> >
> > Cc: Christian Koenig <[email protected]>
> > Cc: Matthew Brost <[email protected]>
> > Cc: Johannes Weiner <[email protected]>
> > Cc: [email protected]
> > Cc: Andrew Morton <[email protected]>
> > Signed-off-by: Dave Airlie <[email protected]>
> >
> > ---
> >
> > v2: add more info to the documentation on this memory.
> >
> > I'd like to get acks to merge this via the drm tree, if possible,
> >
> > Dave.
> > ---
> >  Documentation/filesystems/proc.rst | 8 ++++++++
> >  drivers/base/node.c                | 5 +++++
> >  fs/proc/meminfo.c                  | 6 ++++++
> >  include/linux/mmzone.h             | 2 ++
> >  mm/show_mem.c                      | 9 +++++++--
> >  mm/vmstat.c                        | 2 ++
> >  6 files changed, 30 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/filesystems/proc.rst 
> > b/Documentation/filesystems/proc.rst
> > index 5236cb52e357..7cc5a9185190 100644
> > --- a/Documentation/filesystems/proc.rst
> > +++ b/Documentation/filesystems/proc.rst
> > @@ -1095,6 +1095,8 @@ Example output. You may not have all of these fields.
> >      CmaFree:               0 kB
> >      Unaccepted:            0 kB
> >      Balloon:               0 kB
> > +    GPUActive:             0 kB
> > +    GPUReclaim:            0 kB
>
> Active certainly makes sense, but I think we should rather disable the pool 
> on newer CPUs than adding reclaimable memory here.

I'm not just concerned about newer platforms though, even on Fedora 42
on my test ryzen1+7900xt machine, with a desktop session running

nr_gpu_active 7473
nr_gpu_reclaim 6656

It's not an insignificant amount of memory. I also think if we get to
some sort of discardable GTT objects with a shrinker they should
probably be accounted in reclaim.

Dave.

Reply via email to