[Public]

Hi Christian,

Good Morning!

Thank you! for taking the time to review and for your valuable feedback.

Few clarifications as below please:

> -----Original Message-----
> From: Koenig, Christian <[email protected]>
> Sent: Monday, March 2, 2026 6:31 PM
> To: SHANMUGAM, SRINIVASAN <[email protected]>;
> Deucher, Alexander <[email protected]>
> Cc: [email protected]; Kasiviswanathan, Harish
> <[email protected]>; Kuehling, Felix <[email protected]>
> Subject: Re: [PATCH v3 1/4] drm/amdgpu: Add render-node EVENTFD manager
> core
>
> On 3/2/26 04:02, Srinivasan Shanmugam wrote:
> > Introduce a per-drm_file eventfd manager to support render-node event
> > subscriptions.
> >
> > The manager is implemented in amdgpu_eventfd.[ch] and is owned by the
> > drm_file (amdgpu_fpriv). It maps event_id -> eventfd_id object, where
> > each eventfd_id can have multiple eventfds bound (fan-out).
> >
> > The design is IRQ-safe for signaling: IRQ path takes the xarray lock
> > (irqsave) and signals eventfds while still holding the lock.
> >
> > This patch only adds the core manager and wires its lifetime into
> > open/postclose.
> >
> > Cc: Harish Kasiviswanathan <[email protected]>
> > Cc: Felix Kuehling <[email protected]>
> > Cc: Alex Deucher <[email protected]>
> > Suggested-by: Christian König <[email protected]>
> > Signed-off-by: Srinivasan Shanmugam <[email protected]>
> > Change-Id: Iac025dcb7c1b4f9ed74ac4190085e1925e2c8e68
> > ---
> >  drivers/gpu/drm/amd/amdgpu/Makefile         |   3 +-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu.h         |   5 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_eventfd.c | 321
> > ++++++++++++++++++++  drivers/gpu/drm/amd/amdgpu/amdgpu_eventfd.h |
> 76 +++++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c     |  28 +-
> >  5 files changed, 430 insertions(+), 3 deletions(-)  create mode
> > 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_eventfd.c
> >  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_eventfd.h
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile
> > b/drivers/gpu/drm/amd/amdgpu/Makefile
> > index 006d49d6b4af..30b1cf3c6cdf 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> > @@ -67,7 +67,8 @@ amdgpu-y += amdgpu_device.o amdgpu_reg_access.o
> amdgpu_doorbell_mgr.o amdgpu_kms
> >     amdgpu_fw_attestation.o amdgpu_securedisplay.o \
> >     amdgpu_eeprom.o amdgpu_mca.o amdgpu_psp_ta.o amdgpu_lsdma.o \
> >     amdgpu_ring_mux.o amdgpu_xcp.o amdgpu_seq64.o amdgpu_aca.o
> amdgpu_dev_coredump.o \
> > -   amdgpu_cper.o amdgpu_userq_fence.o amdgpu_eviction_fence.o
> amdgpu_ip.o
> > +   amdgpu_cper.o amdgpu_userq_fence.o amdgpu_eviction_fence.o
> amdgpu_ip.o \
> > +   amdgpu_eventfd.o
> >
> >  amdgpu-$(CONFIG_PROC_FS) += amdgpu_fdinfo.o
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > index 1e71a03c8bba..9e650b3707e3 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> > @@ -44,6 +44,7 @@
> >  #include <linux/hashtable.h>
> >  #include <linux/dma-fence.h>
> >  #include <linux/pci.h>
> > +#include <linux/xarray.h>
> >
> >  #include <drm/ttm/ttm_bo.h>
> >  #include <drm/ttm/ttm_placement.h>
> > @@ -434,6 +435,8 @@ struct amdgpu_flip_work {
> >     bool                            async;
> >  };
> >
> > +struct amdgpu_eventfd_mgr;
> > +
> >  /*
> >   * file private structure
> >   */
> > @@ -453,6 +456,8 @@ struct amdgpu_fpriv {
> >
> >     /** GPU partition selection */
> >     uint32_t                xcp_id;
> > +
> > +   struct amdgpu_eventfd_mgr       *eventfd_mgr;
>
> If possible we should embed that structure here instead of having a pointer.

I understand you prefer embedding struct amdgpu_eventfd_mgr in amdgpu_fpriv.

In my current series the USERQ fence driver holds a ref to the manager, so mgr 
can outlive fpriv.

Question: Do you want to keep mgr separately allocated + kref (mgr may outlive 
fpriv), or should I restrict mgr lifetime to fpriv to allow embedding?

>
> >  };
> >
> >  int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv
> > **fpriv); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eventfd.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_eventfd.c
> > new file mode 100644
> > index 000000000000..15ffb74c1de3
> > --- /dev/null
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_eventfd.c
> > @@ -0,0 +1,321 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright 2026 Advanced Micro Devices, Inc.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > +obtaining a
> > + * copy of this software and associated documentation files (the
> > +"Software"),
> > + * to deal in the Software without restriction, including without
> > +limitation
> > + * the rights to use, copy, modify, merge, publish, distribute,
> > +sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom
> > +the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > +included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
> KIND,
> > +EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > +MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN
> NO EVENT
> > +SHALL
> > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY
> CLAIM,
> > +DAMAGES OR
> > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> > +OTHERWISE,
> > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE
> OR THE USE
> > +OR
> > + * OTHER DEALINGS IN THE SOFTWARE.
> > + *
> > + */
> > +
> > +/*
> > + * Render-node eventfd subscription infrastructure.
> > + */
> > +
> > +#include <linux/slab.h>
> > +#include <linux/err.h>
> > +
> > +#include "amdgpu_eventfd.h"
> > +
> > +static void amdgpu_eventfd_mgr_release(struct kref *ref) {
> > +   struct amdgpu_eventfd_mgr *mgr =
> > +           container_of(ref, struct amdgpu_eventfd_mgr, refcount);
> > +   unsigned long index;
> > +   struct amdgpu_eventfd_id *id;
> > +
>
> > +   /* Final teardown: remove all ids and drop all eventfd references. */
> > +   spin_lock(&mgr->lock);
> > +   mgr->fd_closing = true;
> > +   spin_unlock(&mgr->lock);
>
> Of hand I can't see a necessity for that. Please explain further.

Ack, that makes sense. bind/unbind are only reachable through the render
node file descriptor, so they cannot happen after postclose.

I will drop fd_closing and the related checks in v4.

>
> > +
> > +   xa_lock(&mgr->ids);
> > +   xa_for_each(&mgr->ids, index, id) {
> > +           struct amdgpu_eventfd_entry *e;
> > +           struct hlist_node *tmp;
> > +
> > +           __xa_erase(&mgr->ids, index);
> > +
> > +           spin_lock(&id->lock);
> > +           hlist_for_each_entry_safe(e, tmp, &id->entries, hnode) {
> > +                   hlist_del(&e->hnode);
> > +                   id->n_entries--;
> > +                   if (e->ctx)
> > +                           eventfd_ctx_put(e->ctx);
> > +                   kfree(e);
> > +           }
> > +           spin_unlock(&id->lock);
> > +           kfree(id);
> > +   }
> > +   xa_unlock(&mgr->ids);
> > +
> > +   xa_destroy(&mgr->ids);
> > +   kfree(mgr);
> > +}
> > +
> > +struct amdgpu_eventfd_mgr *amdgpu_eventfd_mgr_alloc(void) {
> > +   struct amdgpu_eventfd_mgr *mgr;
> > +
> > +   mgr = kzalloc(sizeof(*mgr), GFP_KERNEL);
> > +   if (!mgr)
> > +           return NULL;
> > +
> > +   kref_init(&mgr->refcount);
> > +   xa_init(&mgr->ids);
> > +   spin_lock_init(&mgr->lock);
> > +   mgr->bind_count = 0;
> > +   mgr->fd_closing = false;
> > +
> > +   return mgr;
> > +}
> > +
> > +void amdgpu_eventfd_mgr_get(struct amdgpu_eventfd_mgr *mgr) {
> > +   if (mgr)
> > +           kref_get(&mgr->refcount);
> > +}
> > +
> > +void amdgpu_eventfd_mgr_put(struct amdgpu_eventfd_mgr *mgr) {
> > +   if (mgr)
> > +           kref_put(&mgr->refcount, amdgpu_eventfd_mgr_release); }
> > +
> > +void amdgpu_eventfd_mgr_mark_closing(struct amdgpu_eventfd_mgr *mgr)
> > +{
> > +   if (!mgr)
> > +           return;
> > +
> > +   spin_lock(&mgr->lock);
> > +   mgr->fd_closing = true;
> > +   spin_unlock(&mgr->lock);
> > +}
> > +
> > +static struct amdgpu_eventfd_id *amdgpu_eventfd_id_alloc(u32
> > +event_id) {
> > +   struct amdgpu_eventfd_id *id;
> > +
> > +   id = kzalloc(sizeof(*id), GFP_KERNEL);
> > +   if (!id)
> > +           return NULL;
> > +
> > +   id->event_id = event_id;
> > +   spin_lock_init(&id->lock);
> > +   INIT_HLIST_HEAD(&id->entries);
> > +   id->n_entries = 0;
> > +   return id;
> > +}
> > +
> > +int amdgpu_eventfd_bind(struct amdgpu_eventfd_mgr *mgr, u32 event_id,
> > +int eventfd) {
> > +   struct amdgpu_eventfd_id *id, *new_id = NULL;
> > +   struct amdgpu_eventfd_entry *e;
> > +   struct eventfd_ctx *ctx;
> > +   unsigned long flags;
> > +   bool found = false;
> > +
> > +   if (!mgr || !event_id || eventfd < 0)
> > +           return -EINVAL;
> > +
>
> > +   /* If file is closing, refuse new binds. */
> > +   spin_lock(&mgr->lock);
> > +   if (mgr->fd_closing) {
> > +           spin_unlock(&mgr->lock);
> > +           return -EBADF;
> > +   }
> > +   spin_unlock(&mgr->lock);
>
> Such checks are unecessary. bind/unbind can only come through the DRM render
> node file descriptor and when we set fd_closing to true that one is already 
> gone.

Agreed. bind/unbind are only reachable via the render-node ioctl on the same 
drm_file, so once postclose runs the FD is already gone and userspace can’t 
issue new ioctls anyway.
I’ll drop the fd_closing flag and remove the closing checks from 
amdgpu_eventfd_bind() / amdgpu_eventfd_unbind() (and the related mark_closing() 
plumbing) in v4.

Signaling is still safe because lifetime is guarded by the manager kref and the 
xarray locking in the IRQ path. This is just explains why removing fd_closing 
is still safe.

>
> > +
> > +   ctx = eventfd_ctx_fdget(eventfd);
> > +   if (IS_ERR(ctx))
> > +           return PTR_ERR(ctx);
> > +
>
> > +   e = kzalloc(sizeof(*e), GFP_KERNEL);
> > +   if (!e) {
> > +           eventfd_ctx_put(ctx);
> > +           return -ENOMEM;
> > +   }
> > +   e->ctx = ctx;
>
> Do that after allocating new_id. It is not a problem if we allocate new_id 
> and then
> never use it.

Ack. I'll move the eventfd_ctx_fdget() and entry allocation after
the new_id allocation so we only take the ctx reference once the
event_id object has been prepared.

>
>
> > +   e->eventfd = eventfd;
>
> Stuff like that is a big no-go. The file descriptor number is only valid to 
> call
> eventfd_ctx_fdget() once!
>
> Even a second call to eventfd_ctx_fdget() can return a different value.
>

Ack. I’ll stop storing the fd integer in amdgpu_eventfd_entry.
I’ll treat eventfd_ctx * as the stable identity: compare ctx for duplicate 
binds and for unbind matching.
The ioctl will do eventfd_ctx_fdget(fd) only to obtain ctx for that call.

> > +
> > +   /*
> > +    * Prepare id object outside xa_lock_irqsave(): kzalloc(GFP_KERNEL)
> > +    * must not happen while holding spinlock/irqs-disabled.
> > +    */
> > +   new_id = amdgpu_eventfd_id_alloc(event_id);
>
>
>
> > +
> > +   /*
> > +    * IRQ-safe design:
> > +    *  - protect id lookup + insertion under xarray lock (irqsave)
> > +    *  - protect entry list under id->lock
>
> That is a bit overkill, just protecting everything under the xa lock should 
> be sufficient
> as far as I can see.

Ack. I’ll simplify the locking by dropping id->lock and protect the entry list 
using the xarray lock only. That should be sufficient for both lookup and entry 
modifications.

>
> > +    */
> > +   xa_lock_irqsave(&mgr->ids, flags);
> > +
> > +   id = xa_load(&mgr->ids, event_id);
> > +   if (!id) {
> > +           if (!new_id) {
> > +                   xa_unlock_irqrestore(&mgr->ids, flags);
> > +                   eventfd_ctx_put(ctx);
> > +                   kfree(e);
> > +                   return -ENOMEM;
> > +           }
> > +           if (xa_err(xa_store(&mgr->ids, event_id, new_id, GFP_NOWAIT))) {
> > +                   xa_unlock_irqrestore(&mgr->ids, flags);
> > +                   kfree(new_id);
> > +                   eventfd_ctx_put(ctx);
> > +                   kfree(e);
> > +                   return -ENOMEM;
> > +           }
> > +           id = new_id;
> > +           new_id = NULL;
> > +   }
>
> I strongly suggest to move that whole dance into amdgpu_eventfd_id_alloc().


Ack. I’ll refactor the xa_load/xa_store/new_id handling into
amdgpu_eventfd_id_alloc() so the get-or-create logic for event_id is
encapsulated there.

>
> > +
> > +   /* Enforce total bind limit. */
> > +   spin_lock(&mgr->lock);
> > +   if (mgr->bind_count >= AMDGPU_EVENTFD_MAX_BINDS) {
> > +           spin_unlock(&mgr->lock);
> > +           xa_unlock_irqrestore(&mgr->ids, flags);
> > +           kfree(new_id);
> > +           eventfd_ctx_put(ctx);
> > +           kfree(e);
> > +           return -ENOSPC;
> > +   }
> > +   spin_unlock(&mgr->lock);
>
> That could be an atomic operation instead. Would save the mgr lock as far as 
> I can
> see.

Ack. I’ll convert bind_count to atomic and remove the lock.

>
> > +
> > +   spin_lock(&id->lock);
> > +   {
> > +           struct amdgpu_eventfd_entry *it;
> > +
> > +           hlist_for_each_entry(it, &id->entries, hnode) {
> > +                   if (it->eventfd == eventfd) {
>
> Absolutely clear NAK to stuff like that! You must compare the ctx instead.
>
> Apart from that is it problematic to bind the same fd to and event multiple 
> times?

If the same (event_id, ctx) is bound again, should bind be a no-op returning 0, 
or should it return -EEXIST?

 I
> mean it doesn't make to much sense but it also doesn't hurt us in any way.

Ack. I’ll stop storing and comparing the eventfd integer and instead use 
eventfd_ctx *ctx as the binding identity.
For duplicate binds of the same (event_id, ctx), I plan to make bind idempotent 
(ignore duplicates and keep a single entry).
For unbind, I’ll resolve the fd to ctx and remove the matching (event_id, ctx) 
entry.
May I kno please, if this correct?


And other questions please:

1. Signal count API:

My tree seems to have eventfd_signal(ctx) only (no count arg).

Question: For older kernels, is it acceptable to loop signal count times, or 
should we treat count as 1?

2. IRQ lifetime:

In IRQ paths where I copy mgr pointer, I’ll take a kref get/put around the 
signal to avoid raw pointer lifetime issues.

Question: OK with doing amdgpu_eventfd_mgr_get/put() in IRQ before/after 
signaling?

3. One clarification regarding lifetime:
In the current design the eventfd manager and subscriptions are tied to the 
drm_file (fpriv), and longer-lived producers like the USERQ fence driver hold a 
kref to the manager if needed.
Does that match the expected model, or would you prefer moving the 
subscriptions under VM lifetime instead?

Thanks in advance!

Best regards,
Srini

>
> I'm running out of time to further review the patches, but I think that 
> already gives
> you quite a bit of todo.
>
> Regards,
> Christian.
>

Reply via email to