Looks like this hasn't landed yet.  Can someone push this?

Alex

On Fri, Oct 17, 2025 at 2:18 AM Philipp Stanner <[email protected]> wrote:
>
> On Wed, 2025-10-15 at 16:01 +0200, Christian König wrote:
> > From: David Rosca <[email protected]>
> >
> > The DRM scheduler tracks who last uses an entity and when that process
> > is killed blocks all further submissions to that entity.
> >
> > The problem is that we didn't track who initially created an entity, so
> > when a process accidently leaked its file descriptor to a child and
> > that child got killed, we killed the parent's entities.
> >
> > Avoid that and instead initialize the entities last user on entity
> > creation. This also allows to drop the extra NULL check.
> >
> > v2: still use cmpxchg
> > v3: improve the commit message
> >
> > Signed-off-by: David Rosca <[email protected]>
> > Signed-off-by: Christian König <[email protected]>
> > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4568
> > Reviewed-by: Alex Deucher <[email protected]>
> > CC: [email protected]
>
> Acked-by: Philipp Stanner <[email protected]>
>
>
> Fire at will, Christian. Maybe optionally with the commit message nits
> twirked in we discussed before.
>
>
> P.
>
> > ---
> >  drivers/gpu/drm/scheduler/sched_entity.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> > b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 5a4697f636f2..3e2f83dc3f24 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -70,6 +70,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> >       entity->guilty = guilty;
> >       entity->num_sched_list = num_sched_list;
> >       entity->priority = priority;
> > +     entity->last_user = current->group_leader;
> >       /*
> >        * It's perfectly valid to initialize an entity without having a valid
> >        * scheduler attached. It's just not valid to use the scheduler 
> > before it
> > @@ -302,7 +303,7 @@ long drm_sched_entity_flush(struct drm_sched_entity 
> > *entity, long timeout)
> >
> >       /* For a killed process disallow further enqueueing of jobs. */
> >       last_user = cmpxchg(&entity->last_user, current->group_leader, NULL);
> > -     if ((!last_user || last_user == current->group_leader) &&
> > +     if (last_user == current->group_leader &&
> >           (current->flags & PF_EXITING) && (current->exit_code == SIGKILL))
> >               drm_sched_entity_kill(entity);
> >
>

Reply via email to