Looks like this hasn't landed yet. Can someone push this? Alex
On Fri, Oct 17, 2025 at 2:18 AM Philipp Stanner <[email protected]> wrote: > > On Wed, 2025-10-15 at 16:01 +0200, Christian König wrote: > > From: David Rosca <[email protected]> > > > > The DRM scheduler tracks who last uses an entity and when that process > > is killed blocks all further submissions to that entity. > > > > The problem is that we didn't track who initially created an entity, so > > when a process accidently leaked its file descriptor to a child and > > that child got killed, we killed the parent's entities. > > > > Avoid that and instead initialize the entities last user on entity > > creation. This also allows to drop the extra NULL check. > > > > v2: still use cmpxchg > > v3: improve the commit message > > > > Signed-off-by: David Rosca <[email protected]> > > Signed-off-by: Christian König <[email protected]> > > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4568 > > Reviewed-by: Alex Deucher <[email protected]> > > CC: [email protected] > > Acked-by: Philipp Stanner <[email protected]> > > > Fire at will, Christian. Maybe optionally with the commit message nits > twirked in we discussed before. > > > P. > > > --- > > drivers/gpu/drm/scheduler/sched_entity.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > > b/drivers/gpu/drm/scheduler/sched_entity.c > > index 5a4697f636f2..3e2f83dc3f24 100644 > > --- a/drivers/gpu/drm/scheduler/sched_entity.c > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > > @@ -70,6 +70,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity, > > entity->guilty = guilty; > > entity->num_sched_list = num_sched_list; > > entity->priority = priority; > > + entity->last_user = current->group_leader; > > /* > > * It's perfectly valid to initialize an entity without having a valid > > * scheduler attached. It's just not valid to use the scheduler > > before it > > @@ -302,7 +303,7 @@ long drm_sched_entity_flush(struct drm_sched_entity > > *entity, long timeout) > > > > /* For a killed process disallow further enqueueing of jobs. */ > > last_user = cmpxchg(&entity->last_user, current->group_leader, NULL); > > - if ((!last_user || last_user == current->group_leader) && > > + if (last_user == current->group_leader && > > (current->flags & PF_EXITING) && (current->exit_code == SIGKILL)) > > drm_sched_entity_kill(entity); > > >
