On 31/10/2025 15:48, Boris Brezillon wrote: > A group can become runnable even after reset.in_progress has > been set to true and panthor_sched_suspend() has been called, > because the drm_sched queues are still running at that point, > and ::run_job() might call group_schedule_locked() which moves > the group to the runnable list. And that's fine, because we're > moving those groups to the stopped list anyway when we call > panthor_group_stop(), so just drop the misleading WARN_ON().
If we've got another thread mutating the runnable list between panthor_sched_suspend() and list_for_each_entry_safe(), doesn't that make the list iterator unsafe? (_safe only protects against deleting the current item, not against concurrent access). It feels to me like we should be holding the sched mutex - at least while iterating. I agree the WARN_ON is unnecessary, and will need removing if we simply guard the iteration - the alternative is to recolour panthor_sched_suspend() to assume the lock is held (and take the lock in panthor_sched_pre_reset), but I suspect that's a more ugly change. Thanks, Steve > > Signed-off-by: Boris Brezillon <[email protected]> > --- > drivers/gpu/drm/panthor/panthor_sched.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/gpu/drm/panthor/panthor_sched.c > b/drivers/gpu/drm/panthor/panthor_sched.c > index fc0826db8f48..51a8d842a7a3 100644 > --- a/drivers/gpu/drm/panthor/panthor_sched.c > +++ b/drivers/gpu/drm/panthor/panthor_sched.c > @@ -2835,8 +2835,6 @@ void panthor_sched_pre_reset(struct panthor_device > *ptdev) > * new jobs while we're resetting. > */ > for (i = 0; i < ARRAY_SIZE(sched->groups.runnable); i++) { > - /* All groups should be in the idle lists. */ > - drm_WARN_ON(&ptdev->base, > !list_empty(&sched->groups.runnable[i])); > list_for_each_entry_safe(group, group_tmp, > &sched->groups.runnable[i], run_node) > panthor_group_stop(group); > }
