On 6/17/26 10:38, Philipp Stanner wrote: > On Thu, 2026-06-11 at 13:34 +0100, Tvrtko Ursulin wrote: >> Due the scheduler locking design, and the inability to always lock both >> the entity and the run-queue in the consistent order, a completion exists >> which effectively marks the entity as in use from a call path which is not >> able to lock it. >> >> When entity is selected from the run job worker, its completion is marked >> as non-idle all until the code is sure it will not be dereferencing it any >> more, at which point it signals it as idle, releasing the potential >> parallel cleanup path. >> >> We can remove the need for this completion by implementing the identical >> guarantee by simply flushing the run job work from the cleanup path, after >> having removed the entity from the run queue.
Oh, yes please. >> We then know that the entity is no longer reachable by the run queue >> selection logic, so as soon as any pending work is done the cleanup can >> safely proceed. And because we have marked the entity as stopped, we also >> know that the entity cannot re-enter the run queue. >> >> Signed-off-by: Tvrtko Ursulin <[email protected]> >> Cc: Christian König <[email protected]> >> Cc: Danilo Krummrich <[email protected]> >> Cc: Matthew Brost <[email protected]> >> Cc: Philipp Stanner <[email protected]> >> Cc: [email protected] >> Cc: [email protected] >> --- >> "Perfection is achieved, not when there is nothing more to add, but when >> there is nothing left to take away." - Antoine de Saint-Exupéry > > > Hmm, alright, so the basic trick just seems to be that the workqueue > implementation already can ensure the synchronization which we manually > implemented so far through the completion. > > > It's a bit more LOC, and doesn't solve a bug. However, I kind of like > the idea because it removes a redundant mechanism. Would be cool to > hear some other opinions, though. At least of hand it looks cleaner to me. And just for the record: This was also the original design the scheduler had before people started to "optimize" it. Cheers, Christian. > > A comment below > >> >> Lets see what Intel's CI says about this, not to mention our new AI >> overlords... >> --- >> drivers/gpu/drm/scheduler/sched_entity.c | 25 +++++++++++++++------- >> drivers/gpu/drm/scheduler/sched_internal.h | 14 ++++++++++-- >> drivers/gpu/drm/scheduler/sched_main.c | 2 -- >> drivers/gpu/drm/scheduler/sched_rq.c | 14 +++++++----- >> include/drm/gpu_scheduler.h | 9 -------- >> 5 files changed, 38 insertions(+), 26 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c >> b/drivers/gpu/drm/scheduler/sched_entity.c >> index c51101ec70c1..e6f7c2fbefce 100644 >> --- a/drivers/gpu/drm/scheduler/sched_entity.c >> +++ b/drivers/gpu/drm/scheduler/sched_entity.c >> @@ -137,10 +137,6 @@ int drm_sched_entity_init(struct drm_sched_entity >> *entity, >> entity->rq = &sched_list[0]->rq; >> RCU_INIT_POINTER(entity->last_scheduled, NULL); >> RB_CLEAR_NODE(&entity->rb_tree_node); >> - init_completion(&entity->entity_idle); >> - >> - /* We start in an idle state. */ >> - complete_all(&entity->entity_idle); >> >> spin_lock_init(&entity->lock); >> spsc_queue_init(&entity->job_queue); >> @@ -276,18 +272,24 @@ static void drm_sched_entity_kill_jobs_cb(struct >> dma_fence *f, >> */ >> void drm_sched_entity_kill(struct drm_sched_entity *entity) >> { >> + struct drm_gpu_scheduler *sched; >> struct drm_sched_job *job; >> struct dma_fence *prev; >> >> spin_lock(&entity->lock); >> entity->stopped = true; >> - drm_sched_rq_remove_entity(entity->rq, entity); >> + sched = drm_sched_rq_remove_entity(entity->rq, entity); >> spin_unlock(&entity->lock); >> >> - /* Make sure this entity is not used by the scheduler at the moment */ >> - wait_for_completion(&entity->entity_idle); >> + /* >> + * Make sure this entity is not used by the scheduler at the moment. >> + * >> + * Scheduler is guaranteed to be stable after the entity was stopped and >> + * removed from the run-queue. >> + */ >> + if (sched) >> + drm_sched_flush_run_work(sched); >> >> - /* The entity is guaranteed to not be used by the scheduler */ >> prev = rcu_dereference_check(entity->last_scheduled, true); >> dma_fence_get(prev); >> while ((job = drm_sched_entity_queue_pop(entity))) { >> @@ -576,6 +578,13 @@ void drm_sched_entity_select_rq(struct drm_sched_entity >> *entity) >> return; >> >> spin_lock(&entity->lock); >> + >> + if (entity->stopped) { >> + spin_unlock(&entity->lock); >> + return; >> + >> + } > > Seems unrelated? Why wasn't this needed semantically before? > >> + >> sched = drm_sched_pick_best(entity->sched_list, entity->num_sched_list); >> rq = sched ? &sched->rq : NULL; >> if (rq != entity->rq) { >> diff --git a/drivers/gpu/drm/scheduler/sched_internal.h >> b/drivers/gpu/drm/scheduler/sched_internal.h >> index 13ecb771d7a2..80dece3be415 100644 >> --- a/drivers/gpu/drm/scheduler/sched_internal.h >> +++ b/drivers/gpu/drm/scheduler/sched_internal.h >> @@ -35,12 +35,22 @@ bool drm_sched_can_queue(struct drm_gpu_scheduler *sched, >> struct drm_sched_entity *entity); >> void drm_sched_wakeup(struct drm_gpu_scheduler *sched); >> >> +/** >> + * drm_sched_flush_run_work - flush the run-job work > > In v1, you'd probably want to document what this function typically > will be used for :) > > > Greetings, > P.
