On Tue, 2026-06-23 at 15:05 +0100, Tvrtko Ursulin wrote: > > > There we have this sequence: > > > > > > spin_lock(&entity->lock); > > > entity->stopped = true; > > > sched = drm_sched_rq_remove_entity(entity->rq, entity); > > > spin_unlock(&entity->lock); > > > > > > if (sched) > > > drm_sched_flush_run_work(sched); > > > > > > That is, without that check, in theory, an evil driver could race > > > drm_sched_entity_select_rq() (via drm_sched_job_arm()) and > > > drm_sched_entity_kill(). I am not sure if any driver can actually do > > > that at the moment but it felt sensible to express it in code. > > > > The devil is in the "at the moment". > > > > AFAICS and as your explanation sounds, this is an existing problem that > > is unrelated to the Flush RFC. So I suppose that this should be a > > separate patch (independent from this one) which precisely focusses on > > this robustness work. > > Catch is, and why I thought it is justified to do this it in this patch, > is because before completion was in the entity so whats happening with > entity->rq->sched after entity is stopped was irrelevant. With this > patch is it relevant on paper so I considered it prudent to express that > in the code as well as the comment.
I'm not sure if I can fully follow, but it seems that the stopped boolean + spinlock now protects the scheduler pointer? IDK, I don't feel comfortable right now.. New docu says drm_sched_rq_add_entity(struct drm_sched_entity *entity) * @entity: scheduler entity * * Removes a scheduler entity from the run queue. + * + * Return: DRM scheduler selected to handle this entity or NULL if entity has + * already been removed. */ Isn't a removed entity by definition stopped since drm_sched can't access it anymore? Or do we have race there, too? If so, then stopped and entity->rq->sched == NULL represent the same state. IOW, the locked bool must be redundant with something. Right? > > Acceptable or unacceptable? In any case such changes should, wherever possible, be a separate patch. For the fact alone that the distinct commit message and especially the position / order of the patch within the series helps the reviewer greatly in understanding what that particular change enables and thus why it's necessary. P.
