I've got a kernel.org addr by now by the way On Tue, 2025-10-21 at 14:39 -0700, Matthew Brost wrote: > According to the DMA scheduler documentation, once a job is armed, it > must be pushed. Drivers should avoid calling the failing code path that > attempts to add dependencies after a job has been armed. >
Why is that a "failing code path"? The issue with adding callbacks is that adding them to an already signaled fence is a bad idea. I'm not sure if it's illegal, though. dma_fence_add_cb() merely returns an error then, but the driver could in priniciple then execute its cb code itself. And even if we agree that this is a hard rule that must be followed, then drm_sched_job_arm() *might* not be the right place, because just because a job is armed doesn't mean that its fence is about to get signaled. drm_sched_entity_push_job() would be the critical place. > This change > enforces that rule. > > Cc: Christian König <[email protected]> > Cc: Danilo Krummrich <[email protected]> > Cc: Matthew Brost <[email protected]> > Cc: Philipp Stanner <[email protected]> > Cc: [email protected] > Signed-off-by: Matthew Brost <[email protected]> > --- > drivers/gpu/drm/scheduler/sched_main.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index 676484dd3ea3..436cb2844161 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -873,7 +873,8 @@ EXPORT_SYMBOL(drm_sched_job_arm); > * @job: scheduler job to add the dependencies to > * @fence: the dma_fence to add to the list of dependencies. > * > - * Note that @fence is consumed in both the success and error cases. > + * Note that @fence is consumed in both the success and error cases. This > + * function cannot be called if the job is armed. > * > * Returns: > * 0 on success, or an error on failing to expand the array. > @@ -886,6 +887,10 @@ int drm_sched_job_add_dependency(struct drm_sched_job > *job, > u32 id = 0; > int ret; > > + /* Do not allow additional dependencies when job is armed */ > + if (WARN_ON_ONCE(job->sched)) One would probably want an 'armed' boolean for that. At the very least one wants to document in the struct's docstring that job->sched has this semantic meaning. Otherwise it's only obvious for people who have been hacking on the scheduler for years. By the way I think that we use WARN_ON*() too much in DRM. It generates difficult to read, non-descriptive error messages compared to dev_warn() and similar helpers, and it's often a bit overkill. I would only use it when there is no other choice, such as in an interrupt- handler or widely used void func() where you cannot simply add a return code. P. > + return -EINVAL; > + > if (!fence) > return 0; >
