I've got a kernel.org addr by now by the way

On Tue, 2025-10-21 at 14:39 -0700, Matthew Brost wrote:
> According to the DMA scheduler documentation, once a job is armed, it
> must be pushed. Drivers should avoid calling the failing code path that
> attempts to add dependencies after a job has been armed.
> 

Why is that a "failing code path"?

The issue with adding callbacks is that adding them to an already
signaled fence is a bad idea. I'm not sure if it's illegal, though.
dma_fence_add_cb() merely returns an error then, but the driver could
in priniciple then execute its cb code itself.

And even if we agree that this is a hard rule that must be followed,
then drm_sched_job_arm() *might* not be the right place, because just
because a job is armed doesn't mean that its fence is about to get
signaled. drm_sched_entity_push_job() would be the critical place.


>  This change
> enforces that rule.
> 
> Cc: Christian König <[email protected]>
> Cc: Danilo Krummrich <[email protected]>
> Cc: Matthew Brost <[email protected]>
> Cc: Philipp Stanner <[email protected]>
> Cc: [email protected]
> Signed-off-by: Matthew Brost <[email protected]>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 676484dd3ea3..436cb2844161 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -873,7 +873,8 @@ EXPORT_SYMBOL(drm_sched_job_arm);
>   * @job: scheduler job to add the dependencies to
>   * @fence: the dma_fence to add to the list of dependencies.
>   *
> - * Note that @fence is consumed in both the success and error cases.
> + * Note that @fence is consumed in both the success and error cases. This
> + * function cannot be called if the job is armed.
>   *
>   * Returns:
>   * 0 on success, or an error on failing to expand the array.
> @@ -886,6 +887,10 @@ int drm_sched_job_add_dependency(struct drm_sched_job 
> *job,
>       u32 id = 0;
>       int ret;
>  
> +     /* Do not allow additional dependencies when job is armed */
> +     if (WARN_ON_ONCE(job->sched))

One would probably want an 'armed' boolean for that. At the very least
one wants to document in the struct's docstring that job->sched has
this semantic meaning. Otherwise it's only obvious for people who have
been hacking on the scheduler for years.


By the way I think that we use WARN_ON*() too much in DRM. It generates
difficult to read, non-descriptive error messages compared to
dev_warn() and similar helpers, and it's often a bit overkill. I would
only use it when there is no other choice, such as in an interrupt-
handler or widely used void func() where you cannot simply add a return
code.


P.

> +             return -EINVAL;


> +
>       if (!fence)
>               return 0;
>  

Reply via email to