Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues: - [High] The `run_job` callbacks return `NULL` instead of propagating the error pointer when `v3d_fence_create()` fails, masking submission failures and causing the DRM scheduler to treat the job as successfully completed. -- commit f3ab1695e1c90559d6118895b31f3b02289aee81 Author: Maíra Canal <[email protected]> drm/v3d: Clear queue->active_job when v3d_fence_create() fails This commit clears queue->active_job in the run_job callbacks for BIN, RENDER, TFU, and CSD when v3d_fence_create() fails. It creates a failure path to prevent dangling pointers when returning from a failed job submission. > diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c > index 0e266b29317f3..46f4fc09c59ec 100644 > --- a/drivers/gpu/drm/v3d/v3d_sched.c > +++ b/drivers/gpu/drm/v3d/v3d_sched.c [ ... ] > @@ -214,7 +210,7 @@ static struct dma_fence *v3d_bin_job_run(struct > drm_sched_job *sched_job) > > fence = v3d_fence_create(v3d, V3D_BIN); > if (IS_ERR(fence)) > - return NULL; > + goto out_clean_job; [ ... ] > @@ -242,6 +238,12 @@ static struct dma_fence *v3d_bin_job_run(struct > drm_sched_job *sched_job) > V3D_CORE_WRITE(0, V3D_CLE_CT0QEA, job->end); > > return fence; > + > +out_clean_job: > + spin_lock_irqsave(&queue->queue_lock, irqflags); > + queue->active_job = NULL; > + spin_unlock_irqrestore(&queue->queue_lock, irqflags); > + return NULL; [Severity: High] This is a pre-existing issue, but does returning NULL here mask the submission failure from the DRM scheduler? When v3d_fence_create() fails, it returns an error pointer. If the v3d_bin_job_run() callback (and the other run_job callbacks modified in this patch) returns NULL instead of propagating this error pointer, the DRM scheduler treats the job as having completed synchronously and successfully. Could this allow dependent jobs to execute as if the hardware successfully processed the command lists, potentially leading to GPU page faults or userspace crashes? Should the callbacks propagate the error pointer back to the scheduler instead? -- Sashiko AI review · https://sashiko.dev/#/patchset/[email protected]?part=2
