Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues: - [Critical] Dangling pointer in `ctx->fw_obj` leads to use-after-free on `xa_alloc()` failure in `pvr_context_create()`. - [High] Concurrent job submission and context destruction pushes jobs to a destroyed DRM scheduler entity, leaking jobs and corrupting memory. -- commit 251c2d0e7e2826c41929b77248365d4ff3fa7a98 Author: Brajesh Gupta <[email protected]> drm/imagination: Fix double call to drm_sched_entity_fini() This patch removes an extra call to drm_sched_entity_fini() during context teardown. It fixes a use-after-free underflow issue in the DRM scheduler entity stats refcounting. Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling") References: fd177135f0e6 ("drm/sched: Account entity GPU time") > diff --git a/drivers/gpu/drm/imagination/pvr_context.c > b/drivers/gpu/drm/imagination/pvr_context.c > index eba4694400b5a..e2e30908dfce4 100644 > --- a/drivers/gpu/drm/imagination/pvr_context.c > +++ b/drivers/gpu/drm/imagination/pvr_context.c [ ... ] > @@ -349,7 +374,7 @@ int pvr_context_create(struct pvr_file *pvr_file, struct > drm_pvr_ioctl_create_co > pvr_fw_object_destroy(ctx->fw_obj); > > err_destroy_queues: > - pvr_context_destroy_queues(ctx); > + pvr_context_destroy_queues(ctx, true); [Severity: Critical] This is a pre-existing issue, but can this cause a use-after-free if xa_alloc() fails? If xa_alloc() fails for the context handle, the code jumps to err_destroy_fw_obj: err_destroy_fw_obj: pvr_fw_object_destroy(ctx->fw_obj); err_destroy_queues: pvr_context_destroy_queues(ctx, true); This frees the pvr_fw_object but does not set ctx->fw_obj to NULL. Execution then falls through to err_destroy_queues, where pvr_context_destroy_queues() will iterate through the queues and call pvr_queue_destroy(). Inside pvr_queue_destroy(), it eventually calls pvr_queue_cleanup_fw_context(): pvr_queue_cleanup_fw_context() { if (!queue->ctx->fw_obj) return 0; return pvr_fw_structure_cleanup(..., queue->ctx->fw_obj, ...); } Since the freed pointer is not NULL, does it bypass this check and dereference the freed memory in pvr_fw_structure_cleanup()? > > err_free_ctx_id: > /* [ ... ] > diff --git a/drivers/gpu/drm/imagination/pvr_queue.c > b/drivers/gpu/drm/imagination/pvr_queue.c > index 7ed60e1c1a866..15cd8adc7528f 100644 > --- a/drivers/gpu/drm/imagination/pvr_queue.c > +++ b/drivers/gpu/drm/imagination/pvr_queue.c > @@ -1445,15 +1445,11 @@ void pvr_queue_kill(struct pvr_queue *queue) > */ > void pvr_queue_destroy(struct pvr_queue *queue) > { [Severity: High] This is a pre-existing issue, but does a race condition exist between pvr_submit_jobs() and pvr_context_destroy()? If pvr_submit_jobs() acquires a reference to the context and sets up a job, and another thread concurrently calls the context destruction ioctl, pvr_context_destroy() calls pvr_context_kill_queues() and then pvr_queue_kill(). pvr_queue_kill() executes: drm_sched_entity_destroy(&queue->entity); dma_fence_put(queue->last_queued_job_scheduled_fence); queue->last_queued_job_scheduled_fence = NULL; If the submit thread concurrently reaches pvr_queue_job_push(): dma_fence_put(queue->last_queued_job_scheduled_fence); queue->last_queued_job_scheduled_fence = dma_fence_get(...); ... drm_sched_entity_push_job(&job->base); Can this lead to a data race on queue->last_queued_job_scheduled_fence, potentially causing a double free? Also, wouldn't this push a job to a scheduler entity that was already destroyed by pvr_queue_kill(), leaking the stranded jobs? > - if (!queue) > - return; > - > mutex_lock(&queue->ctx->pvr_dev->queues.lock); > list_del_init(&queue->node); > mutex_unlock(&queue->ctx->pvr_dev->queues.lock); > > drm_sched_fini(&queue->scheduler); > - drm_sched_entity_fini(&queue->entity); > > if (WARN_ON(queue->last_queued_job_scheduled_fence)) > dma_fence_put(queue->last_queued_job_scheduled_fence); -- Sashiko AI review ยท https://sashiko.dev/#/patchset/[email protected]?part=1
