Thank you for your contribution! Sashiko AI review found 2 potential issue(s) 
to consider:

Pre-existing issues:
- [Critical] Dangling pointer in `ctx->fw_obj` leads to use-after-free on 
`xa_alloc()` failure in `pvr_context_create()`.
- [High] Concurrent job submission and context destruction pushes jobs to a 
destroyed DRM scheduler entity, leaking jobs and corrupting memory.
--

commit 251c2d0e7e2826c41929b77248365d4ff3fa7a98
Author: Brajesh Gupta <[email protected]>

drm/imagination: Fix double call to drm_sched_entity_fini()

This patch removes an extra call to drm_sched_entity_fini() during context
teardown. It fixes a use-after-free underflow issue in the DRM scheduler
entity stats refcounting.

Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling")
References: fd177135f0e6 ("drm/sched: Account entity GPU time")

> diff --git a/drivers/gpu/drm/imagination/pvr_context.c 
> b/drivers/gpu/drm/imagination/pvr_context.c
> index eba4694400b5a..e2e30908dfce4 100644
> --- a/drivers/gpu/drm/imagination/pvr_context.c
> +++ b/drivers/gpu/drm/imagination/pvr_context.c

[ ... ]

> @@ -349,7 +374,7 @@ int pvr_context_create(struct pvr_file *pvr_file, struct 
> drm_pvr_ioctl_create_co
>       pvr_fw_object_destroy(ctx->fw_obj);
>  
>  err_destroy_queues:
> -     pvr_context_destroy_queues(ctx);
> +     pvr_context_destroy_queues(ctx, true);

[Severity: Critical]
This is a pre-existing issue, but can this cause a use-after-free if
xa_alloc() fails?

If xa_alloc() fails for the context handle, the code jumps to
err_destroy_fw_obj:

err_destroy_fw_obj:
        pvr_fw_object_destroy(ctx->fw_obj);

err_destroy_queues:
        pvr_context_destroy_queues(ctx, true);

This frees the pvr_fw_object but does not set ctx->fw_obj to NULL. Execution
then falls through to err_destroy_queues, where pvr_context_destroy_queues()
will iterate through the queues and call pvr_queue_destroy().

Inside pvr_queue_destroy(), it eventually calls
pvr_queue_cleanup_fw_context():

pvr_queue_cleanup_fw_context() {
        if (!queue->ctx->fw_obj)
                return 0;
        return pvr_fw_structure_cleanup(..., queue->ctx->fw_obj, ...);
}

Since the freed pointer is not NULL, does it bypass this check and
dereference the freed memory in pvr_fw_structure_cleanup()?

>  
>  err_free_ctx_id:
>       /*

[ ... ]

> diff --git a/drivers/gpu/drm/imagination/pvr_queue.c 
> b/drivers/gpu/drm/imagination/pvr_queue.c
> index 7ed60e1c1a866..15cd8adc7528f 100644
> --- a/drivers/gpu/drm/imagination/pvr_queue.c
> +++ b/drivers/gpu/drm/imagination/pvr_queue.c
> @@ -1445,15 +1445,11 @@ void pvr_queue_kill(struct pvr_queue *queue)
>   */
>  void pvr_queue_destroy(struct pvr_queue *queue)
>  {

[Severity: High]
This is a pre-existing issue, but does a race condition exist
between pvr_submit_jobs() and pvr_context_destroy()?

If pvr_submit_jobs() acquires a reference to the context and sets up a job,
and another thread concurrently calls the context destruction ioctl,
pvr_context_destroy() calls pvr_context_kill_queues() and then
pvr_queue_kill().

pvr_queue_kill() executes:

        drm_sched_entity_destroy(&queue->entity);
        dma_fence_put(queue->last_queued_job_scheduled_fence);
        queue->last_queued_job_scheduled_fence = NULL;

If the submit thread concurrently reaches pvr_queue_job_push():

        dma_fence_put(queue->last_queued_job_scheduled_fence);
        queue->last_queued_job_scheduled_fence = dma_fence_get(...);
        ...
        drm_sched_entity_push_job(&job->base);

Can this lead to a data race on queue->last_queued_job_scheduled_fence,
potentially causing a double free? 

Also, wouldn't this push a job to a scheduler entity that was already
destroyed by pvr_queue_kill(), leaking the stranded jobs?

> -     if (!queue)
> -             return;
> -
>       mutex_lock(&queue->ctx->pvr_dev->queues.lock);
>       list_del_init(&queue->node);
>       mutex_unlock(&queue->ctx->pvr_dev->queues.lock);
>  
>       drm_sched_fini(&queue->scheduler);
> -     drm_sched_entity_fini(&queue->entity);
>  
>       if (WARN_ON(queue->last_queued_job_scheduled_fence))
>               dma_fence_put(queue->last_queued_job_scheduled_fence);

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=1

Reply via email to