Thank you for your contribution! Sashiko AI review found 4 potential issue(s) 
to consider:

Pre-existing issues:
- [Critical] Missing validation of `drm_gem_object_lookup()` return value leads 
to a deterministic kernel crash.
- [Critical] Calling `vmap()` on imported PRIME BOs panics the kernel due to a 
NULL pages array.
- [High] Calling `drm_sched_job_cleanup()` on jobs that have already been 
pushed to the DRM scheduler causes a use-after-free and corrupts the 
scheduler's internal state.
- [High] Implicit BO fences are only added to the last job in a submission, 
causing earlier GPU jobs to execute prematurely without waiting for 
synchronization.
--

commit 572410b6778693137514a8800e2af096467fd0aa
Author: Maíra Canal <[email protected]>

drm/v3d: Refactor CPU ioctl into unified submission chain

Restructure the CPU ioctl so that all job types, including indirect CSD,
use a single struct v3d_submit chain and a single DRM exec context.

> diff --git a/drivers/gpu/drm/v3d/v3d_submit.c 
> b/drivers/gpu/drm/v3d/v3d_submit.c
> index 62c23feb8fbba..1df7ab528422a 100644
> --- a/drivers/gpu/drm/v3d/v3d_submit.c
> +++ b/drivers/gpu/drm/v3d/v3d_submit.c
> @@ -1317,90 +1312,51 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void 
> *data,
[ ... ]
> +     /* Look up the CPU jobs' BOs before v3d_setup_csd_jobs_and_bos() appends
> +      * the CSD and clean jobs in the case of indirect CSD job.
> +      */
>       if (args->bo_handle_count) {
>               ret = v3d_lookup_bos(&submit, args->bo_handles, 
> args->bo_handle_count);

[Severity: High]
This isn't a bug introduced by this patch, but it looks like BOs might be
attached exclusively to the last job in the submission.

v3d_lookup_bos() assigns the user-provided BO array and bo_count to
submit->jobs[submit->job_count - 1]. Later, v3d_submit_lock_reservations()
iterates over the jobs and calls drm_sched_job_add_implicit_dependencies()
only for jobs with bo_count > 0.

Consequently, earlier jobs in the submission, such as the BIN job in a CL
submission or the CSD job in an indirect CSD submission, receive no
implicit dependencies for the BOs they access. Could this allow them to be
scheduled and executed immediately, racing with other processes or queues
currently writing to those BOs on the GPU?

[Severity: Critical]
This is a pre-existing issue, but I am wondering if mapping these BOs properly
handles imported PRIME BOs.

v3d_get_bo_vaddr() maps GEM objects into kernel space by calling
vmap(obj->pages, ...). However, for PRIME BOs imported via
v3d_prime_import_sg_table(), the drm_gem_shmem backend does not populate the
obj->pages array.

By bypassing the drm_gem_shmem_vmap() helper and directly accessing obj->pages,
v3d_get_bo_vaddr() passes NULL as the pages array to vmap(). Can a local
user trigger a kernel panic by importing a DMA-buf, passing it as a buffer to
a CPU job, and causing v3d_get_bo_vaddr() to crash?

>               if (ret)
>                       goto fail;
[ ... ]
> +     if (cpu_job->job_type == V3D_CPU_JOB_TYPE_INDIRECT_CSD) {
> +             ret = v3d_setup_csd_jobs_and_bos(&submit, 
> &cpu_job->indirect_csd.args,
> +                                              NULL);

[Severity: Critical]
This is a pre-existing issue, but shouldn't the return value of
drm_gem_object_lookup() be validated when processing the indirect CSD params?

In v3d_get_cpu_indirect_csd_params(), drm_gem_object_lookup() is called
with a user-provided handle and the return value is assigned directly to
info->indirect without checking for NULL.

If userspace passes an invalid handle, NULL is stored. Later, when the CPU job
executes, v3d_rewrite_csd_job_wg_counts_from_indirect() calls
v3d_get_bo_vaddr(to_v3d_bo(indirect_csd->indirect)). 

Would this pass a NULL pointer to v3d_get_bo_vaddr(), which unconditionally
accesses obj->pages, resulting in a deterministic kernel panic?

[ ... ]
> +     ret = v3d_submit_jobs(&submit, 0, &se);
> +     if (ret)
> +             goto fail_unreserve;
>  
>       return 0;
>  
>  fail_unreserve:
>       v3d_submit_unlock_reservations(&submit);
>  fail:
>       v3d_submit_cleanup_jobs(&submit);

[Severity: High]
This isn't a bug introduced by this patch, but does calling
v3d_submit_cleanup_jobs() here clean up jobs that have already been pushed?

In v3d_submit_jobs(), jobs are pushed to the DRM scheduler via v3d_push_job().
If a subsequent call to drm_sched_job_add_dependency() fails for a later job
in the submission chain, the function returns an error without unwinding the
already-pushed jobs.

The caller then branches to this error path that invokes
v3d_submit_cleanup_jobs(), which iterates over all jobs and calls
v3d_job_cleanup() -> drm_sched_job_cleanup().

Calling drm_sched_job_cleanup() on a job that the scheduler already owns
violates the scheduler API contract. It sets job->s_fence = NULL and destroys
the dependencies XArray while the scheduler thread might be executing the job.
Can this lead to a NULL pointer dereference and memory corruption?

-- 
Sashiko AI review · 
https://sashiko.dev/#/patchset/[email protected]?part=11

Reply via email to