Re: [RFC PATCH v4 6/9] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568

sashiko-bot Sat, 13 Jun 2026 00:11:22 -0700

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) 
to consider:


Pre-existing issues:
- [Critical] Double free of `rjob->tasks` leading to memory corruption 
triggerable from userspace.
- [Critical] Unconditional dereference of a NULL `job->domain` triggering a 
kernel panic on early job submission failure.
- [High] Permanent PM runtime reference count leak when `iommu_attach_group()` 
fails.
- [High] Zero-initialized clock IDs cause the framework to fetch the same clock 
repeatedly, violating hardware constraints.
- [Medium] Premature DRM device registration exposes an incomplete state to 
userspace.
--

commit d0d76b931ac773e3e176353cdb3ac9c590c43dc3
Author: Midgy BALON <[email protected]>

dt-bindings: npu: rockchip, rk3588-rknn-core: Add RK3568
    
The RK3568 carries a single core of the same NVDLA-derived NPU IP as the
RK3588.  Add its compatible.

> diff --git 
> a/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml 
> b/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
> index caca2a4903cd1..e0b948ac47d45 100644
> --- a/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
> +++ b/Documentation/devicetree/bindings/npu/rockchip,rk3588-rknn-core.yaml
[ ... ]

drivers/accel/rocket/rocket_device.c:rocket_device_init() {
    ...
    err = drm_dev_register(ddev, 0);
    ...
}

drivers/accel/rocket/rocket_drv.c:rocket_probe() {
    ...
    rdev->cores[core].dev = &pdev->dev;
    ...
}

[Severity: Medium]
This is a pre-existing issue, but does this premature registration expose an
incomplete state to userspace?

drm_dev_register() is called in rocket_device_init() before rocket_probe()
assigns rdev->cores[0].dev.

As a result, the device node is briefly visible to userspace while its core
structures remain partially uninitialized.

If userspace attempts to open the DRM device during this window, rocket_open()
passes the NULL rdev->cores[0].dev to rocket_iommu_domain_create().

While the IOMMU framework safely catches the NULL device and returns -ENODEV,
this still causes a spurious open() failure for userspace applications relying
on udev events.


drivers/accel/rocket/rocket_job.c:rocket_job_run() {
    ...
        ret = pm_runtime_get_sync(core->dev);
        if (ret < 0)
                return fence;
    ...
                ret = iommu_attach_group(job->domain->domain, 
core->iommu_group);
                if (ret < 0)
                        return fence;
    ...
}

[Severity: High]
This is a pre-existing issue, but does this cause a permanent PM runtime
reference count leak when iommu_attach_group() fails?

In rocket_job_run(), pm_runtime_get_sync() increments the PM usage count to wake
up the NPU.

If the subsequent iommu_attach_group() fails, the driver returns the allocated
fence and exits early without calling pm_runtime_put().

Because the driver exits before assigning core->in_flight_job, the subsequent
timeout handler rocket_reset() will also skip the manual
pm_runtime_put_noidle().

This causes a permanent PM runtime reference count leak, preventing the NPU
hardware from ever entering runtime suspend.


drivers/accel/rocket/rocket_core.c:rocket_core_init() {
    ...
    err = devm_clk_bulk_get(dev, ARRAY_SIZE(core->clks), core->clks);
    ...
}

[Severity: High]
This is a pre-existing issue, but do these zero-initialized clock IDs cause the
framework to fetch the same clock repeatedly?

In rocket_core_init(), the driver declares an array of clk_bulk_data but never
initializes the .id fields, leaving them implicitly NULL.

When devm_clk_bulk_get() is called, the clock framework handles the NULL IDs by
returning the first clock defined in the device tree for the device on all four
iterations.

Consequently, rocket_device_runtime_resume() enables the exact same clock four
times, while the remaining required clocks are left disabled.

This violates physical hardware state requirements and causes unclocked NPU
register accesses.


drivers/accel/rocket/rocket_job.c:rocket_copy_tasks() {
    ...
fail:
        kvfree(rjob->tasks);
        return ret;
}

drivers/accel/rocket/rocket_job.c:rocket_job_cleanup() {
    ...
        kvfree(job->tasks);
    ...
}

[Severity: Critical]
This is a pre-existing issue, but does this lead to a double free of rjob->tasks
that can be triggered from userspace?

In rocket_copy_tasks(), if copy_from_user() fails, kvfree(rjob->tasks) is
called, but the pointer rjob->tasks is not cleared.

The function returns an error to rocket_ioctl_submit_job(), which jumps to
out_cleanup_job and invokes rocket_job_put().

When the reference count drops, rocket_job_cleanup() is executed,
unconditionally calling kvfree(job->tasks) again on the already freed memory
block.

Since the IOCTL input is fully controlled by userspace, an unprivileged attacker
can trigger a double free by submitting an invalid task pointer.


drivers/accel/rocket/rocket_job.c:rocket_ioctl_submit_job() {
    ...
        rjob = kzalloc_obj(*rjob);
    ...
        ret = rocket_copy_tasks(dev, file, job, rjob);
        if (ret)
                goto out_cleanup_job;
    ...
}

drivers/accel/rocket/rocket_drv.c:rocket_iommu_domain_put() {
    ...
    kref_put(&domain->kref, rocket_iommu_domain_destroy);
    ...
}

[Severity: Critical]
This is a pre-existing issue, but does this unconditionally dereference a NULL
job->domain, triggering a kernel panic on early job submission failure?

In rocket_ioctl_submit_job(), rjob is allocated via kzalloc_obj(), leaving
rjob->domain initialized to NULL.

If a failure occurs before the domain is assigned, such as an invalid task
pointer in rocket_copy_tasks() or a bad BO handle in drm_gem_objects_lookup(),
the driver jumps to the cleanup path.

The cleanup handler, rocket_job_cleanup(), unconditionally passes the NULL
domain to rocket_iommu_domain_put(), which invokes kref_put(&domain->kref, ...).

Computing &domain->kref yields an offset address, and refcount_dec_and_test()
immediately dereferences this invalid address, resulting in a kernel panic that
can be triggered by unprivileged userspace.

-- 
Sashiko AI review · 
https://sashiko.dev/#/patchset/[email protected]?part=6

Re: [RFC PATCH v4 6/9] dt-bindings: npu: rockchip,rk3588-rknn-core: Add RK3568

Reply via email to