Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues: - [High] Unhandled interrupt storm during device removal due to `pm_runtime_get_if_active()`. - [High] NULL pointer dereference in the IRQ thread during teardown (Use-After-Free). Pre-existing issues: - [Critical] Double free of `rjob->tasks` array during job submission error handling. - [High] NULL pointer dereference when freeing an uninitialized `job->domain`. -- commit 7c59a4d5e2e74264bcfe7bc8aa7c903516b1c1c0 Author: ZhaoJinming <[email protected]> accel/rocket: Fix iommu_group leak and unsafe IRQ register access This commit replaces iommu_group_get() with the cached core->iommu_group to prevent a reference leak. It also adds pm_runtime_get_if_active() checks in the IRQ handler and thread to prevent unsafe hardware register accesses when the device is suspended. > diff --git a/drivers/accel/rocket/rocket_job.c > b/drivers/accel/rocket/rocket_job.c > index cf9f61682dd8b..21b433826b479 100644 > --- a/drivers/accel/rocket/rocket_job.c > +++ b/drivers/accel/rocket/rocket_job.c > @@ -349,7 +349,7 @@ static void rocket_job_handle_irq(struct rocket_core > *core) > return; > } > > - iommu_detach_group(NULL, iommu_group_get(core->dev)); > + iommu_detach_group(NULL, core->iommu_group); [Severity: High] Could this lead to a NULL pointer dereference during device teardown? In rocket_core_fini(), core->iommu_group is set to NULL without acquiring core->job_lock or waiting for the IRQ thread to finish: rocket_core_fini() ... core->iommu_group = NULL; If the threaded IRQ handler runs concurrently after passing the runtime PM check, it will pass this newly nulled pointer to iommu_detach_group(), causing the IOMMU core to dereference a NULL pointer when acquiring the group's mutex. > dma_fence_signal(core->in_flight_job->done_fence); > pm_runtime_put_autosuspend(core->dev); > core->in_flight_job = NULL; [ ... ] > @@ -428,16 +431,24 @@ static irqreturn_t rocket_job_irq_handler_thread(int > irq, void *data) > static irqreturn_t rocket_job_irq_handler(int irq, void *data) > { > struct rocket_core *core = data; > - u32 raw_status = rocket_pc_readl(core, INTERRUPT_RAW_STATUS); > + u32 raw_status; > + > + if (pm_runtime_get_if_active(core->dev) != 1) > + return IRQ_NONE; [Severity: High] Will this cause an unhandled interrupt storm during module removal? During module removal, rocket_core_fini() disables runtime PM: rocket_core_fini() pm_runtime_disable(core->dev); This happens before the hardware is stopped. If a hardware job completes subsequently, the device raises a level-triggered interrupt. Since runtime PM is disabled, pm_runtime_get_if_active() returns -EINVAL. The handler then unconditionally returns IRQ_NONE without clearing the interrupt. Because the IRQ is level-triggered, the unacknowledged interrupt will immediately refire upon handler exit, potentially causing an infinite interrupt storm that deadlocks the CPU. [Severity: Critical] This isn't a bug introduced by this patch, but does the error handling in rocket_copy_tasks() cause a double free of the tasks array? If validation of the incoming tasks fails, the array is freed but the pointer is not cleared: rocket_copy_tasks() { ... fail: kvfree(rjob->tasks); return ret; } The caller then jumps to the error label and invokes rocket_job_put(), which triggers rocket_job_cleanup(). This cleanup function frees the dangling pointer again: rocket_job_cleanup() { ... kvfree(job->tasks); } Can a local unprivileged attacker exploit this via the submit ioctl to corrupt kernel memory? [Severity: High] This isn't a bug introduced by this patch, but is there a NULL pointer dereference when freeing an uninitialized domain? When the job structure is allocated, all fields are initialized to zero. If setup fails early (for example, in drm_sched_job_init() or rocket_copy_tasks()), job->domain remains NULL. The error handling path invokes rocket_job_cleanup(), which unconditionally releases the domain: rocket_job_cleanup() { ... rocket_iommu_domain_put(job->domain); } This function immediately dereferences the pointer: rocket_iommu_domain_put() { ... kref_put(&domain->kref, rocket_iommu_domain_destroy); } Could this cause a kernel panic if an unprivileged user passes invalid arguments? -- Sashiko AI review · https://sashiko.dev/#/patchset/[email protected]?part=2
