Thank you for your contribution! Sashiko AI review found 3 potential issue(s) 
to consider:

Pre-existing issues:
- [Critical] The patch addresses a Use-After-Free in `amdxdna_show_fdinfo()`, 
but identical UAF vulnerabilities exist across most ioctls and GEM destruction 
paths due to broken client lifecycle management.
- [Critical] Premature destruction of `notifier_wq` in `amdxdna_remove()` 
causes a Use-After-Free when GEM objects are subsequently freed.
- [High] `amdxdna_sva_init()` leaves `client->sva` as a dangling pointer if 
`iommu_sva_get_pasid()` fails, leading to a double `iommu_sva_unbind_device()` 
on client cleanup.
--

commit 225c80eb0fc36a5511f44ce636794ce0b887dbc7
Author: Lizhi Hou <[email protected]>

accel/amdxdna: Fix amdxdna_client race in amdxdna_show_fdinfo()

This patch guards the show_fdinfo path with drm_dev_enter() and
drm_dev_exit() to prevent use-after-free of the amdxdna_client structure
after the device has been unplugged.

> diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.c 
> b/drivers/accel/amdxdna/amdxdna_pci_drv.c
> index c18ddad9f6459..e6b541db38981 100644
> --- a/drivers/accel/amdxdna/amdxdna_pci_drv.c
> +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.c
> @@ -296,6 +296,10 @@ static void amdxdna_show_fdinfo(struct drm_printer *p, 
> struct drm_file *filp)
>       struct amdxdna_client *client = filp->driver_priv;
>       size_t heap_usage, external_usage, internal_usage;
>       char *drv_name = filp->minor->dev->driver->name;
> +     int idx;
> +
> +     if (!drm_dev_enter(filp->minor->dev, &idx))
> +             return;

[Severity: Critical]
This is a pre-existing issue, but while this patch addresses the UAF in
amdxdna_show_fdinfo(), do the same vulnerabilities exist across other ioctls
and GEM destruction paths due to broken client lifecycle management?

For example, amdxdna_remove() forcefully frees each amdxdna_client struct
by calling amdxdna_client_cleanup() while file descriptors might still be
open:

drivers/accel/amdxdna/amdxdna_pci_drv.c:amdxdna_remove() {
    ...
    client = list_first_entry_or_null(&xdna->client_list, ...);
    while (client) {
        amdxdna_client_cleanup(client);
        ...
    }
}

Meanwhile, IOCTL handlers like amdxdna_drm_submit_cmd_ioctl(),
amdxdna_drm_config_hwctx_ioctl(), and amdxdna_drm_wait_cmd_ioctl() omit
drm_dev_enter(), extracting the client pointer directly from
filp->driver_priv:

drivers/accel/amdxdna/amdxdna_ctx.c:amdxdna_drm_submit_cmd_ioctl() {
    struct amdxdna_client *client = filp->driver_priv;
    ...
    return amdxdna_drm_submit_execbuf(client, args);
}

If amdxdna_remove() runs concurrently, will these IOCTLs access the freed
client structure, causing a Use-After-Free? Similarly, when the file
descriptor is eventually closed post-unplug, will GEM destruction paths
like amdxdna_gem_obj_close() and amdxdna_gem_dev_obj_free() dereference the
already-freed abo->client pointer?


>  
>       mutex_lock(&client->mm_lock);
>  

[ ... ]

[Severity: High]
This is a pre-existing issue, but does amdxdna_sva_init() leave client->sva
as a dangling pointer if iommu_sva_get_pasid() fails?

In amdxdna_sva_init(), if iommu_sva_bind_device() succeeds but getting the
pasid fails, the device is unbound but client->sva is not set to NULL:

drivers/accel/amdxdna/amdxdna_pci_drv.c:amdxdna_sva_init() {
    ...
    client->pasid = iommu_sva_get_pasid(client->sva);
    if (client->pasid == IOMMU_PASID_INVALID) {
        iommu_sva_unbind_device(client->sva);
        XDNA_ERR(xdna, "SVA get pasid failed");
        return -ENODEV;
    }
}

Then in amdxdna_drm_open(), if amdxdna_sva_init() fails but
amdxdna_use_carveout(xdna) is true, the driver issues a warning and allows
the client initialization to succeed.

When the file descriptor is closed and amdxdna_client_cleanup() calls
amdxdna_sva_fini(), will this dangling pointer bypass the IS_ERR_OR_NULL()
check and cause a double iommu_sva_unbind_device()?

drivers/accel/amdxdna/amdxdna_pci_drv.c:amdxdna_sva_fini() {
    if (IS_ERR_OR_NULL(client->sva))
        return;

    iommu_sva_unbind_device(client->sva);
}

[Severity: Critical]
This is a pre-existing issue, but does the premature destruction of
notifier_wq in amdxdna_remove() cause a Use-After-Free when GEM objects
are subsequently freed?

amdxdna_remove() destroys the workqueue before GEM objects are completely
released:

drivers/accel/amdxdna/amdxdna_pci_drv.c:amdxdna_remove() {
    ...
    destroy_workqueue(xdna->notifier_wq);
    ...
    drm_dev_unplug(&xdna->ddev);
}

When a GEM object's refcount drops to zero later, amdxdna_gem_obj_free()
is invoked and unconditionally calls flush_workqueue() on the destroyed
workqueue:

drivers/accel/amdxdna/amdxdna_gem.c:amdxdna_gem_obj_free() {
    ...
    amdxdna_hmm_unregister(abo, NULL);
    flush_workqueue(xdna->notifier_wq);
}

Could the destruction of the workqueue be deferred to the DRM device release
callback instead of the PCI remove callback?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=3

Reply via email to