On 7/27/23 04:55, Jesse Zhang wrote:
From: Jesse Zhang <[email protected]>

iGpu driver fail to read/write register by iommu when start X.
kernel: [  433.296634] audit: type=1400 audit(1690403823.130:64): apparmor="DENIED" 
operation="capable" class="cap"
profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=12344 comm="snap-confine" 
capability=38  capname="perfmon"
kernel: [  433.515795] amdgpu 0000:03:00.0: amdgpu: failed to write reg 28b4 
wait reg 28c6
kernel: [  440.195492] amdgpu 0000:03:00.0: amdgpu: failed to write reg 28b4 
wait reg 28c6
kernel: [  453.679611] amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 
wait reg 1a706
kernel: [  460.383490] amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 
wait reg 1a706

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2659

Disable address translation service, before detach device.
Do detach will clear the page table point or pasid table entries,
so all DMA requests from the device should be blocked before that.

Signed-off-by: Jesse Zhang <[email protected]>
---
  drivers/iommu/amd/iommu.c | 21 ++++++++++++---------
  1 file changed, 12 insertions(+), 9 deletions(-)

The reporter came back and indicated this worked, so here are some tags for it.

Fixes: 8dc1db3172ae ("drm/amdkfd: Introduce kfd_node struct (v5)")
Tested-by: Mike Lothian <[email protected]>

This commit that introduced the problem is in 6.5-rc1, so hopefully this can be queued up for a future 6.5-rc.


diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index dc1ec6849775..6a2237bfdcb9 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1863,17 +1863,20 @@ static void detach_device(struct device *dev)
        if (WARN_ON(!dev_data->domain))
                goto out;
- do_detach(dev_data);
-
-       if (!dev_is_pci(dev))
-               goto out;
+        /* Disable address translation service, before detach device.
+        *  Do detach will clear the page table point or pasid table entries,
+        *  so all DMA requests from the device should be blocked before that.
+        */
+       if (dev_is_pci(dev)) {
+               if (domain->flags & PD_IOMMUV2_MASK && dev_data->iommu_v2)
+                       pdev_iommuv2_disable(to_pci_dev(dev));
+               else if (dev_data->ats.enabled)
+                       pci_disable_ats(to_pci_dev(dev));
- if (domain->flags & PD_IOMMUV2_MASK && dev_data->iommu_v2)
-               pdev_iommuv2_disable(to_pci_dev(dev));
-       else if (dev_data->ats.enabled)
-               pci_disable_ats(to_pci_dev(dev));
+               dev_data->ats.enabled = false;
+       }
- dev_data->ats.enabled = false;
+       do_detach(dev_data);
out:
        spin_unlock(&dev_data->lock);

Reply via email to