> From: Nicolin Chen <[email protected]>
> Sent: Monday, October 13, 2025 8:05 AM
> @@ -714,7 +714,12 @@ struct iommu_ops {
> 
>  /**
>   * struct iommu_domain_ops - domain specific operations
> - * @attach_dev: attach an iommu domain to a device
> + * @test_dev: Test compatibility prior to an @attach_dev or
> @set_dev_pasid call.
> + *            A driver-level callback of this op should do a thorough 
> sanity, to
> + *            make sure a device is compatible with the domain. So the 
> following
> + *            @attach_dev and @set_dev_pasid functions would likely succeed
> with
> + *            only one exception due to a temporary failure like out of 
> memory.
> + *            It's suggested to avoid the kernel prints in this op.
>   *  Return:
>   * * 0               - success
>   * * EINVAL  - can indicate that device and domain are incompatible due
> to
> @@ -722,11 +727,15 @@ struct iommu_ops {
>   *             driver shouldn't log an error, since it is legitimate for a
>   *             caller to test reuse of existing domains. Otherwise, it may
>   *             still represent some other fundamental problem
> - * * ENOMEM  - out of memory
> - * * ENOSPC  - non-ENOMEM type of resource allocation failures
>   * * EBUSY   - device is attached to a domain and cannot be changed
>   * * ENODEV  - device specific errors, not able to be attached
>   * * <others>        - treated as ENODEV by the caller. Use is discouraged
> + * @attach_dev: attach an iommu domain to a device
> + *  Return:
> + * * 0               - success
> + * * ENOMEM  - out of memory
> + * * ENOSPC  - non-ENOMEM type of resource allocation failures
> + * * <others>        - Use is discouraged

It might need more work to meet this requirement. e.g. after patch4
I could still spot other errors easily in the attach path:

intel_iommu_attach_device()
  iopf_for_domain_set()
    intel_iommu_enable_iopf():

        if (!info->pri_enabled)
                return -ENODEV;

intel_iommu_attach_device()
  dmar_domain_attach_device()
    domain_attach_iommu():
      
       curr = xa_cmpxchg(&domain->iommu_array, iommu->seq_id,
                          NULL, info, GFP_KERNEL);
        if (curr) {
                ret = xa_err(curr) ? : -EBUSY;
                goto err_clear;
        }

intel_iommu_attach_device()
  dmar_domain_attach_device()
    domain_setup_first_level()
      __domain_setup_first_level()
        intel_pasid_setup_first_level():

        pte = intel_pasid_get_entry(dev, pasid);
        if (!pte) {
                spin_unlock(&iommu->lock);
                return -ENODEV;
        }

        if (pasid_pte_is_present(pte)) {
                spin_unlock(&iommu->lock);
                return -EBUSY;
        }

On the other hand, how do we communicate whatever errors returned
by attach_dev in the reset_done path back to userspace? As noted above
resource allocation failures could still occur in attach_dev, but userspace
may think the requested attach in middle of a reset has succeeded as
long as it passes the test_dev check.

Does it work better to block the attaching process upon ongoing reset
and wake it up later upon reset_done to resume attach?

Reply via email to