> From: Nicolin Chen <[email protected]>
> Sent: Monday, October 13, 2025 8:05 AM
> @@ -714,7 +714,12 @@ struct iommu_ops {
>
> /**
> * struct iommu_domain_ops - domain specific operations
> - * @attach_dev: attach an iommu domain to a device
> + * @test_dev: Test compatibility prior to an @attach_dev or
> @set_dev_pasid call.
> + * A driver-level callback of this op should do a thorough
> sanity, to
> + * make sure a device is compatible with the domain. So the
> following
> + * @attach_dev and @set_dev_pasid functions would likely succeed
> with
> + * only one exception due to a temporary failure like out of
> memory.
> + * It's suggested to avoid the kernel prints in this op.
> * Return:
> * * 0 - success
> * * EINVAL - can indicate that device and domain are incompatible due
> to
> @@ -722,11 +727,15 @@ struct iommu_ops {
> * driver shouldn't log an error, since it is legitimate for a
> * caller to test reuse of existing domains. Otherwise, it may
> * still represent some other fundamental problem
> - * * ENOMEM - out of memory
> - * * ENOSPC - non-ENOMEM type of resource allocation failures
> * * EBUSY - device is attached to a domain and cannot be changed
> * * ENODEV - device specific errors, not able to be attached
> * * <others> - treated as ENODEV by the caller. Use is discouraged
> + * @attach_dev: attach an iommu domain to a device
> + * Return:
> + * * 0 - success
> + * * ENOMEM - out of memory
> + * * ENOSPC - non-ENOMEM type of resource allocation failures
> + * * <others> - Use is discouraged
It might need more work to meet this requirement. e.g. after patch4
I could still spot other errors easily in the attach path:
intel_iommu_attach_device()
iopf_for_domain_set()
intel_iommu_enable_iopf():
if (!info->pri_enabled)
return -ENODEV;
intel_iommu_attach_device()
dmar_domain_attach_device()
domain_attach_iommu():
curr = xa_cmpxchg(&domain->iommu_array, iommu->seq_id,
NULL, info, GFP_KERNEL);
if (curr) {
ret = xa_err(curr) ? : -EBUSY;
goto err_clear;
}
intel_iommu_attach_device()
dmar_domain_attach_device()
domain_setup_first_level()
__domain_setup_first_level()
intel_pasid_setup_first_level():
pte = intel_pasid_get_entry(dev, pasid);
if (!pte) {
spin_unlock(&iommu->lock);
return -ENODEV;
}
if (pasid_pte_is_present(pte)) {
spin_unlock(&iommu->lock);
return -EBUSY;
}
On the other hand, how do we communicate whatever errors returned
by attach_dev in the reset_done path back to userspace? As noted above
resource allocation failures could still occur in attach_dev, but userspace
may think the requested attach in middle of a reset has succeeded as
long as it passes the test_dev check.
Does it work better to block the attaching process upon ongoing reset
and wake it up later upon reset_done to resume attach?