Hi Jason,
Thanks for reviewing this patch.
On 1/28/26 12:46 AM, Jason Gunthorpe wrote:
On Tue, Jan 27, 2026 at 06:35:56PM +0000, Shivaprasad G Bhat wrote:
The RFC attempts to implement the IOMMUFD support on PPC64 by
adding new iommu_ops for paging domain. The existing platform
domain continues to be the default domain for in-kernel use.
It would be nice to see the platform domain go away and ppc use the
normal dma-iommu.c stuff, but I don't think it is critical to making
it work with iommufd.
I agree. I have started on this. I will send incremental changes
as follow-up after this.
On PPC64, IOVA ranges are based on the type of the DMA window
and their properties. Currently, there is no way to expose the
attributes of the non-default 64-bit DMA window, which the platform
supports. The platform allows the operating system to select the
starting offset(at 4GiB or 512PiB default offset), pagesize and
window size for the non-default 64-bit DMA window. For example,
with VFIO, this is handled via VFIO_IOMMU_SPAPR_TCE_GET_INFO
and VFIO_IOMMU_SPAPR_TCE_CREATE|REMOVE ioctls. While I am exploring
the ways to expose and configure these DMA window attributes as
per user input, any suggestions in this regard will be very helpful.
You can pass in driver specific information during HWPT creation, so
any properties you need can be specified there.
Sure. I think IOMMU_GET_HW_INFO would be useful for getting the
platform supported configuration in this case.
Then you'd want to introduce a new domain op to get the apertures
instead of the single range hard coded into the domain struct. The new
op would be able to return a list. We can use this op to return
apertures for sign extension page tables too.
Update iommufd to calculate the reserved regions by evaluating the
whole list.
I think you'll find this pretty straight forward, I'd do it as a
followup patch to this one.
Thanks. I will wait for that patch.
Currently existing vfio type1 specific vfio-compat driver even
with this patch will not work for PPC64. I believe we need to have
a separate "vfio-spapr-compat" driver to make it work.
Yes, vfio-compat doesn't support the special spapr ioctls.
I don't think you need a new driver, just implement whatever they do
with the existing interfaces, probably in its own .c file though.
There are ioctl number conflicts like
# grep -n "VFIO_BASE + 1[89]" include/uapi/linux/vfio.h | grep define
940:#defineVFIO_DEVICE_BIND_IOMMUFD_IO(VFIO_TYPE, VFIO_BASE + 18)
976:#defineVFIO_DEVICE_ATTACH_IOMMUFD_PT_IO(VFIO_TYPE, VFIO_BASE + 19)
1833:#defineVFIO_IOMMU_SPAPR_UNREGISTER_MEMORY_IO(VFIO_TYPE, VFIO_BASE + 18)
1856:#defineVFIO_IOMMU_SPAPR_TCE_CREATE_IO(VFIO_TYPE, VFIO_BASE + 19)
# grep -n "VFIO_BASE + 20" include/uapi/linux/vfio.h | grep define
999:#defineVFIO_DEVICE_DETACH_IOMMUFD_PT_IO(VFIO_TYPE, VFIO_BASE + 20)
1870:#defineVFIO_IOMMU_SPAPR_TCE_REMOVE_IO(VFIO_TYPE, VFIO_BASE + 20)
However, I have no idea what is required to implement those ops, or if
it is even possible.. It may be easier to just leave the old vfio
stuff around instead of trying to compat it. The purpose of compat was
to be able to build kernels without type1 at all. It isn't necessary
to start using iommufd in new apps with the new interfaces.
Given you are mainly looking at a VMM that already will have iommufd
support it may not be worthwhile.
You are right. We do have some use cases beyond VMM, I will consider
compat driver
only if it is helpful there.
@@ -1201,7 +1201,15 @@ spapr_tce_blocked_iommu_attach_dev(struct iommu_domain
*platform_domain,
* also sets the dma_api ops
*/
table_group = iommu_group_get_iommudata(grp);
+
+ if (old && old->type == IOMMU_DOMAIN_DMA) {
I'm trying to delete IOMMU_DOMAIN_DMA please don't use it in
drivers.
Sure.
static const struct iommu_ops spapr_tce_iommu_ops = {
.default_domain = &spapr_tce_platform_domain,
.blocked_domain = &spapr_tce_blocked_domain,
@@ -1267,6 +1436,14 @@ static const struct iommu_ops spapr_tce_iommu_ops = {
.probe_device = spapr_tce_iommu_probe_device,
.release_device = spapr_tce_iommu_release_device,
.device_group = spapr_tce_iommu_device_group,
+ .domain_alloc_paging = spapr_tce_domain_alloc_paging,
+ .default_domain_ops = &(const struct iommu_domain_ops) {
+ .attach_dev = spapr_tce_iommu_attach_device,
+ .map_pages = spapr_tce_iommu_map_pages,
+ .unmap_pages = spapr_tce_iommu_unmap_pages,
+ .iova_to_phys = spapr_tce_iommu_iova_to_phys,
+ .free = spapr_tce_domain_free,
+ }
Please don't use default_domain_ops in a driver that is supporting
multiple domain types and platform, it becomes confusing to guess
which domain type those ops are linked to.
Sure.
You should also implement the BLOCKING domain type to make VFIO work
better
I am not sure how this could help making VFIO better. May be, I am not able
to imagine the advantages with the current platform domain approach
in place. Could you please elaborate more on this?
I wouldn't try to guess if this is right or not, but it looks pretty
reasonable as a first start.
Thanks, I will iterate this as RFC till i get to reasonable shape.
Regards,
Shivaprasad
Jason