On 5/20/2025 5:16 AM, Duan, Zhenzhong wrote:
-----Original Message----- From: Steven Sistare <[email protected]> Subject: Re: [PATCH V3 39/42] vfio/iommufd: reconstruct hwptOn 5/18/2025 11:25 PM, Duan, Zhenzhong wrote:-----Original Message----- From: Steve Sistare <[email protected]> Subject: [PATCH V3 39/42] vfio/iommufd: reconstruct hwpt Save the hwpt_id in vmstate. In realize, skip its allocation from iommufd_cdev_attach -> iommufd_cdev_attach_container -> iommufd_cdev_autodomains_get. Rebuild userland structures to hold hwpt_id by calling iommufd_cdev_rebuild_hwpt at post load time. This depends on hw_caps,whichwas restored by the post_load call to vfio_device_hiod_create_and_realize. Signed-off-by: Steve Sistare <[email protected]> --- hw/vfio/cpr-iommufd.c | 7 +++++++ hw/vfio/iommufd.c | 24 ++++++++++++++++++++++-- hw/vfio/trace-events | 1 + hw/vfio/vfio-iommufd.h | 3 +++ include/hw/vfio/vfio-cpr.h | 1 + 5 files changed, 34 insertions(+), 2 deletions(-) diff --git a/hw/vfio/cpr-iommufd.c b/hw/vfio/cpr-iommufd.c index 24cdf10..6d3f4e0 100644 --- a/hw/vfio/cpr-iommufd.c +++ b/hw/vfio/cpr-iommufd.c @@ -110,6 +110,12 @@ static int vfio_device_post_load(void *opaque, int version_id) error_report_err(err); return false; } + if (!vbasedev->mdev) { + VFIOIOMMUFDContainer *container = container_of(vbasedev-bcontainer,+ VFIOIOMMUFDContainer, + bcontainer); + iommufd_cdev_rebuild_hwpt(vbasedev, container); + } return true; } @@ -121,6 +127,7 @@ static const VMStateDescription vfio_device_vmstate= {.needed = cpr_needed_for_reuse, .fields = (VMStateField[]) { VMSTATE_INT32(devid, VFIODevice), + VMSTATE_UINT32(cpr.hwpt_id, VFIODevice), VMSTATE_END_OF_LIST() } }; diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index d980684..ec79c83 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -318,6 +318,7 @@ static bool iommufd_cdev_detach_ioas_hwpt(VFIODevice *vbasedev, Error **errp) static void iommufd_cdev_use_hwpt(VFIODevice *vbasedev, VFIOIOASHwpt *hwpt) { vbasedev->hwpt = hwpt; + vbasedev->cpr.hwpt_id = hwpt->hwpt_id; vbasedev->iommu_dirty_tracking = iommufd_hwpt_dirty_tracking(hwpt); QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next); } @@ -373,6 +374,23 @@ static bool iommufd_cdev_make_hwpt(VFIODevice *vbasedev, return true; } +void iommufd_cdev_rebuild_hwpt(VFIODevice *vbasedev, + VFIOIOMMUFDContainer *container) +{ + VFIOIOASHwpt *hwpt; + int hwpt_id = vbasedev->cpr.hwpt_id; + + trace_iommufd_cdev_rebuild_hwpt(container->be->fd, hwpt_id); + + QLIST_FOREACH(hwpt, &container->hwpt_list, next) { + if (hwpt->hwpt_id == hwpt_id) { + iommufd_cdev_use_hwpt(vbasedev, hwpt); + return; + } + } + iommufd_cdev_make_hwpt(vbasedev, container, hwpt_id, false, NULL); +} + static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev, VFIOIOMMUFDContainer *container, Error **errp) @@ -567,7 +585,8 @@ static bool iommufd_cdev_attach(const char *name, VFIODevice *vbasedev, vbasedev->iommufd != container->be) { continue; } - if (!iommufd_cdev_attach_container(vbasedev, container, &err)) { + if (!vbasedev->cpr.reused && + !iommufd_cdev_attach_container(vbasedev, container, &err)) { const char *msg = error_get_pretty(err); trace_iommufd_cdev_fail_attach_existing_container(msg); @@ -605,7 +624,8 @@ skip_ioas_alloc: bcontainer = &container->bcontainer; vfio_address_space_insert(space, bcontainer); - if (!iommufd_cdev_attach_container(vbasedev, container, errp)) { + if (!vbasedev->cpr.reused && + !iommufd_cdev_attach_container(vbasedev, container, errp)) {All container attaching is bypassed in new qemu. I have a concern that newqemu doesn't generate same containers as old qemu if there are more than one container in old qemu.Then there can be devices attached to wrong container or attaching fail in postload. Yes, this relates to our discussion in patch 35. Please explain, how can a single iommufd backend have multiple containers?Similar as legacy container, there can be multiple containers in one address space. If existing mapping in one container conflicts with new device's reserved region, Attaching to that container will fail and a new container need to be created to accept new device's reserved region. Maybe you need to do same thing just like you do for legacy container, e.g., saving ioas_id just like you saving container->fd, then checking existing ioas_id and restore iommufd container based on that.
Thanks, now I understand. iommufd_cdev_attach calls iommufd_cdev_attach_container -> iommufd_cdev_attach_ioas_hwpt(container->ioas_id) until it finds a container that works, or creates a new container with a new ioas_id. To fix, I need to record each device's ioas_id in cpr-state, so it is available when vfio_realize -> iommufd_cdev_attach is called. Saving it in vmstate as I do now and recovering it in a post_load handler is too late. - Steve
