On 2025/11/26 13:45, Duan, Zhenzhong wrote:
-----Original Message-----
From: Liu, Yi L <[email protected]>
Subject: Re: [PATCH v5 7/9] vfio/listener: Construct iotlb entry when unmap
memory address space
On 2025/11/6 12:20, Zhenzhong Duan wrote:
If a VFIO device in guest switches from passthrough(PT) domain to block
domain, the whole memory address space is unmapped, but we passed a
NULL
iotlb entry to unmap_bitmap, then bitmap query didn't happen and we lost
dirty pages.
this is a good catch. :) Have you observed problem in testing or just
identified it with patch iteration?
Patch iteration.
By constructing an iotlb entry with iova = gpa for unmap_bitmap, it can
set dirty bits correctly.
For IOMMU address space, we still send NULL iotlb because VFIO don't
know the actual mappings in guest. It's vIOMMU's responsibility to send
actual unmapping notifications, e.g.,
vtd_address_space_unmap_in_migration()
Signed-off-by: Zhenzhong Duan <[email protected]>
Tested-by: Giovannio Cabiddu <[email protected]>
---
hw/vfio/listener.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c
index 2109101158..3b48f6796c 100644
--- a/hw/vfio/listener.c
+++ b/hw/vfio/listener.c
@@ -713,14 +713,27 @@ static void
vfio_listener_region_del(MemoryListener *listener,
if (try_unmap) {
bool unmap_all = false;
+ IOMMUTLBEntry entry = {}, *iotlb = NULL;
if (int128_eq(llsize, int128_2_64())) {
assert(!iova);
unmap_all = true;
llsize = int128_zero();
}
+
+ /*
+ * Fake an IOTLB entry for identity mapping which is needed by
dirty
+ * tracking. In fact, in unmap_bitmap, only translated_addr field
is
+ * used to set dirty bitmap.
Just say sync dirty is needed per unmap. So you may add a check
in_migration as well. If not in migration, it is no needed to do it.
Dirty tracking is not only for migration, but also ditry rate/limit. So a
non-null iotlb
is always passed for ram MR. That iotlb pointer is used only when
vfio_container_dirty_tracking_is_started() return true.
I can add a check on global_dirty_tracking if you prefer to add a check.
yeah, this would be helpful.
+ */
+ if (!memory_region_is_iommu(section->mr)) {
+ entry.iova = iova;
+ entry.translated_addr = iova;
+ iotlb = &entry;
+ }
+
While, I'm still wondering how to deal with iommu MR case. Let's see a
scenario first. When switching from DMA domain to PT, QEMU will switch
to PT. This shall trigger the vfio_listener_region_del() and unregister
the iommu notifier. This means vIOMMU side needs to do unmap prior to
switching AS. If not, the iommu notifier is gone when vIOMMU wants to
unmap with an IOTLBEvent. For virtual intel_iommu, it is calling
vtd_address_space_unmap_in_migration() prior to calling
vtd_switch_address_space(). So I think you need to tweak the intel_iommu
a bit to suit the order requirement. :)
VTD doesn't support switching from DMA domain to PT atomically, so switches
to block domain in between, see intel_iommu_attach_device() in kernel.
So with this sequence is DMA->block->PT domain, we have guaranteed the order
you shared? See vtd_pasid_cache_sync_locked().
I see. So guest helps it. This might be a bit weak since we rely on
guest behavior. I think you may add a TODO or add comment to note it.
BTW. I think the subject can be refined since the real purpose is to
make tracking dirty pages in the unmap happen in region_del.
vfio/listener: Add missing dirty tracking in region_del
with this and the prior check, this patch looks good to me.
Reviewed-by: Yi Liu <[email protected]>
BTW. should the iommu MRs even go to this try_unmap branch? I think for
such MRs, it relies on the vIOMMU to unmap explicitly (hence trigger the
vfio_iommu_map_notify()).
Yes, it's unnecessary, but it's hard for VFIO to distinguish if try_unmap is
due to
domain switch or a real unmap. I think it's harmless because the second
try_unmap
unmaps nothing.
hmmm. can a unmap path go to region_del()? Not quite get the second
try_unmap, do you mean when vIOMMU unmaps everything via
vfio_iommu_map_notify() and then switch AS which triggers the region_del
and this try_unmap branch?
Regards,
Yi Liu