>-----Original Message----- >From: Liu, Yi L <[email protected]> >Subject: Re: [PATCH v5 7/9] vfio/listener: Construct iotlb entry when unmap >memory address space > >On 2025/11/26 13:45, Duan, Zhenzhong wrote: >> >> >>> -----Original Message----- >>> From: Liu, Yi L <[email protected]> >>> Subject: Re: [PATCH v5 7/9] vfio/listener: Construct iotlb entry when >unmap >>> memory address space >>> >>> On 2025/11/6 12:20, Zhenzhong Duan wrote: >>>> If a VFIO device in guest switches from passthrough(PT) domain to block >>>> domain, the whole memory address space is unmapped, but we passed a >>> NULL >>>> iotlb entry to unmap_bitmap, then bitmap query didn't happen and we >lost >>>> dirty pages. >>> >>> this is a good catch. :) Have you observed problem in testing or just >>> identified it with patch iteration? >> >> Patch iteration. >> >>> >>>> By constructing an iotlb entry with iova = gpa for unmap_bitmap, it can >>>> set dirty bits correctly. >>>> >>>> For IOMMU address space, we still send NULL iotlb because VFIO don't >>>> know the actual mappings in guest. It's vIOMMU's responsibility to send >>>> actual unmapping notifications, e.g., >>> vtd_address_space_unmap_in_migration() >>>> >>>> Signed-off-by: Zhenzhong Duan <[email protected]> >>>> Tested-by: Giovannio Cabiddu <[email protected]> >>>> --- >>>> hw/vfio/listener.c | 15 ++++++++++++++- >>>> 1 file changed, 14 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c >>>> index 2109101158..3b48f6796c 100644 >>>> --- a/hw/vfio/listener.c >>>> +++ b/hw/vfio/listener.c >>>> @@ -713,14 +713,27 @@ static void >>> vfio_listener_region_del(MemoryListener *listener, >>>> >>>> if (try_unmap) { >>>> bool unmap_all = false; >>>> + IOMMUTLBEntry entry = {}, *iotlb = NULL; >>>> >>>> if (int128_eq(llsize, int128_2_64())) { >>>> assert(!iova); >>>> unmap_all = true; >>>> llsize = int128_zero(); >>>> } >>>> + >>>> + /* >>>> + * Fake an IOTLB entry for identity mapping which is needed >by >>> dirty >>>> + * tracking. In fact, in unmap_bitmap, only translated_addr >field >>> is >>>> + * used to set dirty bitmap. >>> >>> Just say sync dirty is needed per unmap. So you may add a check >>> in_migration as well. If not in migration, it is no needed to do it. >> >> Dirty tracking is not only for migration, but also ditry rate/limit. So a >non-null iotlb >> is always passed for ram MR. That iotlb pointer is used only when >> vfio_container_dirty_tracking_is_started() return true. >> >> I can add a check on global_dirty_tracking if you prefer to add a check. > >yeah, this would be helpful. > >> >>> >>>> + */ >>>> + if (!memory_region_is_iommu(section->mr)) { >>>> + entry.iova = iova; >>>> + entry.translated_addr = iova; >>>> + iotlb = &entry; >>>> + } >>>> + >>> >>> While, I'm still wondering how to deal with iommu MR case. Let's see a >>> scenario first. When switching from DMA domain to PT, QEMU will switch >>> to PT. This shall trigger the vfio_listener_region_del() and unregister >>> the iommu notifier. This means vIOMMU side needs to do unmap prior to >>> switching AS. If not, the iommu notifier is gone when vIOMMU wants to >>> unmap with an IOTLBEvent. For virtual intel_iommu, it is calling >>> vtd_address_space_unmap_in_migration() prior to calling >>> vtd_switch_address_space(). So I think you need to tweak the intel_iommu >>> a bit to suit the order requirement. :) >> >> VTD doesn't support switching from DMA domain to PT atomically, so >switches >> to block domain in between, see intel_iommu_attach_device() in kernel. >> >> So with this sequence is DMA->block->PT domain, we have guaranteed the >order >> you shared? See vtd_pasid_cache_sync_locked(). > >I see. So guest helps it. This might be a bit weak since we rely on >guest behavior. I think you may add a TODO or add comment to note it.
Make sense, will add. > >BTW. I think the subject can be refined since the real purpose is to >make tracking dirty pages in the unmap happen in region_del. > >vfio/listener: Add missing dirty tracking in region_del Will do. > >with this and the prior check, this patch looks good to me. > >Reviewed-by: Yi Liu <[email protected]> > >>> >>> BTW. should the iommu MRs even go to this try_unmap branch? I think for >>> such MRs, it relies on the vIOMMU to unmap explicitly (hence trigger the >>> vfio_iommu_map_notify()). >> >> Yes, it's unnecessary, but it's hard for VFIO to distinguish if try_unmap is >> due >to >> domain switch or a real unmap. I think it's harmless because the second >try_unmap >> unmaps nothing. > >hmmm. can a unmap path go to region_del()? Not quite get the second >try_unmap, do you mean when vIOMMU unmaps everything via >vfio_iommu_map_notify() and then switch AS which triggers the region_del >and this try_unmap branch? Yes, vfio_iommu_map_notify() is first try_unmap, unmap in region_del is the second try_unmap. Thanks Zhenzhong
