> -----Original Message----- > From: Liu, Yi L > Sent: Monday, March 27, 2017 5:22 PM > To: Peter Xu <[email protected]> > Cc: [email protected]; Lan, Tianyu <[email protected]>; Tian, > Kevin > <[email protected]>; [email protected]; [email protected]; > [email protected]; [email protected]; David Gibson > <[email protected]>; [email protected] > Subject: RE: [Qemu-devel] [PATCH v7 14/17] memory: add > MemoryRegionIOMMUOps.replay() callback > > > -----Original Message----- > > From: Peter Xu [mailto:[email protected]] > > Sent: Monday, March 27, 2017 5:12 PM > > To: Liu, Yi L <[email protected]> > > Cc: [email protected]; Lan, Tianyu <[email protected]>; > > Tian, Kevin <[email protected]>; [email protected]; > > [email protected]; [email protected]; [email protected]; David > > Gibson <[email protected]>; [email protected] > > Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add > > MemoryRegionIOMMUOps.replay() callback > > > > On Mon, Mar 27, 2017 at 08:35:05AM +0000, Liu, Yi L wrote: > > > > -----Original Message----- > > > > From: Qemu-devel > > > > [mailto:[email protected]] On > > > > Behalf Of Peter Xu > > > > Sent: Tuesday, February 7, 2017 4:28 PM > > > > To: [email protected] > > > > Cc: Lan, Tianyu <[email protected]>; Tian, Kevin > > > > <[email protected]>; [email protected]; [email protected]; > > > > [email protected]; [email protected]; > > > > [email protected]; [email protected]; David Gibson > > > > <[email protected]> > > > > Subject: [Qemu-devel] [PATCH v7 14/17] memory: add > > > > MemoryRegionIOMMUOps.replay() callback > > > > > > > > Originally we have one memory_region_iommu_replay() function, > > > > which is the default behavior to replay the translations of the > > > > whole IOMMU region. However, on some platform like x86, we may > > > > want our own > > replay logic for IOMMU regions. > > > > This patch add one more hook for IOMMUOps for the callback, and > > > > it'll override the default if set. > > > > > > > > Signed-off-by: Peter Xu <[email protected]> > > > > --- > > > > include/exec/memory.h | 2 ++ > > > > memory.c | 6 ++++++ > > > > 2 files changed, 8 insertions(+) > > > > > > > > diff --git a/include/exec/memory.h b/include/exec/memory.h index > > > > 0767888..30b2a74 100644 > > > > --- a/include/exec/memory.h > > > > +++ b/include/exec/memory.h > > > > @@ -191,6 +191,8 @@ struct MemoryRegionIOMMUOps { > > > > void (*notify_flag_changed)(MemoryRegion *iommu, > > > > IOMMUNotifierFlag old_flags, > > > > IOMMUNotifierFlag new_flags); > > > > + /* Set this up to provide customized IOMMU replay function */ > > > > + void (*replay)(MemoryRegion *iommu, IOMMUNotifier *notifier); > > > > }; > > > > > > > > typedef struct CoalescedMemoryRange CoalescedMemoryRange; diff > > > > --git a/memory.c b/memory.c index 7a4f2f9..9c253cc 100644 > > > > --- a/memory.c > > > > +++ b/memory.c > > > > @@ -1630,6 +1630,12 @@ void > > > > memory_region_iommu_replay(MemoryRegion > > > > *mr, IOMMUNotifier *n, > > > > hwaddr addr, granularity; > > > > IOMMUTLBEntry iotlb; > > > > + /* If the IOMMU has its own replay callback, override */ > > > > + if (mr->iommu_ops->replay) { > > > > + mr->iommu_ops->replay(mr, n); > > > > + return; > > > > + } > > > > > > Hi Alex, Peter, > > > > > > Will all the other vendors(e.g. PPC, s390, ARM) add their own replay > > > callback as well? I guess it depends on whether the original replay > > > algorithm work well for them? Do you have such knowledge? > > > > I guess so. At least for VT-d we had this callback since the default > > replay mechanism did not work well on x86 due to its extremely large > > memory region size. Thanks, > > thx. that would make sense.
Peter, Just come to mind that there may be a corner case here. Intel VT-d actually has a "pt" mode which allows device use physical address even when VT-d is enabled. In kernel, there is a iommu_identity_mapping. If a device is in this map, then it would use "pt" mode. So that IOMMU driver would not build second-level page table for it. Back to the virtual IOVA implementation, if an assigned device is in the iommu_identity_mapping(e.g. VGA controller), it uses GPA directly to do DMA. So it demands a GPA->HPA mapping in host. However, the iommu->ops.replay is not able to build it when guest SL page table is empty. So I think building an entire guest PA->HPA mapping before guest kernel boot would be recommended. Any thoughts? Regards, Yi L
