On Wed, Mar 18, 2020 at 12:42:25PM +0100, Auger Eric wrote: > Hi Jean, > > On 3/18/20 12:20 PM, Bharat Bhushan wrote: > > > > > >> -----Original Message----- > >> From: Jean-Philippe Brucker <[email protected]> > >> Sent: Wednesday, March 18, 2020 4:48 PM > >> To: Bharat Bhushan <[email protected]> > >> Cc: Auger Eric <[email protected]>; Peter Maydell > >> <[email protected]>; [email protected]; Tomasz Nowicki [C] > >> <[email protected]>; [email protected]; [email protected]; > >> [email protected]; [email protected]; [email protected]; > >> [email protected]; Bharat Bhushan <[email protected]>; > >> [email protected]; [email protected] > >> Subject: [EXT] Re: [PATCH v7 3/5] virtio-iommu: Call iommu notifier for > >> attach/detach > >> > >> External Email > >> > >> ---------------------------------------------------------------------- > >> On Wed, Mar 18, 2020 at 03:47:44PM +0530, Bharat Bhushan wrote: > >>> Hi Jean, > >>> > >>> On Tue, Mar 17, 2020 at 9:29 PM Jean-Philippe Brucker > >>> <[email protected]> wrote: > >>>> > >>>> On Tue, Mar 17, 2020 at 02:46:55PM +0530, Bharat Bhushan wrote: > >>>>> Hi Jean, > >>>>> > >>>>> On Tue, Mar 17, 2020 at 2:23 PM Jean-Philippe Brucker > >>>>> <[email protected]> wrote: > >>>>>> > >>>>>> On Tue, Mar 17, 2020 at 12:40:39PM +0530, Bharat Bhushan wrote: > >>>>>>> Hi Jean, > >>>>>>> > >>>>>>> On Mon, Mar 16, 2020 at 3:41 PM Jean-Philippe Brucker > >>>>>>> <[email protected]> wrote: > >>>>>>>> > >>>>>>>> Hi Bharat, > >>>>>>>> > >>>>>>>> Could you Cc me on your next posting? Unfortunately I don't > >>>>>>>> have much hardware for testing this at the moment, but I > >>>>>>>> might be able to help a little on the review. > >>>>>>>> > >>>>>>>> On Mon, Mar 16, 2020 at 02:40:00PM +0530, Bharat Bhushan wrote: > >>>>>>>>>>>>> First issue is: your guest can use 4K page and your > >>>>>>>>>>>>> host can use 64KB pages. In that case VFIO_DMA_MAP > >>>>>>>>>>>>> will fail with -EINVAL. We must devise a way to pass the host > >> settings to the VIRTIO-IOMMU device. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Even with 64KB pages, it did not work for me. I have > >>>>>>>>>>>>> obviously not the storm of VFIO_DMA_MAP failures but > >>>>>>>>>>>>> I have some, most probably due to some wrong notifications > >> somewhere. I will try to investigate on my side. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Did you test with VFIO on your side? > >>>>>>>>>>>> > >>>>>>>>>>>> I did not tried with different page sizes, only tested with 4K > >>>>>>>>>>>> page > >> size. > >>>>>>>>>>>> > >>>>>>>>>>>> Yes it works, I tested with two n/w device assigned > >>>>>>>>>>>> to VM, both interfaces works > >>>>>>>>>>>> > >>>>>>>>>>>> First I will try with 64k page size. > >>>>>>>>>>> > >>>>>>>>>>> 64K page size does not work for me as well, > >>>>>>>>>>> > >>>>>>>>>>> I think we are not passing correct page_size_mask here > >>>>>>>>>>> (config.page_size_mask is set to TARGET_PAGE_MASK ( > >>>>>>>>>>> which is > >>>>>>>>>>> 0xfffffffffffff000)) > >>>>>>>>>> I guess you mean with guest using 4K and host using 64K. > >>>>>>>>>>> > >>>>>>>>>>> We need to set this correctly as per host page size, correct? > >>>>>>>>>> Yes that's correct. We need to put in place a control > >>>>>>>>>> path to retrieve the page settings on host through VFIO to inform > >>>>>>>>>> the > >> virtio-iommu device. > >>>>>>>>>> > >>>>>>>>>> Besides this issue, did you try with 64kB on host and guest? > >>>>>>>>> > >>>>>>>>> I tried Followings > >>>>>>>>> - 4k host and 4k guest - it works with v7 version > >>>>>>>>> - 64k host and 64k guest - it does not work with v7 > >>>>>>>>> hard-coded config.page_size_mask to 0xffffffffffff0000 > >>>>>>>>> and it works > >>>>>>>> > >>>>>>>> You might get this from the iova_pgsize bitmap returned by > >>>>>>>> VFIO_IOMMU_GET_INFO. The virtio config.page_size_mask is > >>>>>>>> global so there is the usual problem of aggregating > >>>>>>>> consistent properties, but I'm guessing using the host page size as a > >> granule here is safe enough. > >>>>>>>> > >>>>>>>> If it is a problem, we can add a PROBE property for page > >>>>>>>> size mask, allowing to define per-endpoint page masks. I > >>>>>>>> have kernel patches somewhere to do just that. > >>>>>>> > >>>>>>> I do not see we need page size mask per endpoint. > >>>>>>> > >>>>>>> While I am trying to understand what "page-size-mask" guest > >>>>>>> will work with > >>>>>>> > >>>>>>> - 4K page size host and 4k page size guest > >>>>>>> config.page_size_mask = 0xffffffffffff000 will work > >>>>>>> > >>>>>>> - 64K page size host and 64k page size guest > >>>>>>> config.page_size_mask = 0xfffffffffff0000 will work > >>>>>>> > >>>>>>> - 64K page size host and 4k page size guest > >>>>>>> 1) config.page_size_mask = 0xffffffffffff000 will also not > >>>>>>> work as VFIO in host expect iova and size to be aligned to 64k > >>>>>>> (PAGE_SIZE in > >>>>>>> host) > >>>>>>> 2) config.page_size_mask = 0xfffffffffff0000 will not work, > >>>>>>> iova initialization (in guest) expect minimum page-size > >>>>>>> supported by h/w to be equal to 4k (PAGE_SIZE in guest) > >>>>>>> Should we look to relax this in iova allocation code? > >>>>>> > >>>>>> Oh right, that's not great. Maybe the BUG_ON() can be removed, > >>>>>> I'll ask on the list. > >>>>> > >>>>> yes, the BUG_ON in iova_init. > So you mean in init_iova_domain()? > > I see the BUG_ON was introduced in > 0fb5fe874c42942e16c450ae05da453e13a1c09e "iommu: Make IOVA domain page > size explicit" but was it meant to remain?
No I don't think iova.c is the problem, we could as well remove the BUG_ON(). Now my concern is more with dma-iommu.c and VFIO which use the PAGE_SIZE in some places and could have ingrained assumption about iommu_pgsize <= PAGE_SIZE. We need a thorough audit of these drivers before relaxing this. I started with dma-iommu yesterday but gave up after seeing the VFIO WARN_ON. I just sent the patch for virtio-iommu on the IOMMU list, we can continue the discussion there Thanks, Jean > > Logically when we allocate buffer IOVAs for DMA accesses, later on, > shouldn't it be possible to use the actual granule set on init() and > make sure the allocated size is properly aligned. > > Reading the commit msg it explicitly says that "the systems may contain > heterogeneous IOMMUs supporting differing minimum page sizes, which may > also not be common with the CPU page size". > > Thanks > > Eric > > > >>>>> I tried with removing same and it worked, but not analyzed side effects. > >>>> > >>>> It might break the assumption from device drivers that mapping a > >>>> page is safe. For example they call alloc_page() followed by > >>>> dma_map_page(). In our situation dma-iommu.c will oblige and create > >>>> one 64k mapping to one 4k physical page. As a result the endpoint > >>>> can access the neighbouring 60k of memory. > >>>> > >>>> This isn't too terrible. After all, even when the page sizes match, > >>>> device drivers can call dma_map_single() on sub-page buffers, which > >>>> will also let the endpoint access a whole page. The solution, if you > >>>> don't trust the endpoint, is to use bounce buffers. > >>>> > >>>> But I suspect it's not as simple as removing the BUG_ON(), we'll > >>>> need to go over dma-iommu.c first. And it seems like assigning > >>>> endpoints to guest userspace won't work either in this config. In > >> vfio_dma_do_map(): > >>>> > >>>> mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - > >>>> 1; > >>>> > >>>> WARN_ON(mask & PAGE_MASK); > >>> > >>> Yes, Agree > >>> > >>>> > >>>> If I read this correctly the WARN will trigger in a 4k guest under > >>>> 64k host, right? So maybe we can just say that this config isn't > >>>> supported, unless it's an important use-case for virtio-iommu? > >>> > >>> I sent v8 version of patch and with that guest and host with same page > >>> size should work. > >>> While i have not yet added analyzed how to mark 4k guest and 64k host > >>> as un-supported configuration, will analyze and send patch. > >> > >> I don't think there is anything to do for QEMU, it's Linux that doesn't > >> support the > >> configuration. We could add something like the attached patch, in the > >> virtio- > >> iommu driver, to abort more gracefully than with a BUG_ON(). > > > > Yes agree, we need to have change in Linux side. > > > > Thanks > > -Bharat > > > >> > >> Thanks, > >> Jean > > >
