On Wed, 7 Feb 2018 10:56:01 +0800 Yi Min Zhao <zyi...@linux.vnet.ibm.com> wrote:
> 在 2018/2/6 下午6:23, Cornelia Huck 写道: > > On Mon, 5 Feb 2018 15:22:55 +0800 > > Yi Min Zhao <zyi...@linux.vnet.ibm.com> wrote: > > > >> This series contains three patches, > >> 1) optimizes the code including walking DMA tables and rpcit handler > >> 2) fixes the issue caused by IOTLB global refresh > >> 3) uses the right pal and pba when registering ioat > >> > >> The issue mentioned above was found when we tested SMC-r tools. This > >> behavior has been introduced when linux guest started using a global > >> refresh to purge the whole IOTLB of invalid entries in a lazy fashion > >> instead of flushing each entry when invalidating table entries. > >> > >> The previous QEMU implementation didn't keep track of the mapping, > >> didn't handle correctly the global flush demand from the guest and a > >> major part of the IOTLB entries were not flushed. > >> > >> Consequently linux kernel on the host keeping the previous mapping > >> reports, as it should, -EEXIST DMA mapping error on the next mapping > >> with the same IOVA. The second patch fixes this issue. > > Introducing a local tracking mechanism still feels a bit awkward to me > > (even though it works, of course). If nobody else needs such a thing, > > our best choice is to do it like that, though. > Caching iotlb entries is also helpful for us to support 2G mapping in > future. OK, that makes sense. > > > >> During the investigation, we noticed that the current code walking > >> PCI IOMMU page tables didn't check important flags of table entries, > >> including: > >> 1) protection bit > >> 2) table length > >> 3) table offset > >> 4) intermediate tables' invalid bit > >> 5) format control bit > >> > >> We implement the checking in the first patch before handling the > >> IOTLB global refresh issue. To keep track of the mapped IOTLB entries > >> and be able to check if the host IOTLB entries need to be refreshed > >> we implement a IOTLB cache in QEMU, and introduce some helper > >> functions to check these bits. All S390IOTLBEntry instances are stored > >> in a new hashtable which are indexed by IOVA. Each PCI device has its > >> own IOMMU. Therefore each IOMMU also has its own hashtable caching > >> corresponding PCI device's DMA entries. Finally, we split 1M > >> contiguous DMA range into 4K pages to do DMA map, and the code about > >> error notification is also optimized. > >> > >> Change log: > >> v1->v2: > >> 1) update commit messages > >> 2) move some changes from the 2nd patch to the 1st patch > >> 3) define macros for 'ett' in the 1st patch > >> > >> Yi Min Zhao (3): > >> s390x/pci: fixup the code walking IOMMU tables > >> s390x/pci: fixup global refresh > >> s390x/pci: use the right pal and pba in reg_ioat() > >> > >> hw/s390x/s390-pci-bus.c | 233 > >> ++++++++++++++++++++++++++++++++++++++--------- > >> hw/s390x/s390-pci-bus.h | 17 ++++ > >> hw/s390x/s390-pci-inst.c | 103 ++++++++++++++------- > >> 3 files changed, 275 insertions(+), 78 deletions(-) > >> > > I have played with these patches and some virtio-pci devices (since I > > don't have access to real zpci cards), and it worked both under kvm and > > under tcg. So I'm inclined to apply this (I can't review further due to > > missing documentation), unless the pci folks have further comments. > Thanks! Applied to s390-next, thanks.