On Thu, 19 Aug 2021 15:32:16 +0000 John Johnson <john.g.john...@oracle.com> wrote:
> > On Aug 17, 2021, at 7:04 PM, Alex Williamson <alex.william...@redhat.com> > > wrote: > > > > > > The address/size paradigm falls into the same issues as the vfio kernel > > interface where we can't map or unmap the entire 64-bit address space, > > ie. size is limited to 2^64 - 1. The kernel interface also requires > > PAGE_SIZE granularity for the DMA, which means the practical limit is > > 2^64 - PAGE_SIZE. If we had a redo on the kernel interface we'd use > > start/end so we can express a size of (end - start + 1). > > > > Is following the vfio kernel interface sufficiently worthwhile for > > compatibility to incur this same limitation? I don't recall if we've > > already discussed this, but perhaps worth a note in this design doc if > > similarity to the kernel interface is being favored here. See for > > example QEMU commit 1b296c3def4b ("vfio: Don't issue full 2^64 unmap"). > > Thanks, > > > > > I’d prefer to stay as close to the kernel i/f as we can. > An earlier version of the spec used a vhost-user derived structure > for MAP & UNMAP. This made it more difficult to add the bitmap > field when vfio added migration capability, so we switched to the > ioctl() structure. > > It looks like vfio_dma_unmap() takes a 64b ‘size’ arg > (ram_addr_t) How did you unmap an entire 64b address space? It's called from the MemoryListener which operates on MemoryRegionSections, which uses Int128 that get's chunked to ram_addr_t for vfio_dma_unmap(). We do now have VFIO_DMA_UNMAP_FLAG_ALL in the kernel API which gives us an option to clear the whole 64bit address space in one ioctl, but it's not a high priority to make use of in QEMU since it still needs to handle older kernels. > The comment there mentions a bug where iova+size wraps the end of the > 64b space. Right, that's a separate issue that's just a bug in the kernel. That's been fixed, but the QEMU code exists for now as a workaround for any broken kernels in the wild. Thanks, Alex