Re: when to use MemoryRegionCache vs direct address space read/write vs address_space_map?

Paolo Bonzini Fri, 12 Sep 2025 22:50:13 -0700

On 9/12/25 17:51, Peter Maydell wrote:

(3) address_space_cache_init(), which initializes a MemoryRegionCache
     which you can then use for hopefully faster read and write
     operations via address_space_read_cached() and
     address_space_write_cached().


(And the cached ld/st variants as well)

     Again, subject to limitations: must operate on RAM;


You can operate on non-RAM but it will not be any faster.

     you might not be able to access the whole
     range you wanted. This currently seems to be used solely by
     virtio.


Indeed.  The APIs compare like this

             fast            gives void*          limits
direct         -                  -                  -
map            y                  y           1 MR, 1 bounce buffer
cached         y                  -                1 MR

MemoryRegionCache has the additional complication of needing aMemoryListener to invalidate the cache, if the MemoryRegionCache is longlived. This could be done globally, it just wasn't necessary whilevirtio was the only user.

You could create a MemoryRegionCache for the duration of (say) a singlefunction call, and then you don't need to deal with invalidation; butthen the single bounce buffer limitation is not a problem and map/unmapis probably easier to use.

In particular, I'm working on a GICv5 model. This device puts a
lot of its working data structures into guest memory, so we're going
to be accessing guest memory a lot. The device spec says if you point
it at not-RAM you get to keep both pieces, and requires the guest
not to try to change the contents of that memory underfoot without
notifying it, so this seems like it ought to be a good candidate
for some kind of "act like you have this memory cached so you don't
need to keep looking it up every time" API...


Yes, that is indeed a good use.

Does the MemoryRegionCache API cover all the use cases we use
address_space_map() and dma_memory_map() for? (i.e. could we
deprecate the latter and transition code over to the new API?)

No, there's no way to get a void* from MemoryRegionCache (you could getone when the underlying block is RAM by peeking at the struct members;but there's no bounce buffering by design).

Incidentally, on the subject of the dma.h wrappers -- I've never
really been very clear why we have these. Some devices use them,

but a lot do not.


All PCI devices use them.

The fact that the dma wrappers put in smp_mb()
barriers leaves me wondering if all those other devices that
don't use them have subtle bugs, but OTOH I've never noticed
any problems...

The idea was that PCI specifies the ordering of DMA operations and thememory barrier provides that ordering when the operations are performedby the host CPU.

In practice the cases in which ordering is required are limited, andpersonally I prefer to write these barriers in the device model so thatthe synchronization algorithm is documented. That means you can usemap/unmap or MemoryRegionCache instead, both of which are also faster.

The original and more unique DMA wrapper is dma_blk_io(), which is awrapper around block layer APIs that supports cross-memory-regionoperations and the fact that the address_space_map() bounce buffer isonly 1 page long. This one, which works together with QEMUSGList, isused in several block device models (IDE, SCSI, NVMe).


Paolo

Re: when to use MemoryRegionCache vs direct address space read/write vs address_space_map?

Reply via email to