> > I'm not clear if sysram could be used for virtio, or even needed. I'm > > still figuring out how virtio of simple memory devices is a gain. > > > > Jonathan mentioned that he thinks it would be possible to just bring it > online as a private-node and inform the consumer of this. I think > that's probably reasonable.
Firstly VM == Application. If we have say a DB that wants to do everything itself, it would use same interface as a VM to get the whole memory on offer. (I'm still trying to get that Application Specific Memory term adopted ;) This would be better if we didn't assume anything to do with virtio - that's just one option (and right now for CXL mem probably not the sensible one as it's missing too many things we get for free by just emulating CXL devices - e.g. all the stuff you are describing here for the host is just as valid in the guest.) We have a path to get that emulation and should have the big missing piece posted shortly (DCD backed by 'things - this discussion' that turn up after VM boot). The real topic is memory for a VM and we need a way to tie a memory backend in qemu to, so that whatever the fabric manager provided for that VM is given to the VM and not used for anything else. If it's for a specific VM, then it's tagged as otherwise how else do we know the intent? (lets ignore random other out of band paths). Layering wise we can surface as many backing sources as we like at runtime via 1+ emulated DCD devices (to give perf information etc). They each show up in the guest as contiguous (maybe tagged) single extent and then we apply what ever comes out of the rest of this discussion on top of that. So all we care about is how the host presents it. Bunch of things might work for this. 1. Just put it in a numa node that requires specific selection to allocate from. This is nice because it just looks like normal memory and we can apply any type of front end on top of that. Not good if we have a lot of these coming and going. 2. Provide it as something with an fd we can memmap. I was fine with Dax for this but if it's normal ram just for a VM anything that gives me a handle that I can memmap is fine. Just need a way to know which one (so tag). It's pretty similar for shared cases. Just need a handle to memmap. In that case, tag goes straight up to guest OS (we've just unwound the extent ordering in the host and presented it as a contiguous single extent). Assumption here is we always provide all that capacity that was tagged for the VM to use to the VM. Things may get more entertaining if we have a bunch of capacity that was tagged to provide extra space for a set of VMs (e.g. we overcommit on top of the DCD extents) - to me that's a job for another day. So I'm not really envisioning anything special for the VM case, it's just a dedicate allocation of memory for a user who knows how to get it. We will want a way to get perf info though so we can provide that in the VM. Maybe can figure that out from the CXL HW backing it without needing anything special in what is being discussed here. Jonathan > > ~Gregory

