On Tue, 10 Dec 2024 21:51:40 +0000 "Luo, Zhigang" <zhigang....@amd.com> wrote:
> [AMD Official Use Only - AMD Internal Distribution Only] > > > -----Original Message----- > > From: David Hildenbrand <da...@redhat.com> > > Sent: Tuesday, December 10, 2024 2:55 PM > > To: Luo, Zhigang <zhigang....@amd.com>; qemu-devel@nongnu.org > > Cc: kra...@redhat.com; Igor Mammedov <imamm...@redhat.com> > > Subject: Re: [PATCH] hostmem-file: add the 'hmem' option > > > > On 10.12.24 20:32, Luo, Zhigang wrote: > > > [AMD Official Use Only - AMD Internal Distribution Only] > > > > > > Hi David, > > > > > > > Hi, > > > > >>> > > >>> Thanks for your comments. > > >>> Let me give you some background for this patch. > > >>> I am currently engaged in a project that requires to pass the > > >>> EFI_MEMORY_SP > > >> (Special Purpose Memory) type memory from host to a virtual machine > > >> within QEMU. This memory needs to be EFI_MEMORY_SP type in the > > >> virtual machine as well. > > >>> This particular memory type is essential for the functionality of my > > >>> project. > > >> > > >> Which exact guest memory will be backed by this memory? All > > >> guest-memory? > > > [Luo, Zhigang] not all guest-memory. Only the memory reserved for > > > specific > > device. > > > > Can you show me an example QEMU cmdline, and how you would pass that > > hostmem-file object to the device? > > > [Luo, Zhigang] the following is an example. m1 is the reserved memory for pci > device "0000:03:00.0". both the memory and pci device are set to same numa > node. > > -object memory-backend-ram,size=8G,id=m0 \ > -object > memory-backend-file,size=16G,id=m1,mem-path=/dev/dax0.0,prealloc=on,align=1G,hmem=on > \ > -numa node,nodeid=0,memdev=m0 -numa node,nodeid=1,memdev=m1 \ > -device pxb-pcie,id=pcie.1,numa_node=1,bus_nr=2,bus=pcie.0 \ > -device ioh3420,id=pcie_port1,bus=pcie.1,chassis=1 \ > -device vfio-pci,host=0000:03:00.0,id=hostdev0,bus=pcie_port1 Is /dev/dax0.0 a part of host device 0000:03:00.0 that you pass-through to guest using vfio? > > > > > > >> > > >> And, what is the guest OS going to do with this memory? > > > [Luo, Zhigang] the device driver in guest will use this reserved memory. > > > > Okay, so just like CXL memory. > > > > > > > >> > > >> Usually, this SP memory (dax, cxl, ...) is not used as boot memory. > > >> Like on a bare metal system, one would expect that only CXL memory > > >> will be marked as special and put aside to the cxl driver, such that > > >> the OS can boot on ordinary DIMMs, such that cxl can online it etc. > > >> > > >> So maybe you would want to expose this memory using CXL-mem device to > > >> the VM? Or a DIMM? > > >> > > >> I assume the alternative is to tell the VM on the Linux kernel > > >> cmdline to set EFI_MEMORY_SP on this memory. I recall that there is a > > >> way to > > achieve that. > > >> > > > [Luo, Zhigang] I know this option. but it requires the end user to know > > > where is the > > memory location in guest side(start address, size). > > > > Right. > > > > > > > > > > >>> In Linux, the SPM memory will be claimed by hmem-dax driver by > > >>> default. With > > >> this patch I can use the following config to pass the SPM memory to > > >> guest VM. > > >>> -object > > >>> memory-backend-file,size=30G,id=m1,mem-path=/dev/dax0.0,prealloc=on, > > >>> al > > >>> ign=1G,hmem=on > > >>> > > >>> I was thinking to change the option name from "hmem" to "spm" to > > >>> avoid > > >> confusion. > > >> > > >> Likely it should be specified elsewhere, that you want specific guest > > >> RAM ranges to be EFI_MEMORY_SP. For a DIMM, it could be a property, > > >> similarly maybe for CXL- mem devices (no expert on that). > > >> > > >> For boot memory / machine memory it could be a machine property. But > > >> I'll first have to learn which ranges you actually want to expose > > >> that way, and what the VM will do with that information. > > > [Luo, Zhigang] we want to expose the SPM memory reserved for specific > > > device. > > And we will pass the SPM memory and the device to guest. Then the device > > driver > > can use the SPM memory in guest side. > > > > Then the device driver should likely have a way to configure that, not the > > memory > > backend. > > > > After all, the device driver will map it somehow into guest physical > > address space > > (how?). > > > [Luo, Zhigang] from guest view, it's still system memory, but marked as SPM. > So, qemu will map the memory to guest physical address space. > The device driver just claims to use the SPM memory in guest side. > > > > > > >> > > >>> > > >>> Do you have any suggestions to achieve this more reasonable? > > >> > > >> The problem with qemu_ram_foreach_block() is that you would indicate > > >> also DIMMs, virtio-mem, ... and even RAMBlocks that are not even used > > >> for backing anything to the VM as EFI_MEMORY_SP, which is wrong. > > > [Luo, Zhigang] qemu_ram_foreach_block() will list all memory block, but > > > in > > pc_update_hmem_memory(), only the memory block with "hmem" flag will be > > updated to SPM memory. > > > > Yes, but imagine a user passing such a memory backend to a > > DIMM/virtio-mem/boot > > memory etc. It will have very undesired side effects. > > > [Luo, Zhigang] the user should know what he/she is doing when he/she set the > flag for the memory region. > > > > -- > > Cheers, > > > > David / dhildenb >