On Tue, 30 Jan 2024 13:50:18 +0530 Sajjan Rao <sajj...@gmail.com> wrote:
> Hi Jonathan, > > The QEMU command line in the original email has been corrected back in > August 2023 based on the subsequent responses. > > My current QEMU command line reads like below. As you can see I am not > assigning numa to the CXL memory object. > > qemu-system-x86_64 \ > -hda /var/lib/libvirt/images/CXL-Test_1.qcow2 \ > -machine type=q35,nvdimm=on,cxl=on \ > -accel tcg,thread=single \ > -m 4G \ > -smp cpus=4 \ > -object memory-backend-ram,size=4G,id=m0 \ > -object memory-backend-ram,size=256M,id=cxl-mem1 \ > -object memory-backend-ram,size=256M,id=cxl-mem2 \ > -numa node,memdev=m0,cpus=0-3,nodeid=0 \ > -netdev > user,id=net0,net=192.168.0.0/24,dhcpstart=192.168.0.9,hostfwd=tcp::2222-:22 > \ > -device virtio-net-pci,netdev=net0 \ > -device pxb-cxl,bus_nr=2,bus=pcie.0,id=cxl.1,hdm_for_passthrough=true \ > -device cxl-rp,port=0,bus=cxl.1,id=cxl_rp_port0,chassis=0,slot=2 \ > -device cxl-upstream,bus=cxl_rp_port0,id=us0,addr=0.0,multifunction=on, \ > -device cxl-switch-mailbox-cci,bus=cxl_rp_port0,addr=0.2,target=us0 \ > -device cxl-downstream,port=0,bus=us0,id=swport0,chassis=0,slot=4 \ > -device cxl-downstream,port=1,bus=us0,id=swport1,chassis=0,slot=8 \ > -device cxl-type3,bus=swport0,volatile-memdev=cxl-mem1,id=cxl-vmem1 \ > -device cxl-type3,bus=swport1,volatile-memdev=cxl-mem2,id=cxl-vmem2 \ > -M > cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=512M,cxl-fmw.0.interleave-granularity=2k > \ > -D /tmp/qemu.log \ > -nographic > > Until I moved to Qemu version 8.2 recently, I was able to create > regions and run linux native commands on CXL memory using > #numactl --membind <cxl NUMA#> top > > You had advised me to turn off KVM and use tcg since the membind > command will run code out of CXL memory which is not supported. By > disabling KVM the membind command worked fine. > However with Qemu version 8.2 the same membind command results in a > kernel hard crash. Just to check, kernel crashes, or qemu crashes? I've probably replicated and it seems to be qemu that is going down with a TCG issue. Bisection underway. This may take a while. Our use of TCG is unusual with what QEMU thinks of as io memory is unusual so we tend to run into corners no one else cares about. Richard, +CC on off chance you can guess what has happened and save me a bisection run.. x86 machine pretty much as described above root@localhost:~/devmem2# numactl --membind=1 touch a qemu: fatal: cpu_io_recompile: could not find TB for pc=(nil) RAX=00d6b969c0000000 RBX=ff294696c0044440 RCX=0000000000000028 RDX=0000000000000000 RSI=0000000000000275 RDI=0000000000000000 RBP=0000000490000000 RSP=ff4f8767805d3d20 R8 =0000000000000000 R9 =ff4f8767805d3cdc R10=0000000000000000 R11=0000000000000040 R12=ff294696c0044980 R13=0000000000000000 R14=ff294696c51d0000 R15=0000000000000000 RIP=ffffffff9d270fed RFL=00000007 [-----PC] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 0000000000000000 00000000 00000000 CS =0010 0000000000000000 ffffffff 00af9b00 DPL=0 CS64 [-RA] SS =0018 0000000000000000 ffffffff 00cf9300 DPL=0 DS [-WA] DS =0000 0000000000000000 00000000 00000000 FS =0000 0000000000000000 00000000 00000000 GS =0000 ff2946973bc00000 00000000 00000000 LDT=0000 0000000000000000 00000000 00008200 DPL=0 LDT TR =0040 fffffe37d29e7000 00004087 00008900 DPL=0 TSS64-avl GDT= fffffe37d29e5000 0000007f IDT= fffffe0000000000 00000fff CR0=80050033 CR2=00007f2972bdc450 CR3=0000000490000000 CR4=00751ef0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 CCS=00d6b969c0000000 CCD=0000000490000000 CCO=ADDQ EFER=0000000000000d01 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 YMM00=0000000000000000 0000000000000000 3a3a3a3a3a3a3a3a 3a3a3a3a3a3a3a3a YMM01=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM02=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM03=0000000000000000 0000000000000000 00ff0000000000ff 0000000000000000 YMM04=0000000000000000 0000000000000000 5f796c7261655f63 62696c5f5f004554 YMM05=0000000000000000 0000000000000000 0000000000000000 0000000000000060 YMM06=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM07=0000000000000000 0000000000000000 0909090909090909 0909090909090909 YMM08=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM09=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM10=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM11=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM12=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM13=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM14=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM15=0000000000000000 0000000000000000 0000000000000000 0000000000000000 Jonathan > I wanted to check if this is a known issue with 8.2 and is there a way > around it. > > Thanks, > Sajjan > > On Fri, Jan 26, 2024 at 10:42 PM Jonathan Cameron > <jonathan.came...@huawei.com> wrote: > > > > On Fri, 26 Jan 2024 10:43:43 -0500 > > Gregory Price <gregory.pr...@memverge.com> wrote: > > > > > On Fri, Jan 26, 2024 at 12:39:26PM +0000, Jonathan Cameron wrote: > > > > On Thu, 25 Jan 2024 13:45:09 +0530 > > > > Sajjan Rao <sajj...@gmail.com> wrote: > > > > > > > > > Looks like something changed in QEMU 8.2 that broke running code out > > > > > of CXL memory with KVM disabled. > > > > > I used "numactl --membind 2 ls" as suggested by Dimitrios earlier, > > > > > this worked for me until I updated to the latest QEMU. > > > > > > > > > > Is this a known issue? Or am I missing something? > > > > > > > > I'm confused on how the description below ever worked. > > > > Assigning the underlying memdev=cxl-mem1 to a numa node isn't going > > > > to correctly build the connections the CFMWS PA range. > > > > > > > > > > I've now seen 3-4 occasions where people have done this and run into > > > trouble (for obvious reasons). Is there anything we can do to disallow > > > the double-registering of a single memdev to both a numa node and a cxl > > > device? > > > > > It would be novel for us to prevent people shooting themselves > > in the foot ;) but I guess this should be fairly easy as the > > numa node logic prevents the same one being used multiple times so can > > copy how that is done. > > > > This should do the trick (very lightly tested). > > It's end of day Friday here so a formal patch can wait for next week. > > > > > > diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c > > index f29346fae7..d4194bb757 100644 > > --- a/hw/mem/cxl_type3.c > > +++ b/hw/mem/cxl_type3.c > > @@ -827,6 +827,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error > > **errp) > > error_setg(errp, "volatile memdev must have backing device"); > > return false; > > } > > + if (host_memory_backend_is_mapped(ct3d->hostvmem)) { > > + error_setg(errp, "memory backend %s can't be used multiple > > times.", > > + > > object_get_canonical_path_component(OBJECT(ct3d->hostvmem))); > > + return false; > > + } > > memory_region_set_nonvolatile(vmr, false); > > memory_region_set_enabled(vmr, true); > > host_memory_backend_set_mapped(ct3d->hostvmem, true); > > @@ -850,6 +855,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error > > **errp) > > error_setg(errp, "persistent memdev must have backing device"); > > return false; > > } > > + if (host_memory_backend_is_mapped(ct3d->hostpmem)) { > > + error_setg(errp, "memory backend %s can't be used multiple > > times.", > > + > > object_get_canonical_path_component(OBJECT(ct3d->hostpmem))); > > + return false; > > + } > > memory_region_set_nonvolatile(pmr, true); > > memory_region_set_enabled(pmr, true); > > host_memory_backend_set_mapped(ct3d->hostpmem, true); > > @@ -880,6 +890,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error > > **errp) > > error_setg(errp, "dynamic capacity must have backing device"); > > return false; > > } > > + if (host_memory_backend_is_mapped(ct3d->dc.host_dc)) { > > + error_setg(errp, "memory backend %s can't be used multiple > > times.", > > + > > object_get_canonical_path_component(OBJECT(ct3d->dc.host_dc))); > > + return false; > > + } > > /* FIXME: set dc as nonvolatile for now */ > > memory_region_set_nonvolatile(dc_mr, true); > > memory_region_set_enabled(dc_mr, true); > > > > > > > > > > > > > ~Gregory > >