Public bug reported: This upstream (v6.16) fix resolves an issue when trying to pin a memfd folio before it has been faulted in, which can lead to a crash when CONFIG_DEBUG_VM is enabled or an accounting issue with resv_huge_pages when that kconfig is not present. Contiguous memory is required for the vCMDQ feature on Grace and one way of achieving that is by using huge pages to back the VM memory.
While testing PR 179 with the 4k host kernel and a QEMU branch with the pluggable SMMUv3 interface, I found that the VM would exhibit symptoms of the vCMDQ not being backed back contiguous memory: [ 0.377799] acpi NVDA200C:00: tegra241_cmdqv: unexpected error reported. vintf_map: 0000000000000001, vcmdq_map 00000000:00000000:00000000:00000002 [ 0.379174] arm-smmu-v3 arm-smmu-v3.0.auto: CMDQ error (cons 0x04000000): Unknown [ 0.379954] arm-smmu-v3 arm-smmu-v3.0.auto: skipping command in error state: [ 0.380632] arm-smmu-v3 arm-smmu-v3.0.auto: 0x0001000000000011 [ 0.381147] arm-smmu-v3 arm-smmu-v3.0.auto: 0x0000000000000000 When this occurred, I noticed that the huge page metadata did not match expectations. Notably, it showed that an extra 16G of hugepages was being used and also reflected a negative “in reserve” count, indicating an underflow condition. # grep -i hugep /proc/meminfo AnonHugePages: 69632 kB ShmemHugePages: 0 kB FileHugePages: 0 kB HugePages_Total: 64 HugePages_Free: 32 HugePages_Rsvd: 18446744073709551600 HugePages_Surp: 0 Hugepagesize: 1048576 kB After instrumenting the kernel, I was able to prove the underflow and then noticed this upstream fix. The data also showed that the newer QEMU branch makes more calls to memfd_pin_foilios() during GPU VFIO setup, which triggered the bug in the kernel - I never saw this bug with the older QEMU branch we’ve been using for quite some time for Grace virtualization. After applying the fix, I no longer see the bad huge page metadata and the vCMDQ feature works properly with the 4k host kernel. Lore discussion: https://lkml.kernel.org/r/[email protected] Upstream SHA: eb920662230f mm/hugetlb: don't crash when allocating a folio if there are no resv This commit picked cleanly to 24.04_linux-nvidia-6.14-next. Testing: GPU PT on 4k host with more huge pages than the VM requires (e.g. 32 1G hugepages for a 16G VM) QEMU: https://github.com/nvmochs/QEMU/tree/smmuv3-accel-07212025_egm qemu-system-aarch64 \ -object iommufd,id=iommufd0 \ -machine hmat=on -machine virt,accel=kvm,gic-version=3,ras=on,highmem-mmio-size=512G \ -cpu host -smp cpus=4 -m size=16G,slots=2,maxmem=66G -nographic \ -object memory-backend-file,size=8G,id=m0,mem-path=/hugepages/,prealloc=on,share=off \ -object memory-backend-file,size=8G,id=m1,mem-path=/hugepages/,prealloc=on,share=off \ -numa node,memdev=m0,cpus=0-3,nodeid=0 -numa node,memdev=m1,nodeid=1 \ -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 -numa node,nodeid=5\ -numa node,nodeid=6 -numa node,nodeid=7 -numa node,nodeid=8 -numa node,nodeid=9\ -device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0 -device arm-smmuv3,primary-bus=pcie.1,id=smmuv3.1,accel=on,cmdqv=on \ -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1,io-reserve=0 \ -device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.port1,rombar=0,id=dev0,iommufd=iommufd0 \ -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \ -object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \ -object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \ -object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \ -object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \ -object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \ -object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \ -object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \ -bios /usr/share/AAVMF/AAVMF_CODE.fd \ -device nvme,drive=nvme0,serial=deadbeaf1,bus=pcie.0 \ -drive file=guest.qcow2,index=0,media=disk,format=qcow2,if=none,id=nvme0 \ -device e1000,romfile=/usr/local/share/qemu/efi-e1000.rom,netdev=net0,bus=pcie.0 \ -netdev user,id=net0,hostfwd=tcp::5558-:22,hostfwd=tcp::5586-:5586 ** Affects: linux-nvidia-6.14 (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-nvidia-6.14 in Ubuntu. https://bugs.launchpad.net/bugs/2119577 Title: Backport: mm/hugetlb: don't crash when allocating a folio if there are no resv Status in linux-nvidia-6.14 package in Ubuntu: New Bug description: This upstream (v6.16) fix resolves an issue when trying to pin a memfd folio before it has been faulted in, which can lead to a crash when CONFIG_DEBUG_VM is enabled or an accounting issue with resv_huge_pages when that kconfig is not present. Contiguous memory is required for the vCMDQ feature on Grace and one way of achieving that is by using huge pages to back the VM memory. While testing PR 179 with the 4k host kernel and a QEMU branch with the pluggable SMMUv3 interface, I found that the VM would exhibit symptoms of the vCMDQ not being backed back contiguous memory: [ 0.377799] acpi NVDA200C:00: tegra241_cmdqv: unexpected error reported. vintf_map: 0000000000000001, vcmdq_map 00000000:00000000:00000000:00000002 [ 0.379174] arm-smmu-v3 arm-smmu-v3.0.auto: CMDQ error (cons 0x04000000): Unknown [ 0.379954] arm-smmu-v3 arm-smmu-v3.0.auto: skipping command in error state: [ 0.380632] arm-smmu-v3 arm-smmu-v3.0.auto: 0x0001000000000011 [ 0.381147] arm-smmu-v3 arm-smmu-v3.0.auto: 0x0000000000000000 When this occurred, I noticed that the huge page metadata did not match expectations. Notably, it showed that an extra 16G of hugepages was being used and also reflected a negative “in reserve” count, indicating an underflow condition. # grep -i hugep /proc/meminfo AnonHugePages: 69632 kB ShmemHugePages: 0 kB FileHugePages: 0 kB HugePages_Total: 64 HugePages_Free: 32 HugePages_Rsvd: 18446744073709551600 HugePages_Surp: 0 Hugepagesize: 1048576 kB After instrumenting the kernel, I was able to prove the underflow and then noticed this upstream fix. The data also showed that the newer QEMU branch makes more calls to memfd_pin_foilios() during GPU VFIO setup, which triggered the bug in the kernel - I never saw this bug with the older QEMU branch we’ve been using for quite some time for Grace virtualization. After applying the fix, I no longer see the bad huge page metadata and the vCMDQ feature works properly with the 4k host kernel. Lore discussion: https://lkml.kernel.org/r/[email protected] Upstream SHA: eb920662230f mm/hugetlb: don't crash when allocating a folio if there are no resv This commit picked cleanly to 24.04_linux-nvidia-6.14-next. Testing: GPU PT on 4k host with more huge pages than the VM requires (e.g. 32 1G hugepages for a 16G VM) QEMU: https://github.com/nvmochs/QEMU/tree/smmuv3-accel-07212025_egm qemu-system-aarch64 \ -object iommufd,id=iommufd0 \ -machine hmat=on -machine virt,accel=kvm,gic-version=3,ras=on,highmem-mmio-size=512G \ -cpu host -smp cpus=4 -m size=16G,slots=2,maxmem=66G -nographic \ -object memory-backend-file,size=8G,id=m0,mem-path=/hugepages/,prealloc=on,share=off \ -object memory-backend-file,size=8G,id=m1,mem-path=/hugepages/,prealloc=on,share=off \ -numa node,memdev=m0,cpus=0-3,nodeid=0 -numa node,memdev=m1,nodeid=1 \ -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 -numa node,nodeid=5\ -numa node,nodeid=6 -numa node,nodeid=7 -numa node,nodeid=8 -numa node,nodeid=9\ -device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0 -device arm-smmuv3,primary-bus=pcie.1,id=smmuv3.1,accel=on,cmdqv=on \ -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1,io-reserve=0 \ -device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.port1,rombar=0,id=dev0,iommufd=iommufd0 \ -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \ -object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \ -object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \ -object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \ -object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \ -object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \ -object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \ -object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \ -bios /usr/share/AAVMF/AAVMF_CODE.fd \ -device nvme,drive=nvme0,serial=deadbeaf1,bus=pcie.0 \ -drive file=guest.qcow2,index=0,media=disk,format=qcow2,if=none,id=nvme0 \ -device e1000,romfile=/usr/local/share/qemu/efi-e1000.rom,netdev=net0,bus=pcie.0 \ -netdev user,id=net0,hostfwd=tcp::5558-:22,hostfwd=tcp::5586-:5586 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.14/+bug/2119577/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : [email protected] Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp

