** Also affects: linux-nvidia-6.14 (Ubuntu Noble)
Importance: Undecided
Status: New
** Changed in: linux-nvidia-6.14 (Ubuntu)
Status: New => Invalid
** Changed in: linux-nvidia-6.14 (Ubuntu Noble)
Status: New => Fix Committed
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.14 in Ubuntu.
https://bugs.launchpad.net/bugs/2119577
Title:
Backport: mm/hugetlb: don't crash when allocating a folio if there are
no resv
Status in linux-nvidia-6.14 package in Ubuntu:
Invalid
Status in linux-nvidia-6.14 source package in Noble:
Fix Committed
Bug description:
This upstream (v6.16) fix resolves an issue when trying to pin a memfd
folio before it has been faulted in, which can lead to a crash when
CONFIG_DEBUG_VM is enabled or an accounting issue with resv_huge_pages
when that kconfig is not present. Contiguous memory is required for
the vCMDQ feature on Grace and one way of achieving that is by using
huge pages to back the VM memory.
While testing PR 179 with the 4k host kernel and a QEMU branch with the
pluggable SMMUv3 interface, I found that the VM would exhibit symptoms of the
vCMDQ not being backed back contiguous memory:
[ 0.377799] acpi NVDA200C:00: tegra241_cmdqv: unexpected error reported.
vintf_map: 0000000000000001, vcmdq_map 00000000:00000000:00000000:00000002
[ 0.379174] arm-smmu-v3 arm-smmu-v3.0.auto: CMDQ error (cons 0x04000000):
Unknown
[ 0.379954] arm-smmu-v3 arm-smmu-v3.0.auto: skipping command in error
state:
[ 0.380632] arm-smmu-v3 arm-smmu-v3.0.auto: 0x0001000000000011
[ 0.381147] arm-smmu-v3 arm-smmu-v3.0.auto: 0x0000000000000000
When this occurred, I noticed that the huge page metadata did not match
expectations. Notably, it showed that an extra 16G of hugepages was being used
and also reflected a negative “in reserve” count, indicating an underflow
condition.
# grep -i hugep /proc/meminfo
AnonHugePages: 69632 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 64
HugePages_Free: 32
HugePages_Rsvd: 18446744073709551600
HugePages_Surp: 0
Hugepagesize: 1048576 kB
After instrumenting the kernel, I was able to prove the underflow and
then noticed this upstream fix. The data also showed that the newer
QEMU branch makes more calls to memfd_pin_foilios() during GPU VFIO
setup, which triggered the bug in the kernel - I never saw this bug
with the older QEMU branch we’ve been using for quite some time for
Grace virtualization. After applying the fix, I no longer see the bad
huge page metadata and the vCMDQ feature works properly with the 4k
host kernel.
Lore discussion:
https://lkml.kernel.org/r/[email protected]
Upstream SHA: eb920662230f mm/hugetlb: don't crash when allocating a folio if
there are no resv
This commit picked cleanly to 24.04_linux-nvidia-6.14-next.
Testing:
GPU PT on 4k host with more huge pages than the VM requires (e.g. 32 1G
hugepages for a 16G VM)
QEMU: https://github.com/nvmochs/QEMU/tree/smmuv3-accel-07212025_egm
qemu-system-aarch64 \
-object iommufd,id=iommufd0 \
-machine hmat=on -machine
virt,accel=kvm,gic-version=3,ras=on,highmem-mmio-size=512G \
-cpu host -smp cpus=4 -m size=16G,slots=2,maxmem=66G -nographic \
-object
memory-backend-file,size=8G,id=m0,mem-path=/hugepages/,prealloc=on,share=off \
-object
memory-backend-file,size=8G,id=m1,mem-path=/hugepages/,prealloc=on,share=off \
-numa node,memdev=m0,cpus=0-3,nodeid=0 -numa node,memdev=m1,nodeid=1 \
-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 -numa
node,nodeid=5\
-numa node,nodeid=6 -numa node,nodeid=7 -numa node,nodeid=8 -numa
node,nodeid=9\
-device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0 -device
arm-smmuv3,primary-bus=pcie.1,id=smmuv3.1,accel=on,cmdqv=on \
-device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1,io-reserve=0 \
-device
vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.port1,rombar=0,id=dev0,iommufd=iommufd0
\
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
-object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
-object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
-object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
-object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
-object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
-object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
-object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
-bios /usr/share/AAVMF/AAVMF_CODE.fd \
-device nvme,drive=nvme0,serial=deadbeaf1,bus=pcie.0 \
-drive
file=guest.qcow2,index=0,media=disk,format=qcow2,if=none,id=nvme0 \
-device
e1000,romfile=/usr/local/share/qemu/efi-e1000.rom,netdev=net0,bus=pcie.0 \
-netdev user,id=net0,hostfwd=tcp::5558-:22,hostfwd=tcp::5586-:5586
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.14/+bug/2119577/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp