** Changed in: linux-oem-6.11 (Ubuntu Noble) Status: In Progress => Fix Committed
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-oem-6.11 in Ubuntu. https://bugs.launchpad.net/bugs/2086668 Title: NVIDIA WANR_ON call trace right after power on or resumed on 6.11 kernel Status in HWE Next: New Status in linux-oem-6.11 package in Ubuntu: Invalid Status in linux-oem-6.11 source package in Noble: Fix Committed Bug description: [Impact] Shows follow_pte() warning with nvidia dirver + 6.11 kernel. Aug 09 09:20:42 ubuntu-202407-34200 kernel: WARNING: CPU: 0 PID: 2918 at include/linux/rwsem.h:80 follow_pte+0x220/0x230 [Fix] This occurs during suspend when a function from the NVIDIA 'nv_revoke_gpu_mappings_locked()' calls the kernel function 'unmap_mapping_range()', which eventually ends up calling 'follow_pte()'. The function 'follow_pte()' calls an assertion 'mmap_assert_locked' to check if the 'mmap_lock' has been taken. This assertion fails, and we see a warning call trace (no functional issue, just some output in dmesg). All of this happens in kernel versions v6.10 through v6.11. This is a kernel bug, not an NVIDIA driver bug, and has also been discussed here in the kernel mailing list : https://lore.kernel.org/linux-mm/20240712080414.ga47...@google.com/T/#u There is a series of patches to address this issue and replace the follow_pte() https://lore.kernel.org/linux-mm/20240809160909.1023470-1-pet...@redhat.com/ We try to cherry pick the new functions and at the same time preserve the follow_pte() for compatiblity with the old drivers. b1b46751671b mm: fix follow_pfnmap API lockdep assert 75182022a043 mm/x86: support large pfn mappings cbea8536d933 mm/x86/pat: use the new follow_pfnmap API 6da8e9634bb7 mm: new follow_pfnmap API 6857be5fecae mm: introduce ARCH_SUPPORTS_HUGE_PFNMAP and special bits to pmd/pud [Test] 1. Boot up the machine with 6.11 kernel + nvidia driver 2. Do suspend/resume and check dmesg 3. There should be no nvidia call trace [Where problems could occur] Only this patch change the code that uses follow_pfnmap() to replace follow_pfn() cbea8536d933 ("mm/x86/pat: use the new follow_pfnmap API") The changes are 1x1 mappingable and should do the identical things. To manage notifications about this bug go to: https://bugs.launchpad.net/hwe-next/+bug/2086668/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp