Public bug reported: The 5.4.0 series of the Ubuntu kernel has missed a patch which resolves a null dereference:
[104602.951260] BUG: kernel NULL pointer dereference, address: 0000000000000034 [104602.951263] #PF: supervisor write access in kernel mode [104602.951264] #PF: error_code(0x0002) - not-present page [104602.951266] PGD 0 P4D 0 [104602.951269] Oops: 0002 [#1] SMP PTI [104602.951272] CPU: 6 PID: 176572 Comm: ThreadPoolForeg Kdump: loaded Tainted: P OE 5.4.0-117-generic #132-Ubuntu [104602.951273] Hardware name: System manufacturer System Product Name/P8P67 LE, BIOS 3801 09/12/2013 [104602.951278] RIP: 0010:unlink_anon_vmas+0x3e/0x1b0 [104602.951280] Code: 54 53 48 83 ec 08 48 8b 47 78 48 89 7d d0 48 8b 30 49 39 c5 0f 84 5e 01 00 00 4c 8d 78 f0 4c 8d 66 f0 31 db eb 21 49 8b 46 38 <83> 68 34 01 49 8b 44 24 10 49 8d 54 24 10 4d 89 e7 48 83 e8 10 49 [104602.951281] RSP: 0018:ffffc00908703bd8 EFLAGS: 00010246 [104602.951283] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [104602.951284] RDX: 0000000000000000 RSI: ffff99e8e5815c48 RDI: 0000000000000000 [104602.951286] RBP: ffffc00908703c08 R08: 0000000000000001 R09: ffffffffae665f00 [104602.951287] R10: ffff99eb808bd6c0 R11: 0000000000000001 R12: ffff99eb0fda27b8 [104602.951288] R13: ffff99eb0fda27c8 R14: ffff99e8e5815c08 R15: ffff99e7ee7af6c0 [104602.951290] FS: 0000000000000000(0000) GS:ffff99eb8eb80000(0000) knlGS:0000000000000000 [104602.951291] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [104602.951293] CR2: 0000000000000034 CR3: 000000037ae0a005 CR4: 00000000000606e0 [104602.951294] Call Trace: [104602.951299] free_pgtables+0x93/0xf0 [104602.951301] exit_mmap+0xc7/0x1b0 [104602.951304] mmput+0x5d/0x130 [104602.951306] do_exit+0x31a/0xaf0 [104602.951309] do_group_exit+0x47/0xb0 [104602.951312] get_signal+0x169/0x890 [104602.951315] do_signal+0x34/0x6c0 [104602.951318] ? _copy_from_user+0x3e/0x60 [104602.951321] ? __x64_sys_futex+0x13f/0x170 [104602.951324] exit_to_usermode_loop+0xbf/0x160 [104602.951327] do_syscall_64+0x163/0x190 [104602.951330] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [104602.951332] RIP: 0033:0x7f58d1db27d1 [104602.951335] Code: Bad RIP value. [104602.951336] RSP: 002b:00007f58c6987370 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [104602.951338] RAX: fffffffffffffdfc RBX: 00007f58c69875e8 RCX: 00007f58d1db27d1 [104602.951339] RDX: 0000000000000000 RSI: 0000000000000089 RDI: 00007f58c6987600 [104602.951340] RBP: 00007f58c69875d8 R08: 0000000000000000 R09: 00000000ffffffff [104602.951341] R10: 00007f58c6987460 R11: 0000000000000246 R12: 00007f58c69875fc [104602.951343] R13: 00007f58c69875b0 R14: 00007f58c6987600 R15: 00007f58c69873c0 The patch was posted back in 2021 the linux kernel mailing lists: https://lore.kernel.org/linux-mm/20210224200449.hkU5GTEiH%25akpm@linux- foundation.org/ The defect is: Date: Wed, 24 Feb 2021 12:04:49 -0800 [thread overview] Message-ID: <20210224200449.hku5gteih%a...@linux-foundation.org> (raw) In-Reply-To: <20210224115824.1e289a6895087f10c41dd...@linux-foundation.org> From: Li Xinhai <lixinhai....@gmail.com> Subject: mm: rmap: explicitly reset vma->anon_vma in unlink_anon_vmas() In case the vma will continue to be used after unlink its relevant anon_vma, we need to reset the vma->anon_vma pointer to NULL. So, later when fault happen within this vma again, a new anon_vma will be prepared. By this way, the vma will only be checked for reverse mapping of pages which been fault in after the unlink_anon_vmas call. Currently, the mremap with MREMAP_DONTUNMAP scenario will continue use the vma after moved its page table entries to a new vma. For other scenarios, the vma itself will be freed after call unlink_anon_vmas. Link: https://lkml.kernel.org/r/20210119075126.3513154-1-lixinhai....@gmail.com Signed-off-by: Li Xinhai <lixinhai....@gmail.com> Cc: Andrea Arcangeli <aarca...@redhat.com> Cc: Brian Geffon <bgef...@google.com> Cc: Kirill A. Shutemov <kirill.shute...@linux.intel.com> Cc: Lokesh Gidra <lokeshgi...@google.com> Cc: Minchan Kim <minc...@kernel.org> Cc: Vlastimil Babka <vba...@suse.cz> Signed-off-by: Andrew Morton <a...@linux-foundation.org> --- mm/rmap.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) --- a/mm/rmap.c~mm-rmap-explicitly-reset-vma-anon_vma-in-unlink_anon_vmas +++ a/mm/rmap.c @@ -413,8 +413,15 @@ void unlink_anon_vmas(struct vm_area_str list_del(&avc->same_vma); anon_vma_chain_free(avc); } - if (vma->anon_vma) + if (vma->anon_vma) { vma->anon_vma->degree--; + + /* + * vma would still be needed after unlink, and anon_vma will be prepared + * when handle fault. + */ + vma->anon_vma = NULL; + } unlock_anon_vma_root(root); /* The Linux 5.4 package that Ubuntu is currently running on the latest kernel has the following code: if (vma->anon_vma) vma->anon_vma->degree--; unlock_anon_vma_root(root); This is the 3rd time I've encountered the crash. root@lazarus:/var/crash/202206141315# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.4 LTS Release: 20.04 Codename: focal ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1978719 Title: Ubuntu 5.4.0-117.132-generic 5.4.189 has BUG: kernel NULL pointer dereference, address: 0000000000000034 Status in linux package in Ubuntu: New Bug description: The 5.4.0 series of the Ubuntu kernel has missed a patch which resolves a null dereference: [104602.951260] BUG: kernel NULL pointer dereference, address: 0000000000000034 [104602.951263] #PF: supervisor write access in kernel mode [104602.951264] #PF: error_code(0x0002) - not-present page [104602.951266] PGD 0 P4D 0 [104602.951269] Oops: 0002 [#1] SMP PTI [104602.951272] CPU: 6 PID: 176572 Comm: ThreadPoolForeg Kdump: loaded Tainted: P OE 5.4.0-117-generic #132-Ubuntu [104602.951273] Hardware name: System manufacturer System Product Name/P8P67 LE, BIOS 3801 09/12/2013 [104602.951278] RIP: 0010:unlink_anon_vmas+0x3e/0x1b0 [104602.951280] Code: 54 53 48 83 ec 08 48 8b 47 78 48 89 7d d0 48 8b 30 49 39 c5 0f 84 5e 01 00 00 4c 8d 78 f0 4c 8d 66 f0 31 db eb 21 49 8b 46 38 <83> 68 34 01 49 8b 44 24 10 49 8d 54 24 10 4d 89 e7 48 83 e8 10 49 [104602.951281] RSP: 0018:ffffc00908703bd8 EFLAGS: 00010246 [104602.951283] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [104602.951284] RDX: 0000000000000000 RSI: ffff99e8e5815c48 RDI: 0000000000000000 [104602.951286] RBP: ffffc00908703c08 R08: 0000000000000001 R09: ffffffffae665f00 [104602.951287] R10: ffff99eb808bd6c0 R11: 0000000000000001 R12: ffff99eb0fda27b8 [104602.951288] R13: ffff99eb0fda27c8 R14: ffff99e8e5815c08 R15: ffff99e7ee7af6c0 [104602.951290] FS: 0000000000000000(0000) GS:ffff99eb8eb80000(0000) knlGS:0000000000000000 [104602.951291] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [104602.951293] CR2: 0000000000000034 CR3: 000000037ae0a005 CR4: 00000000000606e0 [104602.951294] Call Trace: [104602.951299] free_pgtables+0x93/0xf0 [104602.951301] exit_mmap+0xc7/0x1b0 [104602.951304] mmput+0x5d/0x130 [104602.951306] do_exit+0x31a/0xaf0 [104602.951309] do_group_exit+0x47/0xb0 [104602.951312] get_signal+0x169/0x890 [104602.951315] do_signal+0x34/0x6c0 [104602.951318] ? _copy_from_user+0x3e/0x60 [104602.951321] ? __x64_sys_futex+0x13f/0x170 [104602.951324] exit_to_usermode_loop+0xbf/0x160 [104602.951327] do_syscall_64+0x163/0x190 [104602.951330] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [104602.951332] RIP: 0033:0x7f58d1db27d1 [104602.951335] Code: Bad RIP value. [104602.951336] RSP: 002b:00007f58c6987370 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca [104602.951338] RAX: fffffffffffffdfc RBX: 00007f58c69875e8 RCX: 00007f58d1db27d1 [104602.951339] RDX: 0000000000000000 RSI: 0000000000000089 RDI: 00007f58c6987600 [104602.951340] RBP: 00007f58c69875d8 R08: 0000000000000000 R09: 00000000ffffffff [104602.951341] R10: 00007f58c6987460 R11: 0000000000000246 R12: 00007f58c69875fc [104602.951343] R13: 00007f58c69875b0 R14: 00007f58c6987600 R15: 00007f58c69873c0 The patch was posted back in 2021 the linux kernel mailing lists: https://lore.kernel.org/linux- mm/20210224200449.hku5gteih%25a...@linux-foundation.org/ The defect is: Date: Wed, 24 Feb 2021 12:04:49 -0800 [thread overview] Message-ID: <20210224200449.hku5gteih%a...@linux-foundation.org> (raw) In-Reply-To: <20210224115824.1e289a6895087f10c41dd...@linux-foundation.org> From: Li Xinhai <lixinhai....@gmail.com> Subject: mm: rmap: explicitly reset vma->anon_vma in unlink_anon_vmas() In case the vma will continue to be used after unlink its relevant anon_vma, we need to reset the vma->anon_vma pointer to NULL. So, later when fault happen within this vma again, a new anon_vma will be prepared. By this way, the vma will only be checked for reverse mapping of pages which been fault in after the unlink_anon_vmas call. Currently, the mremap with MREMAP_DONTUNMAP scenario will continue use the vma after moved its page table entries to a new vma. For other scenarios, the vma itself will be freed after call unlink_anon_vmas. Link: https://lkml.kernel.org/r/20210119075126.3513154-1-lixinhai....@gmail.com Signed-off-by: Li Xinhai <lixinhai....@gmail.com> Cc: Andrea Arcangeli <aarca...@redhat.com> Cc: Brian Geffon <bgef...@google.com> Cc: Kirill A. Shutemov <kirill.shute...@linux.intel.com> Cc: Lokesh Gidra <lokeshgi...@google.com> Cc: Minchan Kim <minc...@kernel.org> Cc: Vlastimil Babka <vba...@suse.cz> Signed-off-by: Andrew Morton <a...@linux-foundation.org> --- mm/rmap.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) --- a/mm/rmap.c~mm-rmap-explicitly-reset-vma-anon_vma-in-unlink_anon_vmas +++ a/mm/rmap.c @@ -413,8 +413,15 @@ void unlink_anon_vmas(struct vm_area_str list_del(&avc->same_vma); anon_vma_chain_free(avc); } - if (vma->anon_vma) + if (vma->anon_vma) { vma->anon_vma->degree--; + + /* + * vma would still be needed after unlink, and anon_vma will be prepared + * when handle fault. + */ + vma->anon_vma = NULL; + } unlock_anon_vma_root(root); /* The Linux 5.4 package that Ubuntu is currently running on the latest kernel has the following code: if (vma->anon_vma) vma->anon_vma->degree--; unlock_anon_vma_root(root); This is the 3rd time I've encountered the crash. root@lazarus:/var/crash/202206141315# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.4 LTS Release: 20.04 Codename: focal To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1978719/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp