Hi David, Thanks for the link, I think that is the most plausible explanation I have seen so far.
The only problem is, if we look at the patch: diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 7a3ab3427369..24001112c323 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile, bool clean) if (tun) xdp_rxq_info_unreg(&tfile->xdp_rxq); ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free); - sock_put(&tfile->sk); } } @@ -702,6 +701,9 @@ static void tun_detach(struct tun_file *tfile, bool clean) if (dev) netdev_state_change(dev); rtnl_unlock(); + + if (clean) + sock_put(&tfile->sk); } static void tun_detach_all(struct net_device *dev) It moves the final sock_put(&tfile->sk) from the end of __tun_detach() to tun_detach(), after the call to netdev_state_change(dev). 685 static void __tun_detach(struct tun_file *tfile, bool clean) 686 { ... 725 if (clean) { 726 if (tun && tun->numqueues == 0 && tun->numdisabled == 0) { 727 netif_carrier_off(tun->dev); 728 729 if (!(tun->flags & IFF_PERSIST) && 730 tun->dev->reg_state == NETREG_REGISTERED) 731 unregister_netdevice(tun->dev); 732 } 733 if (tun) 734 xdp_rxq_info_unreg(&tfile->xdp_rxq); 735 ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free); 736 sock_put(&tfile->sk); 737 } 738 } 739 740 static void tun_detach(struct tun_file *tfile, bool clean) 741 { 742 struct tun_struct *tun; 743 struct net_device *dev; 744 745 rtnl_lock(); 746 tun = rtnl_dereference(tfile->tun); 747 dev = tun ? tun->dev : NULL; 748 __tun_detach(tfile, clean); 749 if (dev) 750 netdev_state_change(dev); 751 rtnl_unlock(); 752 } This more or less makes sense, but if you look at the call trace in the bug: ... [455151.894444] notifier_call_chain+0x55/0x80 ... [455151.895239] unregister_netdevice_queue+0x94/0x120 [455151.895383] __tun_detach+0x421/0x430 ... $ eu-addr2line -ifae ./vmlinux-5.4.0-88-generic __tun_detach+0x421 0xffffffff8178b991 unregister_netdevice inlined at /build/linux-q2DMsi/linux-5.4.0/drivers/net/tun.c:731:5 in __tun_detach /build/linux-q2DMsi/linux-5.4.0/include/linux/netdevice.h:2677:1 __tun_detach /build/linux-q2DMsi/linux-5.4.0/drivers/net/tun.c:731:5 We get to notifier_call_chain() not from netdev_state_change() as mentioned in the bug report, but unregister_netdevice() from line 731. This means we haven't yet run sock_put(&tfile->sk) from line 736. Puzzling isn't it? There are calls to sock_put(&tfile->sk) earlier in __tun_detach(), maybe it freed the socket buffer already, which would explain the behaviour. But then when we run sock_put(&tfile->sk) again, wouldn't we then run into use-after-free territory, when we try free the socket buffer again? 1735 /* Ungrab socket and destroy it, if it was the last reference. */ 1736 static inline void sock_put(struct sock *sk) 1737 { 1738 if (refcount_dec_and_test(&sk->sk_refcnt)) 1739 sk_free(sk); 1740 } I have a second call trace that I have been debugging along with the one in the description, I'll add it in the next comment. I'll keep looking into the patch anyway. I have been running the syzkaller reproducer in a VM for the last few hours, but I haven't reproduced yet. https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe https://groups.google.com/g/syzkaller-bugs/c/C0r0nwrvBME/m/MxQ5Z7_VBAAJ https://syzkaller.appspot.com/x/repro.c?x=11bd3a10f00000 Thanks, Matthew -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1962485 Title: Kernel Crash [general protection fault: 0000 [#1] SMP NOPTI] Status in linux package in Ubuntu: Confirmed Bug description: Hi, I am running openstack xena release on ubuntu focal. Today my compute node running ubuntu focal crashed with due to kernel and dump has been generated in /var/crash/. Below is the kernel trace in crash dump. [455151.890114] general protection fault: 0000 [#1] SMP NOPTI [455151.890285] CPU: 43 PID: 83232 Comm: qemu-system-x86 Kdump: loaded Tainted: G OE 5.4.0-88-generic #99-Ubuntu [455151.890612] Hardware name: Dell Inc. PowerEdge R6525/XXXXX, BIOS 2.5.6 10/06/2021 [455151.890842] RIP: 0010:count_subheaders.part.0+0x26/0x60 [455151.890998] Code: 00 00 00 90 0f 1f 44 00 00 48 83 3f 00 74 4d 55 48 89 e5 41 55 45 31 ed 41 54 45 31 e4 53 48 89 fb 48 8b 7b 18 48 85 ff 74 23 <48> 83 3f 00 74 25 e8 cf ff ff ff 41 01 c5 48 83 c3 40 48 83 3b 00 [455151.891552] RSP: 0018:ffffa6b477487b88 EFLAGS: 00010202 [455151.891707] RAX: 0000000000000000 RBX: ffff9387c594f280 RCX: 0000000000000000 [455151.891918] RDX: 0000000000000060 RSI: ffff9390702a72c0 RDI: 0314a8c0f1b16f3e [455151.892130] RBP: ffffa6b477487ba0 R08: 0000000000000000 R09: ffffffffbc6ed7f0 [455151.892341] R10: ffffa6b477487cd0 R11: 0000000000000001 R12: 0000000000000000 [455151.892552] R13: 0000000000000000 R14: ffff9391e5684000 R15: ffffffffbd5f9880 [455151.892767] FS: 00007f69950c75c0(0000) GS:ffff9391feac0000(0000) knlGS:0000000000000000 [455151.893016] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [455151.893207] CR2: 00007f61e9e45000 CR3: 0000017c54afa000 CR4: 0000000000340ee0 [455151.893434] Call Trace: [455151.893514] count_subheaders.part.0+0x31/0x60 [455151.893646] unregister_sysctl_table+0x30/0x90 [455151.893781] unregister_net_sysctl_table+0xe/0x10 [455151.893922] __devinet_sysctl_unregister.isra.0+0x2c/0x60 [455151.894082] devinet_sysctl_unregister+0x29/0x40 [455151.894220] inetdev_event+0x1e8/0x560 [455151.894334] ? skb_dequeue+0x5f/0x70 [455151.894444] notifier_call_chain+0x55/0x80 [455151.894565] ? notifier_call_chain+0x55/0x80 [455151.894693] raw_notifier_call_chain+0x16/0x20 [455151.894829] call_netdevice_notifiers_info+0x2e/0x60 [455151.894983] ? tun_show_owner+0x60/0x60 [455151.895098] rollback_registered_many+0x36e/0x520 [455151.895239] unregister_netdevice_queue+0x94/0x120 [455151.895383] __tun_detach+0x421/0x430 [455151.895495] tun_chr_close+0x3a/0x70 [455151.895605] __fput+0xcc/0x260 [455151.895698] ____fput+0xe/0x10 [455151.895792] task_work_run+0x8f/0xb0 [455151.895903] exit_to_usermode_loop+0x131/0x160 [455151.896036] do_syscall_64+0x163/0x190 [455151.896150] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [455151.896302] RIP: 0033:0x7f69965ba3fb [455151.896410] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 f3 fb ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2f 44 89 c7 89 44 24 0c e8 31 fc ff ff 8b 44 [455151.896975] RSP: 002b:00007ffdff14b350 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 [455151.897201] RAX: 0000000000000000 RBX: 0000557fe0875e50 RCX: 00007f69965ba3fb [455151.897412] RDX: 0000557fe0748f40 RSI: 0000000000000001 RDI: 000000000000002b [455151.897637] RBP: 0000557fe0887460 R08: 0000000000000000 R09: 0000000000000000 [455151.904390] R10: 0000000000000032 R11: 0000000000000293 R12: 0000557fe0875e50 [455151.911165] R13: 0000000000000001 R14: 0000557fe09efc10 R15: 0000557fe0747900 I didn't find any documented details on kernel 5.4 for this bug. I have uploaded the logs via ubuntu-bug linux command. # uname -a Linux kvm03-a1-r01-khi04.rapid.pk 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux # cat /proc/version_signature Ubuntu 5.4.0-88.99-generic 5.4.140 I am using Dell R6525 with EPYC 7532 CPUs. Let me know if there is there are more information needed. ProblemType: Bug DistroRelease: Ubuntu 20.04 Package: linux-image-5.4.0-88-generic 5.4.0-88.99 ProcVersionSignature: Ubuntu 5.4.0-88.99-generic 5.4.140 Uname: Linux 5.4.0-88-generic x86_64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Feb 28 17:38 seq crw-rw---- 1 root audio 116, 33 Feb 28 17:38 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.20 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CasperMD5CheckResult: pass Date: Mon Feb 28 21:20:20 2022 InstallationDate: Installed on 2021-07-29 (214 days ago) InstallationMedia: Ubuntu-Server 20.04.2 LTS "Focal Fossa" - Release amd64 (20210201.2) IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' MachineType: Dell Inc. PowerEdge R6525 PciMultimedia: ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 EFI VGA ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-88-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro iommu=pt intel_iommu=on swapaccount=1 vga=normal nofb nomodeset video=vesafb:off i915.modeset=0 crashkernel=512M RelatedPackageVersions: linux-restricted-modules-5.4.0-88-generic N/A linux-backports-modules-5.4.0-88-generic N/A linux-firmware 1.187.19 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 10/06/2021 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.5.6 dmi.board.name: 0GK70M dmi.board.vendor: Dell Inc. dmi.board.version: A10 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.5.6:bd10/06/2021:svnDellInc.:pnPowerEdgeR6525:pvr:rvnDellInc.:rn0GK70M:rvrA10:cvnDellInc.:ct23:cvr: dmi.product.family: PowerEdge dmi.product.name: PowerEdge R6525 dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R6525 dmi.sys.vendor: Dell Inc. --- ProblemType: Bug AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Feb 28 17:38 seq crw-rw---- 1 root audio 116, 33 Feb 28 17:38 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu27.20 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CasperMD5CheckResult: pass DistroRelease: Ubuntu 20.04 InstallationDate: Installed on 2021-07-29 (214 days ago) InstallationMedia: Ubuntu-Server 20.04.2 LTS "Focal Fossa" - Release amd64 (20210201.2) IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' MachineType: Dell Inc. PowerEdge R6525 Package: linux (not installed) PciMultimedia: ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 EFI VGA ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-88-generic root=/dev/mapper/ubuntu--vg-ubuntu--lv ro iommu=pt intel_iommu=on swapaccount=1 vga=normal nofb nomodeset video=vesafb:off i915.modeset=0 crashkernel=512M ProcVersionSignature: Ubuntu 5.4.0-88.99-generic 5.4.140 RelatedPackageVersions: linux-restricted-modules-5.4.0-88-generic N/A linux-backports-modules-5.4.0-88-generic N/A linux-firmware 1.187.19 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: focal uec-images Uname: Linux 5.4.0-88-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: True dmi.bios.date: 10/06/2021 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.5.6 dmi.board.name: 0GK70M dmi.board.vendor: Dell Inc. dmi.board.version: A10 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.5.6:bd10/06/2021:svnDellInc.:pnPowerEdgeR6525:pvr:rvnDellInc.:rn0GK70M:rvrA10:cvnDellInc.:ct23:cvr: dmi.product.family: PowerEdge dmi.product.name: PowerEdge R6525 dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R6525 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1962485/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp