Hi David,

Thanks for the link, I think that is the most plausible explanation I have
seen so far.

The only problem is, if we look at the patch:

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7a3ab3427369..24001112c323 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
                if (tun)
                        xdp_rxq_info_unreg(&tfile->xdp_rxq);
                ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
-               sock_put(&tfile->sk);
        }
 }
 
@@ -702,6 +701,9 @@ static void tun_detach(struct tun_file *tfile, bool clean)
        if (dev)
                netdev_state_change(dev);
        rtnl_unlock();
+
+       if (clean)
+               sock_put(&tfile->sk);
 }
 
 static void tun_detach_all(struct net_device *dev)

It moves the final sock_put(&tfile->sk) from the end of __tun_detach()
to tun_detach(), after the call to netdev_state_change(dev).

 685 static void __tun_detach(struct tun_file *tfile, bool clean)
 686 {
...
 725     if (clean) {
 726         if (tun && tun->numqueues == 0 && tun->numdisabled == 0) {
 727             netif_carrier_off(tun->dev);
 728 
 729             if (!(tun->flags & IFF_PERSIST) &&
 730                 tun->dev->reg_state == NETREG_REGISTERED)
 731                 unregister_netdevice(tun->dev);
 732         }
 733         if (tun)
 734             xdp_rxq_info_unreg(&tfile->xdp_rxq);
 735         ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
 736         sock_put(&tfile->sk);
 737     }
 738 }
 739 
 740 static void tun_detach(struct tun_file *tfile, bool clean)
 741 {
 742     struct tun_struct *tun;
 743     struct net_device *dev;
 744 
 745     rtnl_lock();
 746     tun = rtnl_dereference(tfile->tun);
 747     dev = tun ? tun->dev : NULL;
 748     __tun_detach(tfile, clean);
 749     if (dev)
 750         netdev_state_change(dev);
 751     rtnl_unlock();
 752 }
 
This more or less makes sense, but if you look at the call trace in the bug:

...
[455151.894444] notifier_call_chain+0x55/0x80
...
[455151.895239] unregister_netdevice_queue+0x94/0x120
[455151.895383] __tun_detach+0x421/0x430
...

$ eu-addr2line -ifae ./vmlinux-5.4.0-88-generic  __tun_detach+0x421
0xffffffff8178b991
unregister_netdevice inlined at 
/build/linux-q2DMsi/linux-5.4.0/drivers/net/tun.c:731:5 in __tun_detach
/build/linux-q2DMsi/linux-5.4.0/include/linux/netdevice.h:2677:1
__tun_detach
/build/linux-q2DMsi/linux-5.4.0/drivers/net/tun.c:731:5

We get to notifier_call_chain() not from netdev_state_change() as
mentioned in the bug report, but unregister_netdevice() from line 731.
This means we haven't yet run sock_put(&tfile->sk) from line 736.

Puzzling isn't it? There are calls to sock_put(&tfile->sk) earlier in
__tun_detach(), maybe it freed the socket buffer already, which would
explain the behaviour.

But then when we run sock_put(&tfile->sk) again, wouldn't we then run
into use-after-free territory, when we try free the socket buffer again?

1735 /* Ungrab socket and destroy it, if it was the last reference. */
1736 static inline void sock_put(struct sock *sk)
1737 {
1738     if (refcount_dec_and_test(&sk->sk_refcnt))
1739         sk_free(sk);
1740 }

I have a second call trace that I have been debugging along with the one
in the description, I'll add it in the next comment.

I'll keep looking into the patch anyway. I have been running the
syzkaller reproducer in a VM for the last few hours, but I haven't
reproduced yet.

https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe
https://groups.google.com/g/syzkaller-bugs/c/C0r0nwrvBME/m/MxQ5Z7_VBAAJ
https://syzkaller.appspot.com/x/repro.c?x=11bd3a10f00000

Thanks,
Matthew

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1962485

Title:
  Kernel Crash [general protection fault: 0000 [#1] SMP NOPTI]

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi,

  I am running openstack xena release on ubuntu focal. Today my compute
  node running ubuntu focal crashed with due to kernel and dump has been
  generated in /var/crash/. Below is the kernel trace in crash dump.

  [455151.890114] general protection fault: 0000 [#1] SMP NOPTI
  [455151.890285] CPU: 43 PID: 83232 Comm: qemu-system-x86 Kdump: loaded 
Tainted: G           OE     5.4.0-88-generic #99-Ubuntu
  [455151.890612] Hardware name: Dell Inc. PowerEdge R6525/XXXXX, BIOS 2.5.6 
10/06/2021
  [455151.890842] RIP: 0010:count_subheaders.part.0+0x26/0x60
  [455151.890998] Code: 00 00 00 90 0f 1f 44 00 00 48 83 3f 00 74 4d 55 48 89 
e5 41 55 45 31 ed 41 54 45 31 e4 53 48 89 fb 48 8b 7b 18 48 85 ff 74 23 <48> 83 
3f 00 74 25 e8 cf ff ff ff 41 
  01 c5 48 83 c3 40 48 83 3b 00
  [455151.891552] RSP: 0018:ffffa6b477487b88 EFLAGS: 00010202
  [455151.891707] RAX: 0000000000000000 RBX: ffff9387c594f280 RCX: 
0000000000000000
  [455151.891918] RDX: 0000000000000060 RSI: ffff9390702a72c0 RDI: 
0314a8c0f1b16f3e
  [455151.892130] RBP: ffffa6b477487ba0 R08: 0000000000000000 R09: 
ffffffffbc6ed7f0
  [455151.892341] R10: ffffa6b477487cd0 R11: 0000000000000001 R12: 
0000000000000000
  [455151.892552] R13: 0000000000000000 R14: ffff9391e5684000 R15: 
ffffffffbd5f9880
  [455151.892767] FS:  00007f69950c75c0(0000) GS:ffff9391feac0000(0000) 
knlGS:0000000000000000
  [455151.893016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [455151.893207] CR2: 00007f61e9e45000 CR3: 0000017c54afa000 CR4: 
0000000000340ee0
  [455151.893434] Call Trace:
  [455151.893514]  count_subheaders.part.0+0x31/0x60
  [455151.893646]  unregister_sysctl_table+0x30/0x90
  [455151.893781]  unregister_net_sysctl_table+0xe/0x10
  [455151.893922]  __devinet_sysctl_unregister.isra.0+0x2c/0x60
  [455151.894082]  devinet_sysctl_unregister+0x29/0x40
  [455151.894220]  inetdev_event+0x1e8/0x560
  [455151.894334]  ? skb_dequeue+0x5f/0x70
  [455151.894444]  notifier_call_chain+0x55/0x80
  [455151.894565]  ? notifier_call_chain+0x55/0x80
  [455151.894693]  raw_notifier_call_chain+0x16/0x20
  [455151.894829]  call_netdevice_notifiers_info+0x2e/0x60
  [455151.894983]  ? tun_show_owner+0x60/0x60
  [455151.895098]  rollback_registered_many+0x36e/0x520
  [455151.895239]  unregister_netdevice_queue+0x94/0x120
  [455151.895383]  __tun_detach+0x421/0x430
  [455151.895495]  tun_chr_close+0x3a/0x70
  [455151.895605]  __fput+0xcc/0x260
  [455151.895698]  ____fput+0xe/0x10
  [455151.895792]  task_work_run+0x8f/0xb0
  [455151.895903]  exit_to_usermode_loop+0x131/0x160
  [455151.896036]  do_syscall_64+0x163/0x190
  [455151.896150]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  [455151.896302] RIP: 0033:0x7f69965ba3fb
  [455151.896410] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 
18 89 7c 24 0c e8 f3 fb ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 
00 f0 ff ff 77 2f 44 89 c7 89 44 24 0c e8 31 fc ff ff 8b 44
  [455151.896975] RSP: 002b:00007ffdff14b350 EFLAGS: 00000293 ORIG_RAX: 
0000000000000003
  [455151.897201] RAX: 0000000000000000 RBX: 0000557fe0875e50 RCX: 
00007f69965ba3fb
  [455151.897412] RDX: 0000557fe0748f40 RSI: 0000000000000001 RDI: 
000000000000002b
  [455151.897637] RBP: 0000557fe0887460 R08: 0000000000000000 R09: 
0000000000000000
  [455151.904390] R10: 0000000000000032 R11: 0000000000000293 R12: 
0000557fe0875e50
  [455151.911165] R13: 0000000000000001 R14: 0000557fe09efc10 R15: 
0000557fe0747900

  I didn't find any documented details on kernel 5.4 for this bug. I
  have uploaded the logs via ubuntu-bug linux command.

  # uname -a
  Linux kvm03-a1-r01-khi04.rapid.pk 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 
17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

  # cat /proc/version_signature
  Ubuntu 5.4.0-88.99-generic 5.4.140

  I am using Dell R6525 with EPYC 7532 CPUs.

  Let me know if there is there are more information needed.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-image-5.4.0-88-generic 5.4.0-88.99
  ProcVersionSignature: Ubuntu 5.4.0-88.99-generic 5.4.140
  Uname: Linux 5.4.0-88-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Feb 28 17:38 seq
   crw-rw---- 1 root audio 116, 33 Feb 28 17:38 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.20
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CasperMD5CheckResult: pass
  Date: Mon Feb 28 21:20:20 2022
  InstallationDate: Installed on 2021-07-29 (214 days ago)
  InstallationMedia: Ubuntu-Server 20.04.2 LTS "Focal Fossa" - Release amd64 
(20210201.2)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Dell Inc. PowerEdge R6525
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-88-generic 
root=/dev/mapper/ubuntu--vg-ubuntu--lv ro iommu=pt intel_iommu=on swapaccount=1 
vga=normal nofb nomodeset video=vesafb:off i915.modeset=0 crashkernel=512M
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-88-generic N/A
   linux-backports-modules-5.4.0-88-generic  N/A
   linux-firmware                            1.187.19
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 10/06/2021
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 2.5.6
  dmi.board.name: 0GK70M
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A10
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr2.5.6:bd10/06/2021:svnDellInc.:pnPowerEdgeR6525:pvr:rvnDellInc.:rn0GK70M:rvrA10:cvnDellInc.:ct23:cvr:
  dmi.product.family: PowerEdge
  dmi.product.name: PowerEdge R6525
  dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R6525
  dmi.sys.vendor: Dell Inc.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Feb 28 17:38 seq
   crw-rw---- 1 root audio 116, 33 Feb 28 17:38 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu27.20
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CasperMD5CheckResult: pass
  DistroRelease: Ubuntu 20.04
  InstallationDate: Installed on 2021-07-29 (214 days ago)
  InstallationMedia: Ubuntu-Server 20.04.2 LTS "Focal Fossa" - Release amd64 
(20210201.2)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  MachineType: Dell Inc. PowerEdge R6525
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-88-generic 
root=/dev/mapper/ubuntu--vg-ubuntu--lv ro iommu=pt intel_iommu=on swapaccount=1 
vga=normal nofb nomodeset video=vesafb:off i915.modeset=0 crashkernel=512M
  ProcVersionSignature: Ubuntu 5.4.0-88.99-generic 5.4.140
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-88-generic N/A
   linux-backports-modules-5.4.0-88-generic  N/A
   linux-firmware                            1.187.19
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  focal uec-images
  Uname: Linux 5.4.0-88-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 10/06/2021
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 2.5.6
  dmi.board.name: 0GK70M
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A10
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr2.5.6:bd10/06/2021:svnDellInc.:pnPowerEdgeR6525:pvr:rvnDellInc.:rn0GK70M:rvrA10:cvnDellInc.:ct23:cvr:
  dmi.product.family: PowerEdge
  dmi.product.name: PowerEdge R6525
  dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R6525
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1962485/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to