Hi @juergh - Kinga and I aligned on this.  We want to get this fix
backported to Noble 6.8 kernel for the SRU (Oct 28 release). Can you
support this?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2079038

Title:
  [VROC] [Ub 24.04.0/1] Kernel bug and reboot hang during recovery -
  missing patch

Status in linux package in Ubuntu:
  New

Bug description:
  Steps to reproduce:
  1. Create RAID array (example for IMSM):
  mdadm -CR imsm -e imsm -n2 /dev/nvme[01]n1
  mdadm -CR vol1 -l1 -n2 /dev/nvme[01]n1

  2. Hot-remove one of the drive.
  3. Insert drive back - recovery should start.
  4. Reboot platform

  Expected: reboot is performed successfully. Recovery is continued.
  Actual: reboot hanged and call trace appeared.

  
  [  776.416504] (sd-umoun[3359]: Failed to unmount 
/run/shutdown/mounts/bd36b757b23bbc6f: Device or resource busy

  [  776.445568] shutdown[1]: Could not stop MD /dev/md126: Device or
  resource busy

  mdadm: Cannot get exclusive access to /dev/md126:Perhaps a running
  process, mounted filesystem or active volume group?

  mdadm: Cannot stop container /dev/md127: member md126 still active

  mdadm: Cannot get exclusive access to /dev/md126:Perhaps a running
  process, mounted filesystem or active volume group?

  mdadm: Cannot st[  784.636276] (sd-exec-[3360]:
  /usr/lib/systemd/system-shutdown/mdadm.finalrd failed with exit status
  1.

  op container /dev/md127: member [  784.647309] shutdown[1]: Unable to
  finalize remaining file systems, MD devices, ignoring.

  md126 still active

  [  986.549746] INFO: task shutdown:1 blocked for more than 122
  seconds.

  [  986.556158]       Not tainted 6.8.0-31-generic #31-Ubuntu

  [  986.561586] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
  disables this message.

  [  986.569440] task:shutdown        state:D stack:0     pid:1
  tgid:1     ppid:0      flags:0x00004002

  [  986.578769] Call Trace:

  [  986.581236]  <TASK>

  [  986.583352]  __schedule+0x27c/0x6b0

  [  986.586864]  schedule+0x33/0x110

  [  986.590111]  stop_sync_thread+0x135/0x1b0

  [  986.594143]  ? __pfx_autoremove_wake_function+0x10/0x10

  [  986.599388]  __md_stop_writes+0x19/0xf0

  [  986.603240]  md_notify_reboot+0x93/0x160

  [  986.607186]  notifier_call_chain+0x5e/0xe0

  [  986.611301]  blocking_notifier_call_chain+0x41/0x70

  [  986.616201]  kernel_restart+0x21/0xa0

  [  986.619881]  __do_sys_reboot+0x156/0x250

  [  986.623824]  __x64_sys_reboot+0x1b/0x30

  [  986.627680]  x64_sys_call+0x223c/0x25c0

  [  986.631535]  do_syscall_64+0x7f/0x180

  [  986.635218]  ? irqentry_exit+0x43/0x50

  [  986.638990]  ? exc_page_fault+0x94/0x1b0

  [  986.642932]  entry_SYSCALL_64_after_hwframe+0x73/0x7b

  [  986.648003] RIP: 0033:0x7da6531dea07

  [  986.651623] RSP: 002b:00007fff2babce18 EFLAGS: 00000246 ORIG_RAX:
  00000000000000a9

  [  986.659211] RAX: ffffffffffffffda RBX: 0000000000000003 RCX:
  00007da6531dea07

  [  986.666362] RDX: 0000000001234567 RSI: 0000000028121969 RDI:
  00000000fee1dead

  [  986.673509] RBP: 00007fff2babd050 R08: 0000000000000069 R09:
  0000000000000000

  [  986.680662] R10: 0000000000000000 R11: 0000000000000246 R12:
  0000000000000001

  [  986.687808] R13: 0000000000000000 R14: 0000000000000000 R15:
  0000000001234567

  [  986.694956]  </TASK>

  [  986.697665] Kernel panic - not syncing: hung_task: blocked tasks

  [  986.703732] CPU: 0 PID: 982 Comm: khungtaskd Not tainted
  6.8.0-31-generic #31-Ubuntu

  [  986.711510] Hardware name: Intel Corporation WilsonCity/WilsonCity,
  BIOS WLYDCRB1.SYS.0020.P84.2103030140 03/03/2021

  [  986.722061] Call Trace:

  [  986.724529]  <TASK>

  [  986.726639]  dump_stack_lvl+0x48/0x70

  [  986.730326]  dump_stack+0x10/0x20

  [  986.733661]  panic+0x35f/0x3c0

  [  986.736738]  check_hung_uninterruptible_tasks+0x279/0x320

  [  986.742157]  ? __pfx_watchdog+0x10/0x10

  [  986.746011]  watchdog+0xad/0xb0

  [  986.749164]  kthread+0xef/0x120

  [  986.752317]  ? __pfx_kthread+0x10/0x10

  [  986.756080]  ret_from_fork+0x44/0x70

  [  986.760081]  ? __pfx_kthread+0x10/0x10

  [  986.764201]  ret_from_fork_asm+0x1b/0x30

  [  986.768473]  </TASK>

  [  986.771197] Kernel Offset: 0x1c00000 from 0xffffffff81000000
  (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

  [  986.913304] ---[ end Kernel panic - not syncing: hung_task: blocked
  tasks ]---

  
  This issue is fixed in md. Please apply patch: 
https://github.com/torvalds/linux/commit/1ddeeb2a058d7b2a58ed9e820396b4ceb715d529
 

  But customer also see other issue and reboot is delayed with this
  patch (not hanged). I cannot reproduce it on my platform, but I know
  that newer kernel (6.11-rc4) fixes his all issues.

  If possible rebase to the 6.11-rc4 
(https://kernel.ubuntu.com/mainline/v6.11-rc4/ )
  because there all of the customers issues are fixed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2079038/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to