Public bug reported: For reference, here is the stack of systemd-udevd seen in the hang:
[ 1558.214013] INFO: task systemd-udevd:1778 blocked for more than 120 seconds. [ 1558.214318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1558.214556] systemd-udevd D 00003fff8dbdf7a0 0 1778 1 0x00040000 [ 1558.214637] Call Trace: [ 1558.214673] [c000000004ad3790] [c0000000007aac20] schedule_timeout+0x180/0x2f0 (unreliable) [ 1558.214779] [c000000004ad3960] [c0000000000158d0] __switch_to+0x200/0x350 [ 1558.214870] [c000000004ad39c0] [c0000000007adbb4] __schedule+0x414/0x9e0 [ 1558.214961] [c000000004ad3a90] [c0000000003b4e54] blk_mq_freeze_queue_wait+0x64/0xd0 [ 1558.215107] [c000000004ad3af0] [d000000034011964] nvme_revalidate_disk+0xd4/0x3a0 [nvme] [ 1558.215386] [c000000004ad3b90] [c0000000003c2398] rescan_partitions+0x98/0x390 [ 1558.215508] [c000000004ad3c60] [c0000000003bb7ac] __blkdev_reread_part+0x9c/0xd0 [ 1558.215599] [c000000004ad3c90] [c0000000003bb818] blkdev_reread_part+0x38/0x70 [ 1558.215935] [c000000004ad3cc0] [c0000000003bc334] blkdev_ioctl+0x3b4/0xb80 [ 1558.216016] [c000000004ad3d20] [c0000000002cbcd0] block_ioctl+0x70/0x90 [ 1558.216114] [c000000004ad3d40] [c000000000296b38] do_vfs_ioctl+0x458/0x740 [ 1558.216192] [c000000004ad3dd0] [c000000000296ee4] SyS_ioctl+0xc4/0xe0 [ 1558.216275] [c000000004ad3e30] [c00000000000a17c] system_call+0x38/0xb4 It appears that systemd-udevd is triggering every time HTX writes to the boot sector (partition table) of the raw drive, and this is causing the revalidate calls which expose the issue with the block driver mq freeze. With a partition table on each drive, HTX will no longer be writing the partition table and no longer triggering systemd to re-read the partition table and try to freeze I/O. The fix for this is provided by the following upstream commit: 966d2b0 percpu-refcount: fix reference leak during percpu-atomic transition which needs to be pulled into 16.04 (as well as newer releases). ** Affects: linux (Ubuntu) Importance: Undecided Assignee: Taco Screen team (taco-screen-team) Status: New ** Tags: architecture-ppc64le bugnameltc-148242 severity-critical targetmilestone-inin16042 ** Tags added: architecture-ppc64le bugnameltc-148242 severity-critical targetmilestone-inin16042 ** Changed in: ubuntu Assignee: (unassigned) => Taco Screen team (taco-screen-team) ** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1662673 Title: systemd-udevd hung in blk_mq_freeze_queue_wait testing unpartitioned NVMe drive Status in linux package in Ubuntu: New Bug description: For reference, here is the stack of systemd-udevd seen in the hang: [ 1558.214013] INFO: task systemd-udevd:1778 blocked for more than 120 seconds. [ 1558.214318] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1558.214556] systemd-udevd D 00003fff8dbdf7a0 0 1778 1 0x00040000 [ 1558.214637] Call Trace: [ 1558.214673] [c000000004ad3790] [c0000000007aac20] schedule_timeout+0x180/0x2f0 (unreliable) [ 1558.214779] [c000000004ad3960] [c0000000000158d0] __switch_to+0x200/0x350 [ 1558.214870] [c000000004ad39c0] [c0000000007adbb4] __schedule+0x414/0x9e0 [ 1558.214961] [c000000004ad3a90] [c0000000003b4e54] blk_mq_freeze_queue_wait+0x64/0xd0 [ 1558.215107] [c000000004ad3af0] [d000000034011964] nvme_revalidate_disk+0xd4/0x3a0 [nvme] [ 1558.215386] [c000000004ad3b90] [c0000000003c2398] rescan_partitions+0x98/0x390 [ 1558.215508] [c000000004ad3c60] [c0000000003bb7ac] __blkdev_reread_part+0x9c/0xd0 [ 1558.215599] [c000000004ad3c90] [c0000000003bb818] blkdev_reread_part+0x38/0x70 [ 1558.215935] [c000000004ad3cc0] [c0000000003bc334] blkdev_ioctl+0x3b4/0xb80 [ 1558.216016] [c000000004ad3d20] [c0000000002cbcd0] block_ioctl+0x70/0x90 [ 1558.216114] [c000000004ad3d40] [c000000000296b38] do_vfs_ioctl+0x458/0x740 [ 1558.216192] [c000000004ad3dd0] [c000000000296ee4] SyS_ioctl+0xc4/0xe0 [ 1558.216275] [c000000004ad3e30] [c00000000000a17c] system_call+0x38/0xb4 It appears that systemd-udevd is triggering every time HTX writes to the boot sector (partition table) of the raw drive, and this is causing the revalidate calls which expose the issue with the block driver mq freeze. With a partition table on each drive, HTX will no longer be writing the partition table and no longer triggering systemd to re-read the partition table and try to freeze I/O. The fix for this is provided by the following upstream commit: 966d2b0 percpu-refcount: fix reference leak during percpu-atomic transition which needs to be pulled into 16.04 (as well as newer releases). To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1662673/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp