So it may be that the write-back-throttling (wbt) for the underlying devices is getting confused about the exact throttle rates are for these devices and somehow getting stuck. It maybe worth experimenting by disabling the throttling and seeing if this gets I/O working again.
For example, to disable wbt for a device /dev/sda use: echo 0 | sudo tee /sys/block/sda/queue/wbt_lat_usec and if you need to reset it back to the default: echo -1 | sudo tee /sys/block/sda/queue/wbt_lat_usec ..use the appropriate block device name for the block devices you have attached. It may even be worth setting the wbt_lat_usec to 0 for all the block devices in your pool as early as possible after boot and see if this helps. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to zfs-linux in Ubuntu. https://bugs.launchpad.net/bugs/1889110 Title: zfs pool locks and see "INFO: task txg_sync:4307 blocked for more than 120 seconds. " Status in zfs-linux package in Ubuntu: Incomplete Bug description: ZFS filesystem becomes unresponsive and subsequent NFS shares unresponsive. ESXi sees all paths down. See this error 3 times in a row. [184383.479511] INFO: task txg_sync:4307 blocked for more than 120 seconds. [184383.479565] Tainted: P IO 5.4.0-42-generic #46-Ubuntu [184383.479607] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [184383.479655] txg_sync D 0 4307 2 0x80004000 [184383.479658] Call Trace: [184383.479670] __schedule+0x2e3/0x740 [184383.479673] schedule+0x42/0xb0 [184383.479676] schedule_timeout+0x152/0x2f0 [184383.479683] ? __next_timer_interrupt+0xe0/0xe0 [184383.479685] io_schedule_timeout+0x1e/0x50 [184383.479697] __cv_timedwait_common+0x15e/0x1c0 [spl] [184383.479702] ? wait_woken+0x80/0x80 [184383.479710] __cv_timedwait_io+0x19/0x20 [spl] [184383.479816] zio_wait+0x11b/0x230 [zfs] [184383.479905] ? __raw_spin_unlock+0x9/0x10 [zfs] [184383.479983] dsl_pool_sync+0xbc/0x410 [zfs] [184383.480069] spa_sync_iterate_to_convergence+0xe0/0x1c0 [zfs] [184383.480156] spa_sync+0x312/0x5b0 [zfs] [184383.480245] txg_sync_thread+0x27a/0x310 [zfs] [184383.480334] ? txg_dispatch_callbacks+0x100/0x100 [zfs] [184383.480344] thread_generic_wrapper+0x83/0xa0 [spl] [184383.480347] kthread+0x104/0x140 [184383.480356] ? clear_bit+0x20/0x20 [spl] [184383.480358] ? kthread_park+0x90/0x90 [184383.480361] ret_from_fork+0x35/0x40 Then nfsd hangs as well. [184866.787445] INFO: task nfsd:6585 blocked for more than 120 seconds. [184866.787485] Tainted: P IO 5.4.0-42-generic #46-Ubuntu [184866.787526] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [184866.787573] nfsd D 0 6585 2 0x80004000 [184866.787575] Call Trace: [184866.787578] __schedule+0x2e3/0x740 [184866.787675] ? __raw_spin_unlock+0x9/0x10 [zfs] [184866.787678] schedule+0x42/0xb0 [184866.787685] cv_wait_common+0x133/0x180 [spl] [184866.787688] ? wait_woken+0x80/0x80 [184866.787695] __cv_wait+0x15/0x20 [spl] [184866.787764] dmu_tx_wait+0x1ee/0x210 [zfs] [184866.787834] dmu_tx_assign+0x49/0x70 [zfs] [184866.787929] zfs_write+0x461/0xd40 [zfs] [184866.788025] ? atomic_sub_return.constprop.0+0xd/0x20 [zfs] [184866.788033] ? atomic_dec+0xd/0x20 [spl] [184866.788116] ? __raw_spin_unlock+0x9/0x10 [zfs] [184866.788122] ? __d_obtain_alias+0x36/0x90 [184866.788217] zpl_write_common_iovec+0xad/0x120 [zfs] [184866.788313] zpl_iter_write_common+0x8e/0xb0 [zfs] [184866.788409] zpl_iter_write+0x56/0x90 [zfs] [184866.788413] do_iter_readv_writev+0x14f/0x1d0 [184866.788416] do_iter_write+0x84/0x1a0 [184866.788418] vfs_iter_write+0x19/0x30 [184866.788442] nfsd_vfs_write+0xe0/0x480 [nfsd] [184866.788454] nfsd_write+0x7a/0x160 [nfsd] [184866.788458] ? kmem_cache_alloc+0x16d/0x230 [184866.788472] nfsd3_proc_write+0xc3/0x170 [nfsd] [184866.788483] nfsd_dispatch+0xd6/0x220 [nfsd] [184866.788508] svc_process_common+0x3af/0x700 [sunrpc] [184866.788527] ? svc_sock_secure_port+0x16/0x30 [sunrpc] [184866.788538] ? nfsd_svc+0x2d0/0x2d0 [nfsd] [184866.788557] svc_process+0xd9/0x110 [sunrpc] [184866.788568] nfsd+0xe8/0x150 [nfsd] [184866.788570] kthread+0x104/0x140 [184866.788581] ? nfsd_destroy+0x60/0x60 [nfsd] [184866.788583] ? kthread_park+0x90/0x90 [184866.788585] ret_from_fork+0x35/0x40 Linux zfs-01 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux root@zfs-01:/# lsb_release -rd Description: Ubuntu 20.04 LTS Release: 20.04 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1889110/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp