Public bug reported:

Livelock between ZFS evict and writeback threads

[Impact]
ZIO pipeline stalls, causing ZFS workloads to hang indefinitely

[Description]
For certain ZFS workloads, we start seeing hung task timeouts in the kernel 
logs due to zil_commit() stalling. This is due to zfs_zget() not detecting 
whether a znode has been marked for deletion before attempting to access it, 
causing a constant "retry loop" in zfs_get_data() if that znode has been 
unlinked already. An example of the stack traces follows:

[72742.051703] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[72742.070429] mysqld          D    0  5713   2881 0x00000320
[72742.073220] Call Trace:
[72742.075305]  __schedule+0x24e/0x880
[72742.090436]  schedule+0x2c/0x80
[72742.090438]  schedule_preempt_disabled+0xe/0x10
[72742.090441]  __mutex_lock.isra.5+0x276/0x4e0
[72742.090547]  ? dmu_tx_destroy+0x105/0x130 [zfs]
[72742.090555]  __mutex_lock_slowpath+0x13/0x20
[72742.115374]  ? __mutex_lock_slowpath+0x13/0x20
[72742.132266]  mutex_lock+0x2f/0x40
[72742.134207]  zil_commit_impl+0x1b0/0x1b30 [zfs]
[72742.150428]  ? spl_kmem_alloc+0x115/0x180 [spl]
[72742.152622]  ? mutex_lock+0x12/0x40
[72742.154819]  ? zfs_refcount_add_many+0x9a/0x100 [zfs]
[72742.171450]  zil_commit+0xde/0x150 [zfs]
[72742.173687]  zfs_fsync+0x77/0xe0 [zfs]
[72742.175044]  zpl_fsync+0x80/0x110 [zfs]
[72742.191690]  vfs_fsync_range+0x51/0xb0
[72742.193876]  do_fsync+0x3d/0x70
[72742.195126]  SyS_fsync+0x10/0x20
[72742.211059]  do_syscall_64+0x73/0x130
[72742.214078]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

It's possible to hit this issue due to a race between the ZFS evict and
writeback threads. If the z_iput task is trying to evict a znode that's
currently sitting in the writeback thread, both will "livelock" each
other and stall the ZIO pipeline, causing other ZFS operations (such as
zil_commit) to hang indefinitely.

This has been documented and fixed upstream in PR#9583 [0]. We need to
pull two fixes from upstream: the first one fixes the zfs_zget() issue
in the writeback thread, while the second fixes a regression on
O_TMPFILE descriptors caused by the first one.

Upstream patches:
 - Break out of zfs_zget early if unlinked znode (41e1aa2a06f8)
 - Check for unlinked znodes after igrab() (0c46813805f4)

[Test Case]
Being a race condition, this issue has been hard to reproduce consistently. The 
racing window between evict() and the ZFS writeback thread is quite strict, but 
users have reported this to show up after some hours of running 
LXD-containerized mySQL workloads.

[Regression Potential]
These patches have been tested both in the ZFS test suite and in production 
environments, so the potential for further regressions should be low.
Additional regressions would likely cause issues with the ZFS writeback/commit 
and IO pipeline, so they should be spotted fairly quickly.

[0] https://github.com/zfsonlinux/zfs/pull/9583
[1] https://github.com/zfsonlinux/zfs/commit/41e1aa2a06f8
[2] https://github.com/zfsonlinux/zfs/commit/0c46813805f4

** Affects: zfs-linux (Ubuntu)
     Importance: Undecided
     Assignee: Heitor Alves de Siqueira (halves)
         Status: Confirmed

** Affects: zfs-linux (Debian)
     Importance: Unknown
         Status: Unknown


** Tags: sts

** Bug watch added: Debian Bug tracker #946610
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=946610

** Also affects: zfs-linux (Debian) via
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=946610
   Importance: Unknown
       Status: Unknown

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1856084

Title:
  Livelock between ZFS evict and writeback threads

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1856084/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to