[Kernel-packages] [Bug 2115683] Re: ZFS hangs when writing to pools with high object count

Heitor Alves de Siqueira Thu, 10 Jul 2025 10:51:48 -0700

I've redeployed the original server, and re-tested this with Jammy. I
left the script running for a couple of days, and actually managed to
fill up the ZFS storage without running into the object number issues.


ubuntu@wringer-wooster:~$ zfs version
zfs-2.1.5-1ubuntu6~22.04.6
zfs-kmod-2.1.5-1ubuntu6~22.04.6
ubuntu@wringer-wooster:~$ apt policy zfs-dkms
zfs-dkms:
  Installed: 2.1.5-1ubuntu6~22.04.6
  Candidate: 2.1.5-1ubuntu6~22.04.6
  Version table:
 *** 2.1.5-1ubuntu6~22.04.6 500
        500 http://archive.ubuntu.com/ubuntu jammy-proposed/universe amd64 
Packages
        100 /var/lib/dpkg/status
     2.1.5-1ubuntu6~22.04.5 500
        500 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 
Packages
     2.1.5-1ubuntu6~22.04.4 500
        500 http://archive.ubuntu.com/ubuntu jammy-security/universe amd64 
Packages
     2.1.2-1ubuntu3 500
        500 http://archive.ubuntu.com/ubuntu jammy/universe amd64 Packages
ubuntu@wringer-wooster:~$ zpool iostat
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
pooltest    6.86T   128G      0      0  8.70K  4.39K
ubuntu@wringer-wooster:~$ zfs list
NAME            USED  AVAIL     REFER  MOUNTPOINT
pooltest       6.86T   474K       24K  /pooltest
pooltest/data  6.86T   474K     6.86T  /pooltest/data

** Tags removed: verification-needed verification-needed-jammy
** Tags added: verification-done verification-done-jammy

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/2115683

Title:
  ZFS hangs when writing to pools with high object count

Status in zfs-linux package in Ubuntu:
  Fix Released
Status in zfs-linux source package in Jammy:
  Fix Committed
Status in zfs-linux source package in Noble:
  Fix Committed

Bug description:
  [Impact]
  ZFS pools become completely unresponsive, with inflight I/O stalling with 
kernel spews similar to the one below:

      crash> bt -s 835
       PID: 835     TASK: ffff9ef78c6d2880 CPU: 1   COMMAND: "txg_quiesce"
       #0 [ffffaf7242e53ce8] __schedule+648 at ffffffffbcc01248
       #1 [ffffaf7242e53d90] schedule+46 at ffffffffbcc0165e
       #2 [ffffaf7242e53db0] cv_wait_common+258 at ffffffffc05224a2 [spl]
       #3 [ffffaf7242e53e18] __cv_wait+21 at ffffffffc0522505 [spl]
       #4 [ffffaf7242e53e28] txg_quiesce+384 at ffffffffc06f3f70 [zfs]
       #5 [ffffaf7242e53e78] txg_quiesce_thread+205 at ffffffffc06f40bd [zfs]
       #6 [ffffaf7242e53ec0] thread_generic_wrapper+100 at ffffffffc052d314 
[spl]
       #7 [ffffaf7242e53ee8] kthread+214 at ffffffffbbb32ce6
       #8 [ffffaf7242e53f28] ret_from_fork+70 at ffffffffbba66b76
       #9 [ffffaf7242e53f50] ret_from_fork_asm+27 at ffffffffbba052ab

  This typically happens when creating new files on ZFS pools with a
  high objnum count beyond 2^32 values. Due to a bug in the object
  allocation function dmu_object_alloc_impl(), values beyond the 32-bit
  threshold get silently truncated causing the function to keep trying
  to allocate space in chunks that are already full.

  [Test Plan]
  We've been able to consistently reproduce this on ZFS pools with very high 
object number count. Using the attached zfs_write_unified.py script, we can 
cause a pool to hang due to this bug within a couple of days. Below is a high 
level summary of the test procedure:

  1. Create a ZFS pool with total capacity above 2TB (this is required so that 
we can hit the high objnum count):
  ubuntu@wringer-wooster:~$ zfs list
  NAME            USED  AVAIL  REFER  MOUNTPOINT
  pooltest       6.22T   660G    96K  /pooltest
  pooltest/data  6.21T   660G  6.21T  /pooltest/data

  2. Run zfs_write_unified.py against the test pool:
  ubuntu@wringer-wooster:~# python3 zfs_write_unified.py . $(nproc)

  3. Monitor pool throughput through `zfs iostat` or similar, until no
  new transactions get sync'ed to disk (or until a similar kernel spew
  to the one above starts getting logged)

  Once the pool has enough objects, the problem will manifest almost
  immediately. It's very easy to verify this fix by running
  zfs_write_unified.py on an affected pool, as `zpool iostat` will
  report disk activity.

  [Where problems could occur]
  The fix is fairly straightforward, as we're changing the P2ALIGN macro to an 
equivalent that is able to handle typecast values above 32 bits 
(P2ALIGN_TYPED). This shouldn't affect current existing pools, as this code is 
only exercised when creating new objects (files, directories, snapshots, etc).

  We should test the write path extensively after this change, to make
  sure there are no other hangs when using the new P2ALIGN_MACRO. Any
  potential regressions due to this will affect the object allocation
  path, so we should see similar kernel spews stating that `txg_sync` or
  `txg_quiesce` are hanging:

  [179404.940783] INFO: task txg_quiesce:2203494 blocked for more than 122 
seconds.
  [179404.944987]       Tainted: P           OE      6.8.0-1020-aws 
#22~22.04.1-Ubuntu
  [179404.949205] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.

  [Other info]
  This fix has been upstream since May/2024, and is included with ZFS releases 
starting with 2.2.5. As such, Jammy and Noble are affected, and releases 
starting with Oracular already have this fix.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/2115683/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 2115683] Re: ZFS hangs when writing to pools with high object count

Reply via email to