** Description changed: - I have a server that has been running its data volume using ZFS in 20.04 - without any problem. The volume is using ZFS encryption and a raidz1-0 - configuration. I performed a scrub operations before the upgrade and it - did not find any problem. After the reboot for the upgrade, I was - welcomed with the following message: + [ Impact ] + Upgrading from 20.04 to 22.04 causes encrypted pools to become unmountable. This + is due to broken accounting metadata causing checksum errors on decrypt, which + makes ZFS error out early with ECKSUM. + + [ Test Plan ] + This issue needs specific accounting metadata on the zpool to be broken, and as + such is somewhat tricky to reproduce organically. A regular test plan for an + affected pool should be: + 1. Setup encrypted zpool under 20.04 + 2. Upgrade system to 22.04 (e.g. using do-release-upgrade script) + 3. Verify that zpool fails to mount under 22.04 (zpool status will likely point + to ZFS-8000-8A "Corrupted data" [0]) + + [0] https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A/ + + Thankfully, upstream has included a test scenario for this under the ZFS test + suite, which is ran during build. The + tests/zfs-tests/tests/functional/userquota/13709_reproducer.bz2 file is taken + directly from upstream, and corresponds to an encrypted zpool with the required + (broken) metadata to reproduce this issue. If the ZFS test suite passes, this + should give us a strong signal that this isssue is fixed. + + [ Where problems could occur ] + Although I've backported the upstream test, it'd be great to have confirmation + from affected users that this patch resolves the issue. Additionally, we should + also perform upgrades in non-affected zpools as well as non-encrypted zpools, to + ensure no regressions have been introduced. + + Considering this change affects the encrypt/decrypt code paths, problems could + arise in creating new encrypted zpools, as well as when mounting zpools that + have been previously encrypted. + + [ Other Info ] + This SRU includes a little more changes than the minimal changes mentioned in + the SRU policy, as I've also backported one of upstream's tests for encrypted + pools. This included a new test script (userspace_encrypted_13709.ksh), as well + as a binary zpool dump (13709_reproducer.bz2) that I've added under + d/s/include-binaries. + + Considering this issue causes zpools to become unmountable, I think it's worth + to include these in the standard ZFS test suite (similar to an autopkgtest + scenario for a high-risk regression). These are included in future releases of + zfs-linux, and as such only Jammy is affected by this regression. + -- + + [ Original Description ] + I have a server that has been running its data volume using ZFS in 20.04 without any problem. The volume is using ZFS encryption and a raidz1-0 configuration. I performed a scrub operations before the upgrade and it did not find any problem. After the reboot for the upgrade, I was welcomed with the following message: status: One or more devices has experienced an error resulting in data - corruption. Applications may be affected. + corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the - entire pool from backup. - see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A + entire pool from backup. + see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A The volumes still do not have any checksum error but there are 5 zvols that are not accessible. zpool status displays a line similar to the below for each of the five: - errors: Permanent errors have been detected in the following files: - - tank/data/data:<0x0> + errors: Permanent errors have been detected in the following files: + + tank/data/data:<0x0> I run a scrub and it has not identified any problem but the error messages are not there and the data is still not available. There are 10+ other zvols in the zpool that do not have any kind of problem. I have been unable to identify any correlation between the zvols that are failing. I have seen people reporting similar problems in github after the 20.04 to the 22.04 upgrade (see https://github.com/openzfs/zfs/issues/13763). I wonder how widespread the problem will be as more people upgrades to 22.04. I will try to downgrade the version of zfs in the system and report back ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: zfsutils-linux 2.1.4-0ubuntu0.1 ProcVersionSignature: Ubuntu 5.15.0-46.49-generic 5.15.39 Uname: Linux 5.15.0-46-generic x86_64 NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu82.1 Architecture: amd64 CasperMD5CheckResult: unknown Date: Sat Aug 20 22:24:54 2022 ProcEnviron: - TERM=screen-256color - PATH=(custom, no user) - XDG_RUNTIME_DIR=<set> - LANG=en_US.UTF-8 - SHELL=/bin/bash + TERM=screen-256color + PATH=(custom, no user) + XDG_RUNTIME_DIR=<set> + LANG=en_US.UTF-8 + SHELL=/bin/bash SourcePackage: zfs-linux UpgradeStatus: Upgraded to jammy on 2022-08-20 (0 days ago) modified.conffile..etc.sudoers.d.zfs: [inaccessible: [Errno 13] Permission denied: '/etc/sudoers.d/zfs']
** Also affects: zfs-linux (Ubuntu Jammy) Importance: Undecided Status: New ** Changed in: zfs-linux (Ubuntu Jammy) Assignee: (unassigned) => Heitor Alves de Siqueira (halves) ** Changed in: zfs-linux (Ubuntu Jammy) Importance: Undecided => High ** Changed in: zfs-linux (Ubuntu Jammy) Status: New => Incomplete ** Changed in: zfs-linux (Ubuntu Jammy) Status: Incomplete => In Progress ** Changed in: zfs-linux (Ubuntu) Status: In Progress => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1987190 Title: ZFS unrecoverable error after upgrading from 20.04 to 22.04.1 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1987190/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs