[Kernel-packages] [Bug 1931790] Re: Unable to mount btrfs RAID 1 filesystem after reboot - Error - device total_bytes should be at most X but found Y

Esko Luontola Tue, 06 Feb 2024 06:18:04 -0800

TLDR:
I had the "device total_bytes should be at most X but found Y" error which 
prevented me from mounting the btrfs filesystem. It was apparently caused by a 
HDD's usable disk space/partition size shrinking by about 1MB (perhaps the HDD 
detected bad sectors). "btrfs check" found no errors. I had to downgrade to 
kernel 5.10 (with ubuntu-mainline-kernel.sh) to be able to mount the 
filesystem. After it was mounted, I was able to fix the filesystem by running 
"btrfs filesystem resize max". Then I could again boot to kernel 5.15.



What happened:

I had been running btrfs in RAID1 for many years. Originally /dev/sdb1
and /dev/sdc1 were part of the btrfs volume. After getting low on disk
space, I repartitioned /dev/sda1 and /dev/sdd1, added them to the same
btrfs volume, and balanced the filesystem. All 4 disks were the same
size and model. (I had bought all 4 disks at the same time, but
originally used sda and sdd just for some backups.)

But the next time I rebooted the system, the partition sda1 had
disappeared and the btrfs volume failed to mount due to a missing
device. I recreated the sda1 partition using the same parted command as
before ("mkpart my-btrfs btrfs 4MiB 100%"). Now btrfs no longer warned
about a missing device. I ran "btrfs check" and it found no errors in
the filesystem:


# btrfs check /dev/sdb1
Opening filesystem to check...
Checking filesystem on /dev/sdb1
UUID: 1ed4397c-bac2-43b7-8e00-fec7bfcecee6
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 8583317286912 bytes used, no error found
total csum bytes: 8369873676
total tree bytes: 10528210944
total fs tree bytes: 1233027072
total extent tree bytes: 295829504
btree space waste bytes: 790336946
file data blocks allocated: 10413926699008
 referenced 10416270606336


However, trying to mount the volume would fail:


# mount /mnt/data
mount: /mnt/data: wrong fs type, bad option, bad superblock on /dev/sdd1, 
missing codepage or helper program, or other error.

# tail /var/log/kern.log
Feb  1 16:38:24 omega kernel: [11659.827098] BTRFS info (device sdb1): using 
crc32c (crc32c-generic) checksum algorithm
Feb  1 16:38:24 omega kernel: [11659.827120] BTRFS info (device sdb1): disk 
space caching is enabled
Feb  1 16:38:24 omega kernel: [11659.827123] BTRFS info (device sdb1): has 
skinny extents
Feb  1 16:38:24 omega kernel: [11659.920248] BTRFS error (device sdb1): device 
total_bytes should be at most 8001557626880 but found 8001558675456
Feb  1 16:38:24 omega kernel: [11659.920318] BTRFS error (device sdb1): failed 
to read chunk tree: -22
Feb  1 16:38:24 omega kernel: [11659.934266] BTRFS error (device sdb1): 
open_ctree failed


I was using kernel 5.15.0. I downgraded to kernel 5.10.130 using 
ubuntu-mainline-kernel.sh, and with that kernel I was able to mount the volume.

Looking at the disk sizes, the disks sdb, sdc and sdd were
8001563222016B each. But sda was about 1MB smaller than the others at
8001562140160B. sda's SMART data didn't report any reallocated sectors
or similar, but for some reason the disk was now smaller. All the disks
were the same model.


# parted /dev/sda unit B print
Model: ATA WDC WD80EFZX-68U (scsi)
Disk /dev/sda: 8001562140160B
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start     End             Size            File system  Name      Flags
 1      4194304B  8001561821183B  8001557626880B  btrfs        my-btrfs

# parted /dev/sdb unit B print
Model: ATA WDC WD80EFZX-68U (scsi)
Disk /dev/sdb: 8001563222016B
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start     End             Size            File system  Name     Flags
 1      1048576B  8001562869759B  8001561821184B  btrfs        primary

# parted /dev/sdc unit B print
Model: ATA WDC WD80EFZX-68U (scsi)
Disk /dev/sdc: 8001563222016B
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start     End             Size            File system  Name     Flags
 1      1048576B  8001562869759B  8001561821184B  btrfs        primary

# parted /dev/sdd unit B print
Model: ATA WDC WD80EFZX-68U (scsi)
Disk /dev/sdd: 8001563222016B
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start     End             Size            File system  Name      Flags
 1      4194304B  8001562869759B  8001558675456B  btrfs        my-btrfs


When I told btrfs to report the size of each device, btrfs reported sda1's size 
as 8001558675456 bytes, but parted reported sda1's size as 8001557626880 bytes. 
This was in line with "BTRFS error (device sdb1): device total_bytes should be 
at most 8001557626880 but found 8001558675456". Confusingly, that error message 
didn't say that sda1 had a problem, but it reported sdb1 (i.e. the filesystem's 
first device).


# btrfs filesystem show --raw
Label: 'data'  uuid: 1ed4397c-bac2-43b7-8e00-fec7bfcecee6
        Total devices 4 FS bytes used 8583317413888
        devid    1 size 8001561821184 used 4288558399488 path /dev/sdb1
        devid    2 size 8001561821184 used 4288558399488 path /dev/sdc1
        devid    3 size 8001558675456 used 4296041037824 path /dev/sda1
        devid    4 size 8001558675456 used 4296041037824 path /dev/sdd1


After the volume was mounted, I was able to run "btrfs filesystem resize". That 
fixed the device size reported by btrfs, so that both btrfs and parted agreed 
that the sda1 partition was 8001557626880 bytes.


# btrfs filesystem resize 3:max /mnt/data/
Resize device id 3 (/dev/sda1) from 7.28TiB to max

# btrfs filesystem show --raw
Label: 'data'  uuid: 1ed4397c-bac2-43b7-8e00-fec7bfcecee6
        Total devices 4 FS bytes used 8583317413888
        devid    1 size 8001561821184 used 4288558399488 path /dev/sdb1
        devid    2 size 8001561821184 used 4288558399488 path /dev/sdc1
        devid    3 size 8001557626880 used 4296041037824 path /dev/sda1
        devid    4 size 8001558675456 used 4296041037824 path /dev/sdd1


After that I was able to reboot to kernel 5.15.0 and mount the btrfs volume 
normally.

Running "btrfs scrub" found 258 corruption errors on sda1 and fixed them
all. The scrub logs in kern.log reported 10 corruptions, all within 1MB
of each other:


Feb  2 00:57:30 omega kernel: [13245.463818] BTRFS warning (device sdb1): 
checksum error at logical 8373201805312 on dev /dev/sda1, physical 
1404488392704, root 259, inode 27758561, offset 54009856, length 4096, links 1 
(path: xxxxxx)
Feb  2 00:57:30 omega kernel: [13245.465320] BTRFS error (device sdb1): bdev 
/dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
Feb  2 00:57:30 omega kernel: [13245.471929] BTRFS error (device sdb1): fixed 
up error at logical 8373201805312 on dev /dev/sda1

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1931790

Title:
  Unable to mount btrfs RAID 1 filesystem after reboot - Error - device
  total_bytes should be at most X but found Y

Status in btrfs package in Ubuntu:
  Confirmed
Status in btrfs-progs package in Ubuntu:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi,

  I'm unable to mount my newly created btrfs RAID 1 filesystem after
  reboot..I performed the below steps with my 2 X 1 TB drives:

  * created a new btrfs filesystem
  mkfs.btrfs /dev/sdd1

  * mounted it to /mnt/datanew
  mount -t btrfs /dev/sdd1 /mnt/datanew

  * copied over data from my previous NTFS volume on /dev/sde1 (which I
  intended to add later as a mirror) - took 3.5 hrs

  * added a new volume
  mkfs.btrfs device add /dev/sde1

  * started balance and convert to raid 1 - took 8 to 10
  hours..overnight

  btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/datanew

  * rebooted next morning

  I seeing below errors:
  [  792.045980] BTRFS info (device sdd1): allowing degraded mounts
  [  792.045988] BTRFS info (device sdd1): disk space caching is enabled
  [  792.045990] BTRFS info (device sdd1): has skinny extents
  [  792.047498] BTRFS error (device sdd1): device total_bytes should be at 
most 1000201841152 but found 1000203804672
  [  792.049207] BTRFS error (device sdd1): failed to read chunk tree: -22
  [  792.051525] BTRFS error (device sdd1): open_ctree failed
  * Manual mount has below error additonally
  mount: /mnt/datanew: wrong fs type, bad option, bad superblock on /dev/sde1, 
missing codepage or helper program, or other error.

  * Verified the data seems intact (apparently no errors), superblock
  seems fine.. tried btrfs rescue commands.. no issues found to fix

  * Found several hits on search but not many solutions..except "btrfs 
filesystem resize max /" which doesnt work as I'm unable to mount in the first 
place..
  btrfs filesystem resize max /mnt/datanew/
  ERROR: not a btrfs filesystem: /mnt/datanew/

  Any suggestions? Thanks

  ProblemType: Bug
  DistroRelease: Ubuntu 21.04
  Package: ubuntu-release-upgrader-core 1:21.04.13
  ProcVersionSignature: Ubuntu 5.11.0-17.18-generic 5.11.12
  Uname: Linux 5.11.0-17-generic x86_64
  ApportVersion: 2.20.11-0ubuntu65.1
  Architecture: amd64
  CasperMD5CheckResult: pass
  CrashDB: ubuntu
  Date: Sun Jun 13 11:17:02 2021
  InstallationDate: Installed on 2021-05-16 (27 days ago)
  InstallationMedia: Kubuntu 21.04 "Hirsute Hippo" - Release amd64 (20210420)
  PackageArchitecture: all
  ProcEnviron:
   LANGUAGE=en_IN:en
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_IN
   SHELL=/bin/bash
  SourcePackage: ubuntu-release-upgrader
  Symptom: ubuntu-release-upgrader
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/btrfs/+bug/1931790/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1931790] Re: Unable to mount btrfs RAID 1 filesystem after reboot - Error - device total_bytes should be at most X but found Y

Reply via email to