Package: linux-image-current-odroidxu4
Version: 26.2.1
Severity: important
X-Debbugs-Cc: [email protected]

Dear Maintainer,

I run a home backup server with a 1TB SSD attached to an ODroid HC1 board.  
There is no redundancy, but I do keep a limited history of snapshots, taken 
hourly and trimmed down to a small number of hourly, daily, weekly, monthly, 
and yearly snapshots.  There should be at most about 20 snapshots existing at a 
time.  I have a daily cron job to run a balance operation.  The mount point is 
`/backup` .  Relevant line from /etc/fstab:


/dev/mapper/backup /backup btrfs 
noatime,noauto,nodev,nosuid,noexec,compress=zlib:9 0 0


Command used for daily balance:


btrfs balance start --full-balance '/backup'


The filesystem only seems to be about half full:


ansible@backup:~ $ sudo btrfs filesystem df --si /backup
Data, single: total=430.57GB, used=428.24GB
System, DUP: total=67.11MB, used=81.92kB
Metadata, DUP: total=3.22GB, used=2.48GB
GlobalReserve, single: total=536.87MB, used=49.15kB
ansible@backup:~ $


My best understanding is that the messages that follow indicate that something 
is attempting to address space beyond 16TB, and that seems like a bug because 
my disk is only 1TB total.  I would appreciate anyone with a firmer grasp of 
the topic at least confirming whether that is an accurate reading of the 
warnings and errors.  If not, I might suggest that the warnings and errors 
could be improved, but I am open to correction if I just failed to understand 
what they are telling me.

The problem seems to be triggered by any attempt to rebalance metadata or 
system data with a usage limit of 1 or higher.  Excerpt from a command session:


ansible@backup:~ $ time sudo btrfs balance start -musage=1 /backup; echo $?
ERROR: error during balancing '/backup': Read-only file system
There may be more info in syslog - try dmesg | tail

real    0m0.304s
user    0m0.040s
sys     0m0.082s
1
ansible@backup:~ $ sudo umount /backup && sudo mount -o skip_balance /backup && 
sudo btrfs balance cancel /backup
ansible@backup:~ $ time sudo btrfs balance start -f -susage=1 /backup; echo $?
ERROR: error during balancing '/backup': Read-only file system
There may be more info in syslog - try dmesg | tail

real    0m0.321s
user    0m0.023s
sys     0m0.096s
1
ansible@backup:~ $ 


Excerpt from dmesg from that time:


[114023.693265] BTRFS info (device dm-0): balance: start -musage=1 -susage=1
[114023.695656] BTRFS error (device dm-0): extent buffer 18905857064960 is 
beyond 32bit page cache limit
[114023.695669] BTRFS error (device dm-0): reached 32bit limit for logical 
addresses
[114023.695677] BTRFS error (device dm-0): due to page cache limit on 32bit 
systems, metadata beyond 16T can't be accessed
[114023.695686] BTRFS error (device dm-0): please consider upgrading to 64bit 
kernel/hardware
[114023.695699] ------------[ cut here ]------------
[114023.695706] WARNING: CPU: 4 PID: 30371 at fs/btrfs/space-info.h:208 
btrfs_space_info_update_bytes_may_use+0x9c/0x1c0 [btrfs]
[114023.695813] Modules linked in: aes_arm_bs crypto_simd dm_crypt 
cpufreq_conservative cpufreq_userspace cpufreq_powersave sunrpc zram zsmalloc 
binfmt_misc evdev sg nfnetlink ip_tables x_tables ipv6 autofs4 btrfs 
blake2b_generic xor xor_neon raid6_pq li
bcrc32c sd_mod t10_pi crc64_rocksoft uas usb_storage scsi_mod scsi_common 
gpio_keys
[114023.695959] CPU: 4 PID: 30371 Comm: btrfs Tainted: G        W          
6.6.122-current-odroidxu4 #1
[114023.695969] Hardware name: Samsung Exynos (Flattened Device Tree)
[114023.695980]  unwind_backtrace from show_stack+0x10/0x14
[114023.696000]  show_stack from dump_stack_lvl+0x40/0x4c
[114023.696014]  dump_stack_lvl from __warn+0x78/0x154
[114023.696025]  __warn from warn_slowpath_fmt+0x1b4/0x1bc
[114023.696035]  warn_slowpath_fmt from 
btrfs_space_info_update_bytes_may_use+0x9c/0x1c0 [btrfs]
[114023.696127]  btrfs_space_info_update_bytes_may_use [btrfs] from 
btrfs_block_rsv_release+0x1f4/0x2f4 [btrfs]
[114023.696300]  btrfs_block_rsv_release [btrfs] from 
btrfs_alloc_tree_block+0x118/0x6b4 [btrfs]
[114023.696465]  btrfs_alloc_tree_block [btrfs] from 
btrfs_force_cow_block+0x148/0xa60 [btrfs]
[114023.696628]  btrfs_force_cow_block [btrfs] from btrfs_cow_block+0xe4/0x2bc 
[btrfs]
[114023.696792]  btrfs_cow_block [btrfs] from btrfs_search_slot+0x6dc/0xc08 
[btrfs]
[114023.696957]  btrfs_search_slot [btrfs] from btrfs_update_device+0xa4/0x248 
[btrfs]
[114023.697122]  btrfs_update_device [btrfs] from 
btrfs_chunk_alloc_add_chunk_item+0xcc/0x638 [btrfs]
[114023.697287]  btrfs_chunk_alloc_add_chunk_item [btrfs] from 
reserve_chunk_space+0xec/0x180 [btrfs]
[114023.697451]  reserve_chunk_space [btrfs] from check_system_chunk+0x6c/0x74 
[btrfs]
[114023.697615]  check_system_chunk [btrfs] from 
btrfs_inc_block_group_ro+0x228/0x234 [btrfs]
[114023.697780]  btrfs_inc_block_group_ro [btrfs] from 
btrfs_relocate_block_group+0x94/0x48c [btrfs]
[114023.697945]  btrfs_relocate_block_group [btrfs] from 
btrfs_relocate_chunk+0x3c/0x180 [btrfs]
[114023.698108]  btrfs_relocate_chunk [btrfs] from btrfs_balance+0x864/0x13b8 
[btrfs]
[114023.698272]  btrfs_balance [btrfs] from btrfs_ioctl+0x248c/0x28c4 [btrfs]
[114023.698436]  btrfs_ioctl [btrfs] from sys_ioctl+0x288/0xc60
[114023.698533]  sys_ioctl from ret_fast_syscall+0x0/0x54
[114023.698547] Exception stack(0xf2575fa8 to 0xf2575ff0)
[114023.698556] 5fa0:                   00000000 00000003 00000003 c4009420 
bebe7fd0 bebe7f70
[114023.698565] 5fc0: 00000000 00000003 bebe7fd0 00000036 005bfb78 bebe86f4 
00000001 0058b190
[114023.698572] 5fe0: 00000036 bebe7f58 b6d718b1 b6cdf736
[114023.698580] ---[ end trace 0000000000000000 ]---
[114023.698802] BTRFS error (device dm-0): extent buffer 18905857064960 is 
beyond 32bit page cache limit
[114023.698821] ------------[ cut here ]------------
[114023.698827] WARNING: CPU: 4 PID: 30371 at fs/btrfs/block-group.c:2783 
btrfs_create_pending_block_groups+0x674/0x67c [btrfs]
[114023.698925] BTRFS: Transaction aborted (error -75)
[114064.918054] Modules linked in: aes_arm_bs crypto_simd dm_crypt 
cpufreq_conservative cpufreq_userspace cpufreq_powersave sunrpc zram zsmalloc 
binfmt_misc evdev sg nfnetlink ip_tables x_tables ipv6 autofs4 btrfs 
blake2b_generic xor xor_neon raid6_pq libcrc32c sd_mod t10_pi crc64_rocksoft 
uas usb_storage scsi_mod scsi_common gpio_keys
[114064.918234] CPU: 5 PID: 30410 Comm: btrfs Tainted: G        W          
6.6.122-current-odroidxu4 #1
[114064.918245] Hardware name: Samsung Exynos (Flattened Device Tree)
[114064.918256]  unwind_backtrace from show_stack+0x10/0x14
[114064.918272]  show_stack from dump_stack_lvl+0x40/0x4c
[114064.918284]  dump_stack_lvl from __warn+0x78/0x154
[114064.918296]  __warn from warn_slowpath_fmt+0x130/0x1bc
[114064.918307]  warn_slowpath_fmt from 
btrfs_create_pending_block_groups+0x674/0x67c [btrfs]
[114064.918399]  btrfs_create_pending_block_groups [btrfs] from 
__btrfs_end_transaction+0x38/0x2a0 [btrfs]
[114064.918567]  __btrfs_end_transaction [btrfs] from 
btrfs_inc_block_group_ro+0x1f0/0x234 [btrfs]
[114064.918733]  btrfs_inc_block_group_ro [btrfs] from 
btrfs_relocate_block_group+0x94/0x48c [btrfs]
[114064.918899]  btrfs_relocate_block_group [btrfs] from 
btrfs_relocate_chunk+0x3c/0x180 [btrfs]
[114064.919065]  btrfs_relocate_chunk [btrfs] from btrfs_balance+0x864/0x13b8 
[btrfs]
[114064.919231]  btrfs_balance [btrfs] from btrfs_ioctl+0x248c/0x28c4 [btrfs]
[114064.919397]  btrfs_ioctl [btrfs] from sys_ioctl+0x288/0xc60
[114064.919491]  sys_ioctl from ret_fast_syscall+0x0/0x54
[114064.919505] Exception stack(0xf26b9fa8 to 0xf26b9ff0)
[114064.919516] 9fa0:                   00000000 00000003 00000003 c4009420 
beae9fc0 beae9f60
[114064.919527] 9fc0: 00000000 00000003 beae9fc0 00000036 0057fb78 beaea6f1 
00000001 0054b190
[114064.919537] 9fe0: 00000036 beae9f48 b6d118b1 b6c7f736
[114064.919614] ---[ end trace 0000000000000000 ]---
[114064.919632] BTRFS: error (device dm-0: state A) in 
btrfs_create_pending_block_groups:2783: errno=-75 unknown
[114064.919647] BTRFS info (device dm-0: state EA): forced readonly
[114064.920025] ------------[ cut here ]------------
[114064.920039] WARNING: CPU: 5 PID: 30410 at fs/btrfs/space-info.h:208 
btrfs_space_info_update_bytes_may_use+0x9c/0x1c0 [btrfs]
[114064.920182] Modules linked in: aes_arm_bs crypto_simd dm_crypt 
cpufreq_conservative cpufreq_userspace cpufreq_powersave sunrpc zram zsmalloc 
binfmt_misc evdev sg nfnetlink ip_tables x_tables ipv6 autofs4 btrfs 
blake2b_generic xor xor_neon raid6_pq libcrc32c sd_mod t10_pi crc64_rocksoft 
uas usb_storage scsi_mod scsi_common gpio_keys
[114064.920385] CPU: 5 PID: 30410 Comm: btrfs Tainted: G        W          
6.6.122-current-odroidxu4 #1
[114064.920398] Hardware name: Samsung Exynos (Flattened Device Tree)
[114064.920409]  unwind_backtrace from show_stack+0x10/0x14
[114064.920426]  show_stack from dump_stack_lvl+0x40/0x4c
[114064.920441]  dump_stack_lvl from __warn+0x78/0x154
[114064.920454]  __warn from warn_slowpath_fmt+0x1b4/0x1bc
[114064.920466]  warn_slowpath_fmt from 
btrfs_space_info_update_bytes_may_use+0x9c/0x1c0 [btrfs]
[114064.920559]  btrfs_space_info_update_bytes_may_use [btrfs] from 
btrfs_block_rsv_release+0x1f4/0x2f4 [btrfs]
[114064.920729]  btrfs_block_rsv_release [btrfs] from 
btrfs_trans_release_chunk_metadata+0x2c/0x40 [btrfs]
[114064.920900]  btrfs_trans_release_chunk_metadata [btrfs] from 
__btrfs_end_transaction+0x38/0x2a0 [btrfs]
[114064.921070]  __btrfs_end_transaction [btrfs] from 
btrfs_inc_block_group_ro+0x1f0/0x234 [btrfs]
[114064.921239]  btrfs_inc_block_group_ro [btrfs] from 
btrfs_relocate_block_group+0x94/0x48c [btrfs]
[114064.921406]  btrfs_relocate_block_group [btrfs] from 
btrfs_relocate_chunk+0x3c/0x180 [btrfs]
[114064.921572]  btrfs_relocate_chunk [btrfs] from btrfs_balance+0x864/0x13b8 
[btrfs]
[114064.921741]  btrfs_balance [btrfs] from btrfs_ioctl+0x248c/0x28c4 [btrfs]
[114064.921913]  btrfs_ioctl [btrfs] from sys_ioctl+0x288/0xc60
[114064.922009]  sys_ioctl from ret_fast_syscall+0x0/0x54
[114064.922024] Exception stack(0xf26b9fa8 to 0xf26b9ff0)
[114064.922036] 9fa0:                   00000000 00000003 00000003 c4009420 
beae9fc0 beae9f60
[114064.922046] 9fc0: 00000000 00000003 beae9fc0 00000036 0057fb78 beaea6f1 
00000001 0054b190
[114064.922057] 9fe0: 00000036 beae9f48 b6d118b1 b6c7f736
[114064.922067] ---[ end trace 0000000000000000 ]---
[114064.922111] BTRFS info (device dm-0: state EA): balance: ended with status: 
-30


I tried balancing data, starting with low usage limits and moving up to no 
limit.  That all went smoothly.  The problem still occurred when trying to 
balance metadata, but the extent buffer number named in dmesg errors changed.  
I do not know whether that is significant or whether balancing data should be 
expected to be of any benefit in this situation.  I did not keep full context 
from dmesg, but these are the commands issued:


btrfs balance start -dusage=1 /backup
btrfs balance start -dusage=10 /backup
btrfs balance start -dusage=50 /backup
btrfs balance start -dusage=75 /backup
btrfs balance start -dusage=85 /backup
btrfs balance start -dusage=95 /backup
btrfs balance start -d /backup


Most of those were pretty fast.  The last one took several hours.

I also checked my SSD for SMART errors, running a short self test and then a
long one.  No errors are reported.

I do not know whether this is an architecture specific bug.  It may be specific 
to 32 bit systems, or to ARM 32 bit, or it may not.

I do not know whether this is an upstream bug, but it seems likely.  The 
behavior in question seems to be related to the btrfs kernel component.

The recommended fix is to attach the SSD to a 64 bit system to complete any 
operations requiring addresses that cannot be handled by the 32 bit system.  I 
have not done that, yet, and I will hold off for a while, as it seems like that 
will only be a temporary fix.

If I can gather any more relevant information while the system is still in this 
state, please let me know.


-- System Information:
Debian Release: 13.4
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: armhf (armv7l)

Kernel: Linux 6.6.122-current-odroidxu4 (SMP w/8 CPU threads; PREEMPT)
Kernel taint flags: TAINT_WARN
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL 
set to en_US.UTF-8), LANGUAGE=en_US.UTF-8
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)

-- no debconf information

Reply via email to