Another problem that seems potentially related to
http://thread.gmane.org/gmane.comp.file-systems.openzfs.devel/2911/focus=2917
but could be somehting different.
So far reproduced only on FreeBSD.
panic: solaris assert: used > 0 ||
dsl_dir_phys(dd)->dd_used_breakdown[type] >= -used, file:
/usr/devel/svn/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c,
line: 1389
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe004db89410
vpanic() at vpanic+0x182/frame 0xfffffe004db89490
panic() at panic+0x43/frame 0xfffffe004db894f0
assfail() at assfail+0x1a/frame 0xfffffe004db89500
dsl_dir_diduse_space() at dsl_dir_diduse_space+0x200/frame
0xfffffe004db89580
dsl_dataset_clone_swap_sync_impl() at
dsl_dataset_clone_swap_sync_impl+0x3f5/frame 0xfffffe004db89690
dsl_dataset_rollback_sync() at dsl_dataset_rollback_sync+0x11d/frame
0xfffffe004db897f0
dsl_sync_task_sync() at dsl_sync_task_sync+0xef/frame 0xfffffe004db89820
dsl_pool_sync() at dsl_pool_sync+0x45b/frame 0xfffffe004db89890
spa_sync() at spa_sync+0x7c7/frame 0xfffffe004db89ad0
txg_sync_thread() at txg_sync_thread+0x383/frame 0xfffffe004db89bb0
After the panic the pool is left in a state where running the same
rollback command (zfs rollback zroot2/test/4@1) results in the same panic.
Some data from the affected datasets:
zroot2/test/4 used 1076224 -
zroot2/test/4 referenced 51200 -
zroot2/test/4 usedbysnapshots 1024 -
zroot2/test/4 usedbydataset 40960 -
zroot2/test/4 usedbychildren 1034240 -
zroot2/test/4 origin zroot2/test/1@4 -
zroot2/test/4@1 used 1024 -
zroot2/test/4@1 referenced 50176 -
zroot2/test/4@1 clones zroot2/test/3/2/2/4 -
zroot2/test/1@4 used 24576 -
zroot2/test/1@4 referenced 50176 -
zroot2/test/1@4 clones zroot2/test/4 -
zroot2/test/1 used 318464 -
zroot2/test/1 referenced 50176 -
zroot2/test/1 usedbysnapshots 25600 -
zroot2/test/1 usedbydataset 50176 -
zroot2/test/1 usedbychildren 242688 -
zroot2/test/1 origin - -
Using a debugger I determined that zroot2/test/4 has a deadlist of size
40960. So, in dsl_dataset_clone_swap_sync_impl() dused is calculated as:
dused = 50176 + 0 - (51200 + 40960) = -41984
This is a value that gets passed to
dsl_dir_diduse_space(origin_head->ds_dir, DD_USED_HEAD,
dused, dcomp, duncomp, tx);
And dd_used_breakdown[DD_USED_HEAD] is 40960 there, so the assertion
prevents it from going to the negative territory.
I am not sure how the datasets came to this state.
I can provide any additional data that can be queried from the pool and
from the crash dump.
--
Andriy Gapon
-------------------------------------------
openzfs-developer
Archives: https://www.listbox.com/member/archive/274414/=now
RSS Feed: https://www.listbox.com/member/archive/rss/274414/28015062-cce53afa
Modify Your Subscription:
https://www.listbox.com/member/?member_id=28015062&id_secret=28015062-f966d51c
Powered by Listbox: http://www.listbox.com