Control: reassign -1 src:linux Dear Håkan,
thanks for reporting back and testing! * Håkan T Johansson <f96h...@chalmers.se> [220801 19:31]: > On Sun, 31 Jul 2022, Chris Hofstaedtler wrote: > > > I can't see a difference that should matter from userspace. > > > > I have stared a bit at the kernel code... there have been quite some > > changes and fixes in this area. Which kernel version were you > > running when testing this? > > > > Could you retry on something >= 5.9? I.e. some version with patch > > 08fc1ab6d748ab1a690fd483f41e2938984ce353. > > I believe that I was running 5.10 (bullseye). > > It looks like 5.18 (from backports) does not show the issue! (i.e. works) Okay, I think we are now clearly in "this is not an mdadm bug per se" territory (-> reassigning to src:linux). [..] > This time I did get some dmesg BUG output as well (attached). > It does not seem to be the same backtrace on two occurances. > > I also noticed that the BUG: report in dmesg does not happen directly > when doing 'mdadm --examine --scan --config=partitions'. It rather > occurs when some activity happens on the host filesystem, e.g. > a 'touch /root/a' command. > > host: > linux-image-5.18.0-0.bpo.1-amd64 5.18.2-1~bpo11+1 > > (did not re-install anything else, except upgraded zfs, also from > backports (since pure bullseye would not compile with 5.18)) > > Does not exhibit the problem. > > I have tried with both kernels several times, and it was repeatable that > 5.10 got stuck while 5.18 does not show issues. Its good that this now works in 5.18. However I'm not sure how we should find the commit fixing this - in 5.14 lots of block layer code was shuffled around/refactored. If you have the time, maybe trying the various kernel versions between 5.10 and 5.18 would be a good start. If they are not in backports anymore, they should still be at http://snapshot.debian.org/package/linux/ > Reminder: to get the issue, /dev/ should not be mounted in the chroot. > With /dev/ mounted, 5.10 also works. I'll see if I can repro this on 5.10, but need to find a box first. Best, Chris > [mån aug 1 15:53:08 2022] BUG: kernel NULL pointer dereference, address: > 0000000000000010 > [mån aug 1 15:53:08 2022] #PF: supervisor read access in kernel mode > [mån aug 1 15:53:08 2022] #PF: error_code(0x0000) - not-present page > [mån aug 1 15:53:08 2022] PGD 0 P4D 0 > [mån aug 1 15:53:08 2022] Oops: 0000 [#1] SMP PTI > [mån aug 1 15:53:08 2022] CPU: 2 PID: 284256 Comm: cron Tainted: P > OE 5.10.0-16-amd64 #1 Debian 5.10.127-2 > [mån aug 1 15:53:08 2022] Hardware name: Dell Computer Corporation PowerEdge > 2850/0T7971, BIOS A04 09/22/2005 > [mån aug 1 15:53:08 2022] RIP: > 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4] > [mån aug 1 15:53:08 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 > 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab d7 bb e1 48 8b 45 > 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00 > [mån aug 1 15:53:08 2022] RSP: 0018:ffffae27c059fd60 EFLAGS: 00010246 > [mån aug 1 15:53:08 2022] RAX: 0000000000000000 RBX: ffff9d1b94505480 RCX: > ffff9d1bc52e5e38 > [mån aug 1 15:53:08 2022] RDX: ffff9d1bc13782d8 RSI: 0000000000000c14 RDI: > ffffffffc096feb0 > [mån aug 1 15:53:08 2022] RBP: ffff9d1bc52e5e38 R08: ffff9d1be04d5230 R09: > 0000000000000001 > [mån aug 1 15:53:08 2022] R10: ffff9d1bc985f000 R11: 000000000000001d R12: > ffff9d1bc13782d8 > [mån aug 1 15:53:08 2022] R13: ffff9d1be04d5000 R14: 0000000000000c14 R15: > ffff9d1bc13782d8 > [mån aug 1 15:53:08 2022] FS: 00007fed5ecb1840(0000) > GS:ffff9d1cd7c80000(0000) knlGS:0000000000000000 > [mån aug 1 15:53:08 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [mån aug 1 15:53:08 2022] CR2: 0000000000000010 CR3: 00000001a46d8000 CR4: > 00000000000006e0 > [mån aug 1 15:53:08 2022] Call Trace: > [mån aug 1 15:53:08 2022] ext4_orphan_del+0x23f/0x290 [ext4] > [mån aug 1 15:53:08 2022] ext4_evict_inode+0x31f/0x630 [ext4] > [mån aug 1 15:53:08 2022] evict+0xd1/0x1a0 > [mån aug 1 15:53:08 2022] __dentry_kill+0xe4/0x180 > [mån aug 1 15:53:08 2022] dput+0x149/0x2f0 > [mån aug 1 15:53:08 2022] __fput+0xe4/0x240 > [mån aug 1 15:53:08 2022] task_work_run+0x65/0xa0 > [mån aug 1 15:53:08 2022] exit_to_user_mode_prepare+0x111/0x120 > [mån aug 1 15:53:08 2022] syscall_exit_to_user_mode+0x28/0x140 > [mån aug 1 15:53:08 2022] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [mån aug 1 15:53:08 2022] RIP: 0033:0x7fed5eea2d77 > [mån aug 1 15:53:08 2022] Code: 44 00 00 48 8b 15 19 a1 0c 00 f7 d8 64 89 02 > b8 ff ff ff ff eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 03 00 00 00 0f > 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 e9 a0 0c 00 f7 d8 64 89 02 b8 > [mån aug 1 15:53:08 2022] RSP: 002b:00007ffd50452818 EFLAGS: 00000202 > ORIG_RAX: 0000000000000003 > [mån aug 1 15:53:08 2022] RAX: 0000000000000000 RBX: 000055dab4578910 RCX: > 00007fed5eea2d77 > [mån aug 1 15:53:08 2022] RDX: 00007fed5ef6e8a0 RSI: 0000000000000000 RDI: > 0000000000000006 > [mån aug 1 15:53:08 2022] RBP: 0000000000000000 R08: 0000000000000000 R09: > 00007fed5ef6dbe0 > [mån aug 1 15:53:08 2022] R10: 000000000000006f R11: 0000000000000202 R12: > 00007fed5ef6f4a0 > [mån aug 1 15:53:08 2022] R13: 0000000000000000 R14: 0000000000000000 R15: > 0000000000000001 > [mån aug 1 15:53:08 2022] Modules linked in: msr autofs4 nfsd auth_rpcgss > nfsv3 nfs_acl nfs lockd grace sunrpc nfs_ssc fscache xt_mac xt_length > xt_recent xt_multiport xt_tcpudp xt_state xt_conntrack nf_conntrack > nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables loop dcdbas > radeon zfs(POE) zunicode(POE) zzstd(OE) ttm zlua(OE) zavl(POE) icp(POE) > drm_kms_helper iTCO_wdt intel_pmc_bxt cec iTCO_vendor_support zcommon(POE) > watchdog znvpair(POE) intel_powerclamp ipmi_si drm pcspkr spl(OE) > ipmi_devintf serio_raw ipmi_msghandler rng_core i2c_algo_bit sg evdev > e752x_edac button overlay ext4 crc16 mbcache jbd2 btrfs blake2b_generic > raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor > raid6_pq libcrc32c crc32c_generic raid0 multipath linear raid1 sd_mod sr_mod > cdrom ata_generic md_mod mptspi mptscsih ata_piix libata mptbase > scsi_transport_spi nvme ehci_pci uhci_hcd nvme_core ehci_hcd t10_pi scsi_mod > lpc_ich crc_t10dif crct10dif_generic psmouse usbcore e1000 crct10dif_common > [mån aug 1 15:53:08 2022] usb_common video > [mån aug 1 15:53:08 2022] CR2: 0000000000000010 > [mån aug 1 15:53:08 2022] ---[ end trace 4fd9ed73d190bc2a ]--- > [mån aug 1 15:53:08 2022] RIP: > 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4] > [mån aug 1 15:53:08 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 > 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab d7 bb e1 48 8b 45 > 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00 > [mån aug 1 15:53:08 2022] RSP: 0018:ffffae27c059fd60 EFLAGS: 00010246 > [mån aug 1 15:53:08 2022] RAX: 0000000000000000 RBX: ffff9d1b94505480 RCX: > ffff9d1bc52e5e38 > [mån aug 1 15:53:08 2022] RDX: ffff9d1bc13782d8 RSI: 0000000000000c14 RDI: > ffffffffc096feb0 > [mån aug 1 15:53:08 2022] RBP: ffff9d1bc52e5e38 R08: ffff9d1be04d5230 R09: > 0000000000000001 > [mån aug 1 15:53:08 2022] R10: ffff9d1bc985f000 R11: 000000000000001d R12: > ffff9d1bc13782d8 > [mån aug 1 15:53:08 2022] R13: ffff9d1be04d5000 R14: 0000000000000c14 R15: > ffff9d1bc13782d8 > [mån aug 1 15:53:08 2022] FS: 00007fed5ecb1840(0000) > GS:ffff9d1cd7c80000(0000) knlGS:0000000000000000 > [mån aug 1 15:53:08 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [mån aug 1 15:53:08 2022] CR2: 0000000000000010 CR3: 00000001a46d8000 CR4: > 00000000000006e0 > [mån aug 1 18:57:57 2022] BUG: kernel NULL pointer dereference, address: > 0000000000000010 > [mån aug 1 18:57:57 2022] #PF: supervisor read access in kernel mode > [mån aug 1 18:57:57 2022] #PF: error_code(0x0000) - not-present page > [mån aug 1 18:57:57 2022] PGD 0 P4D 0 > [mån aug 1 18:57:57 2022] Oops: 0000 [#1] SMP PTI > [mån aug 1 18:57:57 2022] CPU: 2 PID: 4427 Comm: touch Tainted: P > OE 5.10.0-16-amd64 #1 Debian 5.10.127-2 > [mån aug 1 18:57:57 2022] Hardware name: Dell Computer Corporation PowerEdge > 2850/0T7971, BIOS A04 09/22/2005 > [mån aug 1 18:57:57 2022] RIP: > 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4] > [mån aug 1 18:57:57 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 > 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab 57 e9 e5 48 8b 45 > 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00 > [mån aug 1 18:57:57 2022] RSP: 0018:ffffc2b08062fb78 EFLAGS: 00010246 > [mån aug 1 18:57:57 2022] RAX: 0000000000000000 RBX: 0000000000000000 RCX: > ffff9daed0440068 > [mån aug 1 18:57:57 2022] RDX: ffff9daec0fb53b8 RSI: 0000000000000469 RDI: > ffffffffc0896c80 > [mån aug 1 18:57:57 2022] RBP: ffff9daed0440068 R08: ffff9daed07f7138 R09: > 0000000000000000 > [mån aug 1 18:57:57 2022] R10: ffff9daec4c2ef08 R11: 0000000000000000 R12: > ffff9daec0fb53b8 > [mån aug 1 18:57:57 2022] R13: ffff9daee013d800 R14: 0000000000000469 R15: > ffff9daee013d800 > [mån aug 1 18:57:57 2022] FS: 00007febc0a915c0(0000) > GS:ffff9dafd7c80000(0000) knlGS:0000000000000000 > [mån aug 1 18:57:57 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [mån aug 1 18:57:57 2022] CR2: 0000000000000010 CR3: 0000000106616000 CR4: > 00000000000006e0 > [mån aug 1 18:57:57 2022] Call Trace: > [mån aug 1 18:57:57 2022] ? __ext4_handle_dirty_metadata+0x51/0x1a0 [ext4] > [mån aug 1 18:57:57 2022] __ext4_new_inode+0x925/0x1690 [ext4] > [mån aug 1 18:57:57 2022] ext4_create+0x106/0x1b0 [ext4] > [mån aug 1 18:57:57 2022] path_openat+0xde1/0x1080 > [mån aug 1 18:57:57 2022] do_filp_open+0x88/0x130 > [mån aug 1 18:57:57 2022] ? getname_flags.part.0+0x29/0x1a0 > [mån aug 1 18:57:57 2022] ? __check_object_size+0x136/0x150 > [mån aug 1 18:57:57 2022] do_sys_openat2+0x97/0x150 > [mån aug 1 18:57:57 2022] __x64_sys_openat+0x54/0x90 > [mån aug 1 18:57:57 2022] do_syscall_64+0x33/0x80 > [mån aug 1 18:57:57 2022] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [mån aug 1 18:57:57 2022] RIP: 0033:0x7febc09b9be7 > [mån aug 1 18:57:57 2022] Code: 25 00 00 41 00 3d 00 00 41 00 74 47 64 8b 04 > 25 18 00 00 00 85 c0 75 6b 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f > 05 <48> 3d 00 f0 ff ff 0f 87 95 00 00 00 48 8b 4c 24 28 64 48 2b 0c 25 > [mån aug 1 18:57:57 2022] RSP: 002b:00007ffedb21a7f0 EFLAGS: 00000246 > ORIG_RAX: 0000000000000101 > [mån aug 1 18:57:57 2022] RAX: ffffffffffffffda RBX: 00007ffedb21aaa8 RCX: > 00007febc09b9be7 > [mån aug 1 18:57:57 2022] RDX: 0000000000000941 RSI: 00007ffedb21ae94 RDI: > 00000000ffffff9c > [mån aug 1 18:57:57 2022] RBP: 00007ffedb21ae94 R08: 0000000000000000 R09: > 0000000000000000 > [mån aug 1 18:57:57 2022] R10: 00000000000001b6 R11: 0000000000000246 R12: > 0000000000000941 > [mån aug 1 18:57:57 2022] R13: 00007ffedb21ae94 R14: 0000000000000000 R15: > 0000000000000000 > [mån aug 1 18:57:57 2022] Modules linked in: msr autofs4 nfsd auth_rpcgss > nfsv3 nfs_acl nfs lockd grace sunrpc nfs_ssc fscache xt_mac xt_length > xt_recent xt_multiport xt_tcpudp xt_state xt_conntrack nf_conntrack > nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables loop radeon > zfs(POE) ttm zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) drm_kms_helper > iTCO_wdt cec icp(POE) intel_pmc_bxt dcdbas iTCO_vendor_support ipmi_si > watchdog zcommon(POE) znvpair(POE) intel_powerclamp drm spl(OE) ipmi_devintf > pcspkr ipmi_msghandler i2c_algo_bit sg serio_raw rng_core e752x_edac evdev > button overlay ext4 crc16 mbcache jbd2 btrfs blake2b_generic raid10 raid456 > async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq > libcrc32c crc32c_generic raid0 multipath linear raid1 sd_mod sr_mod cdrom > ata_generic md_mod ata_piix libata nvme mptspi mptscsih nvme_core uhci_hcd > ehci_pci e1000 ehci_hcd t10_pi crc_t10dif psmouse mptbase usbcore > crct10dif_generic scsi_transport_spi scsi_mod lpc_ich crct10dif_common > [mån aug 1 18:57:57 2022] usb_common video > [mån aug 1 18:57:57 2022] CR2: 0000000000000010 > [mån aug 1 18:57:57 2022] ---[ end trace 284590a68ce9a232 ]--- > [mån aug 1 18:57:57 2022] RIP: > 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4] > [mån aug 1 18:57:57 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 > 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab 57 e9 e5 48 8b 45 > 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00 > [mån aug 1 18:57:57 2022] RSP: 0018:ffffc2b08062fb78 EFLAGS: 00010246 > [mån aug 1 18:57:57 2022] RAX: 0000000000000000 RBX: 0000000000000000 RCX: > ffff9daed0440068 > [mån aug 1 18:57:57 2022] RDX: ffff9daec0fb53b8 RSI: 0000000000000469 RDI: > ffffffffc0896c80 > [mån aug 1 18:57:57 2022] RBP: ffff9daed0440068 R08: ffff9daed07f7138 R09: > 0000000000000000 > [mån aug 1 18:57:57 2022] R10: ffff9daec4c2ef08 R11: 0000000000000000 R12: > ffff9daec0fb53b8 > [mån aug 1 18:57:57 2022] R13: ffff9daee013d800 R14: 0000000000000469 R15: > ffff9daee013d800 > [mån aug 1 18:57:57 2022] FS: 00007febc0a915c0(0000) > GS:ffff9dafd7c80000(0000) knlGS:0000000000000000 > [mån aug 1 18:57:57 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [mån aug 1 18:57:57 2022] CR2: 0000000000000010 CR3: 0000000106616000 CR4: > 00000000000006e0 > [mån aug 1 19:24:19 2022] EXT4-fs error (device md127): > ext4_validate_inode_bitmap:105: comm touch: Corrupt inode bitmap - > block_group = 0, inode_bitmap = 494 > [mån aug 1 19:24:19 2022] Aborting journal on device md127-8. > [mån aug 1 19:24:19 2022] EXT4-fs (md127): Remounting filesystem read-only