** Description changed: + Subject: raid1: Fix NULL pointer de-reference in process_checks() + BugLink: https://bugs.launchpad.net/bugs/2112519 [Impact] - <placeholder for now> + A null pointer dereference was found in raid1 during failure mode testing. + A raid1 array was set up, filled with data and a check operation started. While + the check was underway, all underlying iSCSI disks were forcefully disconnected + with --failfast set to the md array, and the following kernel oops occurs: md/raid1:: dm-0: unrecoverable I/O read error for block 527744 md/raid1:: dm-1: unrecoverable I/O read error for block 527616 md/raid1:: dm-0: unrecoverable I/O read error for block 527744 md/raid1:: dm-1: unrecoverable I/O read error for block 527616 md/raid1:: dm-1: unrecoverable I/O read error for block 527616 md/raid1:: dm-0: unrecoverable I/O read error for block 527744 md/raid1:: dm-1: unrecoverable I/O read error for block 527616 md/raid1:: dm-0: unrecoverable I/O read error for block 527744 BUG: kernel NULL pointer dereference, address: 0000000000000040 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page - PGD 0 P4D 0 + PGD 0 P4D 0 SMP NOPTI CPU: 3 PID: 19372 Comm: md_1t889zmbfni_ Kdump: loaded Not tainted 6.8.0-1029-aws #31-Ubuntu Hardware name: Amazon EC2 m6a.xlarge/, BIOS 1.0 10/16/2017 RIP: 0010:process_checks+0x25e/0x5e0 [raid1] Code: 8e 19 01 00 00 48 8b 85 78 ff ff ff b9 08 00 00 00 48 8d 7d 90 49 8b 1c c4 49 63 c7 4d 8b 74 c4 50 31 c0 f3 48 ab 48 89 5d 88 <4c> 8b 53 40 45 0f b6 4e 18 49 8b 76 40 49 81 7e 38 a0 04 7c c0 75 RSP: 0018:ffffb39e8142bcb8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000004 RDI: ffffb39e8142bd50 RBP: ffffb39e8142bd80 R08: ffff9a2e001ea000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a2e0cd63280 R13: ffff9a2e50d1f800 R14: ffff9a2e50d1f000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff9a3128780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000040 CR3: 00000001035b2004 CR4: 00000000003706f0 Call Trace: - <TASK> - ? show_regs+0x6d/0x80 - ? __die+0x24/0x80 - ? page_fault_oops+0x99/0x1b0 - ? do_user_addr_fault+0x2e0/0x660 - ? exc_page_fault+0x83/0x190 - ? asm_exc_page_fault+0x27/0x30 - ? process_checks+0x25e/0x5e0 [raid1] - ? process_checks+0x125/0x5e0 [raid1] - ? srso_alias_return_thunk+0x5/0xfbef5 - ? ___ratelimit+0xc7/0x130 - sync_request_write+0x1c8/0x1e0 [raid1] - raid1d+0x13a/0x3f0 [raid1] - ? srso_alias_return_thunk+0x5/0xfbef5 - md_thread+0xae/0x190 - ? __pfx_autoremove_wake_function+0x10/0x10 - ? __pfx_md_thread+0x10/0x10 - kthread+0xda/0x100 - ? __pfx_kthread+0x10/0x10 - ret_from_fork+0x47/0x70 - ? __pfx_kthread+0x10/0x10 - ret_from_fork_asm+0x1b/0x30 - </TASK> + <TASK> + ? show_regs+0x6d/0x80 + ? __die+0x24/0x80 + ? page_fault_oops+0x99/0x1b0 + ? do_user_addr_fault+0x2e0/0x660 + ? exc_page_fault+0x83/0x190 + ? asm_exc_page_fault+0x27/0x30 + ? process_checks+0x25e/0x5e0 [raid1] + ? process_checks+0x125/0x5e0 [raid1] + ? srso_alias_return_thunk+0x5/0xfbef5 + ? ___ratelimit+0xc7/0x130 + sync_request_write+0x1c8/0x1e0 [raid1] + raid1d+0x13a/0x3f0 [raid1] + ? srso_alias_return_thunk+0x5/0xfbef5 + md_thread+0xae/0x190 + ? __pfx_autoremove_wake_function+0x10/0x10 + ? __pfx_md_thread+0x10/0x10 + kthread+0xda/0x100 + ? __pfx_kthread+0x10/0x10 + ret_from_fork+0x47/0x70 + ? __pfx_kthread+0x10/0x10 + ret_from_fork_asm+0x1b/0x30 + </TASK> + + What happens is that process_checks() loops through all the available disks to + find a primary source with intact data, all disks are missing, and we shouldn't + move forward without having a valid primary source. [Fix] This was fixed in 6.15-rc3 with: commit b7c178d9e57c8fd4238ff77263b877f6f16182ba Author: Meir Elisha <meir.eli...@volumez.com> Date: Tue Apr 8 17:38:08 2025 +0300 Subject: md/raid1: Add check for missing source disk in process_checks() Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b7c178d9e57c8fd4238ff77263b877f6f16182ba - This has been applied to jammy and plucky already through upstream -stable. - Currently noble and oracular are lagging behind and are not up to the -stable - release with the fix. + This has been applied to focal, jammy and plucky already through upstream + -stable. Currently noble and oracular are lagging behind and are not up to the + -stable release with the fix. + + Bug focal: + https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111448 + Bug jammy: + https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111606 + Bug plucky: + https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111268 [Testcase] + You don't need to set up a full iscsi environment, you can just make some local + VMs and then forcefully remove the underlying disks using libvirt. + + Create a VM, attach two scratch disks: + + $ lsblk + NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS + vda 253:0 0 10G 0 disk + ├─vda1 253:1 0 9G 0 part / + ├─vda14 253:14 0 4M 0 part + ├─vda15 253:15 0 106M 0 part /boot/efi + └─vda16 259:0 0 913M 0 part /boot + vdb 253:16 0 372K 0 disk + vdc 253:32 0 3G 0 disk + vdd 253:48 0 3G 0 disk + vde 253:64 0 3G 0 disk + + Create a raid1 array: + + $ sudo mdadm --create --failfast --verbose /dev/md0 --level=1 --raid- + devices=3 /dev/vdc /dev/vdd /dev/vde + + Make a filesystem: + + $ sudo mkfs.xfs /dev/md0 + + $ sudo mkdir /mnt/disk + $ sudo mount /dev/md0 /mnt/disk + + Fill scratch disks with files: + + for n in {1..1000}; do dd if=/dev/urandom of=file$( printf %03d "$n" + ).bin bs=1024 count=$(( RANDOM)); done + + Start a check: + + $ sudo mdadm --action=check /dev/md0 + + Use virt manager / libvirt to detach the disks, watch dmesg. + + Test kernels are available in the following ppa: + + https://launchpad.net/~mruffell/+archive/ubuntu/sf411666-test + + If you install the test kernel, the null pointer dereference no longer + occurs. + [Where problems can occur] + + We are changing the logic such that if all the reads fail in process_check(), + and we have no valid primary source, then we disable recovery mode, mark an + error occurring, free the bio and exit out. Previously we would have just + continued onward and run into the null pointer dereference. + + This really only affects situations where all backing disks are lost. This isn't + too uncommon though, particularly if all are network storage and network issues + occur, losing access to the disks. Things should remain as they are if at least + one primary source disk exists. + + If a regression were to occur, it would affect raid1 arrays only, and only + during check/repair operations. + + A workaround would be to disable check or repair operations on the md array + until the issue is fixed. + + [Other info] + + Upstream mailing list discussion: + + V1: + https://lore.kernel.org/linux-raid/712ff6db-6b01-be95-a394-266be08a1...@huaweicloud.com/T/ + V2: + https://lore.kernel.org/linux-raid/20250408143808.1026534-1-meir.eli...@volumez.com/T/
** Summary changed: - raid1: Fix NULL pointer de-reference in process_checks() + raid1: Fix NULL pointer dereference in process_checks() -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2112519 Title: raid1: Fix NULL pointer dereference in process_checks() Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Status in linux source package in Noble: In Progress Status in linux source package in Oracular: In Progress Status in linux source package in Plucky: Fix Committed Status in linux source package in Questing: Fix Released Bug description: Subject: raid1: Fix NULL pointer de-reference in process_checks() BugLink: https://bugs.launchpad.net/bugs/2112519 [Impact] A null pointer dereference was found in raid1 during failure mode testing. A raid1 array was set up, filled with data and a check operation started. While the check was underway, all underlying iSCSI disks were forcefully disconnected with --failfast set to the md array, and the following kernel oops occurs: md/raid1:: dm-0: unrecoverable I/O read error for block 527744 md/raid1:: dm-1: unrecoverable I/O read error for block 527616 md/raid1:: dm-0: unrecoverable I/O read error for block 527744 md/raid1:: dm-1: unrecoverable I/O read error for block 527616 md/raid1:: dm-1: unrecoverable I/O read error for block 527616 md/raid1:: dm-0: unrecoverable I/O read error for block 527744 md/raid1:: dm-1: unrecoverable I/O read error for block 527616 md/raid1:: dm-0: unrecoverable I/O read error for block 527744 BUG: kernel NULL pointer dereference, address: 0000000000000040 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 SMP NOPTI CPU: 3 PID: 19372 Comm: md_1t889zmbfni_ Kdump: loaded Not tainted 6.8.0-1029-aws #31-Ubuntu Hardware name: Amazon EC2 m6a.xlarge/, BIOS 1.0 10/16/2017 RIP: 0010:process_checks+0x25e/0x5e0 [raid1] Code: 8e 19 01 00 00 48 8b 85 78 ff ff ff b9 08 00 00 00 48 8d 7d 90 49 8b 1c c4 49 63 c7 4d 8b 74 c4 50 31 c0 f3 48 ab 48 89 5d 88 <4c> 8b 53 40 45 0f b6 4e 18 49 8b 76 40 49 81 7e 38 a0 04 7c c0 75 RSP: 0018:ffffb39e8142bcb8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000004 RDI: ffffb39e8142bd50 RBP: ffffb39e8142bd80 R08: ffff9a2e001ea000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a2e0cd63280 R13: ffff9a2e50d1f800 R14: ffff9a2e50d1f000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff9a3128780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000040 CR3: 00000001035b2004 CR4: 00000000003706f0 Call Trace: <TASK> ? show_regs+0x6d/0x80 ? __die+0x24/0x80 ? page_fault_oops+0x99/0x1b0 ? do_user_addr_fault+0x2e0/0x660 ? exc_page_fault+0x83/0x190 ? asm_exc_page_fault+0x27/0x30 ? process_checks+0x25e/0x5e0 [raid1] ? process_checks+0x125/0x5e0 [raid1] ? srso_alias_return_thunk+0x5/0xfbef5 ? ___ratelimit+0xc7/0x130 sync_request_write+0x1c8/0x1e0 [raid1] raid1d+0x13a/0x3f0 [raid1] ? srso_alias_return_thunk+0x5/0xfbef5 md_thread+0xae/0x190 ? __pfx_autoremove_wake_function+0x10/0x10 ? __pfx_md_thread+0x10/0x10 kthread+0xda/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x47/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> What happens is that process_checks() loops through all the available disks to find a primary source with intact data, all disks are missing, and we shouldn't move forward without having a valid primary source. [Fix] This was fixed in 6.15-rc3 with: commit b7c178d9e57c8fd4238ff77263b877f6f16182ba Author: Meir Elisha <meir.eli...@volumez.com> Date: Tue Apr 8 17:38:08 2025 +0300 Subject: md/raid1: Add check for missing source disk in process_checks() Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b7c178d9e57c8fd4238ff77263b877f6f16182ba This has been applied to focal, jammy and plucky already through upstream -stable. Currently noble and oracular are lagging behind and are not up to the -stable release with the fix. Bug focal: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111448 Bug jammy: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111606 Bug plucky: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111268 [Testcase] You don't need to set up a full iscsi environment, you can just make some local VMs and then forcefully remove the underlying disks using libvirt. Create a VM, attach two scratch disks: $ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS vda 253:0 0 10G 0 disk ├─vda1 253:1 0 9G 0 part / ├─vda14 253:14 0 4M 0 part ├─vda15 253:15 0 106M 0 part /boot/efi └─vda16 259:0 0 913M 0 part /boot vdb 253:16 0 372K 0 disk vdc 253:32 0 3G 0 disk vdd 253:48 0 3G 0 disk vde 253:64 0 3G 0 disk Create a raid1 array: $ sudo mdadm --create --failfast --verbose /dev/md0 --level=1 --raid- devices=3 /dev/vdc /dev/vdd /dev/vde Make a filesystem: $ sudo mkfs.xfs /dev/md0 $ sudo mkdir /mnt/disk $ sudo mount /dev/md0 /mnt/disk Fill scratch disks with files: for n in {1..1000}; do dd if=/dev/urandom of=file$( printf %03d "$n" ).bin bs=1024 count=$(( RANDOM)); done Start a check: $ sudo mdadm --action=check /dev/md0 Use virt manager / libvirt to detach the disks, watch dmesg. Test kernels are available in the following ppa: https://launchpad.net/~mruffell/+archive/ubuntu/sf411666-test If you install the test kernel, the null pointer dereference no longer occurs. [Where problems can occur] We are changing the logic such that if all the reads fail in process_check(), and we have no valid primary source, then we disable recovery mode, mark an error occurring, free the bio and exit out. Previously we would have just continued onward and run into the null pointer dereference. This really only affects situations where all backing disks are lost. This isn't too uncommon though, particularly if all are network storage and network issues occur, losing access to the disks. Things should remain as they are if at least one primary source disk exists. If a regression were to occur, it would affect raid1 arrays only, and only during check/repair operations. A workaround would be to disable check or repair operations on the md array until the issue is fixed. [Other info] Upstream mailing list discussion: V1: https://lore.kernel.org/linux-raid/712ff6db-6b01-be95-a394-266be08a1...@huaweicloud.com/T/ V2: https://lore.kernel.org/linux-raid/20250408143808.1026534-1-meir.eli...@volumez.com/T/ To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2112519/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp