This bug is missing log files that will aid in diagnosing the problem.
>From a terminal window please run:

apport-collect 2098056

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that and change the
bug status to 'Confirmed'.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2098056

Title:
  RAID getting corrupted while running ZFS IOs .

Status in linux package in Ubuntu:
  New

Bug description:
  Steps to reproduce :

  1. Power on the NVMeoF enclosure.
  2. Discover and connect the drives.
  3. Create 2 zpools with even and odd drives.
  4. Start the IO on both pools created.

  Observation :

  1. Observed call trace while running ZFS IO Able to see "failed to send 
request-5" and drive went continuously reconnected state.
  2. The issue is seen with  Ubuntu 24.04.1 with kernel 6.8.0-49.generic kernel.

  From Kernel ring buffer logs (dmesg) :

  [Tue Feb 11 05:25:55 2025] ------------[ cut here ]------------
  [Tue Feb 11 05:25:55 2025] WARNING: CPU: 10 PID: 114873 at 
net/core/skbuff.c:7006 skb_splice_from_iter+0x139/0x370
  [Tue Feb 11 05:25:55 2025] Modules linked in: nvme_tcp nvme_keyring nvme 
xt_tcpudp nft_compat nf_tables qrtr cfg80211 binfmt_misc zfs(PO) spl(O) 
intel_rapl_msr intel_rapl_common intel_uncore_frequency 
intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp 
coretemp kvm_intel dell_wmi dell_smbios dell_wmi_descriptor kvm video mgag200 
ledtrig_audio irqbypass sparse_keymap dcdbas joydev input_leds mei_me 
i2c_algo_bit mei acpi_power_meter rapl intel_cstate lpc_ich ipmi_ssif mac_hid 
acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler mxm_wmi sch_fq_codel 
dm_multipath nvme_fabrics msr nvme_core nvme_auth efi_pstore nfnetlink 
dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 mlx5_ib ib_uverbs macsec ib_core mlx5_core 
crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic 
ghash_clmulni_intel sha256_ssse3 sha1_ssse3 mlxfw psample tls pci_hyperv_intf 
tg3
  pata_acpi wmi hid_generic usbhid hid aesni_intel
  [Tue Feb 11 05:25:55 2025]  crypto_simd cryptd
  [Tue Feb 11 05:25:55 2025] CPU: 10 PID: 114873 Comm: kworker/10:2H Tainted: P 
          O       6.8.0-49-generic #49-Ubuntu
  [Tue Feb 11 05:25:55 2025] Hardware name: Dell Inc. PowerEdge R730/072T6D, 
BIOS 2.7.1 001/22/2018
  [Tue Feb 11 05:25:55 2025] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
  [Tue Feb 11 05:25:55 2025] RIP: 0010:skb_splice_from_iter+0x139/0x370
  [Tue Feb 11 05:25:55 2025] Code: 39 e1 48 8b 53 08 49 0f 47 cc 49 89 cd f6 c2 
01 0f 85 c0 01 00 00 66 90 48 89 da 48 8b 12 80 e6 08 0f 84 8e 00 00 00 4d 89 
fe <0f> 0b 49 c7 c0 fb ff ff ff 48 8b 85 68 ff ff ff 41 01 46 70 41 01
  [Tue Feb 11 05:25:55 2025] RSP: 0018:ffffb216769d7a38 EFLAGS: 00010202
  [Tue Feb 11 05:25:55 2025] RAX: 0000000000000000 RBX: fffff74820347000 RCX: 
0000000000001000
  [Tue Feb 11 05:25:55 2025] RDX: 0017ffffc0000840 RSI: 0000000000000000 RDI: 
0000000000000000
  [Tue Feb 11 05:25:55 2025] RBP: ffffb216769d7ae0 R08: 0000000000000000 R09: 
0000000000000000
  [Tue Feb 11 05:25:55 2025] R10: 0000000000000000 R11: 0000000000000000 R12: 
0000000000001000
  [Tue Feb 11 05:25:55 2025] R13: 0000000000001000 R14: ffff9c22fccbfe00 R15: 
ffff9c22fccbfe00
  [Tue Feb 11 05:25:55 2025] FS:  0000000000000000(0000) 
GS:ffff9c347f680000(0000) knlGS:0000000000000000
  [Tue Feb 11 05:25:55 2025] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [Tue Feb 11 05:25:55 2025] CR2: 00007d79ae7af000 CR3: 0000002a226e4001 CR4: 
00000000003706f0
  [Tue Feb 11 05:25:55 2025] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [Tue Feb 11 05:25:55 2025] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
  [Tue Feb 11 05:25:55 2025] Call Trace:
  [Tue Feb 11 05:25:55 2025]  <TASK>
  [Tue Feb 11 05:25:55 2025]  ? show_regs+0x6d/0x80
  [Tue Feb 11 05:25:55 2025]  ? __warn+0x89/0x160
  [Tue Feb 11 05:25:55 2025]  ? skb_splice_from_iter+0x139/0x370
  [Tue Feb 11 05:25:55 2025]  ? report_bug+0x17e/0x1b0
  [Tue Feb 11 05:25:55 2025]  ? handle_bug+0x51/0xa0
  [Tue Feb 11 05:25:55 2025]  ? exc_invalid_op+0x18/0x80
  [Tue Feb 11 05:25:55 2025]  ? asm_exc_invalid_op+0x1b/0x20
  [Tue Feb 11 05:25:55 2025]  ? skb_splice_from_iter+0x139/0x370
  [Tue Feb 11 05:25:55 2025]  ? skb_splice_from_iter+0xd5/0x370
  [Tue Feb 11 05:25:55 2025]  tcp_sendmsg_locked+0x352/0xd70
  [Tue Feb 11 05:25:55 2025]  ? tcp_push+0x159/0x190
  [Tue Feb 11 05:25:55 2025]  ? tcp_sendmsg_locked+0x9c4/0xd70
  [Tue Feb 11 05:25:55 2025]  tcp_sendmsg+0x2c/0x50
  [Tue Feb 11 05:25:55 2025]  inet_sendmsg+0x42/0x80
  [Tue Feb 11 05:25:55 2025]  sock_sendmsg+0x118/0x150
  [Tue Feb 11 05:25:55 2025]  nvme_tcp_try_send_data+0x16e/0x4d0 [nvme_tcp]
  [Tue Feb 11 05:25:55 2025]  nvme_tcp_try_send+0x23c/0x300 [nvme_tcp]
  [Tue Feb 11 05:25:55 2025]  nvme_tcp_io_work+0x40/0xe0 [nvme_tcp]
  [Tue Feb 11 05:25:55 2025]  process_one_work+0x178/0x350
  [Tue Feb 11 05:25:55 2025]  worker_thread+0x306/0x440
  [Tue Feb 11 05:25:55 2025]  ? __pfx_worker_thread+0x10/0x10
  [Tue Feb 11 05:25:55 2025]  kthread+0xf2/0x120
  [Tue Feb 11 05:25:55 2025]  ? __pfx_kthread+0x10/0x10
  [Tue Feb 11 05:25:55 2025]  ret_from_fork+0x47/0x70
  [Tue Feb 11 05:25:55 2025]  ? __pfx_kthread+0x10/0x10
  [Tue Feb 11 05:25:55 2025]  ret_from_fork_asm+0x1b/0x30
  [Tue Feb 11 05:25:55 2025]  </TASK>
  [Tue Feb 11 05:25:55 2025] ---[ end trace 0000000000000000 ]---
  [Tue Feb 11 05:25:55 2025] nvme nvme8: failed to send request -5
  [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 5 (9005) type 4 opcode 0x2 
(I/O Cmd) QID 11 timeout
  [Tue Feb 11 05:26:25 2025] nvme nvme8: starting error recovery
  [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 6 (c006) type 4 opcode 0x1 
(I/O Cmd) QID 11 timeout
  [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 7 (d007) type 4 opcode 0x2 
(I/O Cmd) QID 11 timeout
  [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 11 (700b) type 4 opcode 0x1 
(I/O Cmd) QID 11 timeout
  [Tue Feb 11 05:26:25 2025] nvme nvme8: I/O tag 12 (300c) type 4 opcode 0x1 
(I/O Cmd) QID 11 timeout
  [Tue Feb 11 05:26:25 2025] nvme nvme32: failed to send request -5
  [Tue Feb 11 05:26:25 2025] nvme nvme8: Reconnecting in 10 seconds...
  [Tue Feb 11 05:26:25 2025] nvme nvme32: starting error recovery
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:25 2025] nvme nvme32: Reconnecting in 10 seconds...
  [Tue Feb 11 05:26:36 2025] nvme nvme8: queue_size 128 > ctrl sqsize 16, 
clamping down
  [Tue Feb 11 05:26:36 2025] nvme nvme8: creating 16 I/O queues.
  [Tue Feb 11 05:26:36 2025] nvme nvme32: queue_size 128 > ctrl sqsize 16, 
clamping down
  [Tue Feb 11 05:26:36 2025] nvme nvme32: creating 16 I/O queues.
  [Tue Feb 11 05:26:36 2025] nvme nvme8: mapped 16/0/0 default/read/poll queues.
  [Tue Feb 11 05:26:36 2025] nvme nvme8: Successfully reconnected (1 attempt)
  [Tue Feb 11 05:26:36 2025] nvme nvme8: failed to send request -5
  [Tue Feb 11 05:26:36 2025] nvme nvme32: mapped 16/0/0 default/read/poll 
queues.
  [Tue Feb 11 05:26:36 2025] nvme nvme8: starting error recovery
  [Tue Feb 11 05:26:36 2025] nvme_ns_head_submit_bio: 55 callbacks suppressed
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] block nvme8n1: no usable path - requeuing I/O
  [Tue Feb 11 05:26:36 2025] nvme nvme32: Successfully reconnected (1 attempt)
  [Tue Feb 11 05:26:36 2025] nvme nvme8: reading non-mdts-limits failed: -4

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2098056/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to