Hi Mohammad, I have completed backporting both commits and I have built test kernels for each.
Can you install the test kernels and run the reproducer with Mellanox hardware? My lab doesn't have any Mellanox NICs, so I can't test myself. Please note these test kernels are NOT SUPPORTED by Canonical, and are for TESTING PURPOSES ONLY. ONLY Install in a dedicated test environment. The first test kernel is for: Subject: net/mlx5e: Rx, Fixup skb checksum for packets with tail padding Commit: 0aa1d18615c163f92935b806dcaff9157645233a Backport: https://paste.ubuntu.com/p/Svfp8DQGRP/ Instructions to install (On a bionic system) 1) sudo add-apt-repository ppa:mruffell/lp1854842-test 2) sudo apt-get update 3) sudo apt install linux-image-unsigned-4.15.0-72-generic linux-modules-4.15.0-72-generic linux-modules-extra-4.15.0-72-generic linux-headers-4.15.0-72 linux-headers-4.15.0-72-generic 4) sudo reboot 5) uname -rv 4.15.0-72-generic #81+hf1854842v20191207b2-Ubuntu SMP Sat Dec 7 09:51:02 UTC 2019 Run the reproducer and see if the problem is fixed or not. The second test kernel is for: Subject: net/mlx5e: Rx, Fix checksum calculation for new hardware Commit: db849faa9bef993a1379dc510623f750a72fa7ce Backport: https://paste.ubuntu.com/p/QNPPSDf5Tm/ Instructions to install (On a bionic system) 1) sudo add-apt-repository ppa:mruffell/lp1854842-test-2 2) sudo apt-get update 3) sudo apt install linux-image-unsigned-4.15.0-72-generic linux-modules-4.15.0-72-generic linux-modules-extra-4.15.0-72-generic linux-headers-4.15.0-72 linux-headers-4.15.0-72-generic 4) sudo reboot 5) uname -rv 4.15.0-72-generic #81+hf1854842v20191209b1-Ubuntu SMP Sun Dec 8 23:37:16 UTC 2019 Run the reproducer and see if the problem is fixed or not. Hopefully one of these kernels / commits fixes the problem. If it does, I will submit the commit to the Ubuntu kernel mailing list for SRU. If neither of the kernels fix the problem, we will have to continue debugging. Can you please test the kernels for me and let me know how they go? Thanks, Matthew -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854842 Title: mlx5_core reports hardware checksum error for padded packets on Mellanox NICs Status in linux package in Ubuntu: Fix Released Status in linux source package in Bionic: In Progress Bug description: Hi, we have the following issue which affects a lot of our customers this issue fixes upstream and need to add the fixes to ubuntu 18.04. Mlx5 driver: Tail padding HW Checksum crash in Ubuntu 18.04 kernel Ubuntu-4.15.0-72 Crach log: [ 785.337368] Call Trace: [ 785.337372] <IRQ> [ 785.337388] dump_stack+0x63/0x8e [ 785.337397] netdev_rx_csum_fault+0x38/0x40 [ 785.337403] __skb_checksum_complete+0xbc/0xd0 [ 785.337408] nf_ip_checksum+0xc3/0xf0 [ 785.337417] icmp_error+0x27d/0x310 [nf_conntrack_ipv4] [ 785.337431] nf_conntrack_in+0x15a/0x510 [nf_conntrack] [ 785.337437] ? __skb_checksum+0x68/0x330 [ 785.337441] ipv4_conntrack_in+0x1c/0x20 [nf_conntrack_ipv4] [ 785.337449] nf_hook_slow+0x48/0xc0 [ 785.337452] ? skb_send_sock+0x50/0x50 [ 785.337460] ip_rcv+0x301/0x360 [ 785.337463] ? inet_del_offload+0x40/0x40 [ 785.337468] __netif_receive_skb_core+0x432/0xb80 [ 785.337473] __netif_receive_skb+0x18/0x60 [ 785.337477] ? __netif_receive_skb+0x18/0x60 [ 785.337481] netif_receive_skb_internal+0x45/0xe0 [ 785.337483] napi_gro_receive+0xc5/0xf0 [ 785.337517] mlx5e_handle_rx_cqe+0x48d/0x5e0 [mlx5_core] [ 785.337524] ? enqueue_task_rt+0x1b4/0x2e0 [ 785.337546] mlx5e_poll_rx_cq+0xd1/0x8c0 [mlx5_core] [ 785.337566] mlx5e_napi_poll+0x9d/0x290 [mlx5_core] [ 785.337569] net_rx_action+0x140/0x3a0 [ 785.337574] __do_softirq+0xe4/0x2d4 [ 785.337580] irq_exit+0xc5/0xd0 [ 785.337583] do_IRQ+0x86/0xe0 [ 785.337588] common_interrupt+0x8c/0x8c [ 785.337590] </IRQ> [ 785.337598] RIP: 0010:cpuidle_enter_state+0xa4/0x2f0 [ 785.337600] RSP: 0018:ffffad8d8329fe68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd9 [ 785.337604] RAX: ffff8a6c7f7e1840 RBX: 000000b6d9bf6a06 RCX: 000000000000001f [ 785.337605] RDX: 000000b6d9bf6a06 RSI: ffd4a4b4c86359ce RDI: 0000000000000000 [ 785.337607] RBP: ffffad8d8329fea8 R08: 0000000000000004 R09: 0000000000021080 [ 785.337609] R10: ffffad8d8329fe38 R11: 0056b80166a42400 R12: ffff8a6c7f7ece18 [ 785.337610] R13: 0000000000000005 R14: ffffffffaff73438 R15: 0000000000000000 [HOW TO REPRODUCE]: with scapy on the sender side please run the following commands: 1) a=Ether(dst='ff:ff:ff:ff:ff:ff')/IP(dst='127.0.0.1')/ICMP()/Padding(load='\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe') 2) sendp(a, iface='enp6s0f0') 3) check the dmesg i the receiver side [ADDITIONAL INFO]: This issue fixes upstream by the following set of patches: net/mlx5e: Rx, Fix checksum calculation for new hardware --> db849faa9bef993a1379dc510623f750a72fa7ce net/mlx5e: Rx, Check ip headers sanity - > 0318a7b7fcad9765931146efa7ca3a034194737c net/mlx5e: Rx, Fixup skb checksum for packets with tail padding --> 0aa1d18615c163f92935b806dcaff9157645233a net/mlx5e: XDP, Avoid checksum complete when XDP prog is loaded --> 5d0bb3bac4b9f6c22280b04545626fdfd99edc6b mlx5: fix get_ip_proto() --> ef6fcd455278c2be3032a346cc66d9dd9866b787 net/mlx5e: Allow reporting of checksum unnecessary --> b856df28f9230a47669efbdd57896084caadb2b3 net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets --> fe1dc069990c1f290ef6b99adb46332c03258f38 net/mlx5e: Set ECN for received packets using CQE indication --> f007c13d4ad62f494c83897eda96437005df4a91 net/mlx5e: Add likely to the common RX checksum flow --> 63a612f984a1fae040ab6f1c6a0f1fdcdf1954b8 net/mlx5e: CHECKSUM_COMPLETE offload for VLAN/QinQ packets --> f938daeee95eb36ef6b431bf054a5cc6cdada112 attached the /var/log/kern.log file. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1854842/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp