Hi Mohammad, Thanks for verifying the kernel again, and good to see that things have been fixed when you run padded traffic over your NICs.
The SRU cycle has completed, and the 4.15.0-88-generic kernel has been released to -updates. You can go ahead and tell your affected customers to upgrade to this kernel to fix their checksum problems. That's all from me now. Let me know if you need anything else. Thanks, Matthew -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1854842 Title: mlx5_core reports hardware checksum error for padded packets on Mellanox NICs Status in linux package in Ubuntu: Fix Released Status in linux source package in Bionic: Fix Released Bug description: BugLink: https://bugs.launchpad.net/bugs/1854842 [Impact] On machines equipped with Mellanox NIC's, in this particular case, Mellanox 5 series NICs using the mlx5_core driver, there is a kernel splat when sending large IP packets which have padding at the end. enp6s0f0: hw csum failure CPU: 19 PID: 0 Comm: swapper/19 Not tainted 4.15.0-72-generic Call Trace: <IRQ> dump_stack+0x63/0x8e netdev_rx_csum_fault+0x38/0x40 __skb_checksum_complete+0xbc/0xd0 nf_ip_checksum+0xc3/0xf0 icmp_error+0x27d/0x310 [nf_conntrack_ipv4] nf_conntrack_in+0x15a/0x510 [nf_conntrack] ? __skb_checksum+0x68/0x330 ipv4_conntrack_in+0x1c/0x20 [nf_conntrack_ipv4] nf_hook_slow+0x48/0xc0 ? skb_send_sock+0x50/0x50 ip_rcv+0x301/0x360 ? inet_del_offload+0x40/0x40 __netif_receive_skb_core+0x432/0xb80 __netif_receive_skb+0x18/0x60 ? __netif_receive_skb+0x18/0x60 netif_receive_skb_internal+0x45/0xe0 napi_gro_receive+0xc5/0xf0 mlx5e_handle_rx_cqe+0x48d/0x5e0 [mlx5_core] ? enqueue_task_rt+0x1b4/0x2e0 mlx5e_poll_rx_cq+0xd1/0x8c0 [mlx5_core] mlx5e_napi_poll+0x9d/0x290 [mlx5_core] net_rx_action+0x140/0x3a0 __do_softirq+0xe4/0x2d4 irq_exit+0xc5/0xd0 do_IRQ+0x86/0xe0 common_interrupt+0x8c/0x8c </IRQ> This bug is a further attempt to fix these splats, as there has been previous fixes in LP #1840854 and a series of commits which landed in 4.15.0-67 (LP #1847155) as a part of upstream -stable patches. This bug will also fix the same problems on the new Mellanox CX6 and Bluefield hardware, which has been enabled already via previous upstream -stable patches which landed in LP #1847155. [Fix] This particular issue was fixed for Mellanox series 5 drivers in the following commits: commit 0aa1d18615c163f92935b806dcaff9157645233a Author: Saeed Mahameed <sae...@mellanox.com> Date: Tue Mar 12 00:24:52 2019 -0700 Subject: net/mlx5e: Rx, Fixup skb checksum for packets with tail padding This commit required a minor backport. This commit was selected for upstream -stable in 4.19.76 and 5.0.10. This commit appears to be omitted from "Bionic update: upstream stable patchset 2019-10-07", which is LP #1847155, probably due to requiring a backport. commit db849faa9bef993a1379dc510623f750a72fa7ce Author: Saeed Mahameed <sae...@mellanox.com> Date: Fri May 3 13:14:59 2019 -0700 Subject: net/mlx5e: Rx, Fix checksum calculation for new hardware This commit required a minor backport. This commit was selected for upstream -stable in 5.1.21 and 5.2.4. This commit has already been applied to the disco kernel, as part of stable updates. [Testcase] The following scapy script will reproduce this issue. Run from the machine with the Mellanox series 5 NIC: 1) a=Ether(dst='ff:ff:ff:ff:ff:ff')/IP(dst='127.0.0.1')/ICMP()/Padding(load='\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe\xfe') 2) sendp(a, iface='enp6s0f0') 3) Check dmesg on the reciever side. The example uses localhost, so check dmesg. I have built some test kernels, which are available here: https://launchpad.net/~mruffell/+archive/ubuntu/lp1854842-test This kernel contains 0aa1d18615c163f92935b806dcaff9157645233a. and https://launchpad.net/~mruffell/+archive/ubuntu/lp1854842-test-2 This kernel contains db849faa9bef993a1379dc510623f750a72fa7ce. If you install the test kernels the issue is resolved. [Regression Potential] The changes are limited to the mlx5_core driver, and only modify how packet checksums are calculated when padding is involved. Both patches have been accepted and published by upstream -stable, and are widely accepted by the community. Because of this, I believe the risk of regression is low. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1854842/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp