W dniu 16.11.2018 o 21:06, Cong Wang pisze:
On Thu, Nov 15, 2018 at 8:50 PM Herbert Xu <herb...@gondor.apana.org.au> wrote:
On Thu, Nov 15, 2018 at 06:23:38PM -0800, Cong Wang wrote:
Normally if the hardware's partial checksum is valid then we just
trust it and send the packet along.  However, if the partial
checksum is invalid we don't trust it and we will compute the
whole checksum manually which is what ends up in sum.
Not sure if I understand partial checksum here, but it is the
CHECKSUM_COMPLETE case which I am trying to fix, not
CHECKSUM_PARTIAL.
What I meant by partial checksum is the checksum produced by the
hardware on RX.  In the kernel we call that CHECKSUM_COMPLETE.
CHECKSUM_PARTIAL is the absence of the substantial part of the
checksum which is something we use in the kernel primarily for TX.

Yes the names are confusing :)
Yeah, understood. The hardware provides skb->csum in this case, but
we keep adjusting it each time when we change skb->data.


So, in other word, a checksum *match* is the intended to detect
this HW RX checksum fault?
Correct.  Or more likely it's probably a bug in either the driver
or if there are overlaying code such as VLAN then in that code.

Basically if the RX checksum is buggy, it's much more likely to
cause a valid packet to be rejected than to cause an invalid packet
to be accepted, because we still verify that checksum against the
pseudoheader.  So we only attempt to catch buggy hardware/drivers
by doing a second manual verification for the case where the packet
is flagged as invalid.
Hmm, now I see how it works. Actually it uses the differences between
these two check's as the difference between hardware checksum with
skb_checksum().

I will send a patch to add a comment there to avoid confusion.


Sure, my case is nearly same with Pawel's, except I have no vlan:
https://marc.info/?l=linux-netdev&m=154086647601721&w=2
Can you please provide your backtrace?
I already did:
https://marc.info/?l=linux-netdev&m=154092211305599&w=2

Note, the offending commit has been backported to 4.14, which
is why I saw this warning. I have no idea why it is backported
from the beginning, it is just an optimization, doesn't fix any bug,
IMHO.

Also, it is much harder for me to reproduce it than Pawel who
saw the warning every second. Sometimes I need 1 hour to trigger
it, sometimes other people here needs 10+ hours to trigger it.

By the way - changed network controller for vlans where i was receiving rx csum fail to 82599 with ixgbe driver and

with mellanox:

[91584.359273] vlan980: hw csum failure
[91584.359278] CPU: 54 PID: 0 Comm: swapper/54 Not tainted 4.20.0-rc1+ #2
[91584.359279] Call Trace:
[91584.359282]  <IRQ>
[91584.359290]  dump_stack+0x46/0x5b
[91584.359296]  __skb_checksum_complete+0x9b/0xb0
[91584.359301]  icmp_rcv+0x51/0x1f0
[91584.359305]  ip_local_deliver_finish+0x49/0xd0
[91584.359307]  ip_local_deliver+0xb7/0xe0
[91584.359309]  ? ip_sublist_rcv_finish+0x50/0x50
[91584.359310]  ip_rcv+0x96/0xc0
[91584.359313]  __netif_receive_skb_one_core+0x4b/0x70
[91584.359315]  netif_receive_skb_internal+0x2f/0xc0
[91584.359316]  napi_gro_receive+0xb0/0xd0
[91584.359320]  mlx5e_handle_rx_cqe+0x78/0xd0
[91584.359321]  mlx5e_poll_rx_cq+0xc4/0x970
[91584.359323]  mlx5e_napi_poll+0xab/0xcb0
[91584.359325]  net_rx_action+0xd9/0x300
[91584.359328]  __do_softirq+0xd3/0x2d9
[91584.359333]  irq_exit+0x7a/0x80
[91584.359334]  do_IRQ+0x72/0xc0
[91584.359336]  common_interrupt+0xf/0xf
[91584.359337]  </IRQ>
[91584.359340] RIP: 0010:mwait_idle+0x74/0x1b0
[91584.359342] Code: ae f0 31 d2 65 48 8b 04 25 80 4c 01 00 48 89 d1 0f 01 c8 48 8b 00 48 c1 e8 03 83 e0 01 0f 85 26 01 00 00 48 89 c1 fb 0f 01 c9 <65> 8b 2d 95 8e 6b 7e 0f 1f 44 00 00 65 48 8b 04 25 80 4c 01 00 f0 [91584.359343] RSP: 0018:ffffc900034f3ec0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde [91584.359344] RAX: 0000000000000000 RBX: 0000000000000036 RCX: 0000000000000000 [91584.359345] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [91584.359346] RBP: 0000000000000036 R08: 0000000000000000 R09: 0000000000000000 [91584.359346] R10: 00000001008b49bb R11: 0000000000000c00 R12: 0000000000000000 [91584.359347] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[91584.359352]  do_idle+0x19f/0x1c0
[91584.359354]  ? do_idle+0x4/0x1c0
[91584.359355]  cpu_startup_entry+0x14/0x20
[91584.359360]  start_secondary+0x165/0x190
[91584.359364]  secondary_startup_64+0xa4/0xb0


With intel no errors.



Let me see if I can add vlan on my side to make it more reproducible,
it seems hard as our switch doesn't use vlan either.

We have warnings with conntrack involved too, I can provide it too
if you are interested.

I tend to revert it for -stable, at least that is what I plan to do
on my side unless there is a fix coming soon.

Thanks.

Reply via email to