From: Eric Dumazet <eric.duma...@gmail.com> Date: Tue, 28 Feb 2017 10:34:50 -0800
> From: Eric Dumazet <eduma...@google.com> > > While playing with mlx4 hardware timestamping of RX packets, I found > that some packets were received by TCP stack with a ~200 ms delay... > > Since the timestamp was provided by the NIC, and my probe was added > in tcp_v4_rcv() while in BH handler, I was confident it was not > a sender issue, or a drop in the network. > > This would happen with a very low probability, but hurting RPC > workloads. > > A NAPI driver normally arms the IRQ after the napi_complete_done(), > after NAPI_STATE_SCHED is cleared, so that the hard irq handler can grab > it. > > Problem is that if another point in the stack grabs NAPI_STATE_SCHED bit > while IRQ are not disabled, we might have later an IRQ firing and > finding this bit set, right before napi_complete_done() clears it. > > This can happen with busy polling users, or if gro_flush_timeout is > used. But some other uses of napi_schedule() in drivers can cause this > as well. ... > Signed-off-by: Eric Dumazet <eduma...@google.com> Applied, thanks Eric.