On Sun, 2017-02-12 at 18:31 +0200, Tariq Toukan wrote: > On 09/02/2017 6:56 PM, Eric Dumazet wrote: > >> Default, out of box. > > Well. Please report : > > > > ethtool -l eth0 > > ethtool -g eth0 > $ ethtool -g p1p1 > Ring parameters for p1p1: > Pre-set maximums: > RX: 8192 > RX Mini: 0 > RX Jumbo: 0 > TX: 8192 > Current hardware settings: > RX: 1024 > RX Mini: 0 > RX Jumbo: 0 > TX: 512
We are using 4096 slots per RX queue, this is why I could not reproduce your results. A single TCP flow easily can have more than 1024 MSS waiting in its receive queue (typical receive window on linux is 6MB/2 ) I mentioned that having a slightly inflated skb->truesize might have an impact in some workloads. (charging for 2048 bytes per MSS instead of 1536), but this is not related to mlx4 and should be tweaked in TCP stack instead, since this 2048 bytes (half a page on x86) strategy is now well spread.