Greetings!

During migration from kernel 3.14 to 4.19, we noticed a regression on the network performance. Under the exact same circumstances, the standard deviation of the latency is more than double than before on the Realtek RTL8111/8168B (10ec:8168) using the r8169 driver.

Kernel 3.14:
# netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
    313.37

Kernel 4.19:
# netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
    632.96

In contrast, we noticed small improvements in performance with other non-Realtek network cards (igb, tg3). Which suggested a possible driver related bug.

However after bisecting the code, I ended up with the following patch, which was introduced in kernel 4.17 and modifies net/ipv4:

    commit 0a6b2a1dc2a2105f178255fe495eb914b09cb37a
    Author: Eric Dumazet <eduma...@google.com>
    Date:   Mon Feb 19 11:56:47 2018 -0800

        tcp: switch to GSO being always on

Could you please help me to clarify, should GSO be always on on my device? Or does it just affect TCP? According to ethtool it is always off, "ethtool -K eth0 gso on" has no effect, unless I switch SG on.

    # ethtool -k eth0
    Offload parameters for eth0:
Cannot get device udp large send offload settings: Operation not supported
    rx-checksumming: on
    tx-checksumming: off
    scatter-gather: off
    tcp-segmentation-offload: off
    udp-fragmentation-offload: off
    generic-segmentation-offload: off
    generic-receive-offload: on
    large-receive-offload: off

I validated that reverting "tcp: switch to GSO being always on" successfully brings back the better performance for the r8169 driver.

I'm sure that reverting that commit is not the optimal solution, so I would like to kindly ask for help to shed some light in this issue.

Best regards,
Juliana.

Reply via email to