Re: r8169: Performance regression and latency instability

Juliana Rodrigueiro Mon, 19 Aug 2019 09:05:07 -0700

Hi!

First of all: Thank you everyone for the input.


Here is some more info about my NIC. (Using the latest ethtool)

# ethtool -i eth0 ; ifconfig eth0
driver: r8169
version:
firmware-version: rtl8168h-2_0.0.2 02/26/15
expansion-rom-version:
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no
eth0      Link encap:Ethernet  HWaddr <hidden>
          inet addr:<hidden>  Bcast:<hidden>  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:27392501 errors:0 dropped:0 overruns:0 frame:0
          TX packets:24647212 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:33702173568 (31.3 GiB)  TX bytes:35865124147 (33.4 GiB)


On 8/16/19 9:12 PM, Heiner Kallweit wrote:

Indeed, here we're talking about changes in linux-next, and Juliana's issue is
with 4.19. However I'd appreciate if Juliana could test with linux-next and
different combinations of the NETIF_F_xxx features.


I also tested the latest linux-next (20190819) and the results did not
improved for me, unfortunately. About the same as all the kernel
versions I tested from 4.17 onwards.

# netperf -v 2 -P 0 -H <netserver-ip>,4 -I 99,5 -t omni -l 1 -- -OSTDDEV_LATENCY -m 64K -d Send

627.99

Running linux-next I have the following defaults (shortened for simplicity):

# ethtool -k eth0
Features for eth0:
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
... (all off from here)


There are quite a few possible combinations to go through. I executed my
test with SG, TSO, GSO, RX, TX individually disabled, but the results
were all the same or slightly worse.

Until I find the root cause, we will have to keep the "tcp: switch to
GSO being always on" patch reverted for production, which is not ideal.

Any other ideas how I could debug this issue?


Best regards,
Juliana.

Re: r8169: Performance regression and latency instability

Reply via email to