Here is a set of patches targeting for performance improvement on various platforms and protocols.
Our main target was rx performance on iommu systems, notably NVIDIA Jetson TX2 and NVIDIA Xavier platforms. We introduce page reuse strategy to better deal with iommu dma mapping costs. With it we see 80-90% of page reuse under some test configurations on UDP traffic. This shows good improvements on other systems with IOMMU hardware, like AMD Ryzen. We've also improved TCP LRO configuration parameters, allowing packets to better coalesce. Page reuse tests were carried out using iperf3, iperf2, netperf and pktgen. Mainly on UDP traffic, with various packet lengths. Jetson TX2, UDP, Default MTU: RX Lost Datagrams Before: Max: 69% Min: 68% Avg: 68.5% After: Max: 41% Min: 38% Avg: 39.2% Maximum throughput Before: 1.27 Gbits/sec After: 2.41 Gbits/sec AMD Ryzen 5 2400G, UDP, Default MTU: RX Lost Datagrams Before: Max: 12% Min: 4.5% Avg: 7.17% After: Max: 6.2% Min: 2.3% Avg: 4.26% Igor Russkikh (6): net: aquantia: optimize rx path using larger preallocated skb len net: aquantia: optimize rx performance by page reuse strategy net: aquantia: Introduce rx refill threshold value net: aquantia: Make RX default frame size 2K net: aquantia: Increase rx ring default size from 1K to 2K net: aquantia: enable driver build for arm64 or compile_test Nikita Danilov (1): net: aquantia: improve LRO configuration drivers/net/ethernet/aquantia/Kconfig | 3 +- .../net/ethernet/aquantia/atlantic/aq_cfg.h | 10 +- .../net/ethernet/aquantia/atlantic/aq_nic.c | 1 + .../net/ethernet/aquantia/atlantic/aq_nic.h | 1 + .../net/ethernet/aquantia/atlantic/aq_ring.c | 187 +++++++++++++----- .../net/ethernet/aquantia/atlantic/aq_ring.h | 34 +++- .../net/ethernet/aquantia/atlantic/aq_vec.c | 3 + .../aquantia/atlantic/hw_atl/hw_atl_a0.c | 4 - .../aquantia/atlantic/hw_atl/hw_atl_b0.c | 16 +- .../atlantic/hw_atl/hw_atl_b0_internal.h | 2 +- .../aquantia/atlantic/hw_atl/hw_atl_llh.c | 15 ++ .../aquantia/atlantic/hw_atl/hw_atl_llh.h | 6 + .../atlantic/hw_atl/hw_atl_llh_internal.h | 13 ++ 13 files changed, 223 insertions(+), 72 deletions(-) -- 2.17.1