From: Manish Chopra <manish.cho...@cavium.com> Date: Thu, 17 May 2018 12:05:00 -0700
> This patch makes use of build_skb() throughout in driver's receieve > data path [HW gro flow and non HW gro flow]. With this, driver can > build skb directly from the page segments which are already mapped > to the hardware instead of allocating new SKB via netdev_alloc_skb() > and memcpy the data which is quite costly. > > This really improves performance (keeping same or slight gain in rx > throughput) in terms of CPU utilization which is significantly reduced > [almost half] in non HW gro flow where for every incoming MTU sized > packet driver had to allocate skb, memcpy headers etc. Additionally > in that flow, it also gets rid of bunch of additional overheads > [eth_get_headlen() etc.] to split headers and data in the skb. > > Tested with: > system: 2 sockets, 4 cores per socket, hyperthreading, 2x4x2=16 cores > iperf [server]: iperf -s > iperf [client]: iperf -c <server_ip> -t 500 -i 10 -P 32 > > HW GRO off – w/o build_skb(), throughput: 36.8 Gbits/sec > > Average: CPU %usr %nice %sys %iowait %irq %soft %steal > %guest %idle > Average: all 0.59 0.00 32.93 0.00 0.00 43.07 0.00 > 0.00 23.42 > > HW GRO off - with build_skb(), throughput: 36.9 Gbits/sec > > Average: CPU %usr %nice %sys %iowait %irq %soft %steal > %guest %idle > Average: all 0.70 0.00 31.70 0.00 0.00 25.68 0.00 > 0.00 41.92 ^^^^^ ^^^^^ > > HW GRO on - w/o build_skb(), throughput: 36.9 Gbits/sec > > Average: CPU %usr %nice %sys %iowait %irq %soft %steal > %guest %idle > Average: all 0.86 0.00 24.14 0.00 0.00 6.59 0.00 > 0.00 68.41 > > HW GRO on - with build_skb(), throughput: 37.5 Gbits/sec > > Average: CPU %usr %nice %sys %iowait %irq %soft %steal > %guest %idle > Average: all 0.87 0.00 23.75 0.00 0.00 6.19 0.00 > 0.00 69.19 > > Signed-off-by: Ariel Elior <ariel.el...@cavium.com> > Signed-off-by: Manish Chopra <manish.cho...@cavium.com> Looks great, applied, thank you.