On Wed, 31 Jan 2018 14:53:32 +0100
Björn Töpel <[email protected]> wrote:
> Below are the results in Mpps of the I40E NIC benchmark runs for 64
> byte packets, generated by commercial packet generator HW that is
> generating packets at full 40 Gbit/s line rate.
>
> XDP baseline numbers without this RFC:
> xdp_rxq_info --action XDP_DROP 31.3 Mpps
> xdp_rxq_info --action XDP_TX 16.7 Mpps
>
> XDP performance with this RFC i.e. with the buffer allocator:
> XDP_DROP 21.0 Mpps
> XDP_TX 11.9 Mpps
>
> AF_PACKET V4 performance from previous RFC on 4.14-rc7:
> Benchmark V2 V3 V4 V4+ZC
> rxdrop 0.67 0.73 0.74 33.7
> txpush 0.98 0.98 0.91 19.6
> l2fwd 0.66 0.71 0.67 15.5
My numbers from before:
V4+ZC
rxdrop 35.2 Mpps
txpush 20.7 Mpps
l2fwd 16.9 Mpps
> AF_XDP performance:
> Benchmark XDP_SKB XDP_DRV XDP_DRV_ZC (all in Mpps)
> rxdrop 3.3 11.6 16.9
> txpush 2.2 NA* 21.8
> l2fwd 1.7 NA* 10.4
The numbers on my system are better than your system, and compared to
the my-own before results, the txpush is almost the same, and the gap
between l2fwd is smaller for me.
The surprise is the drop in the 'rxdrop' performance.
XDP_DRV_ZC
rxdrop 22.0 Mpps
txpush 20.9 Mpps
l2fwd 14.2 Mpps
BUT is also seems you have generally slowed down the XDP_DROP results
for i40e:
Before:
sudo ./xdp_bench01_mem_access_cost --dev i40e1
XDP_DROP 35878204 35,878,204 no_touch
After this patchset:
$ sudo ./xdp_bench01_mem_access_cost --dev i40e1
XDP_action pps pps-human-readable mem
XDP_DROP 28992009 28,992,009 no_touch
And if I read data:
sudo ./xdp_bench01_mem_access_cost --dev i40e1 --read
XDP_action pps pps-human-readable mem
XDP_DROP 25107793 25,107,793 read
BTW, see you soon in Brussels (FOSDEM18) ...
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
$ sudo ./xdpsock --rxdrop --interface=i40e1 --queue=11
[...]
i40e1:11 rxdrop
pps pkts 60.01
rx 22,040,099 1,322,572,352
tx 0 0
$ sudo ./xdpsock --txonly --interface=i40e1 --queue=11
[...]
i40e1:11 txonly
pps pkts 239.03
rx 0 0
tx 20,937,885 5,004,790,500
$ sudo ./xdpsock --l2fwd --interface=i40e1 --queue=11
[...]
i40e1:11 l2fwd
pps pkts 152.02
rx 14,244,719 2,165,460,044
tx 14,244,718 2,165,459,915
My before results:
$ sudo ./bench_all.sh
You might want to change the parameters in ./bench_all.sh
i40e1 cpu5 duration 30s zc 16
i40e1 v2 rxdrop duration 29.27s rx: 62959986pkts @ 2150794.94pps
i40e1 v3 rxdrop duration 29.18s rx: 68470248pkts @ 2346658.86pps
i40e1 v4 rxdrop duration 29.45s rx: 68900864pkts @ 2339633.99pps
i40e1 v4 rxdrop zc duration 29.36s rx: 1033722048pkts @ 35206198.62pps
i40e1 v2 txonly duration 29.16s tx: 63272640pkts @ 2169632.53pps.
i40e1 v3 txonly duration 29.14s tx: 62531968pkts @ 2145714.21pps.
i40e1 v4 txonly duration 29.48s tx: 40587316pkts @ 1376761.87pps.
i40e1 v4 txonly zc duration 29.36s tx: 608794761pkts @ 20738953.62pps.
i40e1 v2 l2fwd duration 29.19s rx: 57532736pkts @ 1970885.56pps
tx 57532672pkts @ 1970883.37pps.
i40e1 v3 l2fwd duration 29.16s rx: 57675961pkts @ 1978149.64pps
tx: 57675897pkts @ 1978147.44pps.
i40e1 v4 l2fwd duration 29.51s rx: 29732pkts @ 1007.58pps
tx: 28708pkts @ 972.88pps.
i40e1 v4 l2fwd zc duration 29.32s rx: 497528256pkts @ 16969091.01pps
tx: 497527296pkts @ 16969058.27pps.