On 2015-06-26 11:04, Tom Herbert wrote:
I am testing the simplest configuration which has 1 TCP flow generated
by
iperf from
a VM connected to a linux bridge with a vxlan tunnel interface. The
10G nic
(82599 ES) has
multiple receive queues, but in this simple test, it is likely
immaterial
(because, the
tuple on which it hashes would be fixed). The real difference in
performance
appears to
be whether or not vxlan gro is performed by software.
Please do "ethtool -k vxlan0" of whatever interface is for vxlan.
Ensure GRO is "on", if not enable it on the interface by "ethtool _k
vxlan0 gro on". Run iperf and to tcpdump on the vxlan interface to
verify GRO is being done. If we are seeing performance degradation
when GRO is being done at tunnel versus device that would be a
different problem than no GRO being done at all.
Heres more details on the test.
gro is "on" on the device and the tunnel. tcpdump on the vxlan interface
show un-aggregated packets
[root@ramu1 tracing]# tcpdump -i vxlan0
<snip>
ptions [nop,nop,TS val 1972850548 ecr 193703], length 1398
14:14:38.911955 IP 1.1.1.21.44134 > 1.1.1.11.commplex-link: Flags [.],
seq 224921449:224922847, ack 1, win 221, options [nop,nop,TS val
1972850548 ecr 193703], length 1398
14:14:38.911957 IP 1.1.1.21.44134 > 1.1.1.11.commplex-link: Flags [.],
seq 224922847:224924245, ack 1, win 221, options [nop,nop,TS val
1972850548 ecr 193703], length 1398
14:14:38.911958 IP 1.1.1.21.44134 > 1.1.1.11.commplex-link: Flags [.],
seq 224924245:224925643, ack 1, win 221, options [nop,nop,TS val
1972850548 ecr 193703], length 1398
14:14:38.911959 IP 1.1.1.21.44134 > 1.1.1.11.commplex-link: Flags [.],
seq 224925643:224927041, ack 1, win 221, options [nop,nop,TS val
1972850548 ecr 193703], length 1398
In the kernel trace I dont see "vxlan_gro_receive" being hit at all.
[root@localhost ~]# ./iperf -s -i 2
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 1.1.1.11 port 5001 connected with 1.1.1.21 port 44135
[ ID] Interval Transfer Bandwidth
[ 4] 0.0- 2.0 sec 503 MBytes 2.11 Gbits/sec
With the proposed patch (and everything else remaining the same) tcpdump
shows aggregated frames like this:
[root@ramu1 perf]# tcpdump -i vxlan0
<snip>
14:29:50.961380 IP 1.1.1.21.44138 > 1.1.1.11.commplex-link: Flags [.],
seq 24565681:24629989, ack 1, win 221, options [nop,nop,TS val
1973762616 ecr 4294793113], length 64308
14:29:50.961506 IP 1.1.1.11.commplex-link > 1.1.1.21.44138: Flags [.],
ack 24629989, win 21888, options [nop,nop,TS val 4294793113 ecr
1973762616], length 0
14:29:50.961463 IP 1.1.1.21.44138 > 1.1.1.11.commplex-link: Flags [.],
seq 24629989:24694297, ack 1, win 221, options [nop,nop,TS val
1973762616 ecr 4294793113], length 64308
14:29:50.961518 IP 1.1.1.21.44138 > 1.1.1.11.commplex-link: Flags [.],
seq 24694297:24758605, ack 1, win 221, options [nop,nop,TS val
1973762616 ecr 4294793113], length 64308
14:29:50.961655 IP 1.1.1.11.commplex-link > 1.1.1.21.44138: Flags [.],
ack 24694297, win 21932, options [nop,nop,TS val 4294793113 ecr
1973762616], length 0
14:29:50.961626 IP 1.1.1.21.44138 > 1.1.1.11.commplex-link: Flags [P.],
seq 24758605:24822913, ack 1, win 221, options [nop,nop,TS val
1973762616 ecr 4294793113], length 64308
[root@localhost ~]# ./iperf -s -i 2
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 1.1.1.11 port 5001 connected with 1.1.1.21 port 44136
[ ID] Interval Transfer Bandwidth
[ 4] 0.0- 2.0 sec 1.64 GBytes 7.04 Gbits/sec
[ 4] 2.0- 4.0 sec 1.98 GBytes 8.48 Gbits/sec
[ 4] 4.0- 6.0 sec 1.98 GBytes 8.52 Gbits/sec
[ 4] 6.0- 8.0 sec 1.99 GBytes 8.53 Gbits/sec
kernel trace shows vxlan_gro_receive being hit.
Topology:
---------
VM1 ---bridge
(br_perf)---vxlan0----10Gnic(int4)-----10Gnic---vxlan0----bridge
(br_perf)---VM2
MTUs:
VM (1450)
br_perf (9000)
vxlan0 (9000)
int4 (9000)
Hw/Sw Adapter, Drivers
-----------------------
02:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
SFI/SFP+ Network Connection (rev 01)
[root@ramu1 ~]# ethtool -i int4
driver: ixgbe
version: 4.0.1-k-rh7.1
firmware-version: 0x80000208
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
Config HOST1:
-------------
[root@ramu1 perf]# cat docfg.sh
ip link del vxlan0
ip link set dev br_perf down
brctl delbr br_perf
brctl addbr br_perf
ip link set dev br_perf up
ip link add vxlan0 mtu 9000 type vxlan id 1 l2miss l3miss rsc proxy
nolearning dstport 8472
ip link set dev vxlan0 up
brctl addif br_perf vxlan0
ip neigh add 1.1.1.21 lladdr 52:54:00:17:c8:4d dev vxlan0 nud permanent
bridge fdb replace 52:54:00:17:c8:4d dev vxlan0 self permanent dst
10.50.117.216
Config VM1:
-----------
eth0 IP,MAC: 1.1.1.11, 52:54:00:6c:53:61
CPU affinity for both VMs
--------------------------
[root@ramu1 perf]# virsh vcpuinfo centos-6.5
VCPU: 0
CPU: N/A
State: N/A
CPU time N/A
CPU Affinity: ---y------------------------------------
Iptables disabled on bridges
------------------------------
echo 0 > /proc/sys/net/bridge/bridge-nf-call-iptables
Offload Settings both hosts are at default
-------------------------------------------
[root@ramu1 perf]# ethtool -k int4
Features for int4:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: on
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: on
tx-checksum-fcoe-crc: on [fixed]
tx-checksum-sctp: on
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: on [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: on [fixed]
[root@ramu1 perf]# ethtool -k br_perf
Features for br_perf:
rx-checksumming: off [fixed]
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [requested on]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [requested on]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: on [fixed]
tx-gso-robust: off [requested on]
tx-fcoe-segmentation: off [requested on]
tx-gre-segmentation: on
tx-ipip-segmentation: on
tx-sit-segmentation: on
tx-udp_tnl-segmentation: on
tx-mpls-segmentation: on
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
[root@ramu1 perf]# ethtool -k vxlan0
Features for vxlan0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: on
tx-tcp6-segmentation: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: on [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
[root@ramu1 perf]#
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html