On 2015-06-26 11:04, Tom Herbert wrote:
I am testing the simplest configuration which has 1 TCP flow generated by
iperf from
a VM connected to a linux bridge with a vxlan tunnel interface. The 10G nic
(82599 ES) has
multiple receive queues, but in this simple test, it is likely immaterial
(because, the
tuple on which it hashes would be fixed). The real difference in performance
appears to
be whether or not vxlan gro is performed by software.


Please do "ethtool -k vxlan0" of whatever interface is for vxlan.
Ensure GRO is "on", if not enable it on the interface by "ethtool _k
vxlan0 gro on". Run iperf and to tcpdump on the vxlan interface to
verify GRO is being done. If we are seeing performance degradation
when GRO is being done at tunnel versus device that would be a
different problem than no GRO being done at all.

Heres more details on the test.

gro is "on" on the device and the tunnel. tcpdump on the vxlan interface show un-aggregated packets

[root@ramu1 tracing]# tcpdump -i vxlan0
<snip>
ptions [nop,nop,TS val 1972850548 ecr 193703], length 1398
14:14:38.911955 IP 1.1.1.21.44134 > 1.1.1.11.commplex-link: Flags [.], seq 224921449:224922847, ack 1, win 221, options [nop,nop,TS val 1972850548 ecr 193703], length 1398 14:14:38.911957 IP 1.1.1.21.44134 > 1.1.1.11.commplex-link: Flags [.], seq 224922847:224924245, ack 1, win 221, options [nop,nop,TS val 1972850548 ecr 193703], length 1398 14:14:38.911958 IP 1.1.1.21.44134 > 1.1.1.11.commplex-link: Flags [.], seq 224924245:224925643, ack 1, win 221, options [nop,nop,TS val 1972850548 ecr 193703], length 1398 14:14:38.911959 IP 1.1.1.21.44134 > 1.1.1.11.commplex-link: Flags [.], seq 224925643:224927041, ack 1, win 221, options [nop,nop,TS val 1972850548 ecr 193703], length 1398

In the kernel trace I dont see "vxlan_gro_receive" being hit at all.

[root@localhost ~]# ./iperf -s -i 2
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 1.1.1.11 port 5001 connected with 1.1.1.21 port 44135
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 2.0 sec   503 MBytes  2.11 Gbits/sec


With the proposed patch (and everything else remaining the same) tcpdump shows aggregated frames like this:

[root@ramu1 perf]# tcpdump -i vxlan0
<snip>
14:29:50.961380 IP 1.1.1.21.44138 > 1.1.1.11.commplex-link: Flags [.], seq 24565681:24629989, ack 1, win 221, options [nop,nop,TS val 1973762616 ecr 4294793113], length 64308 14:29:50.961506 IP 1.1.1.11.commplex-link > 1.1.1.21.44138: Flags [.], ack 24629989, win 21888, options [nop,nop,TS val 4294793113 ecr 1973762616], length 0 14:29:50.961463 IP 1.1.1.21.44138 > 1.1.1.11.commplex-link: Flags [.], seq 24629989:24694297, ack 1, win 221, options [nop,nop,TS val 1973762616 ecr 4294793113], length 64308 14:29:50.961518 IP 1.1.1.21.44138 > 1.1.1.11.commplex-link: Flags [.], seq 24694297:24758605, ack 1, win 221, options [nop,nop,TS val 1973762616 ecr 4294793113], length 64308 14:29:50.961655 IP 1.1.1.11.commplex-link > 1.1.1.21.44138: Flags [.], ack 24694297, win 21932, options [nop,nop,TS val 4294793113 ecr 1973762616], length 0 14:29:50.961626 IP 1.1.1.21.44138 > 1.1.1.11.commplex-link: Flags [P.], seq 24758605:24822913, ack 1, win 221, options [nop,nop,TS val 1973762616 ecr 4294793113], length 64308


[root@localhost ~]# ./iperf -s -i 2
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 1.1.1.11 port 5001 connected with 1.1.1.21 port 44136
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0- 2.0 sec  1.64 GBytes  7.04 Gbits/sec
[  4]  2.0- 4.0 sec  1.98 GBytes  8.48 Gbits/sec
[  4]  4.0- 6.0 sec  1.98 GBytes  8.52 Gbits/sec
[  4]  6.0- 8.0 sec  1.99 GBytes  8.53 Gbits/sec

kernel trace shows vxlan_gro_receive being hit.


Topology:
---------

VM1 ---bridge (br_perf)---vxlan0----10Gnic(int4)-----10Gnic---vxlan0----bridge (br_perf)---VM2

MTUs:
  VM        (1450)
  br_perf   (9000)
  vxlan0    (9000)
  int4      (9000)

Hw/Sw Adapter, Drivers
-----------------------

02:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)

[root@ramu1 ~]# ethtool -i int4
driver: ixgbe
version: 4.0.1-k-rh7.1
firmware-version: 0x80000208
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

Config HOST1:
-------------

[root@ramu1 perf]# cat docfg.sh
ip link del vxlan0
ip link set dev br_perf down
brctl delbr br_perf

brctl addbr br_perf
ip link set dev br_perf up
ip link add vxlan0 mtu 9000 type vxlan id 1 l2miss l3miss rsc proxy nolearning dstport 8472
ip link set dev vxlan0 up
brctl addif br_perf vxlan0

ip neigh add 1.1.1.21 lladdr 52:54:00:17:c8:4d dev vxlan0 nud permanent

bridge fdb replace 52:54:00:17:c8:4d dev vxlan0 self permanent dst 10.50.117.216


Config VM1:
-----------
eth0 IP,MAC:  1.1.1.11, 52:54:00:6c:53:61

CPU affinity for both VMs
--------------------------
[root@ramu1 perf]# virsh vcpuinfo centos-6.5
VCPU:           0
CPU:            N/A
State:          N/A
CPU time        N/A
CPU Affinity:   ---y------------------------------------

Iptables disabled on bridges
------------------------------
echo 0 > /proc/sys/net/bridge/bridge-nf-call-iptables

Offload Settings both hosts are at default
-------------------------------------------

[root@ramu1 perf]# ethtool -k int4
Features for int4:
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on
        tx-checksum-fcoe-crc: on [fixed]
        tx-checksum-sctp: on
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: on [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: on [fixed]



[root@ramu1 perf]# ethtool -k br_perf
Features for br_perf:
rx-checksumming: off [fixed]
tx-checksumming: on
        tx-checksum-ipv4: off [fixed]
        tx-checksum-ip-generic: on
        tx-checksum-ipv6: off [fixed]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [requested on]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp6-segmentation: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [requested on]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: on [fixed]
tx-gso-robust: off [requested on]
tx-fcoe-segmentation: off [requested on]
tx-gre-segmentation: on
tx-ipip-segmentation: on
tx-sit-segmentation: on
tx-udp_tnl-segmentation: on
tx-mpls-segmentation: on
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]







[root@ramu1 perf]# ethtool -k vxlan0
Features for vxlan0:
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: off [fixed]
        tx-checksum-ip-generic: on
        tx-checksum-ipv6: off [fixed]
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: on
        tx-tcp6-segmentation: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: on [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
[root@ramu1 perf]#













--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to