Hi Siva,

Thank you for your quick and thoughtful response.

I will ask about the default MTU for the veth interface to see if the
user increased it themselves.

I'm not sure I completely understand what you mean about largesend
offload being disabled after retransmits. I'm also not completely sure
if it's largesend offload or just large packets that are causing issues.
If I have understood correctly (e.g.
https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/com.ibm.aix.performance/tcp_large_send_offload.htm)
large-send offload is what Linux would call TCP Segmentation Offload
(TSO) - does that match your understanding?

Here's my concern. The code I'm looking at (let's look at Zesty, so
v4.10) is in ibmveth.c, ibmveth_poll(). There we see:

 if (length > netdev->mtu + ETH_HLEN) {
        ibmveth_rx_mss_helper(skb, mss, lrg_pkt);
        adapter->rx_large_packets++;
 }

Then ibmveth_rx_mss_helper() has the following - setting GSO on
regardless of the large_pkt bit:

 /* if mss is not set through Large Packet bit/mss in rx buffer,
  * expect that the mss will be written to the tcp header checksum.
  */
 tcph = (struct tcphdr *)(skb->data + offset);
 if (lrg_pkt) {
        skb_shinfo(skb)->gso_size = mss;
 } else if (offset) {
        skb_shinfo(skb)->gso_size = ntohs(tcph->check);
        tcph->check = 0;
 }

It looks to me that Linux will interpret a packet from the veth adaptor
as a GSO/GRO packet based only on whether or not the size of the
received packet is greater than the linux-side MTU plus the header size
- not based on whether AIX thinks it is transmitting a LSO packet.

To put it another way - if I have understood correctly - there are two
ways we could end up with a GSO/GRO packet coming out of a veth adaptor.
The ibmveth_rx_mss_helper path is taken when the size of the packet is
greater than MTU+ETH_HLEN, which can happen when:

 1) The AIX end has turned on LSO, so the large_packet bit is set
 2) Large-send is off in AIX but there is a mis-matched MTU between AIX and 
Linux

In the first case case, you say that AIX will turn off largesend, which
will fix the issue. But in the second case, if I have understood
correctly, AIX will not be able to do anything. Unless you are saying
that AIX will dynamically reduce the MTU for a connection in the
presence of a number of re-transmits?

This isn't necessarily wrong behaviour from AIX - Linux can't do
anything in this situation either; a 'hop' that can participate in Path
MTU Discovery would be needed.

If I understand it, then, the optimal configuration would be for the AIX
LPAR to set an MTU of 1500/9000 and turn on LSO for veth on the AIX side
- does that sound right?

Thanks again!
Regards,
Daniel

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1692538

Title:
  Ubuntu 16.04.02: ibmveth: Support to enable LSO/CSO for Trunk VEA

Status in The Ubuntu-power-systems project:
  In Progress
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Zesty:
  Fix Released
Status in linux source package in Artful:
  Fix Released

Bug description:
  
  == SRU Justification ==
  Commit 66aa0678ef is request to fix four issues with the ibmveth driver.
  The issues are as follows:
  - Issue 1: ibmveth doesn't support largesend and checksum offload features 
when configured as "Trunk".
  - Issue 2: SYN packet drops seen at destination VM. When the packet
  originates, it has CHECKSUM_PARTIAL flag set and as it gets delivered to IO
  server's inbound Trunk ibmveth, on validating "checksum good" bits in ibmveth
  receive routine, SKB's ip_summed field is set with CHECKSUM_UNNECESSARY flag.
  - Issue 3: First packet of a TCP connection will be dropped, if there is
  no OVS flow cached in datapath.
  - Issue 4: ibmveth driver doesn't have support for SKB's with frag_list.

  The details for the fixes to these issues are described in the commits
  git log.



  == Comment: #0 - BRYANT G. LY <b...@us.ibm.com> - 2017-05-22 08:40:16 ==
  ---Problem Description---

   - Issue 1: ibmveth doesn't support largesend and checksum offload features
     when configured as "Trunk". Driver has explicit checks to prevent
     enabling these offloads.

   - Issue 2: SYN packet drops seen at destination VM. When the packet
     originates, it has CHECKSUM_PARTIAL flag set and as it gets delivered to
     IO server's inbound Trunk ibmveth, on validating "checksum good" bits
     in ibmveth receive routine, SKB's ip_summed field is set with
     CHECKSUM_UNNECESSARY flag. This packet is then bridged by OVS (or Linux
     Bridge) and delivered to outbound Trunk ibmveth. At this point the
     outbound ibmveth transmit routine will not set "no checksum" and
     "checksum good" bits in transmit buffer descriptor, as it does so only
     when the ip_summed field is CHECKSUM_PARTIAL. When this packet gets
     delivered to destination VM, TCP layer receives the packet with checksum
     value of 0 and with no checksum related flags in ip_summed field. This
     leads to packet drops. So, TCP connections never goes through fine.

   - Issue 3: First packet of a TCP connection will be dropped, if there is
     no OVS flow cached in datapath. OVS while trying to identify the flow,
     computes the checksum. The computed checksum will be invalid at the
     receiving end, as ibmveth transmit routine zeroes out the pseudo
     checksum value in the packet. This leads to packet drop.

   - Issue 4: ibmveth driver doesn't have support for SKB's with frag_list.
     When Physical NIC has GRO enabled and when OVS bridges these packets,
     OVS vport send code will end up calling dev_queue_xmit, which in turn
     calls validate_xmit_skb.
     In validate_xmit_skb routine, the larger packets will get segmented into
     MSS sized segments, if SKB has a frag_list and if the driver to which
     they are delivered to doesn't support NETIF_F_FRAGLIST feature.

  Contact Information = Bryant G. Ly/b...@us.ibm.com

  ---uname output---
  4.8.0-51.54

  Machine Type = p8

  ---Debugger---
  A debugger is not configured

  ---Steps to Reproduce---
   Increases performance greatly

  The patch has been accepted upstream:
  https://patchwork.ozlabs.org/patch/764533/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1692538/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to