Jennifer,

On 25/05/2018 at 23:44, Jennifer Dahm wrote:
During testing, I discovered that the Zynq GEM hardware overwrites all
outgoing UDP packet checksums, which is illegal in packet forwarding
cases. This happens both with and without the checksum-zeroing
behavior  introduced  in  007e4ba3ee137f4700f39aa6dbaf01a71047c5f6
("net: macb: initialize checksum when using checksum offloading"). The
only solution to both the small packet bug and the packet forwarding
bug that I can find is to disable TX checksum offloading entirely.

Are the bugs listed above present in all revisions of the GEM IP, only for some revisions?
Is there an errata that describe this issue for the Zynq GEM?


There's still the possibility that these bugs are actually with the
driver software and not with the hardware. I've found several places
where the checksum is set to 0xFFFF (the incorrect checksum found in
small packets) when something goes wrong, and I can imagine a buggy
driver writing over the checksum blindly when TX checksum offloading
is enabled.

I would like feedback on two things:
1. Is it possible that the two bugs described above are caused by the
    driver and not by the hardware? If so, where should I look to
    implicate the driver?

Rafal,
Do you want to comment on this part?

2. Is this a problem we care enough about to completely disable TX
    checksum offloading?

I would prefer to keep it enabled if possible. I mean I'm not against such a workaround if the HW is not able to handle such cases but offloading a CPU is always good when we have "low end" processors like ARM9 at 200MHz like the older AT91 that use this driver...

Here is the testing procedure I used to reproduce these bugs on my
machine. Specifically, without this patchset, step 9 fails. Without
007e4ba3ee, step 8 also fails.

1. Set up the test environment:
   a. Acquire a Zynq device with two ethernet ports. This is the DUT.
   b. Acquire a USB-Ethernet adapter.
   c. Acquire two ethernet cables.
   d. Connect one Ethernet port on the DUT to your computer's network
      switch.
   e. Connect the other Ethernet port to the USB-Ethernet adapter and
      plug that adapter into your computer.
   f. Set up a Linux VM to send packets through the DUT. I recommend
      using a VM here so that you can easily detach it from the primary
      network to force outgoing traffic through the DUT.
   g. Set up a computer with a packet inspecting program to receive and
      inspect packets. This doesn't need to be a VM. For the purposes
      of this test, I'll be using a Windows instance with WireShark.
2. Load the kernel you want to test onto the DUT, making sure to
    include the `bridge` module.
3. Set up a bridge on the DUT. The following commands on the DUT
    should work, replacing `eth0` and `eth1` with the two ethernet
    interfaces on the DUT:
    ```
    brctl addbr test
    brctl addif test eth0 eth1
    ifconfig eth0 0.0.0.0
    ifconfig eth1 0.0.0.0
    dhclient test -v
    ```
4. Disconnect the Linux VM from your host computer's network and
    connect it to the USB-Ethernet adapter in order to force outgoing
    network traffic through the DUT. If necessary, run dhclient on the
    Linux VM to acquire an IP address.
5. Ensure that you can reach your Windows instance from your Linux VM
    through the DUT (e.g. ping).
6. Start WireShark on your Windows instance and start monitoring
    traffic on a specific, unused port (e.g. 61557).
7. Using netcat, send a few not-tiny UDP packets from your Linux VM to
    your Windows instance to ensure that valid UDP packets are properly
    forwarded. Ex:
    ```
    echo "hello world" | netcat -u <WindowsIP> 61557
    ```
    Inspect these packets to ensure that the data arrived intact and
    that the checksum looks reasonable (i.e. not 0x0000 or 0xFFFF).
8. Using netcat, send a few tiny UDP packets (2 bytes or fewer) from
    Linux VM to your Windows instance to ensure that the checksum is
    reasonable. Ex:
    ```
    echo "h" | netcat -u <WindowsIP> 61557
    ```
9. Using a custom program, send UDP packets with broken checksums
    (e.g. 0xABCD) from your Linux VM to your Windows instance. Inspect
    these packets with WireShark and make sure that the packet arrived
    with the same checksum you sent it with.

For step 9, I wrote a C program using the Linux socket API that will
send a properly formatted UDP packet with the payload "Hello!" and a
(broken) checksum of 0xABCD to port 61557 on the host provided at the
command line. I can send the full program if you would like, but here
is the important part of it:
```
struct custom_udp {
        int16_t s_port;
        int16_t d_port;
        int16_t length;
        int16_t check;
        char data[];
};

int send_message(int sockfd, in_port_t port, const char *message) {
        struct custom_udp *frame;
        int16_t message_len;
        int16_t frame_len;
        int ret;

        message_len = strlen(message) * sizeof(char);
        frame_len = sizeof(struct custom_udp) + message_len;
        frame = malloc(frame_len);
        frame->s_port = htons(0);
        frame->d_port = htons(port);
        frame->length = htons(frame_len);
        frame->check = htons(0xABCD);
        memmove(frame->data, message, message_len);

        ret = write(sockfd, frame, frame_len);
        free(frame);

        return ret;
}

Thanks a lot for the detailed testing procedure. It will definitively help us reproduce the bug.

As a conclusion, I would say that we can use this patch as a last resort if we are sure that:
1/ the driver doesn't do weird things with TX CSUM in the code
2/ the hw is unable to handle these particular cases.


Best regards,
  Nicolas


```

Jennifer Dahm (1):
   net/macb: Disable TX checksum offloading on all Zynq-7000

  drivers/net/ethernet/cadence/macb.h      |  1 +
  drivers/net/ethernet/cadence/macb_main.c | 11 ++++++++---
  2 files changed, 9 insertions(+), 3 deletions(-)



--
Nicolas Ferre

Reply via email to