AM625: IPv6 UDP checksum offload produces illegal checksum==0

Part Number: AM625
Other Parts Discussed in Thread: SK-AM62

Tool/software:

Hi experts,

my primary issue was a reproducable timeout when trying to downloading a large file using tftp from AM625 linux user space. Interestingly, this works via an IPv4 connection to the tftp server (my laptop computer), but now I'm trying to do it via an IPv6 link-local connection.  Note that my setup uses UDP over IPv6.

When debugging with Wireshark running on the tftp server, I noticed that the download timed out at the time when the checksum of the UDP package sent from AM625 to my laptop, hit zero. (The UDP package contents were some ACK messages with package IDs counting up, hence the UDP checksum counted down from some initial value towards zero.) It seems, that my Laptop discarded the ACK response with UDP checksum 0, as mandated by RFC 2460, which describes UDP over IPv6:

      o  Unlike IPv4, when UDP packets are originated by an IPv6 node,
         the UDP checksum is not optional.  That is, whenever
         originating a UDP packet, an IPv6 node must compute a UDP
         checksum over the packet and the pseudo-header, and, if that
         computation yields a result of zero, it must be changed to hex
         FFFF for placement in the UDP header.  IPv6 receivers must
         discard UDP packets containing a zero checksum, and should log
         the error. [emphasis mine]


Temporarily I could solve the issue with 

Fullscreen
1
ethtool --offload eth0 tx off
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 before the tftp download; but, if I understand correctly, this will burden the CPU with all Internet checksum calculations later on in production, for all TCP and UDP traffic.

To verify the issue, I wrote a small python script sending a small certain UDP package, adapted to result in an UDP checksum of zero (with some trial and error).

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import socket
from time import sleep
from itertools import cycle
ip6 = "fe80::443a:64c4:2cd8:fd26"
dev = "eth0"
addr = f"{ip6}%{dev}"
port = 8648
family, type_, proto, canonname, sockaddr = socket.getaddrinfo(
addr,
port,
family=socket.AF_INET6,
type=socket.SOCK_DGRAM,
proto=socket.IPPROTO_UDP,
).pop()
sock_send = socket.socket(family=family, type=type_, proto=proto)
sock_send.sendto(b"t0z-\r\n", sockaddr) # effects UDP_checksum == 0
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Indeed, the Hardware offloading logic seems to handle the zero case incorrectly (it should substitute 0x0000 by 0xffff). Insead, the UDP package got send with UDP checksum zero, as indicated on the receiving side via Wireshark:

Question: Did we miss some configuration of the AM625 hardware, that changes the calculation of the IPv6-UDP-checksum in the special case (0 --> 0xffff)? Or is this a hardware bug, and we must resort to software-calculated UDP-checksums?

Thanks in advance and Best Regards,
Lukas Rauber

  • Hello Lukas,

    Thanks for explaining the details of your debug process. First, we have a couple of questions regarding your test setup:

    1. Are you testing this on a custom designed board with AM625 or a TI AM62x SK-EVM?

    2. What Linux Processors SDK version are you using?

    3. If you are finding this issue due to testing on a custom designed board, are you able to see the same issue on a TI AM62x SK-EVM?

    4. What is your test topology? (i.e. Is your AM62x device under test directly connected via eth0 to your Laptop with no switch in between? No other ethernet ports on the AM62x device is connected during the test?)

    -Daolin

  • Hello Daolin,

    thank you for your quick response. Regarding your questions:

    1. We use a Phytec SoM (phyCORE-AM62x), which has an AM625 SoC on it. The SoM is mounted on our custom designed board.
    2. We have lately switched to Phytec's BSP-Yocto-Ampliphy-AM62x-PD24.1.0 board support package, which -- accordings to their Release Notes -- ships with:
      • - Linux TI vendor kernel v6.6.32 (based on TI tag 10.00.08)
        - U-Boot TI vendor v2024.04 (based on TI tag 10.00.08)
        - Yocto 5.0.3 (Scarthgap)
        - Qt 6.6
    3. I will try to perform the test with a TI AM62 SK-EVM during the day.
    4. One ethernet port (eth0) is directly connected to my laptop. The other port (eth1) is not used. The ports are not bridged (there is no `br0` network device).

    The device boots from SD Card, since the issue occured during our bootstrapping process (from SD Card to eMMC).

    The shell to the AM625 I accessed via UART0, using a USB UART adapter, also connected to my laptop.

  • Update:

    3.) Indeed, the issue can also be found on the TI SK-AM62 with the TI image `tisdk-default-image-am62xx-evm-10.01.10.04.rootfs.wic.xz  — 1017665 K`:

  • Hi Lukas, 

    >>>3.) Indeed, the issue can also be found on the TI SK-AM62 with the TI image `tisdk-default-image-am62xx-evm-10.01.10.04.rootfs.wic.xz  — 1017665 K`:

    Thanks for confirming. 

    Can you share the following? Please do this on the TI SK-AM62 evm as that will be easier for me to look into this issue on my side.

    1. Results of "ethtool -k eth0" before and after the "ethtool --offload eth0 tx off"?

    2. Results of ethtool -S eth0 before and after the illegal UDP checksum message?

    3. When sending the UDP packets, these packets were sent from your host laptop to the AM62x device under test? (i.e. the AM62x is receiving the UDP packets or is it the one sending the UDP packets?)

    4. If you sent UDP packets with 0 checksum from your host laptop to the AM62x device under test, do you see issues with the UDP rx checksum offload?

    -Daolin

  • No problemo:

    1. Fullscreen
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      root@am62xx-evm:~# ethtool -k eth0
      Features for eth0:
      rx-checksumming: on
      tx-checksumming: on
      tx-checksum-ipv4: off [fixed]
      tx-checksum-ip-generic: on
      tx-checksum-ipv6: off [fixed]
      tx-checksum-fcoe-crc: off [fixed]
      tx-checksum-sctp: off [fixed]
      scatter-gather: on
      tx-scatter-gather: on
      tx-scatter-gather-fraglist: off [fixed]
      tcp-segmentation-offload: off
      tx-tcp-segmentation: off [fixed]
      tx-tcp-ecn-segmentation: off [fixed]
      tx-tcp-mangleid-segmentation: off [fixed]
      tx-tcp6-segmentation: off [fixed]
      generic-segmentation-offload: on
      generic-receive-offload: on
      large-receive-offload: off [fixed]
      rx-vlan-offload: off [fixed]
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


    2. Fullscreen
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      root@am62xx-evm:~# ethtool -S eth0
      NIC statistics:
      p0_rx_good_frames: 27
      p0_rx_broadcast_frames: 6
      p0_rx_multicast_frames: 21
      p0_rx_crc_errors: 0
      p0_rx_oversized_frames: 0
      p0_rx_undersized_frames: 0
      p0_ale_drop: 0
      p0_ale_overrun_drop: 0
      p0_rx_octets: 4698
      p0_tx_good_frames: 41
      p0_tx_broadcast_frames: 24
      p0_tx_multicast_frames: 17
      p0_tx_octets: 6067
      p0_tx_64B_frames: 7
      p0_tx_65_to_127B_frames: 38
      p0_tx_128_to_255B_frames: 9
      p0_tx_256_to_511B_frames: 14
      p0_tx_512_to_1023B_frames: 0
      p0_tx_1024B_frames: 0
      XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
      • Note: the python-script `udp_chksum.py`is as follows:
        Fullscreen
        1
        2
        3
        4
        5
        6
        7
        8
        9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        import socket
        ip6 = "fe80::443a:64c4:2cd8:fd26"
        dev = "eth0"
        addr = f"{ip6}%{dev}"
        port = 8648
        family, type_, proto, canonname, sockaddr = socket.getaddrinfo(
        addr,
        port,
        family=socket.AF_INET6,
        type=socket.SOCK_DGRAM,
        proto=socket.IPPROTO_UDP,
        ).pop()
        sock_send = socket.socket(family=family, type=type_, proto=proto)
        msg = bytearray(b"\x00\x00\r\n")
        sock_send.sendto(msg, sockaddr)
        # check package on the other end (e.g. with wireshark),
        # and observe the UDP cheksum
        c = int(input("Enter observed UDP checksum (hex): "), 16)
        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
      • As expected, the (filtered) wireshark output observed on my laptop shows shows one valid and one illegal UDP package:
    3. No: the AM62x sends the UDP package to my laptop.
    4. I observed no issue that my laptop would send illegal UDP packages -- although, I haven't looked into that, because the initial problem in my tftp use case were the ACK packages sent by the AM625. I expect my laptop hardware to change UDP checksums of zero to be changed to 0xffff before sending them on the wire, as mandated by RFC768 (UDP) and RFC2460 (IPv6).
  • Hi Lukas,

    Thanks once again for sharing these details. I will need some time to look into this internally (discuss internally with some colleagues and to replicate this issue), my plan is to work on this tomorrow and provide an update on Friday. 

    In the meantime, have you looked into how much the CPU load increases when "ethtool --offload eth0 tx off" is used? As I understand, while the CPU load would increase with the tx-checksum offload disabled, the actual load will not increase by a lot. If the CPU load still meets your requirements, it might be sufficient as a workaround until the root-cause of this issue is tracked down.

    -Daolin

  • Update:

    I was able to replicate the same issue using your python script. I will be creating an internal bug ticket on this to get it resolved. My suspicion is that the below part of the CPSW driver is not handling the case of IPv6 UDP transmit checksum being 0x0 and making sure the checksum gets sent as 0xffff. It may take some time for the developer to get a fix for this so feel free to try and modify the CPSW driver (am65-cpsw-nuss.c) yourself to see if you can get the tx offload checksum to work.

    https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/net/ethernet/ti/am65-cpsw-nuss.c?h=ti-rt-linux-6.6.y&id=93a76530316a3d8cc2d82c3deca48424fee92100#n1031  

    Orginal patch that covered tx-checksum-offload: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/drivers/net/ethernet/ti/am65-cpsw-nuss.c?h=ti-rt-linux-6.6.y&id=93a76530316a3d8cc2d82c3deca48424fee92100 

    The errata mentioned in this part of the driver seems to only apply to AM65x devices and is not on the errata list for AM62x: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/net/ethernet/ti/am65-cpsw-nuss.c?h=ti-rt-linux-6.6.y&id=93a76530316a3d8cc2d82c3deca48424fee92100#n1683 

    -Daolin