This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6442: USB device configure cause Linux system hang

Part Number: AM6442
Other Parts Discussed in Thread: DP83869

Tool/software:

Hello E2E Team, 

we face an issue with the am6442 SR2 USB controller. The controller sometimes causes a system hang if we configure the USB device via our setup script. Only a hard power cycle help to reset the system. 

am6442 System: 

  • ti am6442 SoC SR2 
  • Linux 6.1.20+rt 
  • Sysfw: v09.00.07
  • Bootdevice: eMMC

 

Testsetup: 

Custom am6442 board connected to a Win10 PC via USB2.0. The PC try to establish a SSH connection via the RNDIS Ethernet device. After successful establish the connection, the Windows system send a reboot command to the device. After the power cycle the Windows system try to connect again via USB RNDIS. The USB cable connection is not removed while the test, the Windows 10 PC and the device are connected while the hole test. the system hang is not reproduceable, sometimes it takes 7 reboots, sometimes 500+. 

 

The following script configures the USB controller at Linux init time as a device, with CDC-ECM and RNDIS. The controller is configured as "otg" in the device-tree of the Linux-kernel.

Script abstract: 

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

# config 1 is for CDC

mkdir -p configs/c.1
echo "${NET_USB_ATTR}" > configs/c.1/bmAttributes
echo "${NET_USB_PWR}" > configs/c.1/MaxPower
mkdir -p "configs/c.1/strings/${CONFIG_USB_LANGID}"
echo "CDC" > "configs/c.1/strings/${CONFIG_USB_LANGID}/configuration"
mkdir -p functions/ecm.usb0

mkdir -p configs/c.2
echo "${NET_USB_ATTR}" > configs/c.2/bmAttributes
echo "${NET_USB_PWR}" > configs/c.2/MaxPower
mkdir -p "configs/c.2/strings/${CONFIG_USB_LANGID}"
echo "RNDIS" > "configs/c.2/strings/${CONFIG_USB_LANGID}/configuration"

mkdir -p functions/rndis.usb0

# On Windows 7 and later, the RNDIS 5.1 driver would be used by default,
# but it does not work very well. The RNDIS 6.0 driver works better. In
# order to get this driver to load automatically, we have to use a
# Microsoft-specific extension of USB.

echo "1" > os_desc/use
echo "${MS_VENDOR_CODE}" > os_desc/b_vendor_code
echo "${MS_QW_SIGN}" > os_desc/qw_sign

init_mac_leases 3 
local host_mac; host_mac=$(get_mac "${NET_USB_INTERFACE}" "${NET_USB_CONFIGFILE}" "HOST_")
log_info "Initializing usb net for interface ${NET_USB_INTERFACE} with HOST_MAC ${host_mac} and serial number ${serial_number}"
echo "${host_mac}" > functions/rndis.usb0/host_addr
local mac; mac=$(get_mac "${NET_USB_INTERFACE}" "${NET_USB_CONFIGFILE}")
log_info "Setting up interface ${NET_USB_INTERFACE} with IP ${NET_USB_IP} and MAC ${mac}"
echo "${mac}" > functions/ecm.usb0/dev_addr
echo "${mac}" > functions/ecm.usb0/host_addr
echo "${mac}" > functions/rndis.usb0/dev_addr
echo "${mac}" > functions/rndis.usb0/host_addr

echo "${MS_COMPAT_ID}" > functions/rndis.usb0/os_desc/interface.rndis/compatible_id
echo "${MS_SUBCOMPAT_ID}" > functions/rndis.usb0/os_desc/interface.rndis/sub_compatible_id

ln -s functions/ecm.usb0 configs/c.1
ln -s functions/rndis.usb0 configs/c.2
ln -s configs/c.2 os_desc

# add appropriate usb device to the gadget (AM64x specific)
echo "f400000.usb" > UDC

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

After the "echo "f400000.usb" > UDC" the system hang. 

In case of the system hang the Linux stops working, no communication via UART, Ethernet and USB was possible.

We check the silicon errata for the am6442, we add the bugfix for "i2409 — USB: USB2 PHY locks up due to short suspend" sadly that didn't resolve the problem.  

Reading the Cadence silicon errata (https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/usb/cdns3/cdns3-gadget.c?h=ti-rt-linux-6.1.y-cicd),

we remove the power management stuff from the cdns-plat.c driver code: 

https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/drivers/usb/cdns3/cdns3-plat.c?h=ti-rt-linux-6.1.y-cicd#n328 

Sadly, this didn't help with the stuck at init time. Remove the pm stuff let the system boot, but no communication is going on the USB device, reconfigure the device via the script cause the system hang. 

We found this E2E post which may relates to our problem: 

https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1319731/am6422-usb-controller-about-phyrst_a_enable-register? 

Best regards,

Stefan

  • Hi Stefan,

    I have assigned your query to our expert. He is out of office today, so expect a response tomorrow when he is in office or early next week.

    Apologies for the delay.

    Best Regards,

    Suren

  • HI Stefan

    Urgency on this issue noted. Our responses maybe slower than usual as the key expert has limited access next week. 

    Few follow up questions 

    1) Any way to see if different host give different results? Window 11 or Linux host?

    2) Do you have some more data on failure , how many boards tried and how many failing , are you thinking it is some sort of marginality in software or silicon or every board is currently susceptible to it with prolonged reboot tests?

    3) Is power cycling done properly , no noise/glitches etc and no glitches in software start up sequence on host side? Several customers do USB cable connect/disconnect via MCCI like products for robustness testing. Are your robustness testing mostly around powering on/off host 

    4) Is it possible for you to replicate the issue on the TI EVM?

  • Hi Mukul,

     

    According to your follow-up questions:

     

    1: We currently evaluate only on Win10 Systems, but we can setup a Linux (Debian) test as well.

     

    2: We see the failure on different custom boards, using the same kernel and rootfs. We have two different Hardware (product) variants utilize the am64 device, on both boards the problem is seen. In total we have seen the problem on three custom boards. We did not have any other long-term USB-Tests. Sadly, the board stops working after the crash, even the Linux UART console buffer is not written out, so we do not have any snapshot of the error itself.

    We configure the USB core for two different descriptors, as we found the cadence comment in the Linux driver regarding the Buffer init, we now focus on the USB core hw init. We see the error also if we configure only one descriptor.

     

    3: The am64 board use a PMIC for the Core voltages and separate converters for the DDR4 Voltages. The Host PC is not power cycled, so the host keeps the same power status while the whole test. We do not disconnect the cable while the reboot-test.

     

    4: Now we do not have test the am64-evm board right now, we plan to build the test with the am64-evm.

    BR 
    Stefan

  • Hi Stefan,

    I am currently in a full-day training this week and next week, and don't have bandwidth on any EVM hands-on work. But please let me know once you are able to replicate the issue on AM64x GPEVM, then I will look into it.

    I don remember we had a customer had a AM64x USB device mode issue in Linux (but I cannot recall at this moment if it was such dead lockup as you observed). But the issue only happened when USB0 dr_mode in kernel device tree was configured to "peripheral", but it didn't happen if dr_mode = "org". ("otg" is the default setting for USB0 in the SDK kernel.)

  • Hi Bin, 

    we currently setup two am64-evm boards connected to different PC systems, we came back with the test results. We run the USB controller in dr_mode = "org"

    BR 
    Stefan 

  • Hi Stefan,

    we came back with the test results.

    Do you mean you are able to reproduce the issue on two AM64x-EVM?

  • Hi Bin,

    Hi Mukul,

    yes, we are able tor reproduce the error on different am64-evm boards with different PC systems. Here our setups:

    Setup 1: (Win 10 with VMWare Debian)

    The reboot script is running in the VMWare

    am64-evm

    • ti am6442 SoC SR2
    • Linux 6.1.20+rt
    • Sysfw: v09.00.08
    • Bootdevice: eMMC

    Custom Hardware

    • ti am6442 SoC SR2
    • Linux 6.1.20+rt
    • Sysfw: v09.00.08
    • Bootdevice: eMMC

     

    Setup 2: (Win 10)

    The reboot script is running on Windows 

    2 different am64-evm boards

    • ti am6442 SoC SR2
    • Linux 6.1.20+rt
    • Sysfw: v09.00.08
    • Bootdevice: eMMC

    BR 

    Stefan 

  • Hi Stefan,

    Thanks for the details. I am still in training this week, and will be back to work next week and try to reproduce the issue on my EVM and look into it. Due the accumulated work in these two weeks while I am in training, my process would be slower next week. I will keep you posted.

  • Hi Stefan,

    Setup 1: (Win 10 with VMWare Debian)

    The reboot script is running in the VMWare

    It appears in this setup, the AM64x USB RNDIS gadget is enumerated/communicated with the Debian running in Win10 VMWare, not the Win10, right?

    I am asking is because I don't have a Window PC to test with, I only have access to Linux PC to connect my AM64x GPEVM with.

  • Hi Bin,

    yes thats right, the USB Stack is running on Linux. 

    BR 

    Stefan 

  • Hi Stefan,

    Thanks for confirming. I will try to replicate the issue with AM64x GPEVM connecting with Linux USB host early next week.

  • Hi Stefan,

    Sorry for the late response.

    I am unable to reproduce the issue on my AM64x GPEVM. I don't see which Processor SDK version uses kernel v6.1.20, so I used SDK v9.0.0.3 which has kernel v6.1.33.

    I see your USB gadget config uses ECM and RNDIS gadget functions, so I used the following USB gadget config script to create the composite gadget with both ECM and RNDIS functions. It generates two USB ethernet interfaces on both the EVM and Linux PC. I then assigned different subnet IP addresses to both (usb0: 192.168.3.x, usb1: 192.168.4.x), and I can ping and ssh to the EVM.

    After "reboot" the EVM in the ssh session, repeat the same setup, then I can ssh from the Linux PC to the EVM again using ether usb0 or usb1.

    Do you see what I did different from what you tested?

    5852.usbconfigfs.sh.txt
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    #!/bin/bash
    # $1: -d - tear down
    #FUNCS=("mass_storage.usb0")
    #FUNCS=("hid.usb0")
    #FUNCS=("uvc.usb0")
    #FUNCS=("uac1.usb0")
    #FUNCS=("uac2.usb0")
    #FUNCS=("SourceSink.usb0")
    #FUNCS=("uvc.usb0" "hid.usb0")
    #FUNCS=("acm.usb0" "ncm.usb0" "acm.usb1")
    #FUNCS=("uac1.usb0" "hid.usb0")
    #FUNCS=("uac2.usb0" "hid.usb0")
    #FUNCS=("acm.usb0" "acm.usb1")
    FUNCS=("ecm.usb0" "rndis.usb0")
    CFS=/sys/kernel/config/usb_gadget
    VID="0x1d6d"
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

  • BTY, I just flashed the default WIC image "tisdk-default-image-am64xx-evm.wic.xz" from the SDK to a SD card, and copied the configfs script to the /home/root/ directory. Noththing else is changed in Linux.

  • Hi Bin,

    Thank you for your support! Our setup differs slightly from the default image as we need to use the latest sysfw to enable pru_eth in MII mode. We encounter a hang issue after numerous reboots, sometimes up to 800. Initially, the reboots proceed without any problems or hangs.

  • Hi Stefan,

    Thanks. Now I understand the hang issue only happens after hundreds times of reboot, but

    Our setup differs slightly from the default image as we need to use the latest sysfw

    Have you tested if the issue happens with the same sysfw from the SDK? I am trying to narrow the components for reproducing the symptom.

  • Hi Bin, 

    Have you tested if the issue happens with the same sysfw from the SDK? I am trying to narrow the components for reproducing the symptom.

    Not now, because of the mii topic. I can setup a test, but this need some time ~1 Week

  • Hi Stefan,

    . I can setup a test, but this need some time ~1 Week

    Looking forward to the result.

    Meanwhile, if there is anything I can do on my setup, please provide the exact instruction based on any of the SDK version. Please note that I have my own configfs script (attached above). The one your provided in your first post misses the definition of some macros.

  • Dear Bin,

    Unfortunately, we are unable to test the RNDIS functionality with the TI image for the AM64-EVM. Our primary objective is to connect the system to Windows-based systems. This is the reason we have set up RNDIS via config fs. Without this setup, the Windows USB stack would load the serial driver, necessitating a manual configuration of the Windows 10 USB driver.

    Here are the necessary USB configurations for compatibility with Windows:

    ms_compat_id="RNDIS": This matches the Windows RNDIS Drivers.
    ms_subcompat_id="5162001": This matches the Windows RNDIS 6.0 Driver.
    To ensure you have the most up-to-date information regarding our setup, we have built a custom Linux distribution using Yocto. However, it’s important to note that we are not using a Poky distribution.
    Our setup includes:

    - Kernel: 6.1.20+rt (with Debian RT patches)
    - U-Boot: 2021.1
    - Sysfw: v9.00.08

    Best Regards,
    Stefan
  • Hello Bin,

    I’ve attached the script from our Windows PC for your reference (Powershell script). We reboot the board by establishing an SSH connection via RNDIS. The purpose of this test is to verify if the USB interface is operational and if the board is up and ready.

    Best Regards,
    Stefan

     

    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    # Path to Plink executable
    $plinkPath = "plink"
    # Remote host details
    $remoteHost = "192.168.200.1"
    $username = "root"
    $password = "root"
    # Counter for the number of reboots
    $rebootCount = 0
    # Function to establish SSH connection and reboot the remote system
    function Reboot-RemoteSystem {
    param (
    [string]$rhost,
    [string]$user,
    [string]$pass
    )
    $command = "echo y | $plinkPath -ssh $user@$remoteHost -pw $pass reboot"
    $output = Invoke-Expression -Command $command
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

  • Hi Stefan,

    I don't have access to a Windows PC. My computers are all Linux based.

    I modified my script attached above (usbconfigfs.sh.txt):

    FUNCS=("ecm.usb0" "rndis.usb0")

    to

    FUNCS=("rndis.usb0")

    so that the USB gadget has only rndis function, my Linux PC and still enumerate it using "rndis_host" driver.

    I understand you have to use USB gadget configfs, but can you please try to reproduce the issue with a Linux host? Then I can replicate it on my side and debug it.

  • Hello Bin,

    Here’s a short update. As discussed with the FAE, we were able to reproduce the error with the default image from TI.
    The result is slightly different, but in the end, the system hangs. Here is the test setup:

    Hardware:
    am-64evm (ti am6442 SoC SR2)
    SD Card Boot
    Software:
    Processor SDK LINUX AM64x Yocto - SD card image:
    Version: 09.02.01.10
    Release date: May 30, 2024
    MD5 checksum: 268b8be6f8533b45ffb7dac811297e54
    Kernel version: 6.1.83-rt28-ti-rt-g96b0ebd82722
    PC:
    Windows 10, Linux tests are starting now.
    Modifications:
    Added a shell script that configures USB to CDC/RNDIS
    Added a systemd service file to run the script
    Enabled the systemd service
    Setup Script:
    Name: setup-rndis.sh
    Location: root@am64xx-evm:/lib/systemd
    Permissions and size: -rwxr-xr-x 1 root root 5034 Apr 29 04:54 setup-rndis.sh
    Systemd Service File:

    Name: setup-rndis.service
    Location: root@am64xx-evm:/lib/systemd/system
    Permissions and size: -rw-r–r-- 1 root root 249 Apr 29 04:34 setup-rndis.service
    Win 10 script: 
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    # Path to Plink executable
    $plinkPath = "plink"
    # Remote host details
    $remoteHost = "192.168.200.1"
    $username = "root"
    $password = "root"
    #$password = ""
    # Counter for the number of reboots
    $rebootCount = 0
    # Function to establish SSH connection and reboot the remote system
    function Reboot-RemoteSystem {
    param (
    [string]$rhost,
    [string]$user,
    [string]$pass
    )
    $command = "echo y | $plinkPath -ssh $user@$remoteHost -pw $pass /sbin/reboot"
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    setup-rndis.sh
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    #!/bin/sh
    #RNDIS config for WIN 7, WIN 10, Linuy and MAC OSx
    set -e
    # command line parameters
    command="$1" # "up" or "down"
    udc_device="f400000.usb" # a udc device name, such as "musb-hdrc.1.auto"
    config_home="/sys/kernel/config/"
    g="/sys/kernel/config/usb_gadget/AM64xRef"
    usb_up() {
    usb_ver="0x0200" # USB 2.0
    dev_class="2"
    vid="0x1919"
    pid="0x0815"
    device="0x3001"
    mfg="SICK AG"
    prod="AM64xRef"
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    setup-rndis.service
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    [Unit]
    Description=Setup RNDIS Devices at Boot
    After=network.target
    [Service]
    Type=oneshot
    ExecStart=/lib/systemd/setup-rndis.sh start
    ExecStop=/lib/systemd/setup-rndis.sh stop
    RemainAfterExit=yes
    Type=oneshot
    [Install]
    WantedBy=multi-user.target
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

  • Hi Stefan,

    I guess I didn't state it clearly.

    I should be able to run any Linux setup on an AM64x EVM with instructions from you, but my challenge is on the USB host side, as I don't have access to a Windows PC.

    So if you can keep the AM64x side software setup the same, but only try to test it with a Linux PC. If the issue also happens with the Linux PC, I can replicate it and debug.

  • Hi Bin, 

    I just finished the test with a Linux PC (Ubuntu 2022.02), after a few reboots (~50), the System hang: 

    Log from ti am64-evm:


    [ OK ] Finished File System Check on /dev/mmcblk1p1.
    Mounting /run/media/boot-mmcblk1p1...
    [ OK ] Mounted /run/media/boot-mmcblk1p1.
    [ OK ] Started Network Configuration.
    Starting Wait for Network to be Configured...
    Starting Network Name Resolution...
    [ 9.581008] remoteproc remoteproc15: powering up 300b4000.pru
    [ 9.583266] remoteproc remoteproc15: Booting fw image ti-pruss/am65x-sr2-pru0 -prueth-fw.elf, size 40816
    [ 9.583310] remoteproc remoteproc15: unsupported resource 5
    [ 9.583339] remoteproc remoteproc15: remote processor 300b4000.pru is now up
    [ 9.583383] remoteproc remoteproc16: powering up 30084000.rtu
    [ 9.584635] remoteproc remoteproc16: Booting fw image ti-pruss/am65x-sr2-rtu0 -prueth-fw.elf, size 30888
    [ 9.584692] remoteproc remoteproc16: remote processor 30084000.rtu is now up
    [ 9.584731] remoteproc remoteproc7: powering up 3008a000.txpru
    [ 9.586137] remoteproc remoteproc7: Booting fw image ti-pruss/am65x-sr2-txpru 0-prueth-fw.elf, size 36672
    [ 9.586198] remoteproc remoteproc7: remote processor 3008a000.txpru is now up
    [ 9.588990] pps pps1: new PPS source ptp2
    [ 9.658029] am65-cpsw-nuss 8000000.ethernet eth1: PHY [mdio_mux-0.1:03] drive r [TI DP83869] (irq=POLL)
    [ 9.658063] am65-cpsw-nuss 8000000.ethernet eth1: configuring for phy/rgmii-r xid link mode
    [ 9.705028] am65-cpsw-nuss 8000000.ethernet eth0: PHY [8000f00.mdio:00] drive r [TI DP83867] (irq=POLL)
    [ 9.705562] am65-cpsw-nuss 8000000.ethernet eth0: configuring for phy/rgmii-r xid link mode
    [ OK ] Started Network Name Resolution.
    [ OK ] Reached target Network.
    [ OK ] Reached target Host and Network Name Lookups.
    Starting Avahi mDNS/DNS-SD Stack...
    Starting Enable and configure wl18xx bluetooth stack...
    Starting containerd container runtime...
    [ OK ] Started Netperf Benchmark Server.
    [ OK ] Started NFS status monitor for NFSv2/3 locking..
    Starting Setup RNDIS Devices at Boot...
    Starting Simple Network Ma…ent Protocol (SNMP) Daemon....
    Starting Permit User Sessions...
    [ OK ] Finished Enable and configure wl18xx bluetooth stack.
    [ OK ] Started Avahi mDNS/DNS-SD Stack.
    [ OK ] Finished Permit User Sessions.
    [ OK ] Started Getty on tty1.
    [ 10.361531] using random self ethernet address
    [ 10.361546] using random host ethernet address
    [ OK ] Started Serial Getty on ttyS2.
    [ OK ] Reached target Login Prompts.
    Starting Synchronize System and HW clocks...
    [ 10.435187] using random self ethernet address
    [ 10.435203] using random host ethernet address
    [FAILED] Failed to start Synchronize System and HW clocks.
    See 'systemctl status sync-clocks.service' for details.
    [ 10.492666] usb0: HOST MAC 00:06:77:12:ab:50
    [ 10.492687] usb0: MAC 00:06:77:12:ab:50


    Hardware:
    am-64evm (ti am6442 SoC SR2)
    SD Card Boot
    Software:
    Processor SDK LINUX AM64x Yocto - SD card image:
    Version09.02.01.10
    Release dateMay 30, 2024
    MD5 checksum268b8be6f8533b45ffb7dac811297e54
    Kernel version6.1.83-rt28-ti-rt-g96b0ebd82722
    PC:
    Linux Ubuntu 2022.02 
    Setup Script:
    Namesetup-rndis.sh
    Locationroot@am64xx-evm:/lib/systemd
    Permissions and size-rwxr-xr-x 1 root root 5034 Apr 29 04:54 setup-rndis.sh
    Systemd Service File:
    Namesetup-rndis.service
    Locationroot@am64xx-evm:/lib/systemd/system
    Permissions and size-rw-r–r-- 1 root root 249 Apr 29 04:34 setup-rndis.service
    Ubuntu script: 
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    #!/bin/bash
    # Remote host credentials
    REMOTE_HOST="192.168.199.1"
    REMOTE_USER="root"
    REMOTE_PASSWORD="root"
    # Initialize the reboot counter
    REBOOT_COUNT=0
    # Function to connect and reboot the remote host
    reboot_remote_host() {
    sshpass -p $REMOTE_PASSWORD ssh -o StrictHostKeyChecking=no $REMOTE_USER@$REMOTE_HOST '/sbin/reboot'
    if [ $? -eq 0 ]; then
    REBOOT_COUNT=$((REBOOT_COUNT + 1))
    echo "Reboot successful. Current reboot count: $REBOOT_COUNT"
    else
    echo "Failed to connect or reboot the remote host."
    return 1
    fi
    }
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    setup-rndis.sh
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    #!/bin/sh
    #RNDIS config for WIN 7, WIN 10, Linuy and MAC OSx
    set -e
    # command line parameters
    command="$1" # "up" or "down"
    udc_device="f400000.usb" # a udc device name, such as "musb-hdrc.1.auto"
    config_home="/sys/kernel/config/"
    g="/sys/kernel/config/usb_gadget/AM64xRef"
    usb_up() {
    usb_ver="0x0200" # USB 2.0
    dev_class="2"
    vid="0x1919"
    pid="0x0815"
    device="0x3001"
    mfg="SICK AG"
    prod="AM64xRef"
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    [Unit]
    Description=Setup RNDIS Devices at Boot
    After=network.target
    [Service]
    Type=oneshot
    ExecStart=/lib/systemd/setup-rndis.sh start
    ExecStop=/lib/systemd/setup-rndis.sh stop
    RemainAfterExit=yes
    Type=oneshot
    [Install]
    WantedBy=multi-user.target
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  • Hi Stefan,

    Bin is out of office until the end of next week. Please expect a delayed response.

  • Hi Stefan,

    Thanks for the details. I am able to follow it and run the same test on my EVM. It has been running for 58 reboots now and haven't trigger the issue yet. I will leave it running overnight.

    Meanwhile, can you please apply the following kernel patch and test it on your setup to see if it resolves the issue? This patch fixes a kernel driver bug which generates a busy loop with g_printer gadget driver which leads to Linux system hang on AM64x. I reviewed the kernel ECM and RNDIS gadget drivers, the bug appears to be applicable here too.

    0001-usb-cdns3-fix-linked-list-corruption.patch

  • Hi Bin, 

    Great, we see up to ~800 reboots until the error occurs, the variation is quite high. 
    I will add the path to our custom kernel because I don’t have the ti yocto setup running right now. 

    BR

    Stefan

  • Hi Stefan 

    I am guessing it is hard to tell whether the patch improved things at all , if variation is high? Are you doing these tests on your board(s) or EVM or both?

  • Hi Stefan,

    The controller sometimes causes a system hang if we configure the USB device via our setup script. Only a hard power cycle help to reset the system. 

    Do you mean using warm reset, such as grounding RESET_REQz pin, doesn't reset the system when the lockup happens?

  • Hi Stefan,

    Somehow the USB network between my AM64x EVM and the Linux PC is not reliable, the test often got stuck in the sshpass calls. so I changed the PC side script to make the test more reliable (mainly reduce the sleep time, and do ping command before sshpass).

    I am not a network expert, but I am concerned on the following kernel message on the Linux PC side:

    IPv6: usb0: IPv6 duplicate address fe80::206:77ff:fe12:ab50 used by 00:06:77:12:ab:50 detected!

    So I changed the EVM script setup-rndis.sh to use different MAC addresses on the EVM and the PC as follow to remove this kernel message on the host. Please let me know if this change is OK.

    mac1="00:06:77:12:AB:50"
    mac2="00:06:77:12:AB:51"

    dev_mac1="${mac1}"
    host_mac1="${mac2}"
    dev_mac2="${mac1}"
    host_mac2="${mac2}"

  • Hi Stefan, here is an update from my side:

    I am able to reproduce the reboot lockup multiple times, though most times (20+ times) the lockup happened during Linux shutdown phase, but I saw 2 times the lockup happened during Linux bootup (when setup-rndis.sh start).

    I then further simplified the test setup - do not use the host ssh connection, rather just repeatedly do "setup-rndis.sh start" and "setup-rndis.sh stop" without rebooting Linux, but still keep the USB cable connected to the USB host. This makes the lockup happens very quickly (within a few minutes).

    I also removed RNDIS gadget function from setup-rndis.sh so only use ECM, The lockup still happens.

    The patch 0001-usb-cdns3-fix-linked-list-corruption.patch I provided here on last Thursday doesn't seem to help, the lockup still happens with this patch in my test.

    I am now able to reliably and quickly reproduce the lockup, and will start to debug it from Monday.

  • Hi Bin,
    I tested the patch file and encountered a system hang during boot with the am64-evm. I agree with your assessment; the issue doesn’t seem to be related to RNDIS or ECM. Instead, it appears to be linked to the USB core itself. For the testing a valid USB configuration was needed, RNDIS or ECM was a straightforward way to establish a traceable USB communication.

    Please let me know if I can assist further!

    BR
    Stefan



  • I am guessing it is hard to tell whether the patch improved things at all , if variation is high? Are you doing these tests on your board(s) or EVM or both?
    Hi Mukul,
    We’ve observed the issue on various custom boards and on two am64-evms, all with the following configuration:
        TI AM6442 SoC SR2
        Linux 6.1.20+rt
        Sysfw: v09.00.07
        Boot device: 
    • eMMC on custom baords
    • SD on am64-evm
    BR
    Stefan
  • Do you mean using warm reset, such as grounding RESET_REQz pin, doesn't reset the system when the lockup happens?
    Hi Bin,

    we do not use a warm rest in general four our designs. In our design the RESET_REQz is pulled up  @3V3 power rail.

    BR
    Stefan
  • Hi Stefan,

    The lockup appears to happen at random places in either cdns3_gadget_udc_start() or cdns3_gadget_udc_stop() when USB configfs script doing echo "${udc_device}" > ${g}/UDC or echo "" > ${g}/UDC respectively. I will continue debugging...

  • Hi Bin, 

    we see the system hang at the same point: 

    After the "echo "f400000.usb" > UDC" the system hang

    as mentioned in the first post. 

  • Hi Stefan,

    Yes, about an half of the lockup cases I see was when "echo f400000.usb > UDC", which triggers kernel function cdns3_gadget_udc_start(). I also see that JTAG is unable to connect to DDR or any other AM64x peripherals when the lockup happens. It seems the SoC is already in a bad state when the lockup happens. I am continuing debugging the issue, but I am not sure how soon the root cause will be discovered and the issue will be resolved. Do you considering using a watchdog timer in Linux to reset the system when the lockup happens? This could be the plan B in your project until the issue is resolved.

  • Hi Bin, 

    we did tests on the am64-evm and the watchdog workaround can work. We start evaluating it on our custom hardware.

    BR 

    Stefan 

  • Hi Stefan,

    Thanks for the update. We continue debugging it and will keep you posted.