We have found that on our TQMa64xxL SoM, U-Boot will occasionally hang during MDIO initialization of the internal CPSW controller - roughly 1 of 1000 boots will fail.
The hang occurs in cpsw_mdio_init(), right after the following piece of code:
/* set enable and clock divider */ writel(cpsw_mdio->div | CONTROL_ENABLE | CONTROL_FAULT | CONTROL_FAULT_ENABLE, &cpsw_mdio->regs->control); wait_for_bit_le32(&cpsw_mdio->regs->control, CONTROL_IDLE, false, CPSW_MDIO_TIMEOUT, true); /* * wait for scan logic to settle: * the scan time consists of (a) a large fixed component, and (b) a * small component that varies with the mii bus frequency. These * were estimated using measurements at 1.1 and 2.2 MHz on tnetv107x * silicon. Since the effect of (b) was found to be largely * negligible, we keep things simple here. */ mdelay(1);
Here, wait_for_bit_le32() will readl() cpsw_mdio->regs->control
in a loop until the IDLE flag is cleared. This loop terminates successfully, but any subsequent access to the MDIO controller - for example a later cpsw_mdio_wait_for_user_access(), will hang indefinitely, completely stopping the boot process. In fact, even inserting another readl() of cpsw_mdio->regs->control
right after the mdelay() in the cited piece of code will reproduce the issue, giving me the impression that the MDIO controller somehow hangs during this 1ms delay.
We are currently using the latest commit of ti-u-boot-2021.01 (tag 08.03.00.005), and were able to reproduce the issue with SYSFW 2021.05 and 2022.01 (we have not tested other SYSFW versions). In addition, we tried the following changes without success:
- Increase the delay from 1ms to 2ms
- Remove CONTROL_FAULT_ENABLE flag (as it's not enabled by Linux either)
In summary, nothing we tried fixed the issue, and we also haven't found a way to detect or work around the issue, as any access to the MDIO controller will completely hang the CPU.