We're seeing frequent hangs during shutdown/reboot on our TQMa5728/MBa57xx board on ti-rt-linux-5.10.y (tag cicd.2022.11.02.13.24.41-rt), as well as an older branch based on ti-rt-linux-5.4.y. I have not tested other kernel versions.
On the 5.10 kernel, I managed to narrow the issue down to the shutdown of the PCIe controller and PHY:
device_shutdown()
pci_device_shutdown()
pcie_portdrv_remove()
pcie_port_device_remove()
pci_disable_device()
do_pci_disable_device()
pci_write_config_word() // A
platform_drv_shutdown()
dra7xx_pcie_shutdown()
dra7xx_pcie_disable_phy()
phy_power_off()
ti_pipe3_power_off()
regmap_update_bits() // B
Location B is where the hang actually happens. There seems to be some interaction between disabling the controller and PHY however: Adding a delay of 1ms (or a synchronous printk to a serial console) anywhere between the locations A and B makes the issue go away, or at least unlikely enough that I haven't been able to observe it anymore, when it previously happened in roughly 50% of shutdowns. Adding the delay *before* location A doesn't have any effect.
Other information that might be relevant:
- Our Device Tree enables both PCIe controllers. The hang can happen during the shutdown of either of them.
- Nothing is connected to the two PCIe ports.
- Our config enables PREEMPT_RT.