Other Parts Discussed in Thread: XIO2001
Hello,
I'm using an AM5749 on a custom board where I enabled the second PCIe controller (PCIe_SS2) along with PCIe_SS1.
So each controller is using a single pcie lane.
I reproduced the issue with the TI 5.10 kernel but also with the latest upstream kernel 6.1.8.
&axi1 {
status = "okay";
};
&pcie2_rc {
status = "okay";
};
&pcie2_phy {
status = "okay";
};
Sometime (relatively easy to reproduce) the second PCI bridge can't be identified properly:
# lspci
0000:00:00.0 PCI bridge: Texas Instruments Multicore DSP+ARM KeyStone II SOC (rev 01)
0000:01:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge
0000:02:00.0 Unassigned class [ff00]: Hilscher GmbH CIFX 50E-DP(M/S)
0001:00:00.0 Non-VGA unclassified device: Texas Instruments Multicore DSP+ARM KeyStone II SOC (rev 01)
The issue only appear on the PCIe_SS2.
Indeed, the device class reported by the bus is 0x0:
[ 1212.493316] pci 0001:00:00.0: [104c:8888] type 01 class 0x000000 <<<< ??? KO
[ 1212.493347] pci 0001:00:00.0: reg 0x10: [mem 0x00000000-0x000fffff pref]
[ 1212.493377] pci 0001:00:00.0: reg 0x14: [mem 0x00000000-0x0000ffff pref]
[ 1212.493438] pci 0001:00:00.0: supports D1
[ 1212.493438] pci 0001:00:00.0: PME# supported from D0 D1 D3hot
[ 1212.517578] PCI: bus1: Fast back to back transfers enabled
[ 1212.517608] pci 0001:00:00.0: PCI bridge to [bus 01]
Just before in the dmesg log we have this line:
[ 3.215606] dra7-pcie 51800000.pcie: Phy link never came up
Also lspci report "Invalid class 0000 for header type 01" and probably an issue with PCIe BAR mapping (Memory at <unassigned>)
# lspci -nv
0001:00:00.0 0000: 104c:8888 (rev 01)
!!! Invalid class 0000 for header type 01
Flags: fast devsel, IRQ 255
Memory at <unassigned> (32-bit, prefetchable) [virtual] [size=1M] <<<<
Memory at <unassigned> (32-bit, prefetchable) [virtual] [size=64K] <<<<
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: [disabled]
Memory behind bridge: [disabled]
Prefetchable memory behind bridge: [disabled]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Root Port (Slot-), MSI 00
Capabilities: [100] Advanced Error Reporting
When it's working the device class is 0x060400
[ 2.316070] pci 0001:00:00.0: [104c:8888] type 01 class 0x060400
and we have the PCIe Link:
[ 2.315917] dra7-pcie 51800000.pcie: Link up
# lspci
0000:00:00.0 PCI bridge: Texas Instruments Multicore DSP+ARM KeyStone II SOC (rev 01)
0000:01:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge
0000:02:00.0 Unassigned class [ff00]: Hilscher GmbH CIFX 50E-DP(M/S)
0001:00:00.0 PCI bridge: Texas Instruments Multicore DSP+ARM KeyStone II SOC (rev 01)
0001:01:00.0 Network controller: Qualcomm Device 1103 (rev 01)
I found a previous report related to this kind of issue but without a clear explanation about the cause and how it was fixed:
https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1017941/dra722-pcie-either-pcie1-or-pcie2-is-not-detected-properly
I noticed that the LTSSM state is not the same between PCIe1 and PCIe2 when we have the issue (from PCIECTRL_TI_CONF_DEVICE_CMD register)
# devmem2 0x51002104 w
Read at address 0x51002104 (0xb6f0e104): 0x00000045
# devmem2 0x51802104 w
Read at address 0x51802104 (0xb6f67104): 0x00000000 <<<
When it's working:
# devmem2 0x51002104 w ; devmem2 0x51802104 w
Read at address 0x51002104 (0xb6f73104): 0x00000045
Read at address 0x51802104 (0xb6f7b104): 0x00000045
I tried to remove and rescan the pci bus without success:
echo "1" > /sys/bus/pci/devices/0001\:00\:00.0/remove; sleep 1; echo "1" > /sys/bus/pci/rescan
On the 6.1.8 kernel, this crash the system:
[ 142.966583] pci 0000:01:00.0: PCI bridge to [bus 02]
[ 142.971588] pci 0000:01:00.0: bridge window [mem 0x20200000-0x202fffff]
[ 142.985382] pci 0001:01:00.0: [17cb:1103] type 00 class 0x028000
[ 142.991485] pci 0001:01:00.0: reg 0x10: [mem 0x00000000-0x001fffff 64bit]
[ 142.998779] pci 0001:01:00.0: PME# supported from D0 D3hot D3cold
[ 143.005035] pci 0001:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at 0001:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s PCIe x1 link)
[ 143.044342] pci 0001:01:00.0: BAR 0: assigned [mem 0x00200000-0x003fffff 64bit]
[ 143.051757] pci 0001:00:00.0: PCI bridge to [bus 01]
[ 143.056793] pci 0001:00:00.0: bridge window [mem 0x00200000-0x003fffff]
[ 143.063995] ath11k_pci 0001:01:00.0: BAR 0: assigned [mem 0x00200000-0x003fffff 64bit]
[ 143.072052] pci 0001:00:00.0: can't enable device: BAR 0 [mem 0x00000000-0x000fffff pref] not claimed
[ 143.081359] pci 0001:00:00.0: Error enabling bridge (-22), continuing
[ 143.087890] ath11k_pci 0001:01:00.0: enabling device (0000 -> 0002)
On the 5.10 kernel where the ath11k driver doesn't support the Qualcomm module, the lspci is able to list the module but the PCI bridge is still not correctly detected (Non-VGA unclassified device)
# lspci
0000:00:00.0 PCI bridge: Texas Instruments Multicore DSP+ARM KeyStone II SOC (rev 01)
0000:01:00.0 PCI bridge: Texas Instruments XIO2001 PCI Express-to-PCI Bridge
0000:02:00.0 Unassigned class [ff00]: Hilscher GmbH CIFX 50E-DP(M/S)
0001:00:00.0 Non-VGA unclassified device: Texas Instruments Multicore DSP+ARM KeyStone II SOC (rev 01)
0001:01:00.0 Network controller: Qualcomm Device 1103 (rev 01)
Do you have any clue?
Best regards,
Romain