This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DRA829V: PCI Endpoint MSI / Outbound transaction Issue

Part Number: DRA829V
Other Parts Discussed in Thread: DRA829

We have custom board based on DRA829, which need to be configured as PCI endpoint device. In our solution, we are using Qualcomm processor as the Root Complex device.  We have configured the EP with below link as reference, 

software-dl.ti.com/.../PCIe_End_Point.html

With this configuration, we are able to detect the EP by the Root Complex. Also we are able to perform BAR based read and write from the host.

The issue is seen when try to raise an MSI / Legacy interrupt or any READ / WRITE from the TI DRA829 Endpoint. All the outbound transactions from TI Fails and the host subsequently downs the PCI link.

When MSI or Legacy, is asserted from the Endpoint the host doesn't receive it. On debugging, it is seen that the EP controller changes the link status. The MSI interrupt address and the interrupt vector is correct as seen from the EP configuration space. However the memory write transaction to the RC fails. This is the same case when a DMA based read or write transaction is issued to the host.

An Exception is also seen when, READ or write is performed using memcpy_fromio or memcpy_toio in drivers/pci/endpoint/functions/pci-epf-test.c.  

In our case, we have a WIFI EP device connected to another Qualcomm port, which is working fine. So we are not suspecting Root Complex configuration in this case.

TI Kernel version : 4.19.94

Exception seen in EP when read / write using  memcpy_fromio / memcpy_toio is issued from EP,

ERROR: Unhandled External Abort received on 0x80000000 at EL3!
ERROR: exception reason=0 syndrome=0xbf000000
PANIC in EL3.
x30 = 0x0000000070004644
x0 = 0x0000000000000000
x1 = 0x0000000000000060
x2 = 0x0000000000000060
x3 = 0x000000000000000b
x4 = 0x0000000000000062
x5 = 0x0000000000000008
x6 = 0x000000000000003b
x7 = 0x0000000000000000
x8 = 0xffff00000dc6fc70
x9 = 0x0000000041023060
x10 = 0x000000000000073d
x11 = 0x656e696c20717269
x12 = 0x6164202037373520
x13 = 0x3034337830206174
x14 = 0x3020726464612020
x15 = 0xffffffffffffffff
x16 = 0x0000000000000000
x17 = 0x0000000000000000
x18 = 0xffff8008415f7980
x19 = 0x0000000000000000
x20 = 0x00000000bf000000
x21 = 0x0000000000000000
x22 = 0x0000000000000000
x23 = 0xffff000008d76cf0
x24 = 0x0000000000000000
x25 = 0x0000000000000000
x26 = 0xffff000008b1e490
x27 = 0xffff000008d76370
x28 = 0x0000000000000000
x29 = 0x000000007000a520
scr_el3 = 0x000000000000073d
sctlr_el3 = 0x0000000030cd183f
cptr_el3 = 0x0000000000000000
tcr_el3 = 0x0000000080803520
daif = 0x00000000000002c0
mair_el3 = 0x00000000004404ff
spsr_el3 = 0x0000000040000085
elr_el3 = 0xffff00000889ab58
ttbr0_el3 = 0x000000007000f0e0
esr_el3 = 0x00000000bf000000
far_el3 = 0x0000000000000000
spsr_el1 = 0x0000000040000005
elr_el1 = 0xffff000008108b08
spsr_abt = 0x0000000000000000
spsr_und = 0x0000000000000000
spsr_irq = 0x0000000000000000
spsr_fiq = 0x0000000000000000
sctlr_el1 = 0x0000000034d5d91d
actlr_el1 = 0x0000000000000000
cpacr_el1 = 0x0000000000300000
csselr_el1 = 0x0000000000000000
sp_el1 = 0xffff00000dc6f960
esr_el1 = 0x0000000056000000
ttbr0_el1 = 0x00000008c28a6000
ttbr1_el1 = 0x0262000080e20000
mair_el1 = 0x0000bbff440c0400
amair_el1 = 0x0000000000000000
tcr_el1 = 0x00000034f5507510
tpidr_el1 = 0x0000800876e40000
tpidr_el0 = 0x0000000000000000
tpidrro_el0 = 0x0000000000000000
par_el1 = 0x0000000000000000
mpidr_el1 = 0x0000000080000000
afsr0_el1 = 0x0000000000000000
afsr1_el1 = 0x0000000000000000
contextidr_el1 = 0x0000000000000000
vbar_el1 = 0xffff000008081800
cntp_ctl_el0 = 0x0000000000000005
cntp_cval_el0 = 0x0000021c39479daf
cntv_ctl_el0 = 0x0000000000000000
cntv_cval_el0 = 0x0000000000000000
cntkctl_el1 = 0x00000000000000e6
sp_el0 = 0x000000007000a520
isr_el1 = 0x0000000000000040
dacr32_el2 = 0x0000000000000000
ifsr32_el2 = 0x0000000000000000
cpuectlr_el1 = 0x0000001b00000040
cpumerrsr_el1 = 0x0000000000000000
l2merrsr_el1 = 0x0000000000000000

device tree node,


pcie0: pcie@2900000 {
compatible = "ti,j721e-pcie";
reg = <0x00 0x02900000 0x00 0x1000>,
<0x00 0x02907000 0x00 0x400>,
<0x0 0x02905000 0x0 0x00000400>;
reg-names = "intd_cfg", "user_cfg", "vmap";
#address-cells = <2>;
#size-cells = <2>;
ranges;
ti,syscon-pcie-ctrl = <&pcie0_ctrl>;
max-link-speed = <3>;
num-lanes = <2>;
power-domains = <&k3_pds 239 TI_SCI_PD_EXCLUSIVE>;
clocks = <&k3_clks 239 1>;
clock-names = "fck";

pcie0_ep: pcie-ep@d000000 {
compatible = "ti,j721e-cdns-pcie-ep";
reg = <0x00 0x0d000000 0x00 0x00800000>,
<0x40 0x00000000 0x01 0x00000000>;
reg-names = "reg", "addr_space";
cdns,max-outbound-regions = <16>;
max-functions = /bits/ 8 <6>;
max-virtual-functions = /bits/ 8 <0x4 0x4 0x4 0x4 0x0 0x0>;
dma-coherent;
};
};


Additional configurations, on board boot up,

cd /sys/kernel/config/pci_ep
mount -t configfs none /sys/kernel/config
cd /sys/kernel/config/pci_ep/
mkdir functions/pci_epf_test/func1
echo 0x104c > functions/pci_epf_test/func1/vendorid
echo 0xb00d > functions/pci_epf_test/func1/deviceid
echo 32 > functions/pci_epf_test/func1/msi_interrupts
echo 2 > functions/pci_epf_test/func1/msix_interrupts
ln -s functions/pci_epf_test/func1 controllers/d000000.pcie-ep/

echo 1 > controllers/d000000.pcie-ep/start

Could you please provide us pointers on how to effectively debug the issue

  • Brilly, 

    Not sure if you made additional progress on the debug. From the symptoms, I did not see if you were able to verify your EP address translation from outbound address. One guess may be that iATU registers get corrupted and the transaction tried to access some protected memory region in the EP, which crashed the EP. 

    quickest debug approach may be following the description in Sec. 

       12.2.3.4.3.2 PCIe Outbound Address Translation

    of the TRM, read the register values before the EP transaction, and see if the registers are correct. 

    A side note - legacy interrupt is not enabled by default, per:

       https://e2e.ti.com/support/processors/f/791/p/900609/3347914#3347914

    But MSI shall functional. 

    regards

    Jian

  • Thank you Jian  for the reply. Iam yet to find a clue for the issue.

    For the outbound window, PCIE0_DAT1 is being used with address range 0x4000000000 0x40FFFFFFFF

    On debugging, for an outbound write transaction, the address allocated in pcie space is 4000020000. Also, the register values before EP transaction is
    CDNS_PCIE_AT_OB_REGION_PCI_ADDR1  0
    CDNS_PCIE_AT_OB_REGION_PCI_ADDR0  f9876007

    Verfied the above address with the address allocated by the RC.

    CDNS_PCIE_AT_OB_REGION_CPU_ADDR1  40
    CDNS_PCIE_AT_OB_REGION_CPU_ADDR0  20007    

    The above pcie space is in region PCIE0_DAT1. So I guess the PCI outbound window is configured properly? Is there any additional configuration associated..

    Also do you find any reason for which MSI may not be generated by the EP. The MSI interrupt can also be considered as an outbound memory transaction to the host. The MSI address and the MSI vector values are correct as seen from the endpoint configuration space. I have verified the MSI vector and address by directly triggering the interrupt from the host.

  • Brilly, 

    you CPU address fall into the PCIE0_DAT1 region, that is fine. But we need to PCIe translate these addresses to the correct PCIe address, which should be configured by PCIE_CORE_ATU_* registers, as defined in:

      12.2.3.5.5 PCIE_CORE_AXI Registers

    There are 32 sets of register corresponding to 32 regions, for each of the LP and HP ports. You are addressing the LP port. So a quick check may be to dump all registers starting 0D40 0000h address, confirm that 

        PCIE_CORE_ATU_WRAPPER_OB_i_ADDR0

        PCIE_CORE_ATU_WRAPPER_OB_i_ADDR1

    matches to the RC's inbound PCIe address. Reason I asked you to inspect all 32 sets of registers, is that a corrupted ATU table may have the same AXI address or same OB_address across multiple tables.  

    I am not sure about the two register names you mentioned:

       CDNS_PCIE_AT_OB_REGION_CPU_ADDR1  40
       CDNS_PCIE_AT_OB_REGION_CPU_ADDR0  20007 

    are these exact register names as in the TRM? somehow was not able to find them by the names. 

    regards

    jian

  • Hi Jian,

    For MSI, Outbound region 0 is used  and for data transfer Region 1 is being used. I tried dumping the registers as you have suggested. Below is the value obtained for the registers as updated from the controller driver.

    Register dump for Region 1 after setting the outbound window and before starting the data transfer.

    PCIE_CORE_ATU_WRAPPER_OB_i_ADDR0  0xD400020 f9874007
    PCIE_CORE_ATU_WRAPPER_OB_i_ADDR0  0xD400024 0

    PCIE_CORE_ATU_WRAPPER_OB_i_DESC0  0xD400028 2
    PCIE_CORE_ATU_WRAPPER_OB_i_DESC1  0xD40002C 0

    PCIE_CORE_ATU_WRAPPER_OB_i_AXI_ADDR0 0xD400038 20007
    PCIE_CORE_ATU_WRAPPER_OB_i_AXI_ADDR1 0xD40003C 40

    The address of RC inbound address is updated correctly as seen from RC driver.

    Note: I was not able to dump registers using devmem tool. Dumping the AXI registers using devmem was triggering PCI link down.

  • Brilly, 

    Sorry for the extended delay. Could you let me know if you were able to find any issues with the EP outbound address translation. Otherwise,  we will review all registers for these two regions including PCIE_CORE_ATU_WRAPPER_OB_i_DESC0/1/2 registers to make sure correct regions are associated. 

    regards

    jian