This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PCIe operation with FPGA

Hello!

I have stuck in dead end with my design. As I saw some people were in the similar situation, so I hope very much they don't mind to share a hint.

In our system we have C6670 connected to Spartan6 FPGA over PCIe. DSP is operating Root Complex, FPGA running endpoint. We have managed to use PIO example to read and write separate registers within FPGA with one DWORD TLPs. The very next bottleneck is performance. With one DWORD TLP the utilization ratio is poor and write performance is about 40MBps, while read is just 2MBps. So as the next step we are trying to implement busmastering DMA in the endpoint. So far I have just compiled xapp1052 example fof FPGA and proven on PC, that that design can make read and write accesses from FPGA side. Now I am trying to implement that with DSP. Particularly, I have found, that setting up DMA design in FPGA does produce some activity. At least, using ChipScope I saw multi-DWORD TLPs are issued. So its only a question of proper BAR config and translation setup to get those writes complete.

My major concern is that busmastering DMA is expected to provide interrupt to DSP upon transfer completion. Without further grounding I am trying to set up MSI. I have found on this forum, how to setup DSP for MSI reception. Self-writing to DSP register seems triggering MSI interrupt and calls the ISR. However, I cannot make FPGA to trigger MSI to DSP. Ultimately, I see that cfg_interrupt_msienable signal from FPGA's integrated endpoint block remains inactive. I saw activating MSI feature requires extra steps. The minimum required steps are setting MSI capable bit in MSI capability space, and then turning on bus master enable bit in command register of the endpoint. I am trying to do that. As it suggested in doc, I am attempting read on register of interest, modify values and write back. However, my feeling is that reading remote registers does not work as expected. Consider the following fragment:

pcieRegisters_t     setRegs;
pcieMsiCapReg_t     MsiCap;

memset( &setRegs, 0, sizeof(setRegs) );
memset( &MsiCap,  0, sizeof(MsiCap)  );

setRegs.msiCap  = &MsiCap;

if ( pcie_RET_OK != Pcie_readRegs(handle, pcie_LOCATION_REMOTE, &setRegs) )...

When I monitor this access with ChipScope, no activity happens within FPGA. I do monitor cfg_rd_en_n, cfg_rd_wr_done_n, and trn_rsof_n as triggers. So, again, upon that code execution, no activity happens in FPGA. Moreover, MsiCap structure reads as all zeros, though MSI capability should identify itself with capId=0x05. Because of these two facts - no activity and zarro value read I suspect that reading registers does not happen properly.

Next, when I attempt setting MSI enable bit like:

MsiCap.msiEn    = 1;
if ( pcie_RET_OK !=  Pcie_writeRegs(handle, pcie_LOCATION_REMOTE, &setRegs) )...

memset( &MsiCap,  0, sizeof(MsiCap)  );
if ( pcie_RET_OK != Pcie_readRegs(handle, pcie_LOCATION_REMOTE, &setRegs) )...

subsequent reading of MsiCap shows MsiCap.msiEn is 1, however, no activity happen in FPGA.

Next, if I attempt reading of Command register, still no activity happen in FPGA, and only write to Command register I see activity in FPGA. The following code

StatusCmd.busMs = 1;    // Enable bus mastering
StatusCmd.dis   = 1;    // Set DIS to disable Legacy Intr --> use MSI
StatusCmd.resp  = 1;
StatusCmd.memSp = 1;
if ( pcie_RET_OK !=  Pcie_writeRegs(handle, pcie_LOCATION_REMOTE, &setRegs) )...

triggers the following activity in FPGA

cfg_rd_wr_done_n validates output on cfg_do when asserted low. cfg_dwaddr is address of config space registe being accessed in DWORDS. So address of 012 is 0x048, which is origin of MSI capability record, and 0x00805805 is value of that register being read. The 05 in LSbs is MSI capability Id. Just few para above attempt to read MSI capability returned zero there. Next 58 is pointer to the next record and seems valid, but it was zero in attempt to read this capability before. Within 0080 th only bit asserted is capability to generate 64-bit addresses. The LSb of that field is exactly msiEn bit and it remains zero, i.e. capability was not activated.

Access to 017 (0x5C) is PCIe capability and seems malformed, as C0 capability Id seems to be invalid. Access to 019 (0x64) is to PCIe Link capabilities and values there seems reasonable.

So after all, my major problem is that I cannot activate MSI in FPGA, as it seen from MSI capapability output and from cfg_interrupt_msienable signal. ANother concern is that Pcie_readRegs() seems to not trigger actual reads on the endpoints.

Please comment or suggest, what else to check. Would appreciate any hint.

Thanks in advance.

  • Hi,

    In the MCSDK PCIe sample example, two DSP EVMs are used to test the PCIe driver. DSP 1 is configured as a Root Complex and DSP 2 is configured as End Point. Once the PCIe link is established, the following sequence of actions will happen:
    1. DSP 1 sends data to DSP 2 - DSP 2 waits to receive all the data
    2. DSP 2 sends the data back to DSP 1
    3. DSP 1 waits to receive all the data
    4. DSP 1 verifies if the received data matches the sent data and declares test pass or fail.

    Have you implement the same implementation on EP(FPGA) side?

    Please go through below wiki for PCIe related FAQs and Resources.
    processors.wiki.ti.com/.../PCI_Express_(PCIe)_Resource_Wiki_for_Keystone_Devices

    Thanks,
  • Hello,
    Sure, the sample from MCSDK was used as a starting point and we used that for DWORD by DWORD access. However, the actual design is quite different. One thing to note is a way devices are configured. In MCSDK example both RC and EP are running on DSPs. Then it looks like one DSP is configuring PCIe subsystem, including, for example BARs configuration as a RC, another DSP is configuring EP. With FPGA that is not possible. PCIe subsystem on FPGA side should be configured by RC. In other words, RC is configuring its local BARs and remote BARs in EP. In computer systems that is done by host software during enumeration process. Yes, I know TI does not provide enumeration code for RC, I have picked up idea from contributor on this forum. With that I was able to configure remote BARs in FPGA and they do respond properly on access.
    Another big deal is that most of samples are written for DWORD by DWORD access. Performance of that method is unacceptable. DMA with multi-DWORD payload TLPs is the answer. But again, when test DSP against DSP one can use DMA on DSP side to perform efficient writes. With FPGA that is not a case again. Simple design is capable to handle just one DWORD access, either read or write in PIO mode. Any multi-DWORD TLP handling assumes DMA and bus mastering on FPGA side. At least, no one have reported to implement multi-DWORD PIO. So I have implemented a reference design by Xilinx. They provide some Windows driver and app to verify it and seems that was fine. Now when it comes to make similar application on DSP I have stuck in described situation.
    That application is bus mastering DMA. Host system running RC is writing to EP application registers and triggers transfer activity to be done by DMA block in FPGA. That block is making, oh, well, supposed to make, reads from DSP and writes to DSP. Upon transfer completion DMA engine is signalling interrupt to the host.
    Technically, I understand the sequence of steps required. The trouble is that response observed does not match my expectations. For instance, I don't see activity on configuration interface of FPGA's endpoint, when configuration space registers are read. As I mentioned in the original post, the values read from peripheral seems to be wrong (no capId, no Next pointer), somehow multiple accesses observed in FPGA only after writing to Command register of PCIe.