This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6678 DSP PCIe issue with credits

We have a performance issue with the PCIe lanes of the Shannon DSP.

We are running the following SW lib (Version #: 0x01000003; string PCIE LLD Revision: 01.00.00.03)

We have connected a Lattice FPGA with a Lattice PCIe core using a single PCIe lane.

When setting up a DMA data transmission from the FPGA to the DSP via PCIe the achievable data rate is lower than expected. Looking at the data with a logic analyzer we found that the reason is a pretty low amount of credits the DPS’s PCIe  interface issues to the FPGA (0x2F).

We have found no documentation about how to influence this.

Any help appreciated.

  • Markus,

    May I ask what is the data rate achieved in your case and what is your expectation please?

    And how the PCIe (single lane, but Gen1 or Gen2) and EDMA (DBS=64B or 128B, ACNT/BCNT) are configured for the transfer please?

    The Throughput Performance Guide discussed about the PCIe throughput and the factors affecting it. It has the realistic data rate for C66x PCIe module with 2 lanes, Gen2 and different EDMA DBS configuration. It could be the expectation data you can compare with.

    The credit is automatic managed. We can see if the other factors (section 11.1 in Throughput Performance Guide) affect your data rate and how we can get better performance. So please share more info about the configurations. Thanks a lot.

    Sincerely,

    Steven

  • Hi Steven,

    please let me continue with this thread and answer your questions.

    Our connection between FPGA and DSP is one lane 2.5 Gbps. We do not use the EDMA, there is a DMA within the FPGA transferring 256 byte packets. Throughput is slightly above 1.2 Gbps, I would have expected/hoped something about 1.6 Gbps or even more.

    I read the throughput performance guide but it was only based on the connection of two DSPs with the internal EDMA limiting the transfer. This is not the case in our implementation.

    Watching the posted data credits offered to my VHDL implementation only two packets of 256 bytes are transferred and then transmission pauses because not enough credits are available. After credit update the next two packets are transmitted and so on. Sometimes the update does not release all credits but a smaller amount but this does not lead to slower overall transmission.

    The number of posted header credits is always greater than zero and does not influence the transmission.

    Questions are:

    • Can we influence the number of credits?
    • Can we influence the credit update period?
    • Are there more credits or a shorter update period when using 2 lanes?

    Regards,

    Frank

  • Frank,

    I think the flow control credit in C66x PCIe module is automatically managed. So there seems no control for the credit number or update period.

    Using 2 lanes will definitely give you more throughput. The credit could be doubled as well.

    Another thing you could try is to reduce the data burst size from 256 bytes to 128 bytes if you can control the DMA in FPGA. We can see if smaller credit required for each transaction will help on the throughput as well.

    Sincerely,

    Steven