Hello,
I'm currently trying to optimize PCIe transfers between two TDA2xx's. The test I'm running can be found here: "<PROCESSOR_SDK_ROOT>\ti_components\drivers\pdk_01_08_00_16\packages\ti\csl\example\pcie". One TDA2xx serves as RC and the other one serves as EP. The test sends 16MB of data from RC to EP, data is then looped back from EP to RC where it is compared to the original data. If those are identical the test is successful. System EDMA's are used to perform the transfers.
The test in its current state, without any modifications states that the PCIe throughput is 370 MBps when both RC and EP utilize Gen2 transfer speed.
After reading through some documentation, I found out that, to achieve maximum performance, these requirements need to be met:
1) The EDMA transfer parameters are to be set so that the transfer controller submits the least possible number of transfer requests
2) The EDMA TC default burst size is to be equal to the PCIe payload size
As for the bullet 1), I referred to this table (found in SPRAC21).
In the test, the system EDMA's TC0 is used. I have used EDMA before so I was able to experiment with the (A/B/C)CNT values. For example, the original test which has ACNT=16K, BCNT=16M/16K and CCNT=1 (which do not comply to the table above) gives 370 MBps for throughput. When I set the values to ACNT=128, BCNT=128 and CCNT=1024 (which is recommended in the table) I get 330 MBps which is worse. What else should I try regarding the transfer parameters only?
Regarding bullet 2), that's where I need some clarification.
- I cannot confirm that the EDMA's TC0 is indeed using DBS of 128B which I found to be defined in "<PROCESSOR_SDK_ROOT>\ti_components\drivers\pdk_01_08_00_16\packages\ti\csl\soc\tda2xx\hw_ctrl_core.h" CTRL_CORE_CONTROL_IO_1_TC0_DEFAULT_BURST_SIZE_MASK macro definition. It is not used anywhere and I don't know where to find the code that configures the TC's.
- When we say PCIe payload size, does that mean it is the size of the data being sent in a TLP excluding the headers that are added to it? Also, where can I set the payload size of PCIe's transfers? I suppose that it can be 128B at best which I concluded from this snippet found in SPRABK5B. Both of my devices are KeyStone aren't they (they are TDA2xx)?
I'm currently learning about PCIe and how it works so forgive me for my confusion. I would just like some help optimizing this particular test regarding places in code where all these parameters are set and how to set them correctly.
Sorry for the long post and thank you in advance.
Regards,
Nick