This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM5728: PCIe EDMA burst size

Part Number: AM5728

Hi

I am using custom am5728 board and ti-processor-sdk-rtos-am57xx-evm-04.03.00.05 & ti-processor-sdk-linux-am57xx-evm-04.03.00.05.

we have a edma-test demo, which use  (ARM IPC-> DSP -> EDMA -> PCIe -> FPGA/C6678)

change DSP_SYS_BUS_CONFIG[1:0],we can change EDMA DBS and have some different data.

EP device

Link sta

Write rate(MB/s)

Read rate(MB/s)

DBS

FPGA

Gen2 x1

211.04

145.46

16

290.91

219.98

32

359.86

299.30

64

359.86

300.75

128

C6678

Gen2 x1

211.04

178.38

16

290.91

256.20

32

359.86

327.66

64

357.79

325.94

128

  rate = 5Gb/s * num-lanes * 8/10 * (MPS/ MPS + TLP header(24Bytes))
 * write rate = 5Gb/s * 1 * 8/10 * 64/(64 + 24) = 363.5MB/s
 * readrate = 5Gb/s * 1 * 8/10 * 128/(128+ 24) = 421.05MB/s
 
 Q1: According to the above test data, we can see that the write rate has basically reached the bottleneck, but there is a certain gap in the read rate.
 it seems that the DBS is set to 128, but the actual rate is the same as 64. Does this mean that the setting of DBS = 128 is actually sent using 64?
 If you have any suggestions please let me know.
 Thanks.
  • What is your configuration of the following PCIe settings:

    MRRS - Max_Read_Request_Size

    MPS - Max_Payload_Size

  • Hi B.C.

    Thanks for your reply.

    from lspci tools,  MPS is 128Bytes, MRRS is 512 Bytes.

    from datasheet, inbound MPS is 256 Bytes, outbound MPS is 64 Bytes.

    Thanks.

  • Thanks.  Can you test EDMA throughput alone (without PCIe) by doing a memory-to-memory copy on the AM5728?  This will help isolate if it is an EDMA or PCIe issue.

  • Hi, B.C.

    I think you are right, when EDMA + PCIe, things get weird; I have done L2SRAM-> DDR3 test before, here is my test data

     

    Write rate(MB/s)

    Read rate(MB/s)

    DBS

    DSP1 L2SRAM -> DDR

    1197.29

    798.19

    16

    2305.90

    1518.52

    32

    2829.96

    2706.92

    64

    3112.96

    3276.80

    128

    By the way, I set the mpu main frequency to 1.5GHz, the DSP frequency to 750MHz, the DDR3 frequency to 1066MHz, and the bit width to 32-bit. The theoretical bandwidth obtained should be 4GB / s;
    If my calculations are correct, I think we are close to the theoretical performance, and get a big difference between DBS=64 and DBS=128.

    This is a very strange thing. The maximum inbound payload size of PCIe is 256Bytes. 128Bytes should not touch the hardware bottleneck.

  • Hi. B.C.

    Any update here?

  • CY,

    The overall read bandwidth for EDMA is most sensitive to read latency.  In this scenario, with PCIe (high latency) on the Read side of the transfer it looks like the higher latency is limiting performance before the actual wire rate is limiting performance.

    Regards,

    Kyle

  • Hi, Kyle

    Thanks for your reply.

    1 )  I agree with you, EDMA read speed seems to have a bottleneck, and there will be a bigger delay.

    2 )  am5728 have a RTOS edma pcie demo, which show write rate is 337MB/s, read rate is 331MB/s(5Gb/s 1lane);  Does that mean write rate reached 98% of theoretical rate, read rate only reached 78%?

    3 )  The following forum link seems to have reached a preliminary conclusion of a maximum DBS of 64 under the kyestone framework. Does this conclusion have any reference significance under the davichi framework?

    Thanks

    CY

  • 2. I think you miscalculated the write % but your basic understanding appears correct.  337/363=93%.

    3. No I don't believe so.  The underlying architecture between these two devices is different.

    Regards,

    Kyle

  • Hi, Kyle

    This information is enough for us.

    Thanks,

    CY.