Hi,
In “Throughput Performance Guide for C66x KeyStone Devices” [section 12 PCIe] it is mentioned that memory Read is more performant than Memory Write!
In my test [PCIe between 2 shannon] I had:
- Using CPU (data size=4 MB) : Throughput MWr=1.06 Gbps VS. Throughput MRd=0.05 Gbps
May I ask:
1- MWr is more performant than MRd using CPU because is a posted type, so I send the data [Request TLP] without waiting the completion, is correct or also in Memory Write the CPU will receive a completion (Interruption) indicates that the completer has receive the data [transaction OK] ?? or there is some thing else !
2- On the contrary, using EDMA (MSMC to MSMC), it seems that MRd is more performant than MWr!!! So why this difference, Normally I think that MWr should be more performant [posted] ???
3- It is normal that the performance depends on the size of data?
e.g using the EDMA0/1/2 with DBS=128:
--- with data <4KB MWr is more performant than MRd
--- with data >4KB MRd is more performant than MWr
Contrary, with DBS=64 MWR is always more performant than MRd!
4- In the “EDMA user Guide” is mentioned that EDMA0 is optimized for MSMC/DDR transfer, but after my test it seems that EDMA1/2 are more performant thant EDMA0 in PCIe MWr for Data size>1KB (up to 4MB), On the contrary, EDMA0 is more performant in Data size<1 KB.
However, in MRd I obtain the same throughput for EDMA0/EDMA1/EDMA2.
So why this difference with EDMA0 in MWr???
I need more explanations please!