Hello,
For research purposes, I am measuring the time it takes for a store and a load to execute when aiming the DDR3 memory and the OCMC RAM. To do so, I execute them in streams and I record the time using either the performance counters for the ARM Cortex A15 cores or the time stamp register for the C66x DSPs. All is done in bare-metal and with the data caches deactivated. I am using the default Sitara AM5728 GEL configuration files.
When I measure the stores and loads for the C66x when pointing to the DDR3 memory, I obtain around 8 and 178 cycles respectively. This is consistent with other results that I have obtained on other Texas Instruments MPSoCs like Keystone II. However, I am confused about the results that I obtain for the Cortex A15. In this case, I get around 58 and 27 cycles for stores and loads respectively. The former value is surprising for two reasons: (1) it is very different than the one obtained on the Keystone II and (2) I would always have expected to have shorter times for the stores than for the loads. I also appreciate the previous behavior when using the DDR3 memory controller performance counters.
As well, when measuring the store time from the Cortex A15 to the OCMC RAM I see that is much higher than to the DDR3 memory. This does not happen to the DSPs on this SoC or both core types on the Keystone II MSMC SRAM.
Therefore, my questions are the following:
- Is this a problem derived from a wrong register configuration or is this the expected behavior?
- It seems wrong to me the fact that it takes more time to write to the RAM than to the SDRAM. What should I expect here?
Thank you in advance for your help!
Regards,