This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

66AK2H12: About MSMC & DDR Starvation Counters and DDR CoS

Part Number: 66AK2H12

Hello all, I hope you are all fine.
I have been experimenting with MSMC & DDR starvation counters and DDR CoS for a while. My primary purpose is to change the allocated throughput on some specific cores. For that purpose, I experimented with the MDMAARB register, and its results were OK. As I said, now I am experimenting with starvation counters and CoS.
My program works as follows;
Caching is enabled in all of the cores. All cores start simultaneously to transfer data from/to L2 to MSMC, MSMC to L2, L2 to DDR, DDR to L2. On the shared memory regions, cores access different portions of the memory. At the beginning and end of transfers, cores initiate their CSL timers to record the time. Then they calculate the throughput values from that time and data size. To prevent measuring mistakes due to caching, I write back the cache to DDR in DDR transfer cases. My transfer size is 16KB for each core, and I tried the project with different core numbers.
As a result, I expected to observe different throughput results for those cores. I changed their register values, but I did not observe anything. The results were very similar to the default cases where the starvation counters and COS settings are default values from TI.
What can be the issue with that? Is it somewhat expected? How can I change the throughput allocated to a core with those registers?
Best Regards.

  • Hi,

    >>>> , cores initiate their CSL timers to record the time. >>>> Isn't TSCL used for time stamp, this is CPU cycle. And timer is CPU/6.

    >>>> My transfer size is 16KB for each core >>>>> Is this the EDMA transfer or CPU? Do you have multiple transfer in parallel or just one transfer a time? What is the throughput number you got? 

    Regards, Eric

  • Yes, I use TSCL. At the beginning of memcpy I record it, and I also record it after transfer completion. Then I take the difference and find the throughput with simple math.

    I used CPU transfers which are parallel multi CPU transfers simultaneously. Also, I have the EDMA versions of those but I will stick to CPU now.

    For example, let's say 7 cores do simultaneous DDR to L2SRAM transfers. DDR is enabled as cacheable. For each core, I get around 1500MB/s throughput values. In addition to that modifying MSMC SBND counters or DDR COS settings and values does not affect this value too much. I put one core in the best counter values the rest of the core to the worst values for example.

    It is only +-30MB/s with those values.

  • Hi,

    >>>>>For each core, I get around 1500MB/s throughput values. >>>>>>Can you double check this number? I am not sure you can get such high throughput using CPU. I don't have the number on my hands but I think it is hundreds MB/s ballpak. Are you able to use a big data chuck, like 64KB or 128KB for memcpy?

    There is some number for DDR COS on Keystone I device, the test uses EDMA. See attached PDF. You may see the same on KS II if using EDMA.

    Regards, Eric

     7367.sprabk5a_throughput.pdf

  • Also for those data chunks such as 64KB or 128KB I have similar results. My results seem OK for me when I consider this document  (page 36) ;

    6283.8371.K2 SOC Memory Performance.doc

  • Is there a reason for this issue?