This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

priority levels in MSMC arbitration

hi,

I have a question with MSMC of C6657.

I plan that 2 cores share the same memory area in MSM. Core1 could access these memories anytime with highest priority than other masters. And Core0 could not access these memories when Core1 is accessing.

So I found ‘MSMC Bandwidth Management in MSMC user guide below’,  But I still wonder what the priority level point to.

The arbitration scheme attempts to allocate accesses fairly to requestors at the same
priority level. However, it is not sufficient to ensure a bound on the wait times
experienced by lower priority requests. Consequently, requestors could be starved for
access when there is heavy traffic at higher priority levels. To avoid indefinite starvation
for lower priority requests, the MSMC features a bandwidth management scheme that
limits starvation times.

Does the priority level mean the priorities of BUS Masters (corpac0, corpac1, EDMA3, etc) which could be configured by chip level registers?

If not, could you let me know what the priority level mean in MSMC and how to configure it such as the above?

Thank you in advance.

bai

  • Bai,

    There is not a direct hardware method to implement the priority that you describe above.

    One way to implement this is to use the HW Semaphore module.

    1. Select one of the available HW Semaphore bits and assign that bit to this resource.
    2. Whenever CorePac1 is ready to access this region of shared MSMC SRAM, CorePac1 will request that Semaphore bit. It should always be available, but CorePac1 should make sure it owns the Semaphore bit before continuing.
    3. When CorePac1 is finished with its accesses to MSMC SRAM, it will release the Semaphore.
    4. CorePac0 must request the same Semaphore bit before each access that it wants to do to this shared region of MSMC SRAM. It must then immediately perform the access and then immediately release the Semaphore in case CorePac1 wants to use it.

    There may be other methods and simpler methods that may fit with your requirements. Flags or BIOS SEM semaphores may be useful, or you may find other methods in the MultiCore Software Development Kit (MCSDK) that can help you.

    Regards,
    RandyP

  • Hi,

    I want to know the default priority of the CorePacs to access the memory.

    At the moment I have the cache disabled and therefore the prefetch buffer as well. So I have a situation where two cores are accessing the same memory region of the SL2 memory. Then I see a memory contention, so I want to know the what is the priority and how to change it.

    Regards,

    Miguel

  • Miguel,

    What are you working on that you would need to be concerned with this level of detail? For most any development, the great majority can be accomplished with taking all defaults and not needing to worry about details.

    What do you mean by "I see a memory contention"? Do you mean that you recognize a situation in which there could be memory contention? Or are you taking benchmarks that prove some conflicts are slowing down on CorePac when a second is running? Or do you mean that an application you are developing is running into bendwidth issues accessing the same resource because of memory contention issues?

    If you are working on something that has brought you to this deep level of concern, the answer is not as simple as "the default priority is 1" and "you change it by writing to the CPUARBE register". Changing the priorities will not mean that one high-priority core owns the resource and will never be held off, and it will not mean that another low-priority core will only get the resource if no other master wants it.

    The KeyStone architecture is designed to implement very high memory bandwidth so that as many masters as possible can be getting access to any resources. All of the memory targets and the TeraNet are designed to keep data moving as much as we can.

    There can be issues that come up, so there are priority registers and bandwidth management registers. To understand the whole mechanism, you will need to go through the online training for the KeyStone devices. And you will need to learn the contents of several of our documents. These are complex devices, so there are not simple answers to the detailed questions about the inner workings of the device.

    You can go to TI.com and search for KeyStong training online. This would be a good place to start.

    Then you will want to study the Data Manual for your selected device, the MSMC Controller User Guide, and the C66x CorePac User Guide. You can start on those by searching in your pdf-reader for the keyword "priority" (no quotes).

    Regards,
    RandyP

  • Hi Randy,

    Thanks for your reply.

    Let me give you more details about my system. First of all for some special reasons I disabled the data cache in the Keystone, and therefore the prefetch buffer is as well disabled.

    Now I have two processes that in the sequential execution they have roughly the same workload, ~50ms. They operated in the same memory region on some data. When I run the two processes in parallel there is one that takes around 100ms and the other one still 50ms. From the execution graph you get the feeling that one is blocking the other somehow.

    I think this is a expected behavior given that I am not using cache and nor prefetch buffer. What I was wondering how the arbitration works in this case in the MSMC.

    Maybe, you can confirm my ideas or clarify the situation show how.

    Thanks,

    Miguel

  • Miguel,

    Unless there is something in the code holding off the second processor for 50ms, I cannot imagine a way that simple data memory contention could cause 100% blocking during the execution of the first process. And if the blocking were less than 100%, then the second process would take less than 100ms to complete, especially since it has no contention after the first process completes in 50ms. But you are seeing what you are seeing, so I am at a loss to explain it.

    My opinion is that it is a big mistake to have cache turned off. Use some other method of handling coherency between the two CorePacs.

    If you want to understand arbitration, there are several links that I suggested above. I will not try to duplicate that documentation here.

    Regards,
    RandyP

  • Hi Randy,

    The process themself are the same and they are just one line of code which makes a mathematical computation. So the case where the cache + prefetch buffer is disabled is where you see that one takes 100ms and 50ms as I said. When I enable cache both processes take about the same time to execute. So in concrete I wanted to make sure that given that cache+prefetch buffer are disabled and that both processes try to do a given computation simultaneously on the same memory addresses, is a memory contention expected here? and the other thing if yes so I wanted to understand the arbitration process in the Keystone in this situation?.

    The fact that the cache is disabled is just due to experiments of accuracy in performance estimation. Of course for a final system this is not the case.

    Thanks,

    Miguel

  • Miguel,

    Have you tried changing the register I mentioned above to see if the contention is arbitrated any differently?

    Regards,
    RandyP