OMAP-L137 Shared Region cache coherency

Dom Banger

Other Parts Discussed in Thread: OMAP-L137, OMAPL138

Hello,

when i am using a shared region based on SYSLINK to communicate between the ARM and the DSP, do i have to consider cache coherency aspects on the DSP and/or on the ARM? My shared region is specified a follows in the *.cfg file:

SharedRegion.setEntryMeta(1,
    new SharedRegion.Entry({
        name:           "MessageQ Buffers",
        base:           SR1Mem.base,
        len:            SR1Mem.len,
        ownerProcId:    MultiProc.getIdMeta("HOST"),
        cacheEnable:    false,
        isValid:        true
    })
);

Does the line cacheEnable: false, mean, that i deactivate cache fucntionality for this memory are for both the ARM and the DSP?

Thanks in advance,

Dom

over 12 years ago

0 judahvang over 12 years ago

TI__Mastermind 32475 points

Dom,

Yes, you have to consider cache coherency. The SharedRegion setting does not set the hardware. It simply tells SharedRegion whether or not to make cache calls in its APIs. So bottom line is that if the Cache is really enabled for your Shared Region, you need to set "cacheEnable" to be true.

Judah

0 Dom Banger over 12 years ago in reply to judahvang

Intellectual 355 points

Hello judahvang,

thanks for your answer. So do i have to set cacheEnable to true for the shared memory of the OMAP-L137 EVM Board? And is both, the ARM side and the DSP side cached? In my opinion, when working with DSPLINK, i read, that only the DSP side is cached. Which of the shared region api calls need to know, whether the memory is cached? I simply want to read/write to/from a shared memory buffer from DSP/ARM, so do i have to invalidate/write back the cache in both sides?

Best regards,

Dom

0 judahvang over 12 years ago in reply to Dom Banger

TI__Mastermind 32475 points

Dom,

I don't know if you have to in your case. Just depends on the App. My best guess is that you need to set it on the DSP but not the ARM. Basically, if you can determine if your memory is being cached then set it to true...otherwise set it to false.

There's many APIs within IPC which checks this SharedRegion config and determines whether a cache call is needed or not. For example MessageQ_put, NameServer_get are an example.

To answer your last question: I think the default would be...Cache enabled on the DSP so you need to use the Cache APIs to make memory coherent. I think the ARM by default does not have the Cache enabled. But again, this could also depend on whether Cache is enabled by your App.

Judah

0 Dom Banger over 12 years ago in reply to judahvang

Intellectual 355 points

Judah,

thanks again. So i think i have to set cacheEnable in my app and take care aboub cache coherency by myself, because i am writing to the shared region directly without the use of any IPC function. So last remeining question is, could you tell me please, why one should use the "Cache_wbInv" function? I understand the usage of "Cache_wb ", when i have written changes to the memory and want to write them back from cache to the underlying memory and i understand the usage of Cache_inv, when maybe another processer has made changes to the memory area and i want to update my cache to get make these changes available to me. But why should one use both at the same time using "Cache_wbInv"?

Thanks in advance,

Dom

0 Ramsey over 12 years ago in reply to Dom Banger

TI__Genius 12025 points

Dom,

When using SysLink, the default behaviour for SharedRegion memory is non-cacheable on the host (ARM) side and cacheable on the DSP side. Therefore, you only need to manage the cache from the DSP side. Remember, that the cacheable property of memory is set independently for each processor. I realize this is confusing. When configuring the SharedRegion, all the properties are used on the DSP side (including the cacheEnable property). However, when configuring the ARM side, the cacheEnable property is ignored. If you want to enable cache for SharedRegion memory on the ARM side, see the SysLink_params symbol in the ti/syslink/SysLink.h file.

If you want to disable cache for SharedRegion memory on the DSP, you will also need to configure the MAR bits for that memory. Keep in mind the 16 MB alignment requirements. If you have a recent SysLink 2.10 release, look in the examples folder for an example on how to do this. The OMAPL138 is close enough for your device.

~Ramsey

0 judahvang over 12 years ago in reply to Dom Banger

TI__Mastermind 32475 points

Dom,

In fact, Cache_wbInv is very useful. For example, lets say you have a buffer that you share between ARM and DSP. Assume on the DSP this buffer gets Cached. When the DSP passes the buffer to the ARM it would be beneficial to do a Cache_wbInv() as oppose to a Cache_wb() if it expects some data back on the same buffer. If you do only a Cache_wb() then you would need to do a Cache_inv() when receiving the buffer back. This additional call to Cache_inv() could be avoided by doing Cache_wbInv().

Judah

0 Dom Banger over 12 years ago in reply to judahvang

Intellectual 355 points

Hey Judah and Ramsey,

thanks for your reply.

1. Ramsey, i think i do not need to enable cacheing on ARM side and will live with the cacheing enabled on the DSP side, because 16 MB cache allignent is much more than i need for communication between ARM and DSP. So i will keep watch coherency manually for the few cache lines is need.

2. Judah, does Cache_wbInv() implement a sequential call of Cache_wb() and Cache_Inv(), or is there an advance in performance. I do not know if i have the correct understanding of the cache feature, but please imagine the following szenario: The DSP and the ARM use the same shared memory region for communication in both directions. The DSP writes some bytes to the shared region, which is cached on the DSP side. The ARM writes some bytes to the same shared memory region, but different byte positions on the same cache line. Then the DSP wants its written bytes to become available on the ARM side and wants to read the bytes written by the ARM, so the DSP calls Cache_wbInv() for the cache line, including both, the bytes from the DSP for the ARM and the bytes from the ARM for the DSP. Which will be processed first? Will the cache of the DSP be written back to the memory first overwriting the bytes from the ARM in the shared memory, or will the cache first be invalidated so it is read again from the shared memory to the cache of the DSP overwriting the bytes written by the DSP? I hope i explained my problem correctly, i have with the function of Cache_wbInv(). Did i misunderstand the funtion ofCache_wbInv() or the cache feature itself?

Best regards,

Dom

0 judahvang over 12 years ago in reply to Dom Banger

TI__Mastermind 32475 points

Dom,

Sorry for the late response. I don't know exactly what you mean by "sequential call" but Cache_wbInv() has its own set of hardware registers that it utilizes. The API itself does not call Cache_wb() then Cache_Inv().

You need to make sure you align the buffers by the minimal cache alignment requirement for both DSP and ARM. I believe this value is at least 128 bytes since that is the cache line for L2 cache on DSP. This makes sure that when you perform any cache operations, it is only performed on the buffer you want...not anymore and not any less. You should never have a scenario as you describe in your post where part of a buffer expands another cache line. Again, make sure your buffers are aligned to the cache line size this is to prevent the scenario you are trying to describe above.

Judah

0 Dom Banger over 12 years ago in reply to judahvang

Intellectual 355 points

Judah,

thanks for your response. I understand what you mean with buffers in seperate cache lines for ARM and DSP. I think this link gives some more information on the cache topic and helps to understand the Cache_wbInv() function: http://processors.wiki.ti.com/index.php/Cache_Management

Best regards,

Dom

Processors

Processors forum

OMAP-L137 Shared Region cache coherency