This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM263P4: Cache and MPU issues related to AM263P4

Part Number: AM263P4
Other Parts Discussed in Thread: SYSCONFIG

Tool/software:

Question 1:

In the syscfg configuration of AM263P4, are the Region Attributes configurations limited to only certain combinations, rather than all possible combinations of Memory type, Shareability, Cacheability, and Cache policy?

 

I would like to confirm whether my understanding of the Region Attributes combinations provided in syscfg is correct, particularly the highlighted entries.?

Question 2:

According to the TRM manual, the last 32 bytes of every 512KB must be configured as non-cached. However, in the fastboot example, this region is set to shareable. Isn't shareable equivalent to cached + shareable?

Question 3:

In the fastboot example, the RAM is configured as Cached. After performing relevant Flash operations, data consistency between the Cache and RAM is achieved using CacheP_wb to synchronize the data. In non-fastboot SBL (Secondary Boot Loader), the RAM region is configured as shareable. Given that shareable implies cached, does this also require the use of CacheP_wb to synchronize data between the Cache and RAM to ensure consistency?

If a region is configured as shareable, it allows multiple processing cores to share the data, whereas cached does not. Is this the reason why configuring the RAM region as cached prevents the R5 core from accessing OTP data? However, this raises another question: in the fastboot example, the RAM is set to Cached, so how can the HSM firmware read from Flash also be shared with the HSM core?

Question 4:

Does AM263P4 currently only support Write-Back and Write-Allocate cache policies? So, whenever the RAM region is configured as cached, data consistency can only be maintained through Write-Back. Does this mean that automatic data consistency can theoretically only be achieved via Write-Through? How can I determine which data will be processed by the Cache, i.e., when to perform manual data synchronization (Write-Back)?

  • Hi ,

    Apologies for a delayed response, I'll try to answer all your questions one by one,

    In the syscfg configuration of AM263P4, are the Region Attributes configurations limited to only certain combinations, rather than all possible combinations of Memory type, Shareability, Cacheability, and Cache policy?

    Let me discuss this implementation in the SDK with the team and get back on why is this the case.

    I would like to confirm whether my understanding of the Region Attributes combinations provided in syscfg is correct, particularly the highlighted entries.?

    I have summarized in the table below:

    You can also refer the ARM R5F TRM Section 7 (/cfs-file/__key/communityserver-discussions-components-files/908/1067.DDI0460C_5F00_cortexr5_5F00_trm.pdf)

    According to the TRM manual, the last 32 bytes of every 512KB must be configured as non-cached. However, in the fastboot example, this region is set to shareable. Isn't shareable equivalent to cached + shareable?

    No, configuring an MPU region as shareable does not automatically make it cached. Shareability and cacheability are independent attributes that can be configured separately. As above table shows, a region can be sharable and non-cachable. 

    BUT, looking at the ti_dpl_config,c auto-generated code, I observe that when marked as Shared, the region also gets marked as "Cachable". I'll let the team know about this bug (jira.itg.ti.com/.../MCUSDK-14961)

    Does AM263P4 currently only support Write-Back and Write-Allocate cache policies? So, whenever the RAM region is configured as cached, data consistency can only be maintained through Write-Back. Does this mean that automatic data consistency can theoretically only be achieved via Write-Through? How can I determine which data will be processed by the Cache, i.e., when to perform manual data synchronization (Write-Back)?

    Yes, for the AM263P4 with ARM R5F cores, the SDK configuration primarily supports Write-Back and Write-Allocate (WB, WA) cache policies when a memory region is configured as cacheable.

    When using Write-Back policy, data consistency isn't maintained automatically between cache and memory. The cache will hold modified data until the cache line is evicted, which means:

    1. Data consistency can only be automatically maintained with Write-Through policy, which would write to both cache and memory simultaneously.
    2. For Write-Back policy, you need manual cache maintenance operations (data synchronization) in these scenarios:
    - When DMA or another processor needs to read memory that might have modified data in your CPU's cache
    - When your CPU needs to read memory that might have been modified by DMA or another processor

    To determine when to perform manual data synchronization:

    - For CPU-to-memory: Use CacheP_wb() or CacheP_wbInv() before another master accesses memory written by CPU
    - For memory-to-CPU: Use CacheP_inv() before CPU reads memory modified by another master
    - For shared memory regions: Apply appropriate synchronization at critical points where consistency matters

    Regards,
    Shaunak

  • In the fastboot example, the RAM is configured as Cached. After performing relevant Flash operations, data consistency between the Cache and RAM is achieved using CacheP_wb to synchronize the data. In non-fastboot SBL (Secondary Boot Loader), the RAM region is configured as shareable. Given that shareable implies cached, does this also require the use of CacheP_wb to synchronize data between the Cache and RAM to ensure consistency?

    By default, with the current SDK implementation, if region is marked as Sharable, by default it is being marked as cached, but the opposite is not true. If I mark the region as Cached, it is not sharable. Let me discuss on this potential bug with the team.

    Ideally, If the region is "shared and non cached", we dont need CacheP_wb, but if it is "shared and cached", we do need it.

  • Thank you for your reply! I still have a few questions regarding your response:

    Question 1: 

    According to the introduction in the ARMv7-M Architecture Reference Manual link, I revisited the combinations of Memory type, Shareability, Cacheability, and Cache policy. The second column in my summary shows the preset options provided by TI's syscfg. Is my understanding correct?

      

    So, if I set a memory region as Shareable in TI's SysConfig, does it necessarily imply that it is also Cacheable?If I wish to use a non-preset configuration, must I use the method shown below? Is there corresponding documentation explaining the meaning of each parameter?

    Based on this, shouldn't the correct approach for Question 2 be to set the last 32 bytes of each 512KB segment as Non-Cached instead of Shareable? Regarding Question 3, is it indeed necessary to use functions like CacheP_wb for data synchronization? Furthermore, for the latter part of Question 3, why can the HSM firmware still read data from Flash and load it into the HSM's M4 core when the memory is configured as Cached (i.e., Non-Shareable)? I thought the Shareable attribute is required to allow multiple cores to share data?

    Question 2: 

    For the AM263P4, which memory regions must be configured as Strongly Ordered, which must be set as Device, and which should be Normal memory type?

    Question 3:
    You mentioned that to synchronize data from the CPU cache to RAM, both CacheP_wb() and CacheP_wbInv() can be used. Based on my understanding, wouldn't CacheP_wbInv() be a more robust option? Because it not only writes the data back but also forces the CPU to fetch the latest data from RAM afterward. And if another core modifies the data in RAM, we can only use CacheP_inv() to force the CPU to retrieve the updated data from RAM, right?

  • Hi

    . The second column in my summary shows the preset options provided by TI's syscfg. Is my understanding correct?

    Yes, the presets are Strongly ordered, Cached, Non-Cached, Sharable. I've shared the default behavior of all the presets below:

    Strongly ordered:

    Cacheable:

    Non-Cacheable:

    Sharable:

    You can infer the exact config from the ARM Table screenshot attached below.

    If some other config is needed, you can select the advanced config and set it as per your requirements. Please make sure you follow the below table from ARM for correct configurations. Configure the C, B, S and the TEX bits appropriately.

    Regards,
    Shaunak

  • So, if I set a memory region as Shareable in TI's SysConfig, does it necessarily imply that it is also Cacheable?

    When the "Shareable" preset is used, the Syscfg marks it as cacheable 

  • Question 3:
    You mentioned that to synchronize data from the CPU cache to RAM, both CacheP_wb() and CacheP_wbInv() can be used. Based on my understanding, wouldn't CacheP_wbInv() be a more robust option? Because it not only writes the data back but also forces the CPU to fetch the latest data from RAM afterward. And if another core modifies the data in RAM, we can only use CacheP_inv() to force the CPU to retrieve the updated data from RAM, right?

    Yes, you are right

  • Question 2: 

    For the AM263P4, which memory regions must be configured as Strongly Ordered, which must be set as Device, and which should be Normal memory type?

    Strongly ordered memory is used for regions where access order is critical and must be strictly maintained. All accesses happen in program order with no optimization.

    Should be configured for:
    - Memory-mapped I/O with strict ordering requirements
    - External FIFO devices
    - Memory regions where side-effects of access timing are critical
    - Inter-processor communication regions where exact access order matters

    Device memory is designed for peripheral registers and control modules. It preserves access size and partial order.

    Should be configured for:
    - All peripheral register spaces:
    - GPIO registers
    - Timer controllers
    - UART/SPI/I2C controller registers
    - DMA control registers
    - Control modules
    - Configuration registers
    - Interrupt controllers, etc

     Normal memory is used for typical code and data storage with standard memory behavior. (TCM, OCRAM, FLASH)

    Regards,
    Shaunak

  • Thank you very much for your previous reply.I still have one question unanswered, which is the latter part of question 1. Why can the HSM firmware still read data from Flash and load it into the M4 core of the HSM when the RAM is configured as cached (Non Sharable) in the Fastboot example? My understanding is that only the Shareable attribute can allow data access between multiple cores (such as R5_Core0-R5_Core3、HSM M4)

  • Hello mingzhe dai,

    Thank you very much for your previous reply.I still have one question unanswered, which is the latter part of question 1. Why can the HSM firmware still read data from Flash and load it into the M4 core of the HSM when the RAM is configured as cached (Non Sharable) in the Fastboot example? My understanding is that only the Shareable attribute can allow data access between multiple cores (such as R5_Core0-R5_Core3、HSM M4)

    The MPU does not act as a guard against memory. We are marking the region as cached and non shareable in R5's MPU configuration, this will not prevent the region from being inaccessible by M4 core. 

    Also please note that the HSM FW does not read directly from flash. The R5 always reads from flash and stores in its RAM. In case of fast boot, there is a parallelization of application authentication that is being done. This parallelization happens between copy of app image from flash to ram by R5 core and its authentication by HSM core.

    Please check this app note for understanding on faster boot: https://www.ti.com/lit/ab/spradm8/spradm8.pdf?ts=1759215348745&ref_url=https%253A%252F%252Fwww.ti.com%252Fproduct%252FAM263P4

    When a region is marked as cached and shared, it means that cache coherency is maintained. If data is only copied from flash to ram and no update(write) of data is done in the marked area, it should be okay even if shared is not enabled.  

    Regards,

    Aswin

  • Thank you very much for your reply.

    1、What do you mean by "cache consistency is maintained"? Does it refer to automatically synchronizing data in the Cache to ensure data consistency? However, in your previous response, you mentioned that the write-back and write-allocate (WB, WA) cache strategies of the AM263P4 require manual consistency maintenance.

    2、Theoretically, if I need to perform inter-core communication (between R5 and M4), would setting the memory as shared (cached + shared) be more appropriate than cached (cached + no shared)?

    3、Based on my previous research, the shared attribute allows data access among multiple cores (such as R5_Core0-R5_Core3 and HSM M4). However, according to your explanation, the no shared attribute does not prevent other cores (like the M4 core) from accessing the same data. This seems contradictory to my understanding, and I'm quite confused about this point?

    Once again, thank you very much for your help!

  • Hi mingzhe dai,

    What do you mean by "cache consistency is maintained"? Does it refer to automatically synchronizing data in the Cache to ensure data consistency? However, in your previous response, you mentioned that the write-back and write-allocate (WB, WA) cache strategies of the AM263P4 require manual consistency maintenance.

    Data is not automatically synchronized. Cache wrte back configuration in MPU means that when a data that is in cache is modified, the value is not written to its original location suddenly. The value at the original location will be updated when cache write back and is done manually. But his is relevant when the region in accessed by multiple bus masters. If multiple bus master access a location, then if it is cached, then cache write back needs to be done manually.

    Theoretically, if I need to perform inter-core communication (between R5 and M4), would setting the memory as shared (cached + shared) be more appropriate than cached (cached + no shared)?

    It is necessary to configure this region as shared.

    Based on my previous research, the shared attribute allows data access among multiple cores (such as R5_Core0-R5_Core3 and HSM M4). However, according to your explanation, the no shared attribute does not prevent other cores (like the M4 core) from accessing the same data. This seems contradictory to my understanding, and I'm quite confused about this point?

    Coudl you please tell me what the non-shared attribute mean here ? We have only shared attribute. Also referring to this table will be usefule when defaling with combination of attributes. We will be able to understand how the MPU will behave according to the configuration. 

    The S-B-C and te tex value can be obtained from sysconfig ot from the ti_pl_config.c file 

    Regards,

    Aswin

  • Thank you for your response.
    My actual question is regarding the routine where FastBoot configures the RAM region as Cached.

    In TI's SysConfig, "Cached" is equivalent to "Cached + Non-shareable"—these two attributes are bound together. Theoretically, they shouldn't be mutually bound.

    In the FastBoot routine, the HSM firmware is relocated from Flash to RAM. After verification, the R5 core and M4 core perform IPC communication to transfer the HSM firmware to the M4 core. Based on your explanation, since this involves inter-core communication, it should be configured as Shareable. So why does this routine still function correctly?

    Additionally, according to the TRM, the last 32 bytes of every 512KB RAM block in the AM263P4 need to be configured as Non-cached. However, in the latest SBL (Secondary Bootloader) routine, these specific regions are still set as Shared (which in TI's SysConfig implies Shared + Cached, as they are bound). Is this an error? Also, why do routines like the "Hello World" example not specifically configure MPU for these regions? Is it unnecessary?

    Once again, thank you very much for your help!

  • Hi Mingzhe dai,

    In the FastBoot routine, the HSM firmware is relocated from Flash to RAM. After verification, the R5 core and M4 core perform IPC communication to transfer the HSM firmware to the M4 core. Based on your explanation, since this involves inter-core communication, it should be configured as Shareable. So why does this routine still function correctly?

    The question is valid. In this usecase, eventhough RAM is kept as cached, We are doing cache writeback invalidate before HSM related Transactions. So cache coherencey is maintained. 

    Is this an error? Also, why do routines like the "Hello World" example not specifically configure MPU for these regions? Is it unnecessary?

    The exmaple does not use entire ram area. That is why this is not explicitly configured.

    Regards,

    Aswin