This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Incorrect Cache Enable/Disable procedure in RM57 Technical Reference Manual?

Other Parts Discussed in Thread: RM57L843, HALCOGEN

Hey all,


I was at my wit's end trying to figure out why my RM57x was seemingly corrupting memory whenever I disabled and subsequently re-enabled its caches.  I believe I found a solution to my problem and was hoping to run it past someone more knowledgeable.

I started out by using the cache enable/disable code provided in the RM57 Family User's Guide, spnu562.  This code did not work.  Instead, it corrupted recently written values.

http://www.ti.com/lit/ug/spnu562/spnu562.pdf

Disable caches - spnu562 p.392 section 9.3.1

MRC p15, 0, r1, c1, c0, 0    ; Read system control register configuration data
BIC r1, r1, #0x1 << 12       ; Instruction cache disable
BIC r1, r1, #0x1 << 2        ; Data cache disable
DSB
MRC p15, 0, r1, c1, c0, 0    ; Disable cache RAMs

Invalidate and enable caches - spnu562 p.393 section 9.3.1

MRC p15, 0, r1, c1, c0, 1    ; Read auxiliary control register
BIC r1, r1, #0x1 << 5        ; Bit is default set to disable ECC.  Clearing bit 5.
MCR p15, 0, r1, c1, c0, 1    ; Enable ECC, generate abort on ECC errors, enable hardware recovery.
MRC p15, 0, r1, c1, c0, 0    ; Read system control register configuration data
ORR r1, r1, #0x1 << 12       ; Instruction cache enable
ORR r1, r1, #0x1 << 2        ; Data cache enable
DSB
MCR p15, 0, r0, c15, c5, 0   ; Invalidate entire data cache
MCR p15, 0, r0, c7, c5, 0    ; Invalidate entire instruction cache
MCR p15, 0, r1, c1, c0, 0    ; Enable cache RAM
ISB                          ; You must issue an ISB instruction to flush the pipeline.  This ensures that all subsequent instruction fetches see the effect of enabling the instruction cache.

This code does not agree with the ARM Cortex-R5 documentation on how to enable and disable caches.  It also does not work.  Here is a partial list of suspicious-looking things:

  • When disabling caches, the last instruction should logically be "MCR" to store the modified register values back to the CP15.  The provided code instead uses "MRC" meaning that the modified register values are not stored back to the CP15 and the cache is never disabled.
  • When disabling a write-back cache, user code is required to "clean" the cache, copying any changed memory values from cache into the backing RAM.  The provided code does not do this, so when cache is re-enabled, any dirty values in the cache will take precedence over changes made to memory while the cache was disabled.

Following the recommendations in the ARM Cortex-R5 Technical Reference Manual and the ARM Architecture Reference Manual, I came up with some different code instead and tested it on the RM57L843 in my RM57x HDK.  This new code seems to work correctly in a debugger.

ARM Cortex-R5 Technical Reference Manual - http://infocenter.arm.com/help/topic/com.arm.doc.ddi0460d/DDI0460D_cortex_r5_r1p2_trm.pdf

ARM Architecture Reference Manual, ARMv7-A and ARMv7-R edition - http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0406c/index.html

// ----- Cache manipulation -----

// Enable the instruction and data caches w/o ESM.

// Based on the ARM Cortex-R5 Technical Reference Manual, page 231 (8-31, section 8.5.5) #define CACHE_ON \ asm("mov r0, #0"); \ asm("MRC p15, #0, R1, c1, c0, #0"); \ asm("ORR R1, R1, #0x1 << 12"); \ asm("ORR R1, R1, #0x1 << 2"); \ asm("DSB"); \ asm("MCR p15, #0, r0, c15, c5, #0"); \ asm("DSB"); \ asm("MCR p15, #0, r0, c7, c5, #0"); \ asm("DSB"); \ asm("MCR p15, #0, R1, c1, c0, #0"); \ asm("ISB") // Enable the instruction and data caches w/ ESM.

// Based on the ARM Cortex-R5 Technical Reference Manual, page 231 (8-31, section 8.5.5) #define CACHE_ON_ESM \ asm("MOV r0, 0"); \ asm("MRC p15, 0, r1, c1, c0, 1"); \ asm("BIC r1, r1, #0x1 << 5"); \ asm("MCR p15, 0, r1, c1, c0, 1"); \ asm("MRC p15, 0, r1, c1, c0, 0"); \ asm("ORR r1, r1, #0x1 << 12"); \ asm("ORR r1, r1, #0x1 << 2"); \ asm("DSB"); \ asm("MCR p15, 0, r0, c15, c5, 0"); \ asm("DSB"); \ asm("MCR p15, 0, r0, c7, c5, 0"); \ asm("DSB"); \ asm("MCR p15, 0, r1, c1, c0, 0"); \ asm("ISB") // Disable the instruction and data caches. (based on a combination of spnu562 9.3.1 and the ARM Cortex-R5 Technical Reference Manual). // (ISB added from the ARM Cortex-R5 Technical Reference Manual, p. 232 (8-32, section 8.5.5), // "Disabling or enabling instruction cache"). #define CACHE_OFF \ asm("MRC p15, 0, r1, c1, c0, 0"); \ asm("BIC r1, r1, #0x1 << 12"); \ asm("BIC r1, r1, #0x1 << 2"); \ asm("DSB"); \ asm("MCR p15, 0, r1, c1, c0, 0"); \ asm("ISB") // Clean the data cache. // ARM Architecture Reference Manual // B2.2.7 (B2-1286) Performing Cache Maintenance Operations #define CACHE_CLEAN \ asm("MRC p15, 1, R0, c0, c0, 1"); \ asm("ANDS R3, R0, #0x07000000"); \ asm("MOV R3, R3, LSR #23"); \ asm("BEQ Finished"); \ asm("MOV R10, #0"); \ asm("Loop1"); \ asm("ADD R2, R10, R10, LSR #1"); \ asm("MOV R1, R0, LSR R2"); \ asm("AND R1, R1, #7"); \ asm("CMP R1, #2"); \ asm("BLT Skip"); \ asm("MCR p15, 2, R10, c0, c0, 0"); \ asm("ISB"); \ asm("MRC p15, 1, R1, c0, c0, 0"); \ asm("AND R2, R1, #7"); \ asm("ADD R2, R2, #4"); \ asm("LDR R4, =0x3FF"); \ asm("ANDS R4, R4, R1, LSR #3"); \ asm("CLZ R5, R4"); \ asm("MOV R9, R4"); \ asm("Loop2"); \ asm("LDR R7, =0x00007fff"); \ asm("ANDS R7, R7, R1, LSR #13"); \ asm("Loop3"); \ asm("ORR R11, R10, R9, LSL R5"); \ asm("ORR R11, R11, R7, LSL R2"); \ asm("MCR p15, 0, R11, c7, c10, 2");\ asm("SUBS R7, R7, #1"); \ asm("BGE Loop3"); \ asm("SUBS R9, R9, #1"); \ asm("BGE Loop2"); \ asm("Skip"); \ asm("ADD R10, R10, #2"); \ asm("CMP R3, R10"); \ asm("BGT Loop1"); \ asm("DSB"); \ asm("Finished") void cache_clean(void) { CACHE_CLEAN; } // ----- Test harness ----- #define STORE_123 \ asm("EOR R4,R4,R4"); \ asm("ADD R4,R4,1"); \ asm("ADD R5,R4,1"); \ asm("ADD R6,R5,1"); \ asm("STMDB SP!, {R4,R5,R6}") #define STORE_456 \ asm("EOR R4,R4,R4"); \ asm("ADD R4,R4,4"); \ asm("ADD R5,R4,1"); \ asm("ADD R6,R5,1"); \ asm("STMDB SP!, {R4,R5,R6}") #define STORE_789 \ asm("EOR R4,R4,R4"); \ asm("ADD R4,R4,7"); \ asm("ADD R5,R4,1"); \ asm("ADD R6,R5,1"); \ asm("STMDB SP!, {R4,R5,R6}") #define LOAD \ asm("LDMIA SP!, {R4,R5,R6}") #define CLEAR_REGS \ asm("EOR R4,R4,R4"); \ asm("MOV R5,R4"); \ asm("MOV R6,R4"); int main() { // At startup, cache is off. // Store 1,2,3 straight into RAM then read it back. STORE_123; LOAD; // Now let's try enabling the cache and writing some new values. CACHE_ON_ESM; STORE_456; // The value in memory (cached) is 4,5,6. CACHE_OFF; // Now the cache is ignored. // The value in memory (RAM) is 1,2,3. cache_clean(); // Dirty data is copied from the cache back to RAM. // The value in memory (RAM) is 4,5,6. All is well. LOAD; // never return! while( true ); }

To test the code, step through it with a debugger and watch the value in memory switch between {1,2,3} and {4,5,6} as the value is written to cache and RAM, and as cache is enabled/disabled.

It is possible to reproduce the behavior of the SPNU562 code by commenting out cache_clean().  You will notice that when cache is re-enabled, the values {1,2,3} wind up in memory despite the more recent write of {4,5,6} to the cache.

Here is what I am wondering:

  • Is the ARM documentation that I used a reliable source of information on the Cortex cores in the RM57?
  • Would you agree that SPNU562 seems to be incorrect?
  • ...if so, any chance of changing SPNU562 to match the code in the ARM documentation?

Thanks for any guidance,

Peter

  • Peter,
    You are clearly right in saying that the last instruction must be an MCR (write) rather than an MRC (read). I will respond to your procedure.
    So, this means that the procedure laid out in the TRM is incorrect. I'll work to correct that.
    I am sorry that we provided you with the wrong code sequence.
    Best Regards,
    Kevin Lavery
  • Peter,
    TI's HalCoGen tool has assembly instructions for enabling and disabling the cache (as well as for cache invalidation). Your procedures CACHE_ON and CACHE_OFF match the assembly functions in HalCoGen. I feel confident that these functions are correct.

    I am still looking at CACHE_CLEAN.

    Finally, what does "_ESM" denote? I am not familiar with this abbreviation; also, I am not yet comfortable with the CACHE_ON_ESM function.

    Best regards,
    Kevin Lavery
  • Hi Kevin,

    Thanks for your time.

    You asked about the name "_ESM".  That is a misleading name and a mistake on my part. That macro is intended to enable the CPU cache with error checking.  I know that the macro does not affect the Error Signaling Module at all, and I should have named it "_ECC" instead.

    In any case, CACHE_ON_ESM is almost identical to the "invalidate and enable cache with ECC error checking" example in SPNU562 section 9.3.1, the only differences being that:

    • I zeroed the register r0 at the start of the code, even though that register is never used except as a placeholder.
    • I added "DSB" instructions between the final three MCR's.  (These are probably overkill, as all the examples I could find use just one DSB immediately before the MCR that enables the cache).

    I have not tested that this code snippet actually enables cache ECC.  I know that it turns the cache on and off, but that's it.  Interestingly, the ARM Cortex-R5 Technical Reference Manual provides very different-looking code for enabling/disabling error correction - see http://infocenter.arm.com/help/topic/com.arm.doc.ddi0460d/DDI0460D_cortex_r5_r1p2_trm.pdf, Page 232 (8-32).


    So I'm not too sure about this snippet either, I suppose you could consider it part of my question!

  • Peter,
    I found the Auxiliary Control Register definition. Therefore, following the comments, if you want to:
    - enable ECC
    - generate abort on ECC errors and
    - enable hardware recovery
    you need to make the three-bit field (CEC in Auxiliary Control Register) equal to 000b or of 001b.

    Since the default value for the bit-field is 100b, clearing bit 5 changes 100b to 000b ASSUMING THAT THE FIELD IS IN ITS DEFAULT SETTING.
    A more robust way to do this is:
    MRC p15, 0, r1, c1, c0, 1 ; Read auxiliary control register
    BIC r1, r1, #0x7 << 3 ; Clear bits 5:3
    MCR p15, 0, r1, c1, c0, 1 ; Enable ECC, generate abort on ECC errors, enable hardware recovery.

    I still owe you a response about clean. Otherwise, I think I am almost done with your questions.
    Thanks again for pointing out the errant code.

    Best Regards,
    Kevin Lavery
  • Peter,
    Last comment ...
    I think that the asm(" statements typically require a space in the first column (except for labels). This would not have compiled with TI tools.
    Also, the LDR R4 =0x3FF is not supported with TI tools.

    Re-iterating something from the previous post, the modification of the auxiliary control register is key because ECC is disabled for the cache until you enable it with this sequence.

    Best regards,
    Kevin Lavery
  • Kevin Lavery said:
    I think that the asm(" statements typically require a space in the first column (except for labels). This would not have compiled with TI tools.
    Also, the LDR R4 =0x3FF is not supported with TI tools.

    I think the asm was written for the GNU tools.

    In case it is useful for someone else, the following is the clean_cache code changed to be valid for the TI compiler:

    // Clean the data cache.
    // ARM Architecture Reference Manual
    // B2.2.7 (B2-1286) Performing Cache Maintenance Operations
    asm("CONST_3FF  .word 0x3ff");
    asm("CONST_7FFF .word 0x00007fff");
    
    #define CACHE_CLEAN \
        asm(" MRC p15, #1, r0, c0, c0, #1");  \
        asm(" ANDS R3, R0, #0x07000000");     \
        asm(" MOV R3, R3, LSR #23");          \
        asm(" BEQ Finished");                 \
        asm(" MOV R10, #0");                  \
        asm("Loop1:");                        \
        asm(" ADD R2, R10, R10, LSR #1");     \
        asm(" MOV R1, R0, LSR R2");           \
        asm(" AND R1, R1, #7");               \
        asm(" CMP R1, #2");                   \
        asm(" BLT Skip");                     \
        asm(" MCR p15, #2, R10, c0, c0, #0"); \
        asm(" ISB");                          \
        asm(" MRC p15, #1, R1, c0, c0, #0");  \
        asm(" AND R2, R1, #7");               \
        asm(" ADD R2, R2, #4");               \
        asm(" LDR R4,  CONST_3FF");           \
        asm(" ANDS R4, R4, R1, LSR #3");      \
        asm(" CLZ R5, R4");                   \
        asm(" MOV R9, R4");                   \
        asm("Loop2:");                        \
        asm(" LDR R7, CONST_7FFF");           \
        asm(" ANDS R7, R7, R1, LSR #13");     \
        asm("Loop3:");                        \
        asm(" ORR R11, R10, R9, LSL R5");     \
        asm(" ORR R11, R11, R7, LSL R2");     \
        asm(" MCR p15, #0, R11, c7, c10, #2");\
        asm(" SUBS R7, R7, #1");              \
        asm(" BGE Loop3");                    \
        asm(" SUBS R9, R9, #1");              \
        asm(" BGE Loop2");                    \
        asm("Skip:");                         \
        asm(" ADD R10, R10, #2");             \
        asm(" CMP R3, R10");                  \
        asm(" BGT Loop1");                    \
        asm(" DSB");                          \
        asm("Finished:")
    
    void cache_clean(void)
    {
        CACHE_CLEAN;
    }
    

    I performed a brief test of the clean_cache() function in a HALCoGen project for a TMS770LC4357 in which the cache was enabled in the HALCoGen project, and by comparing the dap and CortexR5 memory browser view confirmed that cache_clean() was causing data in the cache to be flushed to RAM.

    [Since the dap view is of the actual RAM, if there is modified data only in the CortexR5 cache the dap and CortexR5 memory browser views show different contents for the same address]