This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM2432: Example of R5F Memory ECC single and double bit error injection.

Part Number: AM2432

Tool/software:

Hello

Do you have any examples of R5F Memory ECC single and double-bit error injection programs?

We are implementing the R5F.RAM-T1 - Software Test for Memory ECC diagnostics, which requires introducing errors.

SDK:mcu_plus_sdk_am243x_09_00_00_35

Thanks

Jimmy

  • Hello Jimmy,

    My understanding is that you are already able to test the TCM memories by injecting errors. For R5F Cache memories, you can refer the below FAQ.

    https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1427813/faq-how-to-test-ecc-for-r5f-cache-memories-on-am6x-am243-devices

    The FAQ is mostly generic for any R5F cores and you will only have to take the corresponding ECC Aggr for AM243x to inject errors.

    Regards,

    Nihar Potturu. 

  • Hi Potturu

    I have read the content of this post and there are several questions that I am unsure if my understanding is correct.

    1. The process for injecting errors in the post aligns with the injection process in SDL, so to inject errors, I need only call the SDL library and pass in the correct ID
    2. Regardless of whether it's ATCM or DCACHE, any attempt to insert a double-bit error will result in a data abort.
    3. R5 ECC automatically corrects single-bit errors without requiring manual intervention.

    Please confirm if my understanding is accurate.

    Thanks

    Jimmy

  • Hi Potturu

    Here is the English translation based on your requirements:

    I followed the guidance provided in the post and performed a DCache fault injection test, uncovering these issues:
    1、During the injection process, writing 0x28 to address 0x3f00d014 was ineffective.


    2、Upon completion of execution, an ESM interrupt is triggered but fails to exit; it instead loops in this location, comparing register values as follows.


    3、In an attempt to correct this issue by writing 0 to the location after entering the interrupt program via address 0x3f00d014, I found that these writes were still ineffective. Moreover, I did not locate register 0x3f00d014 in the AM2432's TRM .

    .

    Please help me understand what might be causing these issues.

    Thanks

    Jimmy

  • Hello Jimmy,

    Regardless of whether it's ATCM or DCACHE, any attempt to insert a double-bit error will result in a data abort.

    Yes, your understanding is correct.

    R5 ECC automatically corrects single-bit errors without requiring manual intervention.

    That is correct.

    3、In an attempt to correct this issue by writing 0 to the location after entering the interrupt program via address 0x3f00d014, I found that these writes were still ineffective. Moreover, I did not locate register 0x3f00d014 in the AM2432's TRM .

    You have to use the below code for clearing the 0x14 offset register

    // Add the below code inside the ISR/Abort Handler whenever an ECC error is detected
    
    // if error == R5F Cache ECC Error {
        uint32_t *ptr3=(uint32_t *)0x3f00d014; // ECC Ctrl Reg    
        *ptr3=0x0;
        uint32_t *ptr5=(uint32_t *)0x3f00d008; // ECC vector Reg
        *ptr5=0x148000;
        while(((*ptr5>>24)&0x1)!=1) // Polling the Read done bit to ensure ECC aggr. gets properly updated
        
        {
        
            ;
        
        }
    //}

    You will be able to come out of the ISR only after the 0x3f00d014 register gets set to 0. This will stop the error injection. 

    0x3f00d000 is the base address of Wkup R5 ECC Aggr. 

    You can find the description 0x14 and 0x8 registers offsets below :

    Regards,

    Nihar Potturu.

  • Hi Potturu

    I cleared the value at address 0x3f00d014 in the interrupt handler, now it is 0, but I am still entering the interrupt handler continuously.

    When ESM0_CFG:CFG_HI_PRI's LVL is not set to 0xFFFF, it becomes impossible to exit the ESM processing routine.

    In addition, if it's not an injected DCache fault, would normal faults require clearing the 0x3f00d014 register? Normally, such events trigger only once.

    Please help me.

    Thanks

    Jimmy

  • Hello Jimmy,

    In addition, if it's not an injected DCache fault, would normal faults require clearing the 0x3f00d014 register? Normally, such events trigger only once.

    Normally, it will not be required to clear the 0x3f00d014 register.

    I cleared the value at address 0x3f00d014 in the interrupt handler, now it is 0, but I am still entering the interrupt handler continuously.

    Can you try making your example run from non-cached memory and only put the 32KB arrays "a" and "b" in cached DDR/MSRAM.  

    Regards,

    Nihar Potturu.

  • Hi Potturu

    Regarding the ECC injection handling for cache, we are modifying our code tests.

    The statement mentions that inserting double-bit ECC anomalies into R5F results in data aborts. Considering this, does it necessitate performing a test of Memory ECC diagnosis as per the security manual's steps for R5F.RAM-T1 - Software Test of Memory ECC? If so, how would one exit from the data abort phase upon executing the double-bit injection tests?Are there any example programs?

    Thanks

    Jimmy

  • Hi Jimmy,

        I add two API  to disable and enable abort  here. /cfs-file/__key/communityserver-discussions-components-files/908/sdl_5F00_ecc_5F00_utils.S

    /cfs-file/__key/communityserver-discussions-components-files/908/sdl_5F00_ecc_5F00_utils.h

        void SDL_ECC_UTILS_enableABORT(void);
        void SDL_ECC_UTILS_disableABORT(void);

         Please check and feedback. Thanks.

    Linjun

  • Hi Linjun

    I performed testing on the code after adding it, and upon inserting the fault before calling the shutdown function, the system still proceeded to an abort state.

    Upon reviewing the R5 manual, according to this piece of code, its execution involves switching between Thumb state or ARM state, which has nothing to do with the abort process. Furthermore, the manual explicitly states that these settings are non-modifiable.

    Thanks

    Jimmy

  • Hello Jimmy,

    On injecting double bit error, you will get a data abort and that cannot be avoided. 

    You can instead modify the abort handler to deal with injected error differently. I am attaching a small idea below :

    1. You can write a specific string to some memory(MSRAM/DDR) before injecting the error.( No other program must write to this address)

    2. In the abort handler, you can check if the ESM error event is set and the string in step 1 is present in the memory. If yes, you can come out of the abort handler directly as it is an injected error. 

    I am attaching the pseudo code below 

    //R5F application :
    
    HW_WR_REG32(0x12345678,0x80000000) // Writing some random string to an address in DDR
    SDL_INJECT_ERRPR(TCMA, 2 bit) // Inject 2 bit error into TCM.
    
    
    //Abort Handler :
     if(error== TCM error && HW_RD_REG32(0x80000000)==0x12345678) // If the ESM error shows TCM ECC error and the string value at 0x80000000 is as expected
     {
        DebugP_log("Error injection successful error detected\r\n");
        HW_WR_REG32(0x0,0x80000000) // Clear the test address so that actual error gets caught
     }
     else // Usual Abort Handler
     {
        ...
        ...
        ...
     }

    Can you let me know if this kind of implementation solves your issue? 

    Regards,

    Nihar Potturu. 

  • Hi Potturu

    Thank you for your response, I understand this process.

    However, I'm still curious about how to return to the normal program in a 'data abort' scenario? Does the original program stack information get saved after entering 'data abort'? Are there any sample programs that can identify user input language?

    Thanks

    Jimmy