RM48L952: SafeTI: SRAM test SRAM_PAR_ADDR_CTRL_SELF_TEST looks to fail in case FIQ interrupts are enabled

Jarkko Silvasti

Part Number: RM48L952
Other Parts Discussed in Thread: TMS570LS1224

I am currently implementing run time checks of SafeTI (same tests works during start-up).

All FEE tests works ok (including fault injections), so I moved to SRAM tests and immediately received error. SRAM_PAR_ADDR_CTRL_SELF_TEST test is causing problems.

SL_SelfTest_SRAM() returns ST_FAIL in SL_SelfTest_Result* sram_stResult

Reason for failure is this check (RAM write part)
            if ((TCRAM_RAMERRSTAT_WADDRPAR_FAIL == (uint32)(sl_tcram1REG->RAMERRSTATUS & TCRAM_RAMERRSTAT_WADDRPAR_FAIL)) && /* Write parity error on B1 */
                    (TCRAM_RAMERRSTAT_WADDRPAR_FAIL == (uint32)(sl_tcram2REG->RAMERRSTATUS & TCRAM_RAMERRSTAT_WADDRPAR_FAIL)) && /* Write parity error on B2 */
                    (0 != (sl_esmREG->SSR2 & ((uint32)1U << ESM_G2ERR_B0TCM_ADDPAR))) &&    /* B1 Parity error */
                    (0 != (sl_esmREG->SSR2 & ((uint32)1U << ESM_G2ERR_B1TCM_ADDPAR)))) {    /* B2 parity error */
                *sram_stResult = ST_PASS;
            } else {
                *sram_stResult = ST_FAIL;
            }

And there it is that sl_tcram2REG check, I have value 0x100 in status and 0x200 is expected.

From this line
sramEccTestBuff[0] = 0xAAAAAAAABBBBBBBBUL; /* Generate write parity error on B1 & B2 */

code jumps to ESM interrupt handler (sl_esm_high_intr_handler()) and when this line line is reached (cpu registers pushed to stack (which is in RAM) the 0x100 is already present in tcram2 status.
esmOffH = (uint8)sl_esmREG->IOFFHR;

So most likely it looks like that this test is not meant to be run while FIQ interrupts are enabled, so this test cannot be run during runtime of the SW.

I double tested the findings by inserting the register printing just before checking the register content (should be safe here based on the comments in lines before that)
            /* Restore parity, so that we can use the stack */
            /*SAFETYMCUSW 134 S MR: 12.2 <APPROVED> Comment_5*/
            sl_tcram1REG->RAMCTRL = (sl_tcram1REG->RAMCTRL & 0xF0FFFFFFU);
            /*SAFETYMCUSW 134 S MR: 12.2 <APPROVED> Comment_5*/
            sl_tcram2REG->RAMCTRL = (sl_tcram2REG->RAMCTRL & 0xF0FFFFFFU);
            DBG_PRINT( "TCRAM1_STAT: 0x%x, TCRAM2_STAT: 0x%x\r\n", sl_tcram1REG->RAMERRSTATUS, sl_tcram2REG->RAMERRSTATUS );
            if ((TCRAM_RAMERRSTAT_WADDRPAR_FAIL == (uint32)(sl_tcram1REG->RAMERRSTATUS & TCRAM_RAMERRSTAT_WADDRPAR_FAIL)) && /* Write parity error on B1 */
                    (TCRAM_RAMERRSTAT_WADDRPAR_FAIL == (uint32)(sl_tcram2REG->RAMERRSTATUS & TCRAM_RAMERRSTAT_WADDRPAR_FAIL)) && /* Write parity error on B2 */
                    (0 != (sl_esmREG->SSR2 & ((uint32)1U << ESM_G2ERR_B0TCM_ADDPAR))) &&    /* B1 Parity error */
                    (0 != (sl_esmREG->SSR2 & ((uint32)1U << ESM_G2ERR_B1TCM_ADDPAR)))) {    /* B2 parity error */
                *sram_stResult = ST_PASS;
            } else {
                *sram_stResult = ST_FAIL;
            }

And the results shows same as check via debugger...
TCRAM1_STAT: 0x200, TCRAM2_STAT: 0x100<CR><LF>

Tried also printing after if-sentence and result is same...

There is no mentions anywhere that this test cannot be run while FIQ interrrupts are enabled (cannot see any other meaningful difference to start-up phase test since this test disables interrupts by itself so context switch should not happen while executing the test).

Is the interpretation correct? If yes, could you please list all SafeTI tests which should not be tried to run during runtime, pbists and so on are quite obvious but tests like this one isn't.

over 6 years ago

0 QJ Wang over 6 years ago

TI__Guru**** 186196 points

Hello Jarkko,

We got your question, and will review it and give you our feedback soon.

Regards,
QJ

0 Jarkko Silvasti over 6 years ago in reply to QJ Wang

Expert 1395 points

Just noticed something

After the reading has been tested there is code comment which says that NO RAM ACCESS
        if (ST_PASS == *sram_stResult) {
            /* Override parity (actually flip).. NO RAM ACCESS from this point (except intentional errors) */
            /*SAFETYMCUSW 134 S MR: 12.2 <APPROVED> Comment_5*/
            sl_tcram1REG->RAMCTRL = (sl_tcram1REG->RAMCTRL & 0xF0FFFFFFU) | TCRAM_RAMCTRL_ADDR_PARITY_OVER;
            /*SAFETYMCUSW 134 S MR: 12.2 <APPROVED> Comment_5*/
            sl_tcram2REG->RAMCTRL = (sl_tcram2REG->RAMCTRL & 0xF0FFFFFFU) | TCRAM_RAMCTRL_ADDR_PARITY_OVER;
            sramEccTestBuff[0] = 0xAAAAAAAABBBBBBBBUL; /* Generate write parity error on B1 & B2 */
            sramEccTestBuff[1] = 0xBBBBBBBBAAAAAAAAUL;

And ESM handler uses RAM for this test by reading the test activity flag from the RAM and increasing & flagging variables

        case ESM_G2ERR_B0TCM_ADDPAR:
           if (TRUE == SL_FLAG_GET(SRAM_PAR_ADDR_CTRL_SELF_TEST)) {
               callbackCancelCount++;
               cancelCallback = TRUE;
            }

        case ESM_G2ERR_B1TCM_ADDPAR:
           if (TRUE == SL_FLAG_GET(SRAM_PAR_ADDR_CTRL_SELF_TEST)) {
               callbackCancelCount++;
               cancelCallback = TRUE;
            }
           callbkParam1 = sl_tcram2REG->RAMPERADDR;
            /* No status to clear here */
            break;

And also esmGroup2Handler uses RAM when stacking the registers
static void esmGroup2Handler (uint8 esmChannel)
{
esmGroup2Handler:
       0xb878: 0xe92d 0x41f0 PUSH.W    {R4-R8, LR}
       0xb87c: 0x0004         MOVS      R4, R0
    boolean cancelCallback = FALSE;

So based on this I made a little experiment which was successful (in case you call esmGroup2Handler and use return there right after these cases is won't work since variables is stacked in function call which use RAM)
        else if (esmOffH >= 32u) {
            /* Group 2 channel 0 to 31 */
            esmChannel = esmOffH - 32u;
            if( esmChannel == ESM_G2ERR_B0TCM_ADDPAR || esmChannel == ESM_G2ERR_B1TCM_ADDPAR )
            {
                return; // NO RAM ACCESS
            }
            else
            {
                esmGroup2Handler(esmChannel);
            }
        }

With this modification the test result was PASSed, of course this modification cannot be used since now it masks also real errors or would it be correct way to say that does not inform application, basically the test activity information should be hidden to some CPU register or peripheral RAM or something else in case this tests should be needed to run...

Is the purpose of SafeTI that integrators works as debuggers for TI? We really can't wait days to get feedback, for example the reason for this behavior should have been discoverable by expert in a couple of minutes/hours and question has been open a couple of days, last time it took over 2 weeks to get confirmation of really simple bug in DMA test...

There is really urgent need to get proper user manual which states ALL the bugs and describes how every function should be used and when and when not to use them. Obviously for example this tests cannot be run in case FIQ is enabled since it uses RAM... Do you guys really test the code at all? If yes, then why these constraints are not stated in the manual that for example do not run this test if FIQ is enabled meaning this test cannot be run during run time?

Is it enough that this test is run only in start-up? Since we cannot reliable map unique identifiers to SafeTi tests we have made a policy that every test in run (for example in SRAM) in order to be able to claim that all unique identifiers are done which says that those are made by running _SRAM-function, this is again one really big "bug" in the manual that it only states function not arguments for it.

- is this test RAM13?

At least some tests must be run during time (diagnosing the diagnostics) and looks like this ESM channel is tested only by this test so do we now need to drop out some unique identifiers from FMEDA excel conserning SRAM since this test is executed only in startup and please DO NOT answer that "it depends on your application"???

0 Jarkko Silvasti over 6 years ago in reply to Jarkko Silvasti

Expert 1395 points

Looks like that "test fix" doesn't work either even though it looked to work at first sight since also sl_esm_high_intr_handler() uses stack in entry... Also wondering why this looks to behave a bit differently depending do you step with debugger or just run ode with debugger, basically test always fails but at different part of code...

Maybe asm-handler should implemented and that should be temporary switched to the place of sl_esm_high_intr_handler to prevent stack usage. In asm handler restore parity, then check if test flag is active, if yes return from FIQ else call sl_esm_high_intr_handler() since this is some real error? Or something like that, this should then mimic the behavior of start-up phase where FIQ is not enabled at all...

0 Bharat@Honeywell over 6 years ago in reply to Jarkko Silvasti

Prodigy 180 points

Hello TI,

Even i am facing the same issue on TMS570ls1224 with the below self test API.
retVal = SL_SelfTest_SRAM(SRAM_PAR_ADDR_CTRL_SELF_TEST, TRUE, &failInfoTCMRAM);

I am geting the read parity error on B2 instead of Write parity error on B2 .

The Self test API got passed, when i swapped the below two buffers in the SL_SelfTest_SRAM() functions.
sramEccTestBuff[1] = 0xBBBBBBBBAAAAAAAAUL;
sramEccTestBuff[0] = 0xAAAAAAAABBBBBBBBUL;

Its a very strange behaviour and i didn't understood how it got passed.
Can you please provide more info & usage of the below self test API
SL_SelfTest_SRAM(SRAM_PAR_ADDR_CTRL_SELF_TEST, TRUE, &failInfoTCMRAM);

Regards,
Bharat Mallela

0 Chuck Davenport over 6 years ago in reply to Bharat@Honeywell

TI__Guru 59540 points

Hello Bharat,

I am currently investigating to try to understand also why this change would cause the test case to pass. I will come back when I have an answer.

Arm-based microcontrollers

Arm-based microcontrollers forum

RM48L952: SafeTI: SRAM test SRAM_PAR_ADDR_CTRL_SELF_TEST looks to fail in case FIQ interrupts are enabled