TMS570LS3137: Simulation of SRAM double-bit ECC error

Chuck Wong

Part Number: TMS570LS3137
Other Parts Discussed in Thread: HALCOGEN

Hello there,

In our project the ECC is enabled to detect SRAM double-bit error. We're looking for a way to simulate double-bit parity error and to see its effect.

However upon reading SPNU499c paragraph 6.6 Emulation / Debug Mode Behavior of the TCRAM module, it looks like it is not possible to force this error with the Debugger in Debug Mode.

What are the techniques to make sure that the SRAM ECC was correctly enabled and it effect is confirmed (like asserting nERROR pin) when error is detected, must like in the case of the SECDEC Flash Error Detection and Correction Control Register 1, by reading a predefined address located at the OTP?

Many thanks!

over 3 years ago

0 Frank Livingston over 3 years ago

TI__Mastermind 29778 points

Hi, our expert is out of the office. Please expect a delayed response.

0 QJ Wang over 3 years ago in reply to Frank Livingston

TI__Guru**** 198766 points

Hi Chuck,

You can inject SRAM ECC error. Please refer to the function checkB0RAMECC() in sys_selftest.c:

void checkB0RAMECC(void)
{
volatile uint64 ramread = 0U;
volatile uint32 regread = 0U;
uint32 tcram1ErrStat, tcram2ErrStat = 0U;

uint64 tcramA1_bk = tcramA1bit;
uint64 tcramA2_bk = tcramA2bit;
volatile uint32 i;
/* USER CODE BEGIN (36) */
/* USER CODE END */

/* enable writes to ECC RAM, enable ECC error response */
tcram1REG->RAMCTRL = 0x0005010AU;
tcram2REG->RAMCTRL = 0x0005010AU;

/* the first 1-bit error will cause an error response */
tcram1REG->RAMTHRESHOLD = 0x1U;
tcram2REG->RAMTHRESHOLD = 0x1U;

/* allow SERR to be reported to ESM */
tcram1REG->RAMINTCTRL = 0x1U;
tcram2REG->RAMINTCTRL = 0x1U;

/* cause a 1-bit ECC error */
_coreDisableRamEcc_();
tcramA1bitError ^= 0x1U;
_coreEnableRamEcc_();

/* disable writes to ECC RAM */
tcram1REG->RAMCTRL = 0x0005000AU;
tcram2REG->RAMCTRL = 0x0005000AU;

/* read from location with 1-bit ECC error */
ramread = tcramA1bit;

/* Check for error status */
tcram1ErrStat = tcram1REG->RAMERRSTATUS & 0x1U;
tcram2ErrStat = tcram2REG->RAMERRSTATUS & 0x1U;
/*SAFETYMCUSW 139 S MR:13.7 <APPROVED> "LDRA Tool issue" */
/*SAFETYMCUSW 139 S MR:13.7 <APPROVED> "LDRA Tool issue" */
if((tcram1ErrStat == 0U) && (tcram2ErrStat == 0U))
{
/* TCRAM module does not reflect 1-bit error reported by CPU */
selftestFailNotification(CHECKB0RAMECC_FAIL1);
}
else
{
/* clear SERR flag */
tcram1REG->RAMERRSTATUS = 0x1U;
tcram2REG->RAMERRSTATUS = 0x1U;

/* clear status flags for ESM group1 channels 26 and 28 */
esmREG->SR1[0U] = 0x14000000U;
}

/* enable writes to ECC RAM, enable ECC error response */
tcram1REG->RAMCTRL = 0x0005010AU;
tcram2REG->RAMCTRL = 0x0005010AU;

/* cause a 2-bit ECC error */
_coreDisableRamEcc_();
tcramA2bitError ^= 0x3U;
_coreEnableRamEcc_();

/* read from location with 2-bit ECC error this will cause a data abort to be generated */
ramread = tcramA2bit;

/* delay before restoring the ram value */
/*SAFETYMCUSW 134 S MR:12.2 <APPROVED> "Wait for few clock cycles (Value of i not used)" */
/*SAFETYMCUSW 134 S MR:12.2 <APPROVED> "Wait for few clock cycles (Value of i not used)" */
for(i=0U;i<10U;i++)
{
}/* Wait */

regread = tcram1REG->RAMUERRADDR;
regread = tcram2REG->RAMUERRADDR;

/* disable writes to ECC RAM */
tcram1REG->RAMCTRL = 0x0005000AU;
tcram2REG->RAMCTRL = 0x0005000AU;

/* Compute correct ECC */
tcramA1bit = tcramA1_bk;
tcramA2bit = tcramA2_bk;

/* USER CODE BEGIN (37) */
/* USER CODE END */
}

0 Chuck Wong over 3 years ago in reply to QJ Wang

Genius 3950 points

Hi QJ,

We don't use HalCoGen in our development, decoding the above function is a bit lost for me. Is there other explanations, or any place in the spnu499c that I could see the exact sequence and procedure in order to simulate a double-bit SRAM ECC error?

Thanks.

0 QJ Wang over 3 years ago in reply to Chuck Wong

TI__Guru**** 198766 points

Hi Chuck,

The TCRAM is protected by ECC allowing the CPU to correct any single-bit errors and detect double-bit errors within a 64-bit value. The error correction codes (ECC) are stored in the RAM memory space as well. For every 64-bit read from the RAM, an 8-bit ECC is also read by the CPU on its ECC bus.

The ECC memory can also be directly accessed. The write to the ECC space can be enabled by setting ECC_WR_EN bit of RAMCTRL register. Injecting two bit ECC error can be done:

0. Backup the SRAM content of an addr you want to perform the test, for example 0x08000032 (aligned 64-bit)

1. Enable ECC write: RAMCTRL->ECC_WR_EN = 1;

2. Disable RAM ECC

3. Toggle 2 bit of ECC code (0x08400032) of SRAM content at 0x08000032

4. Enable RAM ECC

5. Disable ECC write to SRAM: RAMCTRL->ECC_WR_EN = 0;

6. Read the SRAM content (0x08000032): read from location with 2-bit ECC error will cause a data abort

7. Clear the ESM error in data abort handler

0 Chuck Wong over 3 years ago in reply to QJ Wang

Genius 3950 points

Hi QJ,

That's much clearer. Thank you. I still have issue with the following code segment, as soon as the third line is executed (ram = *(u64 *) 0x08000032;), an Abort exception is generated. Could you enlighten me on the cause?

                case 28:    u64 ram;
                            int ACTLR;
                            ram = *(u64 *) 0x08000032;
                            RAMCTRL1_bit.ECC_WR_EN = 1;

                            ACTLR = __MRC(15,0,1,0,1);
                            ACTLR &= 0xF3FFFFFF;
                            __MCR(15,0,ACTLR,1,0,1);

                            asm(" nop");
                            asm(" nop");
                            asm(" nop");

                            *(u64 *) 0x08400032 = ram ^ 0x3;

                            ACTLR = __MRC(15,0,1,0,1);
                            __MCR(15,0,ACTLR|0x0C000000,1,0,1);

                            asm(" nop");
                            asm(" nop");
                            asm(" nop");

                            RAMCTRL1_bit.ECC_WR_EN = 0;
                            *(u64 *) 0x08000032 = ram;
                            break;

0 QJ Wang over 3 years ago in reply to Chuck Wong

TI__Guru**** 198766 points

Is the SRAM initialized? The RAM memory can be initialized by using the dedicated auto-initialization hardware. All RAM data memory is initialized to zeros and the ECC memory is initialized to the correct ECC value for zeros, that is, 0Ch.

0 Chuck Wong over 3 years ago in reply to QJ Wang

Genius 3950 points

Yes all locations of the SRAM were HW auto-initialized.

What is strange is that if all "u64" were changed to "u32" in the source code, stepping all the way to the end without abort exception. Looks like 32-bit read and write are not creating data abort.

0 Chuck Wong over 3 years ago in reply to QJ Wang

Genius 3950 points

Does the following screen shot makes sense for initialized ECC? Some locations are not quite 0x0C.

0 QJ Wang over 3 years ago in reply to Chuck Wong

TI__Guru**** 198766 points

Chuck Wong said:
third line is executed (ram = *(u64 *) 0x08000032;), an Abort exception is generated. Could you enlighten me on the cause?

I am sorry. 0x08000032 is not aligned to 64-bit memory boundary.

Please try ram = *(uint64 *) 0x08000038;

and change this line *(u64 *) 0x08400032 = ram ^ 0x3;

to *(u64 *) 0x08400032 ^= 0x3;

0 QJ Wang over 3 years ago in reply to Chuck Wong

TI__Guru**** 198766 points

Chuck Wong said:
third line is executed (ram = *(u64 *) 0x08000032;), an Abort exception is generated. Could you enlighten me on the cause?

I am sorry. 0x08000032 is not aligned to 64-bit memory boundary.

Please try ram = *(uint64 *) 0x08000038;

and change this line *(u64 *) 0x08400032 = ram ^ 0x3;

to *(u64 *) 0x08400032 ^= 0x3;

0 Chuck Wong over 3 years ago in reply to QJ Wang

Genius 3950 points

Good day QJ,

My test with the proposed changes didn't work, no exception was generated after the whole sequence.

Now I've changed the address been used to 0x08000040 and ECC address 0x08400040 to ease memory inspections, both addresses are 8-byte aligned. This is the new sequence:

Backup RAM at 0x08000040
Disable RAM ECC checking
Enable ECC RAM write
Toggle 2 bits at 0x08400040
Disable ECC RAM write
Enable RAM ECC checking
Read from 0x08000040 to generate data abort exception: Nothing happens here
Restore RAM 0x08000040

Below is the modified code segment. What could be wrong?

I will provide memory views captures if you want them.

case 32:    u64 ram, err;               // local variables
            u32 ACTLR;

            ram = *(u64 *) 0x08000040;  // backup SRAM contents

            ACTLR = __MRC(15,0,1,0,1);  // disable RAM ECC by
            ACTLR &= 0xF3FFFFFF;        // clearing bits 26 & 27
            __MCR(15,0,ACTLR,1,0,1);    // for even and odd banks

            RAMCTRL1_bit.ECC_WR_EN = 1; // enable ECC RAM write
            *(u64 *) 0x08400040 ^= 0x3; // toggle 2 bits of ECC code (0x08400040) of SRAM contents at 0x08000040
            RAMCTRL1_bit.ECC_WR_EN = 0; // disable ECC RAM write

            ACTLR = __MRC(15,0,1,0,1);  // enable RAM ECC by
            ACTLR |= 0x0C000000;        // setting bits 26 & 27
            __MCR(15,0,ACTLR,1,0,1);    // for even and odd banks

            err = *(u64 *) 0x08000040;  // read to generate data abort exception
            *(u64 *) 0x08000040 = ram;  // restore SRAM contents
            break;

0 Chuck Wong over 3 years ago in reply to QJ Wang

Genius 3950 points

Hello QJ,

Please get back to me, any help would be appreciate.

Chuck.

0 QJ Wang over 3 years ago in reply to Chuck Wong

TI__Guru**** 198766 points

Hi Chuck,

The sequence looks good. What is value of other bit field of RAMCTRL register? Are the ECC detection (bit 3~0) and parity checking (bit 19~16) enabled? The ECC detection and parity checking are enabled by default.

0 Chuck Wong over 3 years ago in reply to QJ Wang

Genius 3950 points

Hi QJ,

Just before executing the instruction err = *(u64 *) 0x08000040:

Another related question is since this ECC check will result in a double-bit uncorrected SRAM error, why should I expect a Data Abort exception instead of an ESM Group3 error with nERROR output signal asserted?

Should I also configure RAMCTRL2? Why there are two sets of registers?

Thanks.

0 Chuck Wong over 3 years ago in reply to QJ Wang

Genius 3950 points

Beside that, when reading address 0x08400040, it is 0x0c0c0c0c0c0c0c0c, it seems to me that:

This instruction has no effect: *(u64 *) 0x08400040 ^= 0x3;
This is not the right location.

0 QJ Wang over 3 years ago in reply to Chuck Wong

TI__Guru**** 198766 points

Chuck Wong said:
why should I expect a Data Abort exception instead of an ESM Group3 error with nERROR output signal asserted?

Two-bit ECC error generates aborts to the CPU, and pulls the nERROR LOW, and sets ESM 3.3/3.5

Chuck Wong said:
Should I also configure RAMCTRL2? Why there are two sets of registers?

The BTCM interface (for RAM) is divided into two parts – B0TCM and B1TCM. RAMCTRL 1 is for B0TCM, and RAMCTRL2 is for B1TCM.

+1 QJ Wang over 3 years ago in reply to QJ Wang

TI__Guru**** 198766 points

*(volatile uint32 *)0x08400040U ^= 0x3;

0 Chuck Wong over 3 years ago in reply to QJ Wang

Genius 3950 points

Thank you QJ for your support,

Finally the problem was the 64-bit read to memory address 0x08000040. Instead it should be a 32-bit read aligned to the 8-bit boundary, regardless whether the keyword volatile was used or not in toggling the 2-bit ECC.

Summary is that 32-bit RD/WR shall be used in all accesses, NOT 64-bit.

Best regards!

Arm-based microcontrollers

Arm-based microcontrollers forum

TMS570LS3137: Simulation of SRAM double-bit ECC error