TMS320F2800157: RAM ECC Test

AHashem

Hello

I have the following questions in sdl_ex_ram_ecc_parity_test.c file:

1- In runCorrectableECCTest and runUncorrectableECCTest, Why are you following this sequence?

1. Walk through an M0 RAM location until every bit has had a single-bit error injected into it.
2. Walk through a M0 RAM location until every ECC bit for the lower 16 bits (bits 6:0) has had an error injected into it.
3. Walk through a M0 RAM location until every ECC bit for the upper 16 bits (bits 14:8) has had an error injected into it.

2- Why did you choose to test M0 only not M0 and M1? I am using in my project M0 and M1 shall I implement tests for both or testing M0 is enough?
3- In my project I use also LS0 and LS1, I am required to run the RAM ECC tests on them?

4- Why did you choosed to run the parity tests on LS1 and ECC tests on M0? I mean not covering all the memoried in parity test and ECC test?
5- Why are you load the original data into m0Data(Restore m0Data) at every for loop not one time at the end of the test?

Thanks,

over 1 year ago

0 Whitney Dewey over 1 year ago

TI__Guru 54355 points

1. To perform a thorough test of the ECC logic, we test the detection of errors in every bit of the data and ECC. This gives the test higher diagnostic coverage then simply testing a bit or two.

2. We just demonstrated M0 as an example, but you should repeat the test for each RAM block you use, so yes, you should also test M1.

3. LS0 and LS1 do not have ECC. They only have parity, so you should do a test of the parity logic.

4. Memories either have ECC or parity protection, but not both. There's a table in the device datasheet that shows which type of protection each memory has.

5. Writing the original value back to the memory location will update the ECC to the correct value again (i.e. overwriting the injected error) before injecting an error to a different bit. Although I suppose you could argue that this isn't needed in the runCorrectableECCTest() function since the hardware has already corrected the error.

Whitney

0 AHashem over 1 year ago in reply to Whitney Dewey

Intellectual 270 points

Thank you for the answer,

In the following steps:
1. Walk through an M0 RAM location until every bit has had a single-bit error injected into it.
2. Walk through a M0 RAM location until every ECC bit for the lower 16 bits (bits 6:0) has had an error injected into it.
3. Walk through a M0 RAM location until every ECC bit for the upper 16 bits (bits 14:8) has had an error injected into it.
Why you did the test on Data in one-shot, but the test on ECC bits in 2 phases one phase from 0:6bits and another one from 8:14bits?

0 Whitney Dewey over 1 year ago in reply to AHashem

TI__Guru 54355 points

The loops were split just because the ECC bits for lower data, upper data, and address are not adjacent to each other--there's a bit skipped between them.

The TRM has a table showing the mapping.

Whitney

0 AHashem over 1 year ago in reply to Whitney Dewey

Intellectual 270 points

Thank you for the answer,

In runCorrectableECCTest, in the part which checks on errorAddr in the third loop(ECC from 8:14). Why do you add 1UL to (uint32_t)&m0Data?

if(errorAddr != (((uint32_t)&m0Data) + 1UL))

0 AHashem over 1 year ago in reply to Whitney Dewey

Intellectual 270 points

In m0Data = 0xAAAA5555U; , could I know why did you choose this address?
In case I am doing a test on M1, what is the address range I can use to perform the same test on M1? Do you have ant specific recommendations I should consider when I develop the same test but for M1?

0 Whitney Dewey over 1 year ago in reply to AHashem

TI__Guru 54355 points

AHashem said:
In runCorrectableECCTest, in the part which checks on errorAddr in the third loop(ECC from 8:14). Why do you add 1UL to (uint32_t)&m0Data?

As noted in the TRM, those ECC bits are for the upper word of the data, so we add 1 specifically to address the upper 16 bits of the 32-bit variable m0Data.

AHashem said:
In m0Data = 0xAAAA5555U; , could I know why did you choose this address?

0xAAAA5555 is just a random value assigned to the memory location--you can pick anything. The address of m0Data is determined by the linker, although the application tells it to make sure the value is located in M0 using #pragma DATA_SECTION(m0Data, "ramm0")

AHashem said:
what is the address range I can use to perform the same test on M1? Do you have ant specific recommendations I should consider when I develop the same test but for M1?

You can do something similar for M1 where you define a section within M1 in your linker command file and then use #pragma DATA_SECTION to tell the linker to place your test variable in that section. You device datasheet will tell you what address range is applicable to M1RAM.

Whitney

0 AHashem over 1 year ago in reply to Whitney Dewey

Intellectual 270 points

I followed the same method of M0 to test M1, Same init steps, created Random value and assigned it to RAMM1 by using #pragma DATA_SECTION like the example.

The behaviour is the correctable interrupt is not fired and even the CPURDERR bit is not set when trying to inject single bit fault.

Note: CEINTFLAG equals 0

Do I need to consider anything different than M0? Do I need to enable correctable error interrupt for RAMM1?

0 Whitney Dewey over 1 year ago in reply to AHashem

TI__Guru 54355 points

AHashem said:
Do I need to consider anything different than M0? Do I need to enable correctable error interrupt for RAMM1?

There shouldn't be anything different other than the location of the error injection location and the MEMCFG_SECT_M1 value being passed to the error injection function. One thing you do need to look out for though is the location of the stack. If your stack is in M1, you may run into issues where the code doesn't work correctly because stack memory is in test mode and therefore not behaving normally. It may help instead of calling the MemCfg_setTestMode() function to rewrite it as a bare metal register write to avoid stack use while test mode is enabled.

AHashem said:
in correctable error function you did not check on range 16:23

The region is for the address ECC. Address ECC errors are always uncorrectable.

Whitney

C2000™︎ microcontrollers

C2000 microcontrollers forum

TMS320F2800157: RAM ECC Test