Problem to test R4 ECC

joy liu

Other Parts Discussed in Thread: HALCOGEN

Hello,

I read some information from previsou posting regarding force R4 ECC single bit error. After following the sequence provided by TI experts, I still have the problem to make single bit error in ECC. Below is the what I did

1. set up even RAM register (error count =0, error threshold=1, enable error interrupt control)

2. disable ECC checking logic inside CPU through CP15 instruction. The code is provide by TI posting

3. set up ECC TCRAM Wrapper register to disable ECC and enable ECC write

4. get the RAM address of a 64 bit variable and add the RAM ECC offset (0x00400000) to its RAM address

5. flip one bit in this RAM ECC location to force single bit error

6. set up RAM wrapper register to enable ECC and disable ECC write

7. enable ECC checking logic inside CPU throught CP15 instructions. the code is provided by TI posting

8. read the 64 bit variable in RAM to trigger ECC single bit error

I checked TCRAM register. the single error occurrence field didn't change the value from 0 to 1 after above code execution.

What is wrong? How to make this work?

Thank you.

Joy

over 13 years ago

0 Luc Baudoin over 13 years ago

TI__Intellectual 1125 points

Hello Joy,

Can you provide us the code that you wrote to make an ECC error happen?

Best regards,

TI Forum Team

0 joy liu over 13 years ago in reply to Luc Baudoin

Prodigy 175 points

Hello,

Below is the source code. Two subroutine in asm file are provided by TI (Disable_RAM_R4_ECC & Enable_RAM_R4_ECC)

in C file:

R4_Ram_Even_Registers. Single_Error_Interrupt_Status = 1;

R4_Ram_Even_Registers. Single_Error_Cntr = 0;

R4_Ram_Even_Registers.Single_Error_Threshold = 3;

R4_Ram_Even_Registers.Single_Error_Interrupt_Cntrol = 1;

for ( i = 0; i < 3; i++)

{

InsertBadRamDoubleWordR4(&R4_Ram_Even_Registers, &Ram_Even_Address[i], (unsigned 32)0x1);

/* do a RAM read */ test_word = Ram_Even_Address[i];

}

in H file

#define ECC_WRITE_ENABLE (1<<8)

#define ECC_DISABLE 0x5

in asm file

:ABIPX: InsertBadRamDoubleWordR4: asmfunc

PUSH (R4-R6, LR)

BL :ABIPX: Disable_RAM_R4_ECC ;disable ECC in Core

; disable ECC in RAM Wrapper

LDR R3, [R0, #0)

MOV R5, #(ECC_WRITE_ENABLE | ECC_DISABLE) ;(1<<8) | 0x5

BIC R4, R3, #0xF

ORR R4, R4, R5

STR R4, [R0, #0x00400000]

;make sure VBUS write to RAMW is completed

LDR R4, [R0, #0x00400000]

; add ECC offset to the target address

MOV R6, #0x00400000

ADD R1, R1, R6

: load Ecc value into R4, R5

LDRD R4, R5, [R1]

;EXOR to corrupt single bit or double bits

EOR R5, R5, R2

;Write ECC value back to the address

STRD R4, R5, [R1]

DMB

DSB

; re-enable ECC in RAM Wrapper

STR R3, [R0, #0]

; make sure VBUS write to RAMW is completed

LDR R3, [R0, #0]

; re-enable ECC in CPU

BL :ABIPX: Enable_RAM_R4_ECC

POP {R4-R6, PC}

.endasmfunc

:ABIPX:Disable_RAM_R4_ECC: .asmfunc

; disable ECC for memories

MRC P15, #0, R4, C1, C0, #1

MVN R5, #0x1 << 26

AND R4, R4, R5

MVN R5, #0x1 << 27

DMB

MCR P15, #0, R4, C1, C0, #1

ISB

MRC P15, #0, R4, C1, C0, #1

DMB

MCR P15, #0, R4, C1, C0, #1

ISB

;disable Event on Event bus

MRC P15, #0, R4, C9, C12, #0

MVN R5, #00000010 ; clear 4th bit of PMNC register

AND R4, R4, R5

DMB

MCR P15, #0, R4, C9, C12, #0

ISB

BX LR

.endasmfunc

:ABIPX:Enable_RAM_R4_ECC: .asmfunc

; Enable Events on EVent Bus

MRC P15, #0, R4, C9, C12, #0

ORR R4, R4, #00000010

DMB

MCR P15, #0, R4, C9, C12, #0

ISB

; enable ECC for memories

MRC P15, #0, R4, C1, C0, #1

ORR R4, R4, #0x1 << 26

DMB

MCR P15, #0, R4, C1, C0, #1

ISB

MRC P15, #0, R4, C1, C0, #1

ORR R4, R4, #0x1 << 27

DMB

MCR P15, #0, R4, C1, C0, #1

ISB

BX LR

.endasmfunc

Thank you.

Joy

0 Luc Baudoin over 13 years ago in reply to joy liu

TI__Intellectual 1125 points

Hello Joy,

Looking at the code generated by HalCoGen I can see the following sequence:

1- Initialize CPU RAM and coresponding ECC location using memory hardware initialization

2- Enable CPU RAM ECC

3- Enable write to ECC RAM, enable ECC error response, allow single error to be reported to ESM

4- Create a 1 bit error on a line, disable write to ECC RAM

5- Read location with 1 bit error

6- Clear SERR flag and status flag for ESM

This is all in sys_selftest.c:

/* enable writes to ECC RAM, enable ECC error response */

tcram1REG->RAMCTRL = 0x0005010A;

tcram2REG->RAMCTRL = 0x0005010A;

/* the first 1-bit error will cause an error response */

tcram1REG->RAMTHRESHOLD = 0x1;

tcram2REG->RAMTHRESHOLD = 0x1;

/* allow SERR to be reported to ESM */

tcram1REG->RAMINTCTRL = 0x1;

tcram2REG->RAMINTCTRL = 0x1;

/* cause a 1-bit ECC error */

tcramB1bitError ^= 0x1;

/* disable writes to ECC RAM */

tcram1REG->RAMCTRL = 0x0005000A;

tcram2REG->RAMCTRL = 0x0005000A;

/* read from location with 1-bit ECC error */

ramread = tcramB1bit;

/* SERR not set in TCRAM1 or TCRAM2 modules */

if (!((tcram1REG->RAMERRSTATUS & 1) || (tcram2REG->RAMERRSTATUS & 1)))

{

/* TCRAM module does not reflect 1-bit error reported by CPU */

tcramClass2Error();

}

else

{

/* clear SERR flag */

tcram1REG->RAMERRSTATUS = 0x1;

tcram2REG->RAMERRSTATUS = 0x1;

/* clear status flags for ESM group1 channels 26 and 28 */

esmREG->ESTATUS1[0] = 0x14000000;

}

0 Luc Baudoin over 13 years ago in reply to Luc Baudoin

TI__Intellectual 1125 points

Hello Joy, I forgot to add Enable CPU Event Export...

So it goes like this:

coreEnableEventBusExport_();

_memoryInit_(0x1);

_coreEnableRamEcc_();

tcram1REG->RAMCTRL = 0x0005010A;

tcram2REG->RAMCTRL = 0x0005010A;

tcram1REG->RAMTHRESHOLD = 0x1;

tcram2REG->RAMTHRESHOLD = 0x1;

tcram1REG->RAMINTCTRL = 0x1;

tcram2REG->RAMINTCTRL = 0x1;

tcramA1bitError ^= 0x1;

tcram1REG->RAMCTRL = 0x0005000A;

tcram2REG->RAMCTRL = 0x0005000A;

ramread = tcramA1bit;

0 Luc Baudoin over 13 years ago in reply to Luc Baudoin

TI__Intellectual 1125 points

Hello Joy,

Let me know if the HalCoGen sequence is solving your issue, I am able to see the single bit error occurence bit being set.

Thanks and regards,

Luc

0 joy liu over 13 years ago in reply to Luc Baudoin

Prodigy 175 points

Hello Luc,

Could you please send the code of coreEnableEventBusExport. I couldn't find it in the package of RAMECC_CODE_TI.

Thanks,

Joy

0 Luc Baudoin over 13 years ago in reply to joy liu

TI__Intellectual 1125 points

Hello Joy,

Here it is. Note that in the HalCoGen bundle it is located in the source sys_core.asm.

.def _coreEnableEventBusExport_

.asmfunc

_coreEnableEventBusExport_

stmfd sp!, {r0}

mrc p15, #0x00, r0, c9, c12, #0x00

orr r0, r0, #0x10

mcr p15, #0x00, r0, c9, c12, #0x00

ldmfd sp!, {r0}

bx lr

.endasmfunc

0 joy liu over 13 years ago in reply to Luc Baudoin

Prodigy 175 points

Hello Luc,

This code is already there inside of Enable_ECC/Disable_ECC assembly routines I am using. The only difference is my code use r4 and your code use r0.

While debugging the code, I verified that the registers have 8 bytes of 0x0c loaded from RAM ECC location which corresponding RAM location of value 0. Then flipping one bit to make registers value become 0x0d in one byte and write 8 bytes back to the same RAM ECC location. Afterwords, the value in the corresponding RAM location are still 0, which means RAM ECC value now didn't match the RAM value and a correction should happen and the Counter Register shall increase by 1. But this didn't happen, which is the problem.

Since the assembly code to enable eventBusExport and CoreECC don't use general registers, I can't see what is going on while debugging. Would you be able to let me know how I can debug and what to check to find out why the code didn't change the counter register value.

Thanks,

Joy

0 joy liu over 13 years ago in reply to joy liu

Prodigy 175 points

Hello Luc,

While debugging, I notice something I don't understand.

Previously, I set the value in RAM location as 0 for 8 bytes and corresponding ECC value is 0x0c for 8 bytes. To debug, I changed value in the same location in RAM

location: 0x08006468 -- value: 0x00000000 0x01010101

using registers, I check ECC in the same location

location: 0x08406468 -- value: 0x0c0c0c0c 0x0c0c0c0c

I though ECC should change to another value matching to the RAM value of 1. But it doesn't. This seems suggesting that ECC doesn't set the value according to the RAM value. But from TCRAM wrapper control register, it has value 0x0000000A, ECC detection enabled.

I don't understand why this could happen. Any thoughts or comment on this?

Thanks,

Joy

0 Luc Baudoin over 13 years ago in reply to joy liu

TI__Intellectual 1125 points

2604.ECC_issue.zip

Hello Joy,

Please find attached ECC code that generates a single error.

Please set a breakpoint in sys_selftest.c under function checkB0RAMECC first if / else statement and run, you should run into the else portion listed here after.

Regarding the occurence counter not updating, this is likely due to the threshold being set to 1. As soon as the occurence reaches the threshold, the occurence counter is reset to 0. I have slightly modified the code to change the threshold to 3 and I could see it updating as on the picture attached.

0 Luc Baudoin over 13 years ago in reply to Luc Baudoin

TI__Intellectual 1125 points

Hello Joy,

Regarding your latest post changing a value in RAM with ECC enable should change the ECC value. This is only valid with ECC enable, proper CPU configuration, etc...

I am not sure if you're reading values through the debug interface or directly running with your code. In order to avoid confusion I strongly advise not to use the memory browser window.

To help you more I would need your entire code, compiling and running and a pointer to where I should break it.

Best regards,

Luc

0 Luc Baudoin over 13 years ago in reply to Luc Baudoin

TI__Intellectual 1125 points

Hello Joy,

Please let me know if you need more help with the R4 ECC.

If I answered your question please close the thread as answered.

Thanks and best regards,

Luc

0 joy liu over 13 years ago in reply to Luc Baudoin

Prodigy 175 points

Hello Luc,

Sorry, I didn't get notification email from your last two posts until this post. I will take a look and run your code to check the result and get better understanding of my problem. During debugging this ECC problem, I happened to notice that half of the times in my testing it generated multiple bit error (DED) instead of single bit error (SEC). I don't know why.

My question is is it possible that the flipped ECC bit is not the one of the 8 ECC bit generated by 64 data bits so that either no SEC event occur or DED event occur. Because there are only 8 bit ECC code for each 64 bits data.

I will check out your code and get back to you soon.

Thank you.

Joy

0 joy liu over 13 years ago in reply to Luc Baudoin

Prodigy 175 points

Hello Luc,

I integrated your code into our software and tested. Below is the problems I am having with TCRAM ECC using your code.

1. you use hard coded ECC memory location to inject error bit. But our software with ECC error injectioin only works when using location 0x08400000 and 0x08400008 and did not work for location 0x08400010 and 0x08400018. I guess this might be due to the different usage in RAM location

we have mailbox from 0x08000000 to 0x0800000f , starting at 0x08000010 is the calibration section. I am not sure if software use mailbox, but calibration section is filled with all the data. I don't know why ECC section of that memory didn't work when inject error bit, which might be related to the next problem I have

2. When using memory module hardware initialization (memoryInit(1)) for R4 RAM, software crashes, even though it was called in privilege mode at begining of the application, just after configuring system clock. So it can't be used. Our software do call memoryInit for other memory such as DMA, NHET, ADC and even M3 RAM. But not R4 RAM. But since without it, the software can inject error bit at begining of ECC memory location, I would assume that the software no need to use memoryInit.

To sovle my problem, would you be able to use a RAM variable and then add ECC offset (0x00400000) to access the corresponding ECC location to inject error bit and generate ESM event? This is how I tried to access ECC location to inject error bit. the variable has alignment of 16byte using either linker command or asm code. But I couldn't get it working. If you can, please let me know. If not, what is your suggestion to identify the ECC location for injection without hard coded #define. This way, I can define a RAM variable and then access its ECC location to inject error bit for the testing.

Thank you.

Joy

0 joy liu over 13 years ago in reply to joy liu

Prodigy 175 points

Hello Luc,

I found the reason memoryInit routine causing crash: it wipes out the stack so when the routine returning, it returns to the address 0. Our software use R4 RAM to start application. I added memoryInit at the begining of the entry routine. Because it wiped out the stack, after calling memoryInit, the software lost return address and went back to address 0 which is the boot. This is the reason why our current software use memoryIniit to initialize other modules except for RAM.

Since without using memoryInit, I got ESM SEC event when using hard coded #define 0x08400000 and 0x08400008, I think memoryInit is not necessary for ECC error bit injection. I appreciate If you can help me to use a variable located in RAM to access the corresponding ECC memory location and inject single bit or double bits error to cause ESM event or provide other doable solutions for my problem.

Thank you.

Joy

0 joy liu over 13 years ago in reply to joy liu

Prodigy 175 points

Hello Luc,

I have done quite bit testing and found that the single bit flipping will cause DED ESM event instead of SEC ESM event in most cases when accessing ECC memory through RAM variable plus ECC offset. When using #define to access ECC memory as in your code, sometimes our software still generate DED ESM event instead of SEC ESM event when flipping single bit.

I want to keep you updated on my testing result in case the information might be helpful to clear this issue.

Thanks,

Joy

0 Luc Baudoin over 13 years ago in reply to joy liu

TI__Intellectual 1125 points

Hello Joy,

Can you send me your full project with the instructions to reproduce your issue?

Thanks and regards,

Luc

0 joy liu over 13 years ago in reply to Luc Baudoin

Prodigy 175 points

Hello Luc,

Sorry for reply late since I have being in a training until the end of week. I attached zipped project to this post but not sure if you would be able to receive it.

inside the project, click START AUTOMAKER to launch the automaker for downloading and launch CCS for debugging. the main source code is the routine PreInitDynamicRamEccCheckSvc in r4\app\Os\source\Dynamic_ram_ecc_check.c file. from the routine, you can search for other routines and variables. The routine is called when M3 is still in reset state. The file start_application.asm is the entry point. It calls InitSystemServiceSvc. Inside InitSystsemServiceSvc, it called initRamMemorySvc which is the routine do the hardware initialization for memory other than R4 RAM. Currently, the code access the ECC through RAM variables plus ECC offset (0x00400000) which can generate multiple bit error ESM event. let me know what you find and if you have any problem or question to run the code. Below is how I did my testing and verification

1. do a build -- Rebuild

2. download using JTAG

3. launch CCS.

4 check variable to verify if ESM event occurred (R4_Sec_Even_Event, R4_Sec_Odd_Event, M3_Sec_Event, M3_Ded_Event, R4_Sec_Even_Event, ...) these are the debug variable for ESM event. they are defined in esm_handlers.c

Thank you,

Joy

0 joy liu over 13 years ago in reply to joy liu

Prodigy 175 points

Hello Luc,

I haven't heard from you yet since the last post. Did you receive the software I sent to you and look at them yet? Do you have any comments on this issue?

Thanks,

Joy

0 joy liu over 13 years ago in reply to joy liu

Prodigy 175 points

Hello Luc,

One more thing. Could you please provide me your response to this issue and delete the software I sent to you because they are TRW confidential propriety and are not disclosed further to voilate the company policy.

Thank you.

Joy

0 Luc Baudoin over 13 years ago in reply to joy liu

TI__Intellectual 1125 points

Hello Joy,

Sorry for the delay, I am happy to look into your project if you still need my assistance. I couldn't see any attachment I assume you could remove it per your latest message.

I will add you in the Friends feature so that we can communicate privately for sensitive data.

Regards,

Luc

0 joy liu over 13 years ago in reply to Luc Baudoin

Prodigy 175 points

Hello Luc,

I am not quite sure how the post sending out an attached file. When replying a thread, there is the attache file icon and I used it to attach the zipped project and then post the response. But I could not see the attachment now after I received my post. I thought somehow only TI employee can see the attachment not anyone else. But it looks like you didn't receive it either.

Anyway, I still couldn't generate ESM event for single bit flip in R4 RAM ECC location. Instead, it generate DED event. I have done many bench tests on this and all of them have the same result. Would you be able to provide some explanation on this.

If we want to test ECC feature during normal execution, can we flipping single/double bit in R4 RAM ECC to generate data abort periodically? From my bench testing, we could as long as reset Single-Bit Error Occurrences Counter Regist before inject error bits. Otherwise, the result is unpredicable. (single-bit error correction threshold set to 1). This sounds very strange. Because the threshold is only used for single bit error correction, and has nothing to do with double bit error. Bit this is happened in my bench test. Do you have any explanation? If this is not the right way to test, could you suggest the correct way to test R4 RAM ECC?

Thanks,

Joy

0 Luc Baudoin over 13 years ago in reply to joy liu

TI__Intellectual 1125 points

Hello Joy,

My understanding is that this thread is closed. Can you confirm by verifying the answer?

Thanks and regards,
Luc

0 joy liu over 13 years ago in reply to Luc Baudoin

Prodigy 175 points

Hello Luc,

The issue is not resolved. When the threshold set to a value larger than one, a single bit error in TCRAM ECC location will generate a DED ESM event, which is not supposed to.

regards,

Joy

0 Luc Baudoin over 13 years ago in reply to joy liu

TI__Intellectual 1125 points

Hi Joy,

I believe you are under direct support with TI. If that is the case can you work the issue with your direct contact and close this thread?

Please let me know if I don't have the correct understanding.

Thanks and regards,

Luc

Arm-based microcontrollers

Arm-based microcontrollers forum

Problem to test R4 ECC