TMS570LS3137: FLASH ECC

Muhammet Bagoglu

Part Number: TMS570LS3137
Other Parts Discussed in Thread: HALCOGEN

Tool/software:

Hi,

I'm developing safety project for avionic and I'm using TMS570LS3137. My IDE is CCS. I tried ECC features on the evaluation board. I have been tested checking RAM ECC function. It worked and I observed in the debug mode.

Now, I want to try flash ECC. First of all, I want to specified that I'm using HALCOGEN and Flash ECC is enabled on halcogen. than I observed that sys_startup.c has been called checkFlashECC() function. There is no problem to here.

but I cannot be able to understand how can change any bit on the flash. I think that I need change any bit in the flash memory and ECC flash checker should be detected bit changing and run the error Interrupt. Please correct me for this scenario.

you can view below checkFlashECC() function.

void checkFlashECC(void)
{
/* Routine to check operation of ECC logic inside CPU for accesses to program flash */
volatile uint32 flashread = 0U;

/* USER CODE BEGIN (40) */
/* USER CODE END */

/* Flash Module ECC Response enabled */
flashWREG->FEDACCTRL1 = 0x000A060AU;

_coreEnableFlashEcc_();

/* Enable diagnostic mode and select diag mode 7 */
flashWREG->FDIAGCTRL = 0x00050007U;

/* Select ECC diagnostic mode, single-bit to be corrupted */
flashWREG->FPAROVR = 0x00005A01U;

/* Set the trigger for the diagnostic mode */
flashWREG->FDIAGCTRL |= 0x01000000U;

/* read a flash location from the mirrored memory map */
flashread = flashBadECC1;

/* disable diagnostic mode */
flashWREG->FDIAGCTRL = 0x000A0007U;

/* this will have caused a single-bit error to be generated and corrected by CPU */
/* single-bit error not captured in flash module */
/*SAFETYMCUSW 139 S MR:13.7 <APPROVED> "Hardware status bit read check" */
if ((flashWREG->FEDACSTATUS & 0x2U) == 0U)
{
selftestFailNotification(CHECKFLASHECC_FAIL1);
}
else
{
/* clear single-bit error flag */
flashWREG->FEDACSTATUS = 0x2U; // By giving a value of 2 to that register, how do we clear the bit?

/* clear ESM flag */
esmREG->SR1[0U] = 0x40U;

/* Enable diagnostic mode and select diag mode 7 */
flashWREG->FDIAGCTRL = 0x00050007U;

/* Select ECC diagnostic mode, two bits of ECC to be corrupted */
flashWREG->FPAROVR = 0x00005A03U;

/* Set the trigger for the diagnostic mode */
flashWREG->FDIAGCTRL |= 0x01000000U;

/* read from flash location from mirrored memory map this will cause a data abort */
flashread = flashBadECC2;

/* Read FUNCERRADD register */
flashread = flashWREG->FUNCERRADD;

/* disable diagnostic mode */
flashWREG->FDIAGCTRL = 0x000A0007U;
}

/* USER CODE BEGIN (41) */
/* USER CODE END */
}

Do I need extra function or code for the checking flash ecc? this function is running in the sys_starup.c so it run before the main but I called this function in the infinity loop. but not any changing so, not working.

Also, Can you advice test code for flashECC for that MCU(F021)?

over 1 year ago

0 jagadish gundavarapu over 1 year ago

TI__Guru 70681 points

Hi Muhammet,

You can find and download Hercules Safe TI diagnostic library from below link:

SAFETI_DIAG_LIB Driver or library | TI.com

This library consists of the code for various diagnostic routines possible in Hercules controller.

Once you download it then you can open the user guide and can search for ECC error routines for flash

So, you can get the results for the flash ECC error routines.

And now import the diagnostic code of your controller to the CCS

And from here you can search and access the code related to the corresponding error diagnostics:

As you can see the below is the code for to create the single bit error in flash, similarly you can find all the required routines you needed.

--
Thanks & Regards,
Jagadish.

0 Muhammet Bagoglu over 1 year ago in reply to jagadish gundavarapu

Prodigy 55 points

Thank you for your support,

But Still I don't understand how to implemet our main code for that flash ecc

0 jagadish gundavarapu over 1 year ago in reply to Muhammet Bagoglu

TI__Guru 70681 points

Hi Muhammet,

Simply you can do one more thing:

In HALCoGen Enable the check box "Enable Flash ECC Check"

If you enable this option, then "checkFlashECC" function will be generated in the code.

And this function will get called in startup.c file to test the flash ECC after Power ON reset:

--
Thanks & regards,
Jagadish.

0 Muhammet Bagoglu over 1 year ago in reply to jagadish gundavarapu

Prodigy 55 points

Hi Jagadish,

I generated flash ECC code on the HALCOGEN. But you know that, checkFlashECC() function called just in startup.c.

So, I have a few question marks, could you please answer them?

1. I think it will be insufficient to call this function only in startup.c. Because shouldn't we call the function in an infinite loop in main?

2. I don't understand how to combine and test the checkFlashECC function with the code created from the safety library.
Can you please be more clear about this?

3. In the checkFlashECC function there is an if statement that checks the FEDACSTATUS register. This if statement directs us to an empty interrupt function (selftestFailNotification) when there is a 1 bit error on flash. What should I do in this function?
I also see that this if statement does not meet the condition in any way. You can give a suggestion for this or a comment on how to test this.

4. Inside the checkFlashECC function there is an operation as follows. When we set 2 in the FEDACSTATUS register, we set the ERR_ZERO_FLG bit, but there is no reset operation here. Can you explain this also?
flashWREG->FEDACSTATUS = 0x2U;

0 jagadish gundavarapu over 1 year ago in reply to Muhammet Bagoglu

TI__Guru 70681 points

Muhammet Bagoglu said:
1. I think it will be insufficient to call this function only in startup.c. Because shouldn't we call the function in an infinite loop in main?

It is just application requirements.

Yes, if you want you can call this function with some frequency based on your application requirements.

Muhammet Bagoglu said:
2. I don't understand how to combine and test the checkFlashECC function with the code created from the safety library.
Can you please be more clear about this?

If you generate the code from HALCoGen to test the flash ECC then again, no need to include the safety library code for flash ECC test.

Muhammet Bagoglu said:
3. In the checkFlashECC function there is an if statement that checks the FEDACSTATUS register. This if statement directs us to an empty interrupt function (selftestFailNotification) when there is a 1 bit error on flash. What should I do in this function?

It is also user requirement,

I mean either you can reset the controller or else you can do some log and can take action based on it.

Muhammet Bagoglu said:
4. Inside the checkFlashECC function there is an operation as follows. When we set 2 in the FEDACSTATUS register, we set the ERR_ZERO_FLG bit, but there is no reset operation here. Can you explain this also?

We can do software reset; it is just application requirement.

--
Thanks & regards,
Jagadish.

0 Muhammet Bagoglu over 1 year ago in reply to jagadish gundavarapu

Prodigy 55 points

Jagadish gundavapu said:

If you generate the code from HALCoGen to test the flash ECC then again, no need to include the safety library code for flash ECC test.

I already generated flash or RAM ECC code from HALCoGen. But, How can I verify the correction flash or ram? can I tested ECC on the debug?

0 Muhammet Bagoglu over 1 year ago in reply to Muhammet Bagoglu

Prodigy 55 points

I mean, I want to change any variable for flash or RAM in debug session and I want to observe the interrupt call. Is it possible? Can you guide me on a scenario where I can do this because I need to somehow validate the checkFlashECC and checkRAMECC() functions and see that the processor actually correction or detects the bit.

0 jagadish gundavarapu over 1 year ago in reply to Muhammet Bagoglu

TI__Guru 70681 points

Hi Mahammet,

For RAM you can easily do using variables. But for flash we cannot change the memory content easily, so that is the reason we have diagnostic registers, using those diagnostic registers we can create ECC related faults in flash and can do the tastings.

Refer the checkFlashECC routine and below part of the TRM to understand this procedure.

And also, you can have a look on below thread:

(+) TMS570LS1224: FEE ECC Selftest triggering ESM notification - Arm-based microcontrollers forum - Arm-based microcontrollers - TI E2E support forums

Here we tested ECC errors in FEE region of flash.

--
Thanks & regards,
Jagadish.

0 Muhammet Bagoglu over 1 year ago in reply to jagadish gundavarapu

Prodigy 55 points

First of all I checked the RAM ECC. I scanned the checkRAMECC() function thoroughly and I think I got it.
I realize that checkRAMECC() only checks two RAM addresses for one bit and two bit errors.
Is there an easy way to check all addresses in RAM at once or do I have to start at address 0x08400000 and check it with a for loop up to address 0x0803FFFF?

0 jagadish gundavarapu over 1 year ago in reply to Muhammet Bagoglu

TI__Guru 70681 points

Hi Muhammet,

Muhammet Bagoglu said:
Is there an easy way to check all addresses in RAM at once or do I have to start at address 0x08400000 and check it with a for loop up to address 0x0803FFFF?

No need to test entire RAM because if there are any issues occurs with any particular region of RAM in either data or ECC region of RAM, and then whenever you read that region in the background ECC verification will happen and obviously ECC will get failed. And you will get single bit error or double bit error based on the region.

So, it is not necessary to create ECC errors for all the locations of RAM, still if you want to do it then there won't be any other easy way i think we should need to follow the same procedure for all the locations.

--
Thanks & regards,
Jagadish.

0 Muhammet Bagoglu over 1 year ago in reply to jagadish gundavarapu

Prodigy 55 points

I think I understand. I understand that the checkRAMECC() function is to test the ECC RAM addresses used. so I don't need to check all ECC RAM addresses.
So do you think I need to call checkRAMECC() function in infinite loop? I think there is no need for that either. can you confirm this?

I have detected another situation. what you see below is a checkRAMECC() function generated by Halcogen. i am commenting out the one bit error generating code piece in this function(code line in 2046 of the code picture). but the two bit error generating code you see below is still executing(line 2084 and 2085 in code picture. so i will generate a two bit error in ECC RAM. i could not generate an interrupt when there was a two bit error. can you help with this?

Note: The configurations set via Halcogen are as follows.

0 jagadish gundavarapu over 1 year ago in reply to Muhammet Bagoglu

TI__Guru 70681 points

Hi Muhammet,

Muhammet Bagoglu said:
So do you think I need to call checkRAMECC() function in infinite loop? I think there is no need for that either. can you confirm this?

Yes, no need to do that either.

Muhammet Bagoglu said:
i could not generate an interrupt when there was a two bit error. can you help with this?

I will check this and update you.

--
Thanks & regards,
Jagdish.

0 Muhammet Bagoglu over 1 year ago in reply to jagadish gundavarapu

Prodigy 55 points

Hi jagadish

I scanned the datasheet and I found the detail of the double-bit error as following below;

what do you think about this?

0 jagadish gundavarapu over 1 year ago in reply to Muhammet Bagoglu

TI__Guru 70681 points

Hi Muhammet,

That is interesting finding,

So can you please try to use SCI and send data to the COM port. I mean if code enters into the interrupt, then send the data through SCI as indicator. In this way we can find out whether interrupt is generating or not without using the debug mode.

0 Muhammet Bagoglu over 1 year ago in reply to jagadish gundavarapu

Prodigy 55 points

I tried on SCI but failed. When I forced only single bit error, I observed that ESM1 channel 26 and 28 interrupts were entered. But I forced ESM2 channel 6 and channel 8 interrupts by commenting the single bit error forcing and making only double bit error and at the same time commenting the ECC RAM regulated part and creating uncorrected error, but it failed.

I leave the code below for your review, please guide me.
Maybe, I am making a mistake in creating uncorrectable error. can you give me some idea about this?

0 jagadish gundavarapu over 1 year ago in reply to Muhammet Bagoglu

TI__Guru 70681 points

Hi Muhammet,

I need some time to test this at my end, in mean time can you do the following thing:

And we also face similar issues for CAN-RAM in the past i guess,

(+) TMS570LC4357: CAN ECC single-bit self-test sets DEFLG instead of SEFLG in CAN ECC CS register - Arm-based microcontrollers forum - Arm-based microcontrollers - TI E2E support forums

Can you verify the workaround given in the above thread once.

Thanks & regards,
Jagadish.

0 Muhammet Bagoglu over 1 year ago in reply to jagadish gundavarapu

Prodigy 55 points

Hi Jagadish,

I tried the suggestion you mentioned that is TCRAM Auto-Initialization.

After applied auto-initialization, I forced double bit error and then read the address where it called ESM group3 channel 3 and channel 5.(dabort.asm) after that when code is coming the auto-initialization line, double bit error are corrected automaticly.

before applied auto-initialization, I forced double bit error and then read the address where it called ESM group3 channel 3 and channel 5.(dabort.asm) after that it called infinity loop in dabort.asm

Exactly what I want is to direct it to either low level interrupt or high level interrupt as in single bit error. But as I told you, in double bit error ESM calls group3. Isn't it possible to redirect to ESM group1 or group2 error in double bit error?

0 jagadish gundavarapu over 1 year ago in reply to Muhammet Bagoglu

TI__Guru 70681 points

Hi Muhammet,

Apologies for the delay in my response.

Muhammet Bagoglu said:
Exactly what I want is to direct it to either low level interrupt or high level interrupt as in single bit error. But as I told you, in double bit error ESM calls group3. Isn't it possible to redirect to ESM group1 or group2 error in double bit error?

The uncorrectable ECC errors in the SRAM or TCRAM will always generates group3(channel-3 and/or channel-5) errors only.

For more details refer below thread as well:

(+) TMS570LS0432: ESM Group2 and Group3 for uncorrectable memory error - Arm-based microcontrollers forum - Arm-based microcontrollers - TI E2E support forums

And important thing is that Group-3 errors which have high severity will not generate interrupts, they will always generate low level on ERROR pin to indicate this error. Please refer below highlighted sections on TRM:

As per my understanding the reason for not providing interrupts for uncorrectable error is that,

If there is an uncorrectable error in the flash or RAM, then we can't guarantee that proper code execution, if we execute code with uncorrectable errors means they sometime leads to the abort, so we can't guarantee a proper interrupt for group-3 errors.

So, it is not possible to redirect to interrupt handler for group-3 errors.

--
Thanks & regards,
Jagadish.

0 Muhammet Bagoglu over 1 year ago in reply to jagadish gundavarapu

Prodigy 55 points

Thank you for your response.

But as you can see below, it also appears as Group2 in uncorrectable errors. I am wondering why this is not happening.

Another question I have is whether ESM Gropu3 error takes precedence over Group2?

I would also like to point out that when I make an uncorrectable double-bit error in ECC RAM, it goes to abort.asm, completes all operations and the code continues from where it left off. In other words, while there is a double bit error in ECC RAM, the code continues its task at run time. However, the error continues in ECC RAM. We can only notice this in debug.
Imagine, it is impossible to notice this in run time. That's why I think it should generate an interrupt. I already understand that Group2 ESM should be activated in the image I shared above. Please guide me if I am thinking wrong.

0 Muhammet Bagoglu over 1 year ago in reply to Muhammet Bagoglu

Prodigy 55 points

I understood the difference between group2 and group3 from the link you provided. So how can I recognize it at run-time when an uncorrectable double-bit error occurs. Right now it redirects to the abort.asm function and the code continues to run in the infinite loop after the function completes. But I can see in debug that there is a permanent error in ECC.

Can you guide me on this issue?

0 jagadish gundavarapu over 1 year ago in reply to Muhammet Bagoglu

TI__Guru 70681 points

Hi Muhammet,

In the case of Group3 errors or uncorrectable errors, we cannot guarantee the proper code execution, right?

We can identify the Group3 errors using ERROR pin status register and ESMSR3 register:

How can we take the action for this group3 errors because we can't guarantee that the action code doesn't have any uncorrectable errors, sometimes this code can also be uncorrectable right?

So, this is the reason customers will use external monitor:

(+) TMS570LS1224: TMS570LS - reset upon nERROR - Arm-based microcontrollers forum - Arm-based microcontrollers - TI E2E support forums

This external monitor can monitor the nERROR signal and can reset the controller if any uncorrectable error occurred.

--
Thanks & regards,
Jagadish.