TMS570LC4357: CAN ECC single-bit self-test sets DEFLG instead of SEFLG in CAN ECC CS register

Cameron Fruit

Part Number: TMS570LC4357

I've integrated the Diagnostic Library source code into my current project, and I am only running the SL_SelfTest_CAN CAN_ECC_TEST_MODE_1BIT test on DCAN1 RAM. The issue I see is that when the 1Bit fault is created by reading from the corrupted data location, the fault that occurs is usually flagged with DEFLG instead of SEFLG in the DCAN1 ECC CS register (although sometimes it is correctly flagged with SEFLG). Pseudo-code from my startup procedure is below. Any guidance on how to correct this issue is greatly appreciated. Thanks!

Initialize Core Registers

Initialize Stack Pointers

Enable clocks to peripherals and release peripheral reset

Identify the source of the last reset

Initialize memory

Enable CPU Event Export

Check if there were ESM group3 errors during power-up

Initialize System - Clock, Flash, device-level multiplexing and I/O multiplexing settings

Back up DCAN1 Control register

Clear CAN_INIT_BIT, and enable ECC in DCAN1 Control register

Run SL_SelfTest_CAN CAN_ECC_TEST_MODE_1BIT test on DCAN1

Restore DCAN1 Control register

over 2 years ago

0 jagadish gundavarapu over 2 years ago

TI__Guru 70681 points

Hi Cameron,

I recently got a similar issue, please verify my answer in below thread.

(10) TMS570LC4357: Double bit error when injecting single bit error into CAN-RAM (ESM 1.73 and 1.21) - Arm-based microcontrollers forum - Arm-based microcontrollers - TI E2E support forums

And verify below answer in the above thread.

Thanks & regards,
Jagadish.

0 Cameron Fruit over 2 years ago in reply to jagadish gundavarapu

Prodigy 55 points

Thank you for the quick reply, Jagadish. Sorry that I could not verify your answer to the other thread since it is closed.

I went over the example code you provided and it looks the same as what I'm doing, but I am still seeing the DEFLG slightly more often than SEFLG when the single bit error is generated. Any other suggestions?

Thanks!

0 jagadish gundavarapu over 2 years ago in reply to Cameron Fruit

TI__Guru 70681 points

Hi Cameron,

This test should not be performed in debug mode because, the message RAM representation is different in debug/suspend mode and RDA Mode. Already in to perform this testing we are using RDA mode so if we perform test in debug mode the memory representation alters between two modes, so more RAM errors could generate (that includes both single bit error and double bit errors).

So, to perform this test without any issue i used SCI, i transmitted test status to the SCI after performing testing

The SCI output in my test code overall consists of 10 bytes, among them first 4 bytes are the word before doing corruption, and next 4 bytes are word after doing corruption and last two bytes are "ECC Diagnostic Status Register"

If two bytes are

00 01 - Single Bit Error

01 00 - Double Bit Error

01 01 - Single and double bit error

The below is the output i got when i perform single Bit error

The below is the output i got when i perform double bit error

You can change single- and double-bit errors at below line of the code

Here is the code, please go through it,

1440.CAN_ECC_TEST_MODE_1BIT_LC43.zip

Thanks & regards,
Jagadish.

0 Cameron Fruit over 2 years ago in reply to jagadish gundavarapu

Prodigy 55 points

Thanks for your reply.

I am not running in debug mode and I use RDA for the test. I corrupt the data with CAN_FLIP_DATA_1BIT which represents (uint32)0x1u. Unfortunately I'm still frequently seeing the DEFLG bit being set in the ECC Diagnostic Status Register instead of the SEFLG bit.

Cameron

0 jagadish gundavarapu over 2 years ago in reply to Cameron Fruit

TI__Guru 70681 points

Hi Cameron,

Can you please attach your code?

Thanks & regards,
Jagadish.

0 Cameron Fruit over 2 years ago in reply to jagadish gundavarapu

Prodigy 55 points

Jagadish,

Unfortunately I can't attach my code but I am using the SL_SelfTest_CAN() function for the TMS570LC4357 without any changes from sl_selftest.c in the SafeTI Diagnostic Library version 2.4.0 (downloaded here https://www.ti.com/lit/sw/spnc043b/spnc043b.zip?ts=1669743014857&ref_url=https%253A%252F%252Fwww.ti.com%252Ftool%252FSAFETI_DIAG_LIB%253FkeyMatch%253DSAFETY%2BDIAGNOSTICS%2BLIBRARY ).

For now I am only using the CAN_ECC_TEST_MODE_1BIT test for DCAN1.

Thanks,

Cameron

0 jagadish gundavarapu over 2 years ago in reply to Cameron Fruit

TI__Guru 70681 points

Hi Cameron,

I attached my code right, did you get a chance to look at it.

Is your code is same as mine, or else are there any changes from it.

If possible can you please test it with my code, or else do the necessary modifications into my code and let me know?

Thanks

0 Cameron Fruit over 2 years ago in reply to jagadish gundavarapu

Prodigy 55 points

I was able to slightly modify your SL_SelfTest_CAN() function as follows and test it in my code. Instead of clearing the errors in ECC_CS and ECCDIAG_STAT, I checked their values later, so I was able to verify that I am still seeing DEFLG (bit 8) being set instead of SEFLG (bit 0) more than half the time in both ECC_CS and ECCDIAG_STAT.

I tested it with Level 0 Register Optimization as well as No Optimization configurations with the same result.

Also I noticed that your code defined msgNo as 1u while I had it defined as 2u. I tried your function both ways and still saw DEFLG being flagged more frequently than SEFLG.

Any suggestions?

Thanks,

Cameron

-----------------------

boolean SL_SelfTest_CAN2(SL_DCAN_Instance SL_DCAN_Instance)
{
volatile uint32* data;
register uint32 ramread32;
uint32 dataVal;
volatile boolean testPassed =0;

if(SL_DCAN_Instance == SL_DCAN1)
{
sl_canREG = canREG1;
dcanRAMBase = (uint32 *)&canRAM1;
}

data = (volatile uint32 *)((((uint32)dcanRAMBase) + (msgNo*0x20u)));
data = data++;

/* disable SECDED - write to PMD in CANCTL */
BF_SET(sl_canREG->CTL, CAN_CTRL_SECDED_DIS, CAN_CTRL_SECDED_START, CAN_CTRL_SECDED_LENGTH);

/* set Test bit to enable Test Mode (required for selecting test mode - RDA) */
BIT_SET(sl_canREG->CTL, BIT(CAN_CTRL_TEST_EN));

/* set Init bit (enter software initialization mode) and avoid conflicts with Message Handler
* This step is required before entering RDA mode */
BIT_SET(sl_canREG->CTL, BIT(CAN_CTRL_INIT));

/* enable Ram Direct Access (RDA) */
BIT_SET(sl_canREG->TEST, BIT(CAN_TEST_RDA_EN));

/* backup DATA stored at this location */
dataVal = *data;
//sciSendByte(sciREG1,(uint8_t)(*data>>24));
//sciSendByte(sciREG1,(uint8_t)(*data>>16));
//sciSendByte(sciREG1,(uint8_t)(*data>>8));
//sciSendByte(sciREG1,(uint8_t)*data);

/* enable SECDED diagnostic mode */
BF_SET(sl_canREG->ECCDIAG, CAN_ECCDIAG_SECDED_EN, 0, 4);

/* enable ECC single bit error event */
BF_SET(sl_canREG->ECC_CS , CAN_ECC_CS_SBE_EVT_EN, CAN_ECC_CS_SBE_EVT_START, CAN_ECC_CS_SBE_EVT_LENGTH);

/* corrupt the data */
*data ^= CAN_FLIP_DATA_2BIT;

/* enable SECDED */
BF_SET(sl_canREG->CTL, CAN_CTRL_SECDED_EN, CAN_CTRL_SECDED_START, CAN_CTRL_SECDED_LENGTH);

//(void)SL_FLAG_SET(testType);

/* create fault */
ramread32 = *data;
//sciSendByte(sciREG1,(uint8_t)(*data>>24));
//sciSendByte(sciREG1,(uint8_t)(*data>>16));
//sciSendByte(sciREG1,(uint8_t)(*data>>8));
//sciSendByte(sciREG1,(uint8_t)*data);

//sciSendByte(sciREG1,(uint8_t)(sl_canREG->ECCDIAG_STAT>>8));
//sciSendByte(sciREG1,(uint8_t)sl_canREG->ECCDIAG_STAT);

if(GET_ESM_BIT_NUM(ESM_G1ERR_CAN1_ECC_SBERR) ==
(esmREG->SR7[0] & GET_ESM_BIT_NUM(ESM_G1ERR_CAN1_ECC_SBERR))) {
testPassed = TRUE;
/* clear ESM error status */
esmREG->SR7[0] |= GET_ESM_BIT_NUM(ESM_G1ERR_CAN1_ECC_SBERR);
}

if(GET_ESM_BIT_NUM(ESM_G1ERR_CAN1_ECC_UNCORR) ==
(esmREG->SR1[0] & GET_ESM_BIT_NUM(ESM_G1ERR_CAN1_ECC_UNCORR))) {
testPassed = TRUE;
/* clear ESM error status */
esmREG->SR1[0] |= GET_ESM_BIT_NUM(ESM_G1ERR_CAN1_ECC_UNCORR);
}

//added the following to stop optimizing out ramread32
ramread32 = ramread32;

/* disable SECDED - write to PMD in CANCTL */
BF_SET(sl_canREG->CTL, CAN_CTRL_SECDED_DIS, CAN_CTRL_SECDED_START, CAN_CTRL_SECDED_LENGTH);

//ccf mcu sbit do not clear any of the ECC_CS or ECCDIAG_STAT error status bits
/* clear the single bit error status */
//sl_canREG->ECC_CS = BIT(CAN_ECC_SBERR);
//sl_canREG->ECCDIAG_STAT = BIT(CAN_ECC_SBERR);

/* clear the double bit error status */
//sl_canREG->ECC_CS = BIT(CAN_ECC_UNCORR_ERR);
//sl_canREG->ECCDIAG_STAT = BIT(CAN_ECC_UNCORR_ERR);

/* disable SECDED - write to PMD in CANCTL */
BF_SET(sl_canREG->CTL, CAN_CTRL_SECDED_DIS, CAN_CTRL_SECDED_START, CAN_CTRL_SECDED_LENGTH);
/* Restore data; in other cases data should be auto corrected. */
*data = dataVal;
/* enable SECDED */
BF_SET(sl_canREG->CTL, CAN_CTRL_SECDED_EN, CAN_CTRL_SECDED_START, CAN_CTRL_SECDED_LENGTH);

/* disable diagnostic mode */
BF_SET(sl_canREG->ECCDIAG, CAN_ECCDIAG_SECDED_DIS, 0, 4);
/* Disable Ram Direct Access (RDA) */
BIT_CLEAR(sl_canREG->TEST, BIT(CAN_TEST_RDA_EN));

return testPassed;

}

0 jagadish gundavarapu over 2 years ago in reply to Cameron Fruit

TI__Guru 70681 points

HI,

I am on vacation for last week, Let me verify your code and comeback.

Thanks & regards,
Jagadish.

0 jagadish gundavarapu over 2 years ago in reply to jagadish gundavarapu

TI__Guru 70681 points

Hi Cameron,

I am on vacation for last week so don't get a chance to look into your code. I will start verifying now and will provide an update soon.

Thanks & regards,
Jagadish.

0 jagadish gundavarapu over 2 years ago in reply to jagadish gundavarapu

TI__Guru 70681 points

Hi Cameron,

Cameron Fruit said:
I am not running in debug mode and I use RDA for the test

How you are verifying Error flags (SEFLG and DEFLG) bits, i can see your code you are not using any serial printing options right ?, so you must verify flag statuses in debug mode right?

Can you please try to create CAN_FLIP_DATA_1BIT, and send error flags over SCI communication?

Thanks & regards,
Jagadish.

0 Cameron Fruit over 2 years ago in reply to jagadish gundavarapu

Prodigy 55 points

Actually, I'm using SCI to send serial output with the ECC_CS and ECCDIAG_STAT contents after running the above code. They continue to show that DEFLG is frequently being set instead of SEFLG.

0 jagadish gundavarapu over 2 years ago in reply to Cameron Fruit

TI__Guru 70681 points

Hi Cameron,

So, you mean even though you are flipping single bit using CAN_FLIP_DATA_1BIT you are getting "DEFLG" error right?

And one more thing where you are calling test function, are you calling inside of while(1) or outside as shown below.

Thanks & regards,
Jagadish.

0 Cameron Fruit over 2 years ago in reply to jagadish gundavarapu

Prodigy 55 points

Correct, I am flipping the single bit using CAN_FLIP_DATA_1BIT (which equals 1), and I am not calling the test function within the while loop.

0 jagadish gundavarapu over 2 years ago in reply to Cameron Fruit

TI__Guru 70681 points

Hi Cameron,

I could see the issue you mentioned, for some addresses it is happening. Let me debug further and discuss with in ternal team before providing an update.

Thanks & regards,
Jagadish.

0 jagadish gundavarapu over 2 years ago in reply to jagadish gundavarapu

TI__Guru 70681 points

Hi Cameron,

The internal team saying, If we alter the data bit it might also alter the two ECC bits. So, we should check whether the observed other altered bits are ECC or not.

Thanks & regards,
Jagadish.

0 Cameron Fruit over 2 years ago in reply to jagadish gundavarapu

Prodigy 55 points

Hi Jagadish,

I have ensured that only bit 0 is being altered in the data during the test. I collected data from several trials to show this as follows, cycling power between each trial.

Trial 1:

ECC_CS: DEFLG is NOT set. SEFLG is set.

ECCDIAG_STAT: DEFLG is NOT set. SEFLG is set.

data backup before corruption (dataVal): 5b3dff5d

data after corruption (ramread32): 5b3dff5c

Trial 2:

ECC_CS: DEFLG is set. SEFLG is NOT set.

ECCDIAG_STAT: DEFLG is set. SEFLG is NOT set.

data backup before corruption (dataVal): 593dff5d

data after corruption (ramread32): 593dff5c

Trial 3:

ECC_CS: DEFLG is set. SEFLG is NOT set.

ECCDIAG_STAT: DEFLG is set. SEFLG is NOT set.

data backup before corruption (dataVal): 533dff5d

data after corruption (ramread32): 533dff5c

Trial 4:

ECC_CS: DEFLG is NOT set. SEFLG is set.

ECCDIAG_STAT: DEFLG is NOT set. SEFLG is set.

data backup before corruption (dataVal): 533dff5d

data after corruption (ramread32): 533dff5c

Trial 5:

ECC_CS: DEFLG is set. SEFLG is NOT set.

ECCDIAG_STAT: DEFLG is set. SEFLG is NOT set.

data backup before corruption (dataVal): 791dfe5d

data after corruption (ramread32): 791dfe5c

Trial 6:

ECC_CS: DEFLG is set. SEFLG is NOT set.

ECCDIAG_STAT: DEFLG is set. SEFLG is NOT set.

data backup before corruption (dataVal): 733dff5d

data after corruption (ramread32): 733dff5c

Trial 7:

ECC_CS: DEFLG is set. SEFLG is NOT set.

ECCDIAG_STAT: DEFLG is set. SEFLG is NOT set.

data backup before corruption (dataVal): 593dff5d

data after corruption (ramread32): 593dff5c

Trial 8:

ECC_CS: DEFLG is NOT set. SEFLG is set.

ECCDIAG_STAT: DEFLG is NOT set. SEFLG is set.

data backup before corruption (dataVal): 513dff5d

data after corruption (ramread32): 513dff5c

Trial 9:

ECC_CS: DEFLG is set. SEFLG is NOT set.

ECCDIAG_STAT: DEFLG is set. SEFLG is NOT set.

data backup before corruption (dataVal): 513dff5d

data after corruption (ramread32): 513dff5c

Trial 10:

ECC_CS: DEFLG is set. SEFLG is NOT set.

ECCDIAG_STAT: DEFLG is set. SEFLG is NOT set.

data backup before corruption (dataVal): 5b3dfe5d

data after corruption (ramread32): 5b3dfe5c

0 jagadish gundavarapu over 2 years ago in reply to Cameron Fruit

TI__Guru 70681 points

Hi Cameron,

Thank you for providing more data. Agreed definitely something wrong with SECDED of DCAN RAM.

Give me some time to discuss with internal team and get back on this.

Thanks & regards,
Jagadish.

0 Cameron Fruit over 2 years ago in reply to jagadish gundavarapu

Prodigy 55 points

Hi Jagadish,

Any updates?

Thanks,

Cameron

0 jagadish gundavarapu over 2 years ago in reply to Cameron Fruit

TI__Guru 70681 points

Hi Cameron,

Apologies for the delay, i am trying to contact internal team on this. Please expect some delay.

Thanks & regards,
Jagadish.

0 Cameron Fruit over 2 years ago in reply to jagadish gundavarapu

Prodigy 55 points

Thanks for letting me know. Appreciate any help you can provide.

Cameron

0 jagadish gundavarapu over 2 years ago in reply to Cameron Fruit

TI__Guru 70681 points

Hi Cameron,

Appreciating your patience, i will try to get response from internal team by the end of this week and we can resolve the issue in this week.

Thanks & regards,
Jagadish.

0 Max Wittekind over 2 years ago in reply to jagadish gundavarapu

Expert 1875 points

Hi Cameron, Hi Jagadish,

Just wanted to let you know, that we are seeing similiar issues with CAN ECC.
We get ESM 1bit and 2bit ESM Error when testing ECC and removing the injected ECC Error usually seems not to be working correctly.
To work around this issue, we are currently just resetting 2-Bit erros as well. And after the test, we auto init the CAN-RAM as a clean up step. (Only Testing during Power-on)

0 Cameron Fruit over 2 years ago in reply to Max Wittekind

Prodigy 55 points

Thanks for the update, Max. May I ask how you implement the auto init of the CAN RAM?

0 Max Wittekind over 2 years ago in reply to Cameron Fruit

Expert 1875 points

Sure. Be aware that an auto init should be done after activating ECC and before doing the test.
This does not fix the 2bit error when injecting 1bit error. At least for us.

Using the auto init again, after testing is done is just a way to clean up the CAN RAM.
That said you have to do the following:

1. Enable "Global memory hardware initialization" in MINITGCR Register (TRM Chapter 2.5.1.21)
2. Set the bits in MSINENA Register, that activate the auto init in CAN-Ram (TRM Chapter 2.5.1.22)
3 Wait for MINIDONE bit in MSTCGSTAT Register to be set (TRM Chapter 2.5.1.23)
4. turn of auto init

The Sequence is described in TRM 2.2.4.2 you need table 2-7 to determine which bits to set

0 Cameron Fruit over 2 years ago in reply to Max Wittekind

Prodigy 55 points

Hi Max and Jagadish,

I added auto-init of CAN RAM just after activating ECC and before running the test, and this appears to solve the issue on my end.

Thanks very much for your help,

Cameron