TMS570LC4357: ESM Group 2 Bit 3 Error -- Cortex-R5 Core Fatal Bus Error

Pashan None

Genius 5660 points

Part Number: TMS570LC4357

Hello Support,

How does ESM Group 2 Bit 3 Error -- Cortex-R5 Core Fatal Bus Error occurs?

Does it mean CCM-R5 will not create any fault?

Essentially, how to understand the meaning of ESM Group 2 Bit 3 Error? Any details will be helpful.

Thank you.

Regards

Pashan

over 8 years ago

0 Chuck Davenport over 8 years ago

TI__Guru 59540 points

This ESM error has been used in the LC4357 due to the new bus structure associated with the Cortex R5F. As you know the memories (SRAM and Flash) are located on the L2 BUS interface. When a critical error occurs such as an uncorrectable ECC error, the error is captured as flagged through this ESM channel. This is unique to the LC4357 due to the bus structure and the latency in the error notification to the CPU. Due to the latency, it is not possible to differentiate between an uncorrectable error in RAM vs in Flash. For more information see discussion of the event bus usage realtive to Flash and SRAM in the TRM.

0 Pashan None over 8 years ago in reply to Chuck Davenport

Genius 5660 points

Hello Chuck,

So, when one gets ESM Group 2 Bit 3 Error, then at least one of the following register will be containing the Double-Bit ECC Error Flag Set:

L2RAMW Error Status Register (RAMERRSTATUS)
Flash Global Error and Status Register (FEDAC_GBLSTATUS)

Is that correct statement?
Please confirm.

Assume the setup of the interface is done correctly according to the TRM and ARM document.

Thank you.
Regards
Pashan

0 Charles Tsai over 8 years ago in reply to Pashan None

TI__Guru**** 191906 points

Hello Pashan,

When an uncorrectable ECC error is detected on the CPU's AXI-M bus, the CPU signals this error event via its event bus interface. The EVENTBUS(48) is then directly connected to the ESM GP2.3. There is no flags captured in either L2RAMW or L2FMC. As explained by Chuck the LC4357 is a cache based architecture. It is different from LS31xx type of devices. From the time the L2FMC flash wrapper returns the data to the CPU's AXI interface until the error is detected is no longer deterministic in terms of cycles. The flash/RAM wrapper can no longer associate an ECC error detected/signaled on the event bus with the error address that caused it. Therefore, there is no more uncorrectable error address register in the flash/RAM wrapper either.

Cortex-R5 architecture also some flaw in my opinion too. When an uncorrectable ECC is detected it only signals the error event via the event bus. However, it does not output the associated error address on its boundary. We can only route the event bus(48) to the ESM. No error address is recorded. You CANNOT use the data fault address register in the CPU. That is only used for faults detected on the ATCM/BTCM, but not AXI.

0 Pashan None over 8 years ago in reply to Charles Tsai

Genius 5660 points

Hello Charles,

Thank you for your detailed explanation of the connections between IP blocks.

So, Bit 13 of FEDAC_GBLSTATUS register in TMS570LC4357 device will be SET under what condition?

What does Implicit Read means as in the TRM for Bit 13 of FEDAC_GBLSTATUS register?

Eagerly waiting to hear from you Charles.
Thank you.
Regards
Pashan

0 Charles Tsai over 8 years ago in reply to Pashan None

TI__Guru**** 191906 points

Hello Pashan,
Bit 13 of FEDAC_GBLSTATUS is related to the implicit read from the flash bank by the wrapper before the CPU is even released from reset. After system reset, the wrapper will do some reads from the flash bank's OTP sector. The reads include reading the AJSM password and the device configuration. These reads are NOT CPU reads. The CPU is still held in reset. Inside the flash wrapper there is ECC logic that will detect ECC faults for these implicit reads. After these two reads are done then the CPU is released from reset.

0 Pashan None over 8 years ago in reply to Charles Tsai

Genius 5660 points

Hello Charles,

Really appreciate all these explanations and insights you are providing.
So, if I see ESM Group 2 Bit 3 as SET in the system, then essentially it is a Silicon Error within the device, assuming Flash ECC Contents are correct everywhere.
Is that correct?
Please confirm.
Thank you.
Regards
Pashan

0 Charles Tsai over 8 years ago in reply to Pashan None

TI__Guru**** 191906 points

Hello Pashan,

Yes, you understanding is correct. If the flash bank is populated with data and their corresponding ECC then if the CPU detects an error then it can attribute to faults in the flash bank or even the logic between the flash bank and the CPU's AXI interface. What is betweeen the flash bank and the CPU are the flash wrapper and CPU Interconnect subsystem. You could have a stuck-at fault on the digital datapath. This will be caught too.

Arm-based microcontrollers

Arm-based microcontrollers forum

TMS570LC4357: ESM Group 2 Bit 3 Error -- Cortex-R5 Core Fatal Bus Error