This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Bus ECC ESM error signaled on TMS570LC4357 with cache enabled when read from a partially programmed cache line in flash

Other Parts Discussed in Thread: TMS570LC4357, LAUNCHXL2-570LC43, HALCOGEN

Using the following I have been getting ESM errors:

  • Targeting a TMS570LC4357 in a LAUNCHXL2-570LC43
  • Using HALCoGen 04.05.02 to generate the project with the cache enabled
  • CCS 6.1.3.00033 with Hercules Emulation 6.0.7
  • Auto ECC Generation enabled in the CCS Flash Settings
  • TI ARM compiler v15.12.2

As programs were being developed, after some code changes the ERR led on the LAUNCHXL2-570LC43 would illuminate. From looking at the ESM registers in the CCS debugger, the ESM errors occurred due to the Cortex-R5F reporting a "Bus ECC" fatal bus error event to ESM group 2 channel 3.

The attached project repeatably generates a "Bus ECC" fatal bus error ESM event when programmed with CCS using Auto ECC Generation. The user code which has been added the the HALCoGen generated code is the following in HL_sys_main.c which writes the contents of some ESM status registers over SCI1:

/* Include Files */

#include "HL_sys_common.h"

/* USER CODE BEGIN (1) */
#include <stdio.h>
#include <string.h>

#include "HL_esm.h"
#include "HL_sci.h"
/* USER CODE END */

/** @fn void main(void)
*   @brief Application main function
*   @note This function is empty by default.
*
*   This function is called after startup.
*   The user can use this function to implement the application.
*/

/* USER CODE BEGIN (2) */

#define UART sciREG1

static void sciDisplayText(sciBASE_t *sci, uint8 *text,uint32 length)
{
    while(length--)
    {
        while ((UART->FLR & 0x4) == 4); /* wait until busy */
        sciSendByte(UART,*text++);      /* send out text   */
    };
}
/* USER CODE END */

void main(void)
{
/* USER CODE BEGIN (3) */
	char buffer[132] = "Inital";

	sciInit ();
	sprintf (buffer, "esmError()=%u\n\r", esmError());
	sciDisplayText (UART, (uint8_t *) buffer, strlen(buffer));

	sprintf (buffer, "Group 1 ESM status: Chans 95-64=0x%llx  Chans 63-0=0x%llx\n\r",
			esmGetStatusUpper (0, 0xffffffffLL),  esmGetStatus (0, 0xffffffffffffffffLL));
	sciDisplayText (UART, (uint8_t *) buffer, strlen(buffer));

	sprintf (buffer, "Group 2 ESM status: Chans 31-0=0x%llx\n\r",
			 esmGetStatus (1, 0xffffffff));
	sciDisplayText (UART, (uint8_t *) buffer, strlen(buffer));

	sprintf (buffer, "Group 3 ESM status: Chans 31-0=0x%llx\n\r",
			 esmGetStatus (2, 0xffffffff));
	sciDisplayText (UART, (uint8_t *) buffer, strlen(buffer));
/* USER CODE END */
}

When main is reached the ERR led on the launchpad is off. When step over the char buffer[132] = "Inital" statement the ERR led turns on, meaning an ESM error has occurred. By single stepping the code which initializes the buffer[] array, the "Bus ECC" error occurred when the CPU performs a read from the word at address 0x00006460.

The linker map shows that address 0x00006460 is 5 bytes away from the end of the .const section, and that there is a 27 byte "gap" between the end of the .const section and the start of the .cinit section:

SEGMENT ALLOCATION MAP

run origin  load origin   length   init length attrs members
----------  ----------- ---------- ----------- ----- -------
00000000    00000000    000060d0   000060d0    r-x
  00000000    00000000    00000020   00000020    r-x .intvecs
  00000020    00000020    000060b0   000060b0    r-x .text
000060e0    000060e0    00000385   00000385    r--
  000060e0    000060e0    00000385   00000385    r-- .const
00006480    00006480    0000003c   0000003c    r--
  00006480    00006480    0000003c   0000003c    r-- .cinit
08001500    08001500    00000068   00000000    rw-
  08001500    08001500    00000050   00000000    rw- .bss
  08001550    08001550    00000018   00000000    rw- .data

When verbose mode is turned on the CCS flash settings, this appears to show that CCS doesn't attempt to program the "gap" between .const and .cinit sections:

CortexR5: Verifying Flash @ Address 0x00000000 of length 0x000060D0
CortexR5: Writing Flash @ Address 0x000060e0 of Length 0x00000385
CortexR5: Verifying Flash @ Address 0x000060E0 of length 0x00000388
CortexR5: Writing Flash @ Address 0x00006480 of Length 0x0000003c
CortexR5: Verifying Flash @ Address 0x00006480 of length 0x0000003C

What I think is happening is that when the CPU attempts to read the word at address 0x00006460, due to the cache being enabled an entire cache line of 32 bytes from address 0x00006460 to 0x0000647f is read from flash. However, because the last 27 bytes of the cache line haven't been programmed by CCS this causes an ECC error to be detected on the 32 bytes read from flash.

The HL_sys_link.cmd file which has been generated by HALCoGen contains the following section definitions:

    .text   align(32) : {} > FLASH0 | FLASH1
    .const  align(32) : {} > FLASH0 | FLASH1
    .cinit  align(32) : {} > FLASH0 | FLASH1
    .pinit  align(32) : {} > FLASH0 | FLASH1

Note the use of the align(32) directive which aligns the start address of the sections to a 32 byte boundary, but doesn't pad the sections to a multiple of 32 bytes. Manually edited the HL_sys_link.cmd file to instead using the palign(32) directive to pad the sections to a multiple of 32 bytes:

    .text   palign(32) : {} > FLASH0 | FLASH1
    .const  palign(32) : {} > FLASH0 | FLASH1
    .cinit  palign(32) : {} > FLASH0 | FLASH1
    .pinit  palign(32) : {} > FLASH0 | FLASH1

Following this change to the project the "Bus ECC" ESM error no longer occurred. However, this is not a permanent change since the sections modified in the  HL_sys_link.cmd are outside of USER CODE blocks and so will be lost upon regenerating the project in HALCoGen.

Not sure if the problem is either:

  1. The HALCoGen linker command files should use palign(32) rather than align(32) for the sections.
  2. The CCS Auto EEC Generation should ensure ECC is set for a multiple of 32-bytes for sections which straddle 32 bytes boundaries.

3113.TMS570LC4357_flash_ecc_investigation.zip

  • Hi Chester,
    Your understanding is correct that the cache line refill is 32 bytes which fetches more bytes for which there are no corresponding ECC. The CCS auto-ECC does not generate ECC for these gaps or any unimplemented areas in the flash. I believe using palign in the linker should solve majority of this type of issues. However, I'm afraid that in the future you might run into the same type of issue because the CPU can generate speculative accesses. The CPU can generate an speculative access to a location that is later not taken into its pipeline. If the location for which the CPU generates a speculative access to does not have the corresponding ECC then a bus ECC error will again occur. The recommendation will be to use the linker generated ECC so that the .out file contains the program image of your application and the ECC for the unimplemented areas. If you do this then you do not need to use auto-ECC in CCS since the .out file already contains the ECC.
  • I added the modified HL_sys_link.cmd from that Wiki page to the project, and changed the CCS Flash Loader Settings as per the Wiki page. However, my example program still generated the same Bus ECC ESM error as before.

    From the CCS flash verbose output, there is still a gap in the main flash writes:

    CortexR5: GEL Output: 	Memory Map Setup for Flash @ Address 0x0CortexR5: GEL Output: 	Memory Map Setup for Flash @ Address 0x0 due to System Reset
    CortexR5: GEL Output: 	Memory Map Setup for Flash @ Address 0x0 due to System Reset
    CortexR5: Writing Flash @ Address 0x00000000 of Length 0x000060d0
    CortexR5: Erasing Flash Bank 0, Sector 0
    CortexR5: Erasing Flash Bank 0, Sector 1
    CortexR5: Erasing Flash Bank 0, Sector 2
    CortexR5: Erasing Flash Bank 0, Sector 3
    CortexR5: Erasing Flash Bank 0, Sector 4
    CortexR5: Erasing Flash Bank 0, Sector 5
    CortexR5: Erasing Flash Bank 0, Sector 6
    CortexR5: Erasing Flash Bank 0, Sector 7
    CortexR5: Erasing Flash Bank 0, Sector 8
    CortexR5: Erasing Flash Bank 0, Sector 9
    CortexR5: Erasing Flash Bank 0, Sector 10
    CortexR5: Erasing Flash Bank 0, Sector 11
    CortexR5: Erasing Flash Bank 0, Sector 12
    CortexR5: Erasing Flash Bank 0, Sector 13
    CortexR5: Erasing Flash Bank 0, Sector 14
    CortexR5: Erasing Flash Bank 0, Sector 15
    CortexR5: Erasing Flash Bank 1, Sector 0
    CortexR5: Erasing Flash Bank 1, Sector 1
    CortexR5: Erasing Flash Bank 1, Sector 2
    CortexR5: Erasing Flash Bank 1, Sector 3
    CortexR5: Erasing Flash Bank 1, Sector 4
    CortexR5: Erasing Flash Bank 1, Sector 5
    CortexR5: Erasing Flash Bank 1, Sector 6
    CortexR5: Erasing Flash Bank 1, Sector 7
    CortexR5: Erasing Flash Bank 1, Sector 8
    CortexR5: Erasing Flash Bank 1, Sector 9
    CortexR5: Erasing Flash Bank 1, Sector 10
    CortexR5: Erasing Flash Bank 1, Sector 11
    CortexR5: Erasing Flash Bank 1, Sector 12
    CortexR5: Erasing Flash Bank 1, Sector 13
    CortexR5: Erasing Flash Bank 1, Sector 14
    CortexR5: Erasing Flash Bank 1, Sector 15
    CortexR5: Erasing Flash Bank 7, Sector 0
    CortexR5: Erasing Flash Bank 7, Sector 1
    CortexR5: Erasing Flash Bank 7, Sector 2
    CortexR5: Erasing Flash Bank 7, Sector 3
    CortexR5: Erasing Flash Bank 7, Sector 4
    CortexR5: Erasing Flash Bank 7, Sector 5
    CortexR5: Erasing Flash Bank 7, Sector 6
    CortexR5: Erasing Flash Bank 7, Sector 7
    CortexR5: Erasing Flash Bank 7, Sector 8
    CortexR5: Erasing Flash Bank 7, Sector 9
    CortexR5: Erasing Flash Bank 7, Sector 10
    CortexR5: Erasing Flash Bank 7, Sector 11
    CortexR5: Erasing Flash Bank 7, Sector 12
    CortexR5: Erasing Flash Bank 7, Sector 13
    CortexR5: Erasing Flash Bank 7, Sector 14
    CortexR5: Erasing Flash Bank 7, Sector 15
    CortexR5: Erasing Flash Bank 7, Sector 16
    CortexR5: Erasing Flash Bank 7, Sector 17
    CortexR5: Erasing Flash Bank 7, Sector 18
    CortexR5: Erasing Flash Bank 7, Sector 19
    CortexR5: Erasing Flash Bank 7, Sector 20
    CortexR5: Erasing Flash Bank 7, Sector 21
    CortexR5: Erasing Flash Bank 7, Sector 22
    CortexR5: Erasing Flash Bank 7, Sector 23
    CortexR5: Erasing Flash Bank 7, Sector 24
    CortexR5: Erasing Flash Bank 7, Sector 25
    CortexR5: Erasing Flash Bank 7, Sector 26
    CortexR5: Erasing Flash Bank 7, Sector 27
    CortexR5: Erasing Flash Bank 7, Sector 28
    CortexR5: Erasing Flash Bank 7, Sector 29
    CortexR5: Erasing Flash Bank 7, Sector 30
    CortexR5: Erasing Flash Bank 7, Sector 31
    CortexR5: Writing Flash @ Address 0x000060e0 of Length 0x00000385
    CortexR5: Writing Flash @ Address 0x00006480 of Length 0x0000003c
    CortexR5: Writing Flash @ Address 0xf0400000 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf0407ff0 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf040ffe0 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf0417fd0 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf041ffc0 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf0427fb0 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf042ffa0 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf0437f90 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf043ff80 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf0447f70 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf044ff60 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf0457f50 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf045ff40 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf0467f30 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf046ff20 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf0477f10 of Length 0x00007ff0
    CortexR5: Writing Flash @ Address 0xf047ff00 of Length 0x00000100
    CortexR5: GEL Output: 	Memory Map Setup for Flash @ Address 0x0 due to System Reset

    As well as the modification to enable linker ECC in the HL_sys_link.cmd file, I also changed the align(8) directives to palign(8) and the Bus ECC ESM error stopped occurring. My example program has a .const section which without padding is 0x385 bytes long, i.e. not a multiple of 64-bits. I think having a section which is not a multiple of the 64-bit flash bus width is maybe causing CCS to program data which causes a ECC error.

    [I did have the CCS flash "Align program segments to 64-bit memory regions" setting checked when the Bus ECC ESM error was occurring with linker ECC and using the align(8) directives]

    My example project with the HL_sys_link.cmd for linker ECC and palign(8) is attached. 8838.TMS570LC4357_flash_ecc_investigation.zip

  • Hi Chester,
    You are right that you need to use the palign as illustrated in the two wiki pages.

    Description of palign() copied from SPNU118
    8.5.4.2.5 Alignment With Padding
    As with align, you can tell the linker to place an output section at an address that falls on an n-byte boundary, where n is a power of 2, by using the palign keyword. In addition, palign ensures that the size of the section is a multiple of its placement alignment restrictions, padding the section size up to such a boundary, as needed
  • Charles Tsai said:
    You are right that you need to use the palign as illustrated in the two wiki pages.

    The information on the two Wiki pages is a bit disjointed:

    1. The http://processors.wiki.ti.com/index.php/Linker_Generated_ECC page which mentions palign() doesn't contain a modified HL_sys_link.cmd file.
    2. The http://processors.wiki.ti.com/index.php/LAUNCHXL2-570LC43-RM57L:_LinkerECCRecommendation page contains a link to a modified HL_sys_link.cmd file with Linker ECC enabled. However, that page doesn't mention palign() and the linked HL_sys_link.cmd file uses align rather than palign.

    Perhaps the Wiki pages could be improved to avoid confusion.

    Also, since linker generated ECC is the recommendation for for TMS570LC4357 devices, should the template linker command files in HALCoGen be updated to use linker generated ECC and use of palign?

    Edit: Correct device name

  • Hi Chester,
    Thanks for your feedback on the wiki page. It could be clarified a bit better. I agree that the linker command should have been updated with the linker generated ECC option for LC4357 device. Perhaps part of the hesitance in the decision is that most user projects will do fine without the need of the linker generated ECC. Since the linker generated ECC deviates from other Hercules devices especially for users who use both LC4357 and others it may sometimes creates confusion when loading the .out into CCS as you will need to make sure in the CCS flash setting to disable auto-ECC for LC4357. We will have an internal discussion to align on this topic and update this post.
  • Charles Tsai said:
    Since the linker generated ECC deviates from other Hercules devices especially for users who use both LC4357 and others it may sometimes creates confusion when loading the .out into CCS as you will need to make sure in the CCS flash setting to disable auto-ECC for LC4357.

    One other comment is that the Changes to CCS Flash Loader Settings section of the Wiki page says that to program a TMS570LC4357 .out file containing linker generated ECC the Flash Verification Settings should be set 'None'.

    I confirm that if don't set the Flash Verification Settings to 'None' that CCS 6.1.3 reported a verification failure and aborted the download. My concern is that if there was an actual flash programming failure, with verification set to 'None' would the user be aware of the failure?

    Not sure if the need to set Flash Verification Settings to 'None'  is a limitation of the TMS570LC4357 device or the CCS / flash programming software.

  • Hi Chester,
    I think this is an issue with the CCS flash loader. I think the way CCS works is that the verification is started as soon as the program image is loaded to the flash. The .out file contains the program image first and kind of append the ECC code after the program image. As the programmed image is loaded to the flash, the corresponding ECC is not yet programmed to the flash yet. If the verification is started then it will result in verification failure since the ECC is yet programmed. If the verification is started after the program image and the ECC are both programmed then the problem should go away. This problem is supposed to be addressed in the new CCS release. I will check on this.