This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS570LC4357: Unexpected ECC errors (ESM 2.3) when flashing start of flash using f021 api

Part Number: TMS570LC4357
Other Parts Discussed in Thread: RM57L843

Hello,

I'm using f021 to flash internal FPROM of a TMS570LC4357 device on a HDK board using auto ECC generation.

I have a boot flashed in sectors 0 and 1 (first 32Kb) which includes interrupt & exception vectors. Then, I have an application beginning at sector 12 (offset 1Mbytes). My application uses f021 library to allow flashing either the application itself, or the boot.

The application runs entirely from RAM (text and const for all functions). Flash shoud only be accessed to fetch the primary interrupt handler (which consists in a branch to the real handler code in RAM)

I always flash by 8Kb chunks globally masking interrupts during programming of each chunk.

I succeed in flashing the application itself without any errors.

But, when trying to flash the boot I get an unexpected bus error (ESM 2.3, red led) during the process. Nonetheless the boot flashing ends up correctly : after power on RST, the updated boot is executed and no further error occurs.

I also looked at the cortex events through the PMU, and noticed  0x6e and 0x71 events for the 3 first 8Kb written, and none for the last 8K chunk. I also tried to deactivate the CPU cache which did not change the global behaviour, but only increased the number of events recorded by the PMU.

The ESM indicates both 1.4 and 2.3 errors (which seems consistent with the events recorded above), while the EPC shows only a single address recorded in the CAM FIFO (0x5308).

How can I debug this and find out what (I suppose) is causing an unexpected access to the flash ?

Thanks,

Dominique

  • Hello Dominique,

    The interrupt vector of Cortex-R device is located at the beginning of flash sector 0. Before doing flash operation (erase/write) from your application, please disable the interrupts (IRQ, FIQ) in your application code.
  • Thank you for your response.

    I'm well aware that the cortex executes interrupt vectors at 0x18 / 0x1c for IRQ and FIQ. In consequence, I use the following programming sequence :

    disable_irq(); /* disable interrupts at cortex level (write bit I in CPSR) */
    Fapi_issueAsyncCommandWithAddress(Fapi_EraseSector, sector_address);
    while (FAPI_CHECK_FSM_READY_BUSY == Fapi_Status_FsmBusy);
    Fapi_flushPipeline();
    bytes = BANK_WIDTH;
    while (count > 0)
    {
    Fapi_issueProgrammingCommand(dest, src, bytes, 0, 0, Fapi_AutoEccGeneration);
    while( FAPI_CHECK_FSM_READY_BUSY == Fapi_Status_FsmBusy );
    src += bytes;
    dest += bytes;
    count -= bytes;
    }
    enable_irq();


    In addition, if an IRQ was taken during with a blank sector 0, I suppose the CPU would crash since no valid instruction would be fetched !

    In contrast, what I see is that the programming terminates successfully, whereas ESM errors 1.4 and 2.3 occurs during the programming.

    Also please note that I do not use FIQ (disabled from reset in the cortex, I never enable it).

    I did try to enable the FIQ to catch the call to the handler at 0x1C using a JTAG probe to find out when the ESM 2.3 error occurs, and accordind to LR cortex reg it comes from the Fapi_waitDelay() function, which executes from external SDRAM. So I'm a bit confused and don't know where to look further.

    Dominique
  • Hello Dominique,

    The F021 API userguide says that the Fapi_waitDelay() function is deprecated and should not be used in new projects.

    Fapi_flushPipeline() is used to flushes the FMC pipeline buffers. This function makes the assumption that is can read from Flash Addresses 0, 0x100, 0x200, 0x300. After you erase the sector 0, the ECC area is also erased. SO reading any address in sector 0 will cause ECC error.

    You don't need to call the this function between flash erase and flash program APIs.
  • Part Number: TMS570LC4357

    Hello,

    I apologize for not having provided feedback to the latest proposed solution to the problem exposed in the original thread. I understand it has been locked in "TI thinks resolved" state, but unfortunately it is not.

    Our application changed a bit and now the boot flashed on the first 3 16kb sectors is functionnally much more stable and seldom need reflashing. So now the problem is less critical now. Additionally, I found a workaround by disabling ECC event export from cortex to ESM in PMCR register when flashing the boot.

    But the problem is not really solved. I knew Fapi_waitDelay() was deprecated and had not usage for it in my code anyway, but it is called internally by TI code in Fapi_flushPipeline(), according to the register context I got on FIQ, I saw it came from here but this code was not accesing flash anyway .

    Following latest advice from QJ Wang, I removed useless call to Fapi_flushPipeline() between erase and program. But the problem  of ESM 2.3 error still occurs.

    Now, In the FIQ context (breaking with JTAG debug at fiq handler (0x1c) and looking at registers), I find out the error occurs somewhere in my code in RAM on a sequence of instructions that definitely do not  any flash access.

    It seems like the error occurs asynchronously and the FIQ attached to ESM is of no help investigating where the error comes from.

    Thanks

    Dominique

  • Hello Dominique,

    For the TMS570LC4357 and RM57L843 devices, ECC protection of Flash and SRAM memories is always enabled. This differs from Hercules devices in the TMS570LS and RM4 series where ECC is disabled after reset and remains disabled until enabled by software. It means that it is a lot more likely that you will see the RED ERR LED go on during development. A common reason for this to occur is that erased flash is full of ECC errors! So if you erase the entire (4MByte) flash, then program the sector 0/1 with your bootloader, most of the flash will still contain ECC errors. All it takes to trip the RED error LED is a read from an area of flash left with ECC errors. This can happen unintentionally (the CPU performs speculative pre-fetches).

    The solution to frequent ECC related errors during development is to program correct ECC values even for the locations in the main flash array that are left unused. The easiest way to achieve this is to use the Linker to generate ECC data rather than the loader.

    One more thing, when your bootloader tries to program the application to flash, it erases the flash sectors first. the unused portion of the erased sector contains ecc error. The solution is to pad 0xfffffff to the application until it reach the sector boundary.

    Here are my examples:
  • 1. Linker to generate ECC for the whole flash:

    /*----------------------------------------------------------------------------*/
    /* Linker Settings */

    --retain="*(.intvecs)"

    /* USER CODE BEGIN (1) */
    /* USER CODE END */

    /*----------------------------------------------------------------------------*/
    /* Memory Map */
    MEMORY
    {
    /* USER CODE BEGIN (2) */
    /* USER CODE END */

    VECTORS (X) : origin=0x00000000 length=0x00000020 vfill = 0xffffffff
    FLASH0 (RX) : origin=0x00000020 length=0x001FFFE0 vfill = 0xffffffff
    FLASH1 (RX) : origin=0x00200000 length=0x00200000 vfill = 0xffffffff
    SRAM (RWX) : origin=0x08002000 length=0x0002D000
    STACK (RW) : origin=0x08000000 length=0x00002000

    /* USER CODE BEGIN (3) */
    ECC_VEC (R) : origin=(0xf0400000 + (start(VECTORS) >> 3))
    length=(size(VECTORS) >> 3)
    ECC={algorithm=algoL2R5F021, input_range=VECTORS}

    ECC_FLA0 (R) : origin=(0xf0400000 + (start(FLASH0) >> 3))
    length=(size(FLASH0) >> 3)
    ECC={algorithm=algoL2R5F021, input_range=FLASH0 }

    ECC_FLA1 (R) : origin=(0xf0400000 + (start(FLASH1) >> 3))
    length=(size(FLASH1) >> 3)
    ECC={algorithm=algoL2R5F021, input_range=FLASH1 }
    /* USER CODE END */

    }

    /* USER CODE BEGIN (4) */
    ECC
    {
    algoL2R5F021 : address_mask = 0xfffffff8 /* Address Bits 31:3 */
    hamming_mask = R4 /* Use R4/R5 build in Mask */
    parity_mask = 0x0c /* Set which ECC bits are Even and Odd parity */
    mirroring = F021 /* RM57Lx and TMS570LCx are build in F021 */
    }
    /* USER CODE END */

    /*----------------------------------------------------------------------------*/
    /* Section Configuration */
    SECTIONS
    {
    /* USER CODE BEGIN (5) */
    /* USER CODE END */
    .intvecs : {} > VECTORS

    /* The toot directory is \Debug */
    flashAPI:
    {
    .\F021_Flash_API\02.01.01\source\Fapi_UserDefinedFunctions.obj (.text)
    .\source\bl_flash.obj (.text)
    // .\source\bl_dcan.obj (.text)
    --library= ..\..\..\F021_Flash_API\02.01.01\F021_API_CortexR4_BE_L2FMC.lib (.text)
    } palign=8 load = FLASH0 |FLASH1, run = SRAM, LOAD_START(apiLoadStart), RUN_START(apiRunStart), SIZE(apiLoadSize)

    .text : {} palign=8 > FLASH0 |FLASH1 /*Initialized executable code and constants*/
    .const : {} palign=8 load=FLASH0 |FLASH1, run = SRAM, LOAD_START(constLoadStart), RUN_START(constRunStart), SIZE(constLoadSize) /*Initialized constant data (e.g. const flash_sectors[..] = )*/
    .cinit : {} palign=8 > FLASH0 |FLASH1 /*Initialized global and static variables*/
    .pinit : {} palign=8 > FLASH0 |FLASH1
    .data : {} > SRAM
    .bss : {} > SRAM
    .sysmem : {} > SRAM

    /* USER CODE BEGIN (6) */

    /* USER CODE END */
    }
  • 2. the example code to pad 0xFFFFFFFF to the application in application's linker cmd file. In bootloader, generate ECC when programming this application to the flash.

    /*----------------------------------------------------------------------------*/
    /* Linker Settings */

    --retain="*(.intvecs)"

    /* USER CODE BEGIN (1) */
    /* USER CODE END */

    /*----------------------------------------------------------------------------*/
    /* Memory Map */

    MEMORY
    {
    /* USER CODE BEGIN (2) */
    /* USER CODE END */
    VECTORS (X) : origin=0x00010020 length=0x00000020
    FLASH_CODE (RX) : origin=0x00010040 length=0x8000 - 0x40 fill=0xFFFFFFFF /*sector 4/5* for application /
    FLASH0 (RX) : origin=0x00018000 length=0x00200000 - 0x18000
    FLASH1 (RX) : origin=0x00200000 length=0x00200000
    STACKS (RW) : origin=0x08000000 length=0x00001500
    RAM (RW) : origin=0x08001500 length=0x0007EB00

    /* USER CODE BEGIN (3) */

    // VECTORS (X) : origin=0x00010020 length=0x00000020
    // FLASH_CODE (RX) : origin=0x00010040 length=0x8000 - 0x40 fill=0xFFFFFFFF /*sector 4/5*/
    // FLASH0 (RX) : origin=0x00018000 length=0x00200000 - 0x18000

    /* USER CODE END */
    }

    /* USER CODE BEGIN (4) */
    /* USER CODE END */


    /*----------------------------------------------------------------------------*/
    /* Section Configuration */

    SECTIONS
    {
    /* USER CODE BEGIN (5) */
    /* USER CODE END */
    .intvecs : {} > VECTORS
    .text align(32) : {} > FLASH_CODE
    .const align(32) : {} > FLASH_CODE
    .cinit align(32) : {} > FLASH_CODE
    .pinit align(32) : {} > FLASH_CODE
    .bss : {} > RAM
    .data : {} > RAM
    .sysmem : {} > RAM


    /* USER CODE BEGIN (6) */
    // .text align(32) : {} > FLASH_CODE
    // .const align(32) : {} > FLASH_CODE
    // .cinit align(32) : {} > FLASH_CODE
    // .pinit align(32) : {} > FLASH_CODE

    /* USER CODE END */
    }

    Use those approaches, the ESM errors should be avoided.

    BTW, it is better to disable the interrupt in your bootloader just before jumping to application.
  • Hello,

    Thank you for your reply. I figured out that ECC is always enabled on flash and is wrong on an erased sector. For that reason I take special care when flashing : masking interrupts before erasing and unmasking only after reprogramming a full sector, as well as executing all my code from RAM. I would expect that to work.

    The first sector is entirely flashed (all 16Kb), so padding to FFFFFFFF is not applicable.

    As you mentioned for read accesses : "This can happen unintentionally (the CPU performs speculative pre-fetches)." This is what I suspect is happening, but if this is the case, this is a bug in my software since it is designed to avoid that by executing from RAM and masking IRQ, and I HAVE TO find it. I'm writing the operating system layer for a safety critical application and cannot not rely on misunderstandings and workarounds.

    So now my question becomes : what mean of investigation can I use to find my bug, because I thought using the FIQ triggered by ESM after ECC error would lead me close to the code that do the unwanted access, but it is not : the IRQ occurs always when executing code in RAM which is not doing any flash access.

    Thanks and regards

    Dominique
  • Hello Dominique,

    Please refer to this thread.
  • Hello,

    I am afraid a link is missing ?

    Thanks and Regards,
    Dominique
  • Hello,

    This thread states that a data prefetch can occur anywhere within the address space.

    Please let me know if I am wrong in my understanding :

    • if the access targets an address not physically valid or is not allowed by the MPU, the abort exception will be masked and nothing happens
    • if the access targets a valid and allowed address, it will occur normally. So if it lands on an unprogrammed flash address, it will necessary trigger the ECC error.

    If this is right, we can deduce that all the 4Mo internal flash shall be fully programmed (with padding as necessary):

    • either  using an external mean (jtag probe)
    • or, if flashing with embedded code (bootloader...), ECC bus error event export shall be disabled at cortex level during flash operation since the flashing software itself could lead such a speculative access in an unpredictable manner and fail.

    Maybe one could simply ignore the ECC error when flashing and clear it at the end of operation (not an option on our custom board design though, because nERROR is tied to reset), or maybe configuring L2FMC to ignore ECC faults from erased sectors ?

    What is the preferred/recommended way of managing this issue ?

    Thanks and regards

    Dominique