This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

FMC triggers no abort on access to invalid flash ECC area

Other Parts Discussed in Thread: TMS570LS3137

Hi all,

currently I'm working to detect a major problem in our software. It seems that a "lost" pointer is accessing some memory regions which it should not. However, no action happens when this forbidden accesses happen.

e.g. I would expect a data abort as soon as an uncorrectable flash location is accessed. Normally this mechanism works great. But in this special case no action happens. BTW: The ESM flags are set.

When I stop the programs execution with the debugger, I can see that the FMC's registers for the address of correctable and uncorrectable errors are filled with values. In my case it's 0x10000 for the correctable and 0x10008 for the uncorrectable one. Both are in an unused bootloader sector.

Now the strange thing I observed:
When the error occurs, the (undefined) register at address 0xFFF87300 changes it's value to 0x00060000. This register is no where documented. A read/write breakpoint to this location does also not work. How can the content of this register be interpreted?

I appended the content of the ESM and FMC registers as IntelHex to distinguish between a "good" and a "bad" run.

The device is a TMS570LS3137 with F021 flash module.

Kind regards,
Michael

8103.runtime_bug_registers.zip

  • Hello Michael,

     An uncorrectable ECC error will result in abort. To find out what causes the abort you can also look at either the instruction or data fault status register and instruction or data fault address register in the CPU. Since you have a data read, you will look at the DFSR and DFAR. Since you are not getting an abort, you should be getting a correctable error. Are you seeing bit 6 or the ESM Group 1 getting set? Note that if you really want correctable error to also abort then you can set bit2 and bit3 of the secondary auxiliary control register in the CPU. This will disable ECC correction.

     The register at 0xFFF87300 showing 0x00060000 is proper. It is a status register. When bit 18 and 16 are set, it means that the error status and error address registers in the flash module are frozen and new error events will not affet the registers. The flash module is implemented such that if there are multiple back to back error events, it will only capture the error address of the first error event. Once the error address is read, it will capture new error address should new error is detected again.

    regards,

    Charles

  • Hi Michael,

    Here is some info that could help to clarify the condition that you are describing:

    Literature Number: SPNU499B
    November 2012–Revised August 2013
    Section:
    5.3.1 SECDED Initialization
    .......
    The ECC values for all of the ATCM program memory space (flash banks 0 through 6) must be programmed into the flash before SECDED is enabled. This can be done by generating the correct values of the ECC with an external tool such as nowECC or may be generated by the programming tool. The Cortex R4 CPU may generate speculative fetches to any location within the ATCM memory space. A speculative fetch to a location with invalid ECC, which is subsequently not used, will not create an abort, but will set the ESM flags for a correctable or uncorrectable error. An uncorrectable error will unconditionally cause the nERROR pin to toggle low. Therefore care must be taken to generate the correct ECC for the entire ATCM space including the holes between sections and any unused or blank flash areas.
    .......

    As you can see, it could happen if the CPU generates speculative fetches and the ECC is not calculated for all the ATCM program memory space (flash banks 0 through 6) e.g. for the holes between sections of Bank0.

    ----------------------------------------------------------------------------------------------------------

    Hi Charles,

    Could you please help me to clarify the following items?

    1. Is the information/analysis above correct?

    2. Is it possible to avoid that the CPU generates speculative fetches?

    3. If 2. is not possible:  Is the generation of the ECC for the full ATCM program memory space (flash banks 0 through 6) the only way to avoid the following problem:

    "A speculative fetch to a location with invalid ECC, which is subsequently not used, will not create an abort, but will set the ESM flags for a correctable or uncorrectable error. An uncorrectable error will unconditionally cause the nERROR pin to toggle low."

    Thank you,

    Henry

  • Hi Henry,

      1. The statement to require all unused sections in the flash memory be initialized with valid ECC is correct. I think Michael's observation may very well be related to un-initialized ECC value.

       2. You can not prevent speculative fetch. The CPU can prefetch the next instructions. If the next instruction is like LDR, the data to be referenced will be prefetched. If the address pointer for the LDR instruction is not set yet, this speculative data fetch can go to an un-initialized area.

      3. Yes, an NON-Taken speculative fetch will NOT result in abort if the fetch has ECC error but will set the ESM.

    regards,

    Charles

     

  • Thanks for your feedback Charles.

    BR,

    Henry