This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28388D: NMI by uncorrectable Error in CM

Part Number: TMS320F28388D
Other Parts Discussed in Thread: C2000WARE

We have a sporadic problem that should be solved seriously.
The software runs in flash from the CM of the TMS320F28388D ( FRDCNTL.RWAIT=2, CPUCLK = 120MHz).

Sometimes an NMI interrupt is triggered and the CM jumps to the interrupt routine "nmiISR1" where I have set a breakpoint.
The registers then always show the following picture:



So the cause of the interrupt is an uncorrectable error at address 0x1FFFCEB8. This is always the same address. But this address points to the C1 RAM, which has no ECC. It points to the local variable "i" in the stack. This value is not the same for every attempt.



When jumping back into the code after the interrupt routine, it is always at the same place:



Here wIndex is compared with the object table and this is in the E0 RAM, which has an ECC. So the error does not happen randomly, but always at this place.

If a NOP is inserted here, the error no longer occurs. Also if FRDCNTL.RWAIT=4 is set, the error does not occur any more.

According to the specification FRDCNTL.RWAIT=2 should be sufficient for a CPUCLK of 120 MHz.


The solution with the NOP is more efficient than setting RWAIT=4, since two additional CLK cycles are not waited for each flash access.

The same problem I can see on a second target.
But does this solution cover the whole problem? What is actually the cause for this behavior?

Best regards

Simon 

  • Hello Simon,

    A couple of notes:

    - C1 RAM has parity, and parity errors will manifest as uncorrectable errors. Is this happening only on one device, or have you tested multiple devices?

    - For Flash, be sure that you do not have code placed in the last 16 words of a Flash bank or ECC-capable RAM (per F2838x errata: Memory: Prefetching Beyond Valid Memory)

    Best regards,
    Ibukun

  • Hello Ibukun 

    Thank you for your answer.

    • Yes, this behavior can also be detected on a second target with the same software. So it is not an error that only occurs in one processor.
    • The linker command file has been adjusted so that for all flash sectors and the E0RAM the end address is 16 words before the physical end. However, the problem remains.

         CMBANK0_SECTOR0  : origin = 0x00200008, length = 0x00003FD7
         CMBANK0_SECTOR1  : origin = 0x00204000, length = 0x00003FE0
         CMBANK0_SECTOR2  : origin = 0x00208000, length = 0x00003FE0
         CMBANK0_SECTOR3  : origin = 0x0020C000, length = 0x00003FE0
         CMBANK0_SECTOR4  : origin = 0x00210000, length = 0x0000FFE0
         CMBANK0_SECTOR5  : origin = 0x00220000, length = 0x0000FFE0
         CMBANK0_SECTOR6  : origin = 0x00230000, length = 0x0000FFE0
         CMBANK0_SECTOR7  : origin = 0x00240000, length = 0x0000FFE0
         CMBANK0_SECTOR8  : origin = 0x00250000, length = 0x0000FFE0
         CMBANK0_SECTOR9  : origin = 0x00260000, length = 0x0000FFE0
         CMBANK0_SECTOR10 : origin = 0x00270000, length = 0x00003FE0
         CMBANK0_SECTOR11 : origin = 0x00274000, length = 0x00003FE0
         CMBANK0_SECTOR12 : origin = 0x00278000, length = 0x00003FE0
         CMBANK0_SECTOR13 : origin = 0x0027C000, length = 0x00003FE0
      
         E0RAM            : origin = 0x20010000, length = 0x00003FE0
      


      The question here is whether this restriction also applies if the flash sectors are directly behind each other or whether it is sufficient if the 16 words are kept free for the last sector.

    So the problem is not solved yet.

    Best regards

    Simon

  • Hello Simon,

    Your issue is indeed a strange one. I've looked through historical threads to see if there is a similar issue; the closest I've seen is one where the user apparently resolved their issue by moving their stack from C1 to C0 ((+) TMS320F28388S: FLUNCERR error causing the CM to reset but UNC_ERR_ADDR_HIGH points to stack - C2000 microcontrollers forum - C2000Tm︎ microcontrollers - TI E2E support forums). Your NOP workaround is fine, but I do want to keep this thread open while I try to explore a potential solution with our team.

    To answer the question above: the restriction only applies to the last sector of the Flash bank.

    Thanks,
    Ibukun

  • Hello Ibukun

    That's crazy, this issue you sent me describes exactly the same behavior.
    Of course I also tried to move the stack from C1RAM to C0RAM and delete the NOP again. But this way the error occurred again.
    I also think that the NOP work around may be a solution only just in this build and am interested in a more detailed investigation of the cause.

    Best regards

    Simon

  • Hello Ibukun

    I checked two more points:

    • After the nmi with the uncorrectable error, it always jumped back to a location in the code where an object table in E0RAM was accessed. When this object table is mapped into the C0RAM, the error still occurs. So the problem has nothing to do with the ECC in the E0RAM.
    • If the function where the problem with the uncorrectable error occurs is executed from RAM, the problem no longer occurs. But this seems to be only a specific solution, because in this function there is also only normal C-code. So the problem could also occur in another code area.

    Can you identify the cause of this problem more precisely so that a general work around can be implemented?

    Best regards

    Simon

  • Hello Simon,

    Yes, based on what I can see so far, the error being generated is definitely a Flash ECC error, not RAM. I am discussing this with our internal Flash experts, but it will likely be a few days before I can revert with a response. Please bear with us in the meantime.

    Best regards,
    Ibukun

  • Hello Simon,

    Could you send your linker command file? If necessary, you can send it privately via direct message.

    Best regards,
    Ibukun

  • Hello Ibukun

    Here is the used linker command file as .txt document (uploading .cmd files are not allowed). 

    Best regards 

    Simon

    MEMORY
    {
       /* Flash sectors */
       CMBANK0_RESETISR : origin = 0x00200000, length = 0x00000008 /* Boot to Flash Entry Point */
       CMBANK0_SECTOR0  : origin = 0x00200008, length = 0x00003FF7
       CMBANK0_SECTOR1  : origin = 0x00204000, length = 0x00004000
       CMBANK0_SECTOR2  : origin = 0x00208000, length = 0x00004000
       CMBANK0_SECTOR3  : origin = 0x0020C000, length = 0x00004000
       CMBANK0_RESETISRFW : origin = 0x00210000, length = 0x00000008 /* Flash Entry Point for firmware */
       CMBANK0_SECTOR4  : origin = 0x00210008, length = 0x0000FFF8
       CMBANK0_SECTOR5  : origin = 0x00220000, length = 0x00010000
       CMBANK0_SECTOR6  : origin = 0x00230000, length = 0x00010000
       CMBANK0_SECTOR7  : origin = 0x00240000, length = 0x00010000
       CMBANK0_SECTOR8  : origin = 0x00250000, length = 0x00010000
       CMBANK0_SECTOR9  : origin = 0x00260000, length = 0x00010000
       CMBANK0_SECTOR10 : origin = 0x00270000, length = 0x00004000
       CMBANK0_SECTOR11 : origin = 0x00274000, length = 0x00004000
       CMBANK0_SECTOR12 : origin = 0x00278000, length = 0x00004000
       CMBANK0_SECTOR13 : origin = 0x0027C000, length = 0x00003FE0 /* Flash reduced because prefetching beyond valid memory (TMS320F2838x Real-Time MCUs Silicon Errata) */
    
       CRAM             : origin = 0x1FFFC000,   length = 0x00003ffe // KMa
       //C1RAM            : origin = 0x1FFFC000, length = 0x00001FFF
       //C0RAM            : origin = 0x1FFFE000, length = 0x00001FFF
    
       BOOT_RSVD        : origin = 0x20000000, length = 0x00000800 /* Part of S0, BOOT rom will use this for stack */
       SHARED_RAM_SECT  : origin = 0x20000800, length = 0x0000007F /* This section must be in the same place as in the firmware */
       BLOCK_RAM        : origin = 0x20000880, length = 0x0000f780 //SCS
       //S0RAM            : origin = 0x20000800, length = 0x000037FF
       //S1RAM            : origin = 0x20004000, length = 0x00003FFF
       //S2RAM            : origin = 0x20008000, length = 0x00003FFF
       //S3RAM            : origin = 0x2000C000, length = 0x00003FFF
       E0RAM            : origin = 0x20010000, length = 0x00004000
    
       CPU1TOCMMSGRAM0  : origin = 0x20080000, length = 0x00000800
       CPU1TOCMMSGRAM1  : origin = 0x20080800, length = 0x00000800
       CMTOCPU1MSGRAM0  : origin = 0x20082000, length = 0x00000110
       CMTOCPU1MSGRAM01 : origin = 0x20082110, length = 0x000006f0
       CMTOCPU1MSGRAM1  : origin = 0x20082800, length = 0x00000800
       CPU2TOCMMSGRAM0  : origin = 0x20084000, length = 0x00000800
       CPU2TOCMMSGRAM1  : origin = 0x20084800, length = 0x00000800
       CMTOCPU2MSGRAM0  : origin = 0x20086000, length = 0x00000800
       CMTOCPU2MSGRAM1  : origin = 0x20086800, length = 0x00000800
       ECAT_RAM         : origin = 0x400B1000, length = 0x00004000
    }
    
    SECTIONS
    {
       .resetisr        : > CMBANK0_RESETISRFW
       .vftable         : > CMBANK0_SECTOR4   /* Application placed vector table in Flash*/
       .vtable          : > CRAM             /* Application placed vector table in RAM */
       .text            : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5 | CMBANK0_SECTOR6 | CMBANK0_SECTOR7 | CMBANK0_SECTOR8 | CMBANK0_SECTOR9 | CMBANK0_SECTOR10 | CMBANK0_SECTOR11 | CMBANK0_SECTOR12
       .cinit           : > CMBANK0_SECTOR4  | CMBANK0_SECTOR9
       .pinit           : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5
       .switch          : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5
       .sysmem          : > CRAM
    
       .stack           : > CRAM
       .ebss            : > CRAM
       .econst          : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5
       .esysmem         : > CRAM
       .data            : > E0RAM
       .bss             : >> BLOCK_RAM | ECAT_RAM | CRAM
       .const           : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5
    
        MSGRAM_CM_TO_CPU1 : > CMTOCPU1MSGRAM0, type=NOINIT
        MSGRAM_CM_TO_CPU11 : > CMTOCPU1MSGRAM01, type=NOINIT
        MSGRAM_CM_TO_CPU2 : > CMTOCPU2MSGRAM0, type=NOINIT
        MSGRAM_CPU1_TO_CM : > CPU1TOCMMSGRAM0, type=NOINIT
        MSGRAM_CPU2_TO_CM : > CPU2TOCMMSGRAM0, type=NOINIT
    
        SHARED_RAM      : > SHARED_RAM_SECT, type=NOINIT // NOINIT is necessary
        sharedData      : > BLOCK_RAM
    
        .TI.ramfunc : {} LOAD = CMBANK0_SECTOR4,
                         RUN = BLOCK_RAM,
                         LOAD_START(RamfuncsLoadStart),
                         LOAD_SIZE(RamfuncsLoadSize),
                         LOAD_END(RamfuncsLoadEnd),
                         RUN_START(RamfuncsRunStart),
                         RUN_SIZE(RamfuncsRunSize),
                         RUN_END(RamfuncsRunEnd),
                         ALIGN(16)
    
        /* The following section definition are for DCSM dual core examples */
        ZONE1_RAM       : > CRAM
        UNSECURE_RAM    : > CRAM
        CSMKEY_RAM      : > BLOCK_RAM
    }
    
    /*
    //===========================================================================
    // End of file.
    //===========================================================================
    */
    

  • Hello Simon,

    One recommendation is to ensure that all Flash sections are 128-bit aligned. Use ALIGN(16) in the section declaration. You can use the 2838x_FLASH_lnk_cm.cmd in the device_support/f2838x/common/cmd/ directory in your C2000Ware installation as an example.

    Best regards,
    Ibukun

  • Hello Ibukun

    Thank you very much for this tip. I have adjusted the .cmd file so, rebuilt the firmware and tested.
    The problem is still exactly the same.

    Best regards 

    Simon

    MEMORY
    {
       /* Flash sectors */
       CMBANK0_RESETISR : origin = 0x00200000, length = 0x00000008 /* Boot to Flash Entry Point */
       CMBANK0_SECTOR0  : origin = 0x00200008, length = 0x00003FF7
       CMBANK0_SECTOR1  : origin = 0x00204000, length = 0x00004000
       CMBANK0_SECTOR2  : origin = 0x00208000, length = 0x00004000
       CMBANK0_SECTOR3  : origin = 0x0020C000, length = 0x00004000
       CMBANK0_RESETISRFW : origin = 0x00210000, length = 0x00000008 /* Flash Entry Point for firmware */
       CMBANK0_SECTOR4  : origin = 0x00210008, length = 0x0000FFF8
       CMBANK0_SECTOR5  : origin = 0x00220000, length = 0x00010000
       CMBANK0_SECTOR6  : origin = 0x00230000, length = 0x00010000
       CMBANK0_SECTOR7  : origin = 0x00240000, length = 0x00010000
       CMBANK0_SECTOR8  : origin = 0x00250000, length = 0x00010000
       CMBANK0_SECTOR9  : origin = 0x00260000, length = 0x00010000
       CMBANK0_SECTOR10 : origin = 0x00270000, length = 0x00004000
       CMBANK0_SECTOR11 : origin = 0x00274000, length = 0x00004000
       CMBANK0_SECTOR12 : origin = 0x00278000, length = 0x00004000
       CMBANK0_SECTOR13 : origin = 0x0027C000, length = 0x00003FE0 /* Flash reduced because prefetching beyond valid memory (TMS320F2838x Real-Time MCUs Silicon Errata) */
    
       CRAM             : origin = 0x1FFFC000,   length = 0x00003ffe // KMa
       //C1RAM            : origin = 0x1FFFC000, length = 0x00001FFF
       //C0RAM            : origin = 0x1FFFE000, length = 0x00001FFF
    
       BOOT_RSVD        : origin = 0x20000000, length = 0x00000800 /* Part of S0, BOOT rom will use this for stack */
       SHARED_RAM_SECT  : origin = 0x20000800, length = 0x0000007F /* This section must be in the same place as in the firmware */
       BLOCK_RAM        : origin = 0x20000880, length = 0x0000f780 //SCS
       //S0RAM            : origin = 0x20000800, length = 0x000037FF
       //S1RAM            : origin = 0x20004000, length = 0x00003FFF
       //S2RAM            : origin = 0x20008000, length = 0x00003FFF
       //S3RAM            : origin = 0x2000C000, length = 0x00003FFF
       E0RAM            : origin = 0x20010000, length = 0x00004000
    
       CPU1TOCMMSGRAM0  : origin = 0x20080000, length = 0x00000800
       CPU1TOCMMSGRAM1  : origin = 0x20080800, length = 0x00000800
       CMTOCPU1MSGRAM0  : origin = 0x20082000, length = 0x00000110
       CMTOCPU1MSGRAM01 : origin = 0x20082110, length = 0x000006f0
       CMTOCPU1MSGRAM1  : origin = 0x20082800, length = 0x00000800
       CPU2TOCMMSGRAM0  : origin = 0x20084000, length = 0x00000800
       CPU2TOCMMSGRAM1  : origin = 0x20084800, length = 0x00000800
       CMTOCPU2MSGRAM0  : origin = 0x20086000, length = 0x00000800
       CMTOCPU2MSGRAM1  : origin = 0x20086800, length = 0x00000800
       ECAT_RAM         : origin = 0x400B1000, length = 0x00004000
    }
    
    SECTIONS
    {
       .resetisr        : > CMBANK0_RESETISRFW, ALIGN(16)
       .vftable         : > CMBANK0_SECTOR4, ALIGN(16)   /* Application placed vector table in Flash*/
       .vtable          : > CRAM             /* Application placed vector table in RAM */
       .text            : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5 | CMBANK0_SECTOR6 | CMBANK0_SECTOR7 | CMBANK0_SECTOR8 | CMBANK0_SECTOR9 | CMBANK0_SECTOR10 | CMBANK0_SECTOR11 | CMBANK0_SECTOR12, ALIGN(16)
       .cinit           : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR9, ALIGN(16)
       .pinit           : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5, ALIGN(16)
       .switch          : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5, ALIGN(16)
       .sysmem          : > CRAM
    
       .stack           : > CRAM
       .ebss            : > CRAM
       .econst          : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5, ALIGN(16)
       .esysmem         : > CRAM
       .data            : > E0RAM
       .bss             : >> BLOCK_RAM | ECAT_RAM | CRAM
       .const           : >> CMBANK0_SECTOR4 | CMBANK0_SECTOR5, ALIGN(16)
    
        MSGRAM_CM_TO_CPU1 : > CMTOCPU1MSGRAM0, type=NOINIT
        MSGRAM_CM_TO_CPU11 : > CMTOCPU1MSGRAM01, type=NOINIT
        MSGRAM_CM_TO_CPU2 : > CMTOCPU2MSGRAM0, type=NOINIT
        MSGRAM_CPU1_TO_CM : > CPU1TOCMMSGRAM0, type=NOINIT
        MSGRAM_CPU2_TO_CM : > CPU2TOCMMSGRAM0, type=NOINIT
    
        SHARED_RAM      : > SHARED_RAM_SECT, type=NOINIT // NOINIT is necessary
        sharedData      : > BLOCK_RAM
    
        .TI.ramfunc : {} LOAD = CMBANK0_SECTOR4,
                         RUN = BLOCK_RAM,
                         LOAD_START(RamfuncsLoadStart),
                         LOAD_SIZE(RamfuncsLoadSize),
                         LOAD_END(RamfuncsLoadEnd),
                         RUN_START(RamfuncsRunStart),
                         RUN_SIZE(RamfuncsRunSize),
                         RUN_END(RamfuncsRunEnd),
                         ALIGN(16)
    
        /* The following section definition are for DCSM dual core examples */
        ZONE1_RAM       : > CRAM
        UNSECURE_RAM    : > CRAM
        CSMKEY_RAM      : > BLOCK_RAM
    }
    
    /*
    //===========================================================================
    // End of file.
    //===========================================================================
    */
    

  • Hello Simon,

    We don't have any other potential root cause identified for this issue, unfortunately. It seems you have uncovered a corner case timing limitation in the hardware. Any software workaround (e.g. NOPs) that is consistently reproducible should be fine to use.

    Best regards,
    Ibukun

  • Hi Joseph,

    Perfect, with this clear statement from Ti and the two possible workarounds, we can get around the problem.

    Thanks a lot and best regards,

    Simon