MSPM0G3507: "Hard Fault" when running images built with the GNU ARM toolchain

Part Number: MSPM0G3507

Tool/software:

Hi TI Experts!

I'm seeing weird issues when running images built with the GNU ARM toolchain on the MSPM0G3507. I've determined 2 spots where "hard faults" occur until now but they occur only when:

1. Running "release builds", i.e. compiled with -Os.
2. No debugger is connected.

These conditions make it a little difficult to pinpoint what exactly triggers such a hard fault, but I think I was able to do that for one such occurrence:

In a release build with GNU ARM toolchain arm-gnu-toolchain-14.2.rel1-mingw-w64-i686-arm-none-eabi, the hard fault seems to occur when the routine DL_SYSCTL_configSYSPLL() of TI SDK version 2_03_00_07 writes to SYSCTL->SOCLOCK.SYSPLLPARAM0:

void DL_SYSCTL_configSYSPLL(DL_SYSCTL_SYSPLLConfig *config)
{
    /* PLL configurations are retained in lower reset levels. Set default
     * behavior of disabling the PLL to keep a consistent behavior regardless
     * of reset level. */
    DL_SYSCTL_disableSYSPLL();

    /* Check that SYSPLL is disabled before configuration */
    while ((DL_SYSCTL_getClockStatus() & (DL_SYSCTL_CLK_STATUS_SYSPLL_OFF)) !=
           (DL_SYSCTL_CLK_STATUS_SYSPLL_OFF)) {
        ;
    }

    // set SYSPLL reference clock
    DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG0,
        ((uint32_t) config->sysPLLRef), SYSCTL_SYSPLLCFG0_SYSPLLREF_MASK);

    // set predivider PDIV (divides reference clock)
    DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG1, ((uint32_t) config->pDiv),
        SYSCTL_SYSPLLCFG1_PDIV_MASK);

    // save CPUSS CTL state and disable the cache
    uint32_t ctlTemp = DL_CORE_getInstructionConfig();
    DL_CORE_configInstruction(DL_CORE_PREFETCH_ENABLED, DL_CORE_CACHE_DISABLED,
        DL_CORE_LITERAL_CACHE_ENABLED);

    // populate SYSPLLPARAM0/1 tuning registers from flash, based on input freq
    SYSCTL->SOCLOCK.SYSPLLPARAM0 =
        *(volatile uint32_t *) ((uint32_t) config->inputFreq);          <-- Hard Fault occurs here
    SYSCTL->SOCLOCK.SYSPLLPARAM1 =
        *(volatile uint32_t *) ((uint32_t) config->inputFreq + (uint32_t) 0x4);

    // restore CPUSS CTL state
    CPUSS->CTL = ctlTemp;

    // set feedback divider QDIV (multiplies to give output frequency)
    DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG1,
        ((config->qDiv << SYSCTL_SYSPLLCFG1_QDIV_OFS) &
            SYSCTL_SYSPLLCFG1_QDIV_MASK),
        SYSCTL_SYSPLLCFG1_QDIV_MASK);

    // write clock output dividers, enable outputs, and MCLK source to SYSPLLCFG0
    DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG0,
        (((config->rDivClk2x << SYSCTL_SYSPLLCFG0_RDIVCLK2X_OFS) &
             SYSCTL_SYSPLLCFG0_RDIVCLK2X_MASK) |
            ((config->rDivClk1 << SYSCTL_SYSPLLCFG0_RDIVCLK1_OFS) &
                SYSCTL_SYSPLLCFG0_RDIVCLK1_MASK) |
            ((config->rDivClk0 << SYSCTL_SYSPLLCFG0_RDIVCLK0_OFS) &
                SYSCTL_SYSPLLCFG0_RDIVCLK0_MASK) |
            config->enableCLK2x | config->enableCLK1 | config->enableCLK0 |
            (uint32_t) config->sysPLLMCLK),
        (SYSCTL_SYSPLLCFG0_RDIVCLK2X_MASK | SYSCTL_SYSPLLCFG0_RDIVCLK1_MASK |
            SYSCTL_SYSPLLCFG0_RDIVCLK0_MASK |
            SYSCTL_SYSPLLCFG0_ENABLECLK2X_MASK |
            SYSCTL_SYSPLLCFG0_ENABLECLK1_MASK |
            SYSCTL_SYSPLLCFG0_ENABLECLK0_MASK |
            SYSCTL_SYSPLLCFG0_MCLK2XVCO_MASK));

    // enable SYSPLL
    SYSCTL->SOCLOCK.HSCLKEN |= SYSCTL_HSCLKEN_SYSPLLEN_ENABLE;

    // wait until SYSPLL startup is stabilized
    while ((DL_SYSCTL_getClockStatus() & SYSCTL_CLKSTATUS_SYSPLLGOOD_MASK) !=
           DL_SYSCTL_CLK_STATUS_SYSPLL_GOOD) {
        ;
    }
}

To repeat:
The hard fault does not occur when:

1. The code was compiled with -O0. This is also true when I compile only this specific routine with -O0 (using __attribute__((optimize("O0")))).
2. I'm running the code with a debugger connected.

As such, any programming issues (i.e. the argument config holding some faulty data) can be ruled out from my perspective.

The other occurrence seems to appear with GNU ARM toolchain v9.2.1 only and occurs in the FreeRTOS routine in vPortSuppressTicksAndSleep() (this requires #define configUSE_TICKLESS_IDLE 1). I'm not sure exactly where in this routine the hard fault occurs in this case.

What makes this occurrence even weirder is the fact that it goes away when I add an arbitrary interrupt handler to my code instead of letting the "Default_Handler" manage it ... even though this interrupt is neither used nor enabled in the firmware!

Sounds crazy, right? However:

I can reproduce this behavior consistently. Likewise, I can consistently get rid of these issues when I apply any of the "work arounds" mentioned above.

Do you have any explanation for this behavior?

Thanks,
Chris.

  • Hi Chris,

    Look like this is the compiler issues.

    1. The code was compiled with -O0. This is also true when I compile only this specific routine with -O0 (using __attribute__((optimize("O0")))).
    2. I'm running the code with a debugger connected.

    Is this reproduced in the SDK example project?

    I can then report to TI team to see any comments. While, the fixed should be some thing that depends on the ARM toolchain team.

    I haven't install this GNU Compiler. What I sugges is below:

    So, can you add some code alignments inside the this API functions and see whether it still occurs with hardfault. Something like below:

        __NOP();
        
        // save CPUSS CTL state and disable the cache
        uint32_t ctlTemp = DL_CORE_getInstructionConfig();
        DL_CORE_configInstruction(DL_CORE_PREFETCH_ENABLED, DL_CORE_CACHE_DISABLED,
            DL_CORE_LITERAL_CACHE_ENABLED);
            
        __NOP();
    
        // populate SYSPLLPARAM0/1 tuning registers from flash, based on input freq
        SYSCTL->SOCLOCK.SYSPLLPARAM0 =
            *(volatile uint32_t *) ((uint32_t) config->inputFreq);
        SYSCTL->SOCLOCK.SYSPLLPARAM1 =
            *(volatile uint32_t *) ((uint32_t) config->inputFreq + (uint32_t) 0x4);
        
        __NOP();
        // restore CPUSS CTL state
        CPUSS->CTL = ctlTemp;
        
        __NOP();
        // set feedback divider QDIV (multiplies to give output frequency)
        DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG1,
            ((config->qDiv << SYSCTL_SYSPLLCFG1_QDIV_OFS) &
                SYSCTL_SYSPLLCFG1_QDIV_MASK),
            SYSCTL_SYSPLLCFG1_QDIV_MASK);

    Maybe you can use TI ARM CLANG compiler for the development, and I believe it will have no issues:

    https://www.ti.com/tool/ARM-CGT?keyMatch=TI%20CLANG&tisearch=universal_search 

    B.R.

    Sal

  • Hi Sal,

    Is this reproduced in the SDK example project?

    No. I'm seeing this issue in the firmware we have written for our custom HW design. I did go to lengths to confirm what I wrote above already and don't have the time to try to replicate it using one of the SDK examples (which probably wouldn't run on our custom HW w/o considerable changes anyway).

    So, can you add some code alignments inside the this API functions and see whether it still occurs with hardfault.

    I tried that and the issue goes away. However:
    I have listed ways to work around this issue above already. As such, I know how to work around this specific instance or manifestation of this issue.

    The more interesting questions are:

    1. Why does it occur in the first place?
    2. Are there any other places in your SDK or elsewhere (like in the FreeRTOS code mentioned above) where it might appear as well?

    From my perspective, this indicates some issue in the MSPM0* MCU itself!

    What the two manifestations of this issue seem to have in common is that the core clock or something closely related to it changes.

    The first manifestation changes the SYSPLL configuration (which should have no direct and/or immediate impact on the core clock, but still) whereas the second manifestation in FreeRTOS puts the MCU to sleep by executing the assembler instruction wfiAs I said above, I was not (yet) able to pinpoint the specific instruction that triggers the hard fault, but it is somewhere around the execution of the assembler instruction wfi in the FreeRTOS routine vPortSuppressTicksAndSleep().

    The way in which one can work around this issue indicates that a specific sequence of (assembler) operations in combination with altering the core clock (or something closely related to it) triggers such a hard fault.

    This is what concerns me the most. 

    Maybe you can use TI ARM CLANG compiler for the development, and I believe it will have no issues

    It did see the "FreeRTOS issue" (see above) also with the TI ARM CLANG compiler once, but am no longer able to reproduce it. Regardless:
    We are using the GNU ARM toolchain for lots of other products and don't necessarily want to divert from that.

    Also:
    I have analyzed the assembler code that triggers this hard fault and don't see anything wrong with that. I'm attached the file for your reference. It contains:

    1. The values of the lr and pc registers reported in the hard fault.
    2. A snapshot of the C routine DL_SYSCTL_configSYSPLL().
    3. The corresponding assembler code with my personal comments and a somewhat accurate mapping of this code to the C code.

    The hard fault occurs at address 0x00001342.

    DL_SYSCTL_configSYSPLL_gcc_release.txt
    lr = 0x0000BEDF 
    pc = 0x00001342
    
    void DL_SYSCTL_configSYSPLL(DL_SYSCTL_SYSPLLConfig *config)
    {
        /* PLL configurations are retained in lower reset levels. Set default
         * behavior of disabling the PLL to keep a consistent behavior regardless
         * of reset level. */
        DL_SYSCTL_disableSYSPLL();
    
        /* Check that SYSPLL is disabled before configuration */
        while ((DL_SYSCTL_getClockStatus() & (DL_SYSCTL_CLK_STATUS_SYSPLL_OFF)) !=
               (DL_SYSCTL_CLK_STATUS_SYSPLL_OFF)) {
            ;
        }
    
        // set SYSPLL reference clock
        DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG0,
            ((uint32_t) config->sysPLLRef), SYSCTL_SYSPLLCFG0_SYSPLLREF_MASK);
    
        // set predivider PDIV (divides reference clock)
        DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG1, ((uint32_t) config->pDiv),
            SYSCTL_SYSPLLCFG1_PDIV_MASK);
    
        // save CPUSS CTL state and disable the cache
        uint32_t ctlTemp = DL_CORE_getInstructionConfig();
        DL_CORE_configInstruction(DL_CORE_PREFETCH_ENABLED, DL_CORE_CACHE_DISABLED,
            DL_CORE_LITERAL_CACHE_ENABLED);
    
        // populate SYSPLLPARAM0/1 tuning registers from flash, based on input freq
        SYSCTL->SOCLOCK.SYSPLLPARAM0 =
            *(volatile uint32_t *) ((uint32_t) config->inputFreq);
        SYSCTL->SOCLOCK.SYSPLLPARAM1 =
            *(volatile uint32_t *) ((uint32_t) config->inputFreq + (uint32_t) 0x4);
    
        // restore CPUSS CTL state
        CPUSS->CTL = ctlTemp;
    
        // set feedback divider QDIV (multiplies to give output frequency)
        DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG1,
            ((config->qDiv << SYSCTL_SYSPLLCFG1_QDIV_OFS) &
                SYSCTL_SYSPLLCFG1_QDIV_MASK),
            SYSCTL_SYSPLLCFG1_QDIV_MASK);
    
        // write clock output dividers, enable outputs, and MCLK source to SYSPLLCFG0
        DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG0,
            (((config->rDivClk2x << SYSCTL_SYSPLLCFG0_RDIVCLK2X_OFS) &
                 SYSCTL_SYSPLLCFG0_RDIVCLK2X_MASK) |
                ((config->rDivClk1 << SYSCTL_SYSPLLCFG0_RDIVCLK1_OFS) &
                    SYSCTL_SYSPLLCFG0_RDIVCLK1_MASK) |
                ((config->rDivClk0 << SYSCTL_SYSPLLCFG0_RDIVCLK0_OFS) &
                    SYSCTL_SYSPLLCFG0_RDIVCLK0_MASK) |
                config->enableCLK2x | config->enableCLK1 | config->enableCLK0 |
                (uint32_t) config->sysPLLMCLK),
            (SYSCTL_SYSPLLCFG0_RDIVCLK2X_MASK | SYSCTL_SYSPLLCFG0_RDIVCLK1_MASK |
                SYSCTL_SYSPLLCFG0_RDIVCLK0_MASK |
                SYSCTL_SYSPLLCFG0_ENABLECLK2X_MASK |
                SYSCTL_SYSPLLCFG0_ENABLECLK1_MASK |
                SYSCTL_SYSPLLCFG0_ENABLECLK0_MASK |
                SYSCTL_SYSPLLCFG0_MCLK2XVCO_MASK));
    
        // enable SYSPLL
        SYSCTL->SOCLOCK.HSCLKEN |= SYSCTL_HSCLKEN_SYSPLLEN_ENABLE;
    
        // wait until SYSPLL startup is stabilized
        while ((DL_SYSCTL_getClockStatus() & SYSCTL_CLKSTATUS_SYSPLLGOOD_MASK) !=
               DL_SYSCTL_CLK_STATUS_SYSPLL_GOOD) {
            ;
        }
    }
    
    
    
    0x000012e8: f7 b5           	push	{r0, r1, r2, r4, r5, r6, r7, lr}
    0x000012ea: 33 4b           	ldr	r3, [pc, #204]	@ (0x13b8 <DL_SYSCTL_configSYSPLL+208>)     -> r3 = SYSCTL_BASE
    0x000012ec: 33 4a           	ldr	r2, [pc, #204]	@ (0x13bc <DL_SYSCTL_configSYSPLL+212>)
    0x000012ee: 34 4c           	ldr	r4, [pc, #208]	@ (0x13c0 <DL_SYSCTL_configSYSPLL+216>)
    0x000012f0: 99 58           	ldr	r1, [r3, r2]
    0x000012f2: 21 40           	ands	r1, r4
    0x000012f4: 99 50           	str	r1, [r3, r2]                                                - DL_SYSCTL_disableSYSPLL();
    0x000012f6: 80 21           	movs	r1, #128	@ 0x80
    0x000012f8: c9 01           	lsls	r1, r1, #7
    0x000012fa: 32 4c           	ldr	r4, [pc, #200]	@ (0x13c4 <DL_SYSCTL_configSYSPLL+220>)     
    0x000012fc: 1c 59           	ldr	r4, [r3, r4]
    0x000012fe: 0c 42           	tst	r4, r1
    0x00001300: fb d0           	beq.n	0x12fa <DL_SYSCTL_configSYSPLL+18>                      - while ((DL_SYSCTL_getClockStatus() & (DL_SYSCTL_CLK_STATUS_SYSPLL_OFF)) != (DL_SYSCTL_CLK_STATUS_SYSPLL_OFF)) {...
    0x00001302: 01 26           	movs	r6, #1                                                  - DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG0, - Start
    0x00001304: 03 27           	movs	r7, #3
    0x00001306: 30 49           	ldr	r1, [pc, #192]	@ (0x13c8 <DL_SYSCTL_configSYSPLL+224>)
    0x00001308: 44 7e           	ldrb	r4, [r0, #25]
    0x0000130a: 0d 68           	ldr	r5, [r1, #0]
    0x0000130c: 6c 40           	eors	r4, r5
    0x0000130e: 34 40           	ands	r4, r6
    0x00001310: 6c 40           	eors	r4, r5
    0x00001312: 0c 60           	str	r4, [r1, #0]                                                - DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG0, - End
    0x00001314: 2d 4c           	ldr	r4, [pc, #180]	@ (0x13cc <DL_SYSCTL_configSYSPLL+228>)     - DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG1, - Start
    0x00001316: 45 1c           	adds	r5, r0, #1
    0x00001318: 26 68           	ldr	r6, [r4, #0]
    0x0000131a: ed 7f           	ldrb	r5, [r5, #31]
    0x0000131c: 75 40           	eors	r5, r6
    0x0000131e: 3d 40           	ands	r5, r7
    0x00001320: 75 40           	eors	r5, r6
    0x00001322: 98 26           	movs	r6, #152	@ 0x98
    0x00001324: 25 60           	str	r5, [r4, #0]                                                - DL_Common_updateReg(&SYSCTL->SOCLOCK.SYSPLLCFG1, - End
    0x00001326: 2a 4d           	ldr	r5, [pc, #168]	@ (0x13d0 <DL_SYSCTL_configSYSPLL+232>)     - uint32_t ctlTemp = DL_CORE_getInstructionConfig() - Start
                                                                                                    -> r5 = 0x4040
    0x00001328: 76 01           	lsls	r6, r6, #5                                              -> r6 = 0x98 << 5 = 0x1300
    0x0000132a: af 59           	ldr	r7, [r5, r6]                                                -> r7 = CPUSS->CTL
    0x0000132c: b4 46           	mov	r12, r6
    0x0000132e: 07 26           	movs	r6, #7
    0x00001330: 37 40           	ands	r7, r6                                                  -> r7 = CPUSS->CTL & (CPUSS_CTL_ICACHE_MASK | CPUSS_CTL_PREFETCH_MASK | CPUSS_CTL_LITEN_MASK)
    0x00001332: 01 97           	str	r7, [sp, #4]                                                - uint32_t ctlTemp = DL_CORE_getInstructionConfig() - End
    0x00001334: 67 46           	mov	r7, r12
    0x00001336: 02 3e           	subs	r6, #2
    0x00001338: ee 51           	str	r6, [r5, r7]                                                - DL_CORE_configInstruction(DL_CORE_PREFETCH_ENABLED, DL_CORE_CACHE_DISABLED, DL_CORE_LITERAL_CACHE_ENABLED); - End
    0x0000133a: 46 6a           	ldr	r6, [r0, #36]	@ 0x24                                      -> r6 = config->inputFreq
    0x0000133c: 37 68           	ldr	r7, [r6, #0]                                                -> r7 = *config->inputFreq
    0x0000133e: 3d 00           	movs	r5, r7                                                  -> r5 = r7
    0x00001340: 24 4f           	ldr	r7, [pc, #144]	@ (0x13d4 <DL_SYSCTL_configSYSPLL+236>)     -> r7 = &SYSCTL->SOCLOCK.SYSPLLPARAM0
    0x00001342: dd 51           	str	r5, [r3, r7]                                                <--- PC reported in hard fault, write r5 to r3+r7   SYSCTL->SOCLOCK.SYSPLLPARAM0 = 
    0x00001344: 65 46           	mov	r5, r12
    0x00001346: 77 68           	ldr	r7, [r6, #4]
    0x00001348: 23 4e           	ldr	r6, [pc, #140]	@ (0x13d8 <DL_SYSCTL_configSYSPLL+240>)
    0x0000134a: 9f 51           	str	r7, [r3, r6]                                                - SYSCTL->SOCLOCK.SYSPLLPARAM1 = - End
    0x0000134c: 01 9e           	ldr	r6, [sp, #4]                                                - CPUSS->CTL = ctlTemp; Start
    0x0000134e: 20 4f           	ldr	r7, [pc, #128]	@ (0x13d0 <DL_SYSCTL_configSYSPLL+232>)
    0x00001350: 7e 51           	str	r6, [r7, r5]                                                - CPUSS->CTL = ctlTemp; End
    0x00001352: fe 27           	movs	r7, #254	@ 0xfe
    0x00001354: c5 69           	ldr	r5, [r0, #28]
    0x00001356: 26 68           	ldr	r6, [r4, #0]
    0x00001358: ff 01           	lsls	r7, r7, #7
    0x0000135a: 2d 02           	lsls	r5, r5, #8
    0x0000135c: 3d 40           	ands	r5, r7
    0x0000135e: 75 40           	eors	r5, r6
    0x00001360: 3d 40           	ands	r5, r7
    0x00001362: f0 27           	movs	r7, #240	@ 0xf0
    0x00001364: 75 40           	eors	r5, r6
    0x00001366: 25 60           	str	r5, [r4, #0]
    0x00001368: 06 69           	ldr	r6, [r0, #16]
    0x0000136a: c5 68           	ldr	r5, [r0, #12]
    0x0000136c: 3f 03           	lsls	r7, r7, #12
    0x0000136e: 35 43           	orrs	r5, r6
    0x00001370: 46 69           	ldr	r6, [r0, #20]
    0x00001372: 0c 68           	ldr	r4, [r1, #0]
    0x00001374: 35 43           	orrs	r5, r6
    0x00001376: 06 7e           	ldrb	r6, [r0, #24]
    0x00001378: 35 43           	orrs	r5, r6
    0x0000137a: 06 68           	ldr	r6, [r0, #0]
    0x0000137c: 36 04           	lsls	r6, r6, #16
    0x0000137e: 3e 40           	ands	r6, r7
    0x00001380: 35 43           	orrs	r5, r6
    0x00001382: 46 68           	ldr	r6, [r0, #4]
    0x00001384: 80 68           	ldr	r0, [r0, #8]
    0x00001386: 36 07           	lsls	r6, r6, #28
    0x00001388: 36 0c           	lsrs	r6, r6, #16
    0x0000138a: 35 43           	orrs	r5, r6
    0x0000138c: f0 26           	movs	r6, #240	@ 0xf0
    0x0000138e: 00 02           	lsls	r0, r0, #8
    0x00001390: 36 01           	lsls	r6, r6, #4
    0x00001392: 30 40           	ands	r0, r6
    0x00001394: 28 43           	orrs	r0, r5
    0x00001396: 11 4d           	ldr	r5, [pc, #68]	@ (0x13dc <DL_SYSCTL_configSYSPLL+244>)
    0x00001398: 60 40           	eors	r0, r4
    0x0000139a: 28 40           	ands	r0, r5
    0x0000139c: 60 40           	eors	r0, r4
    0x0000139e: 08 60           	str	r0, [r1, #0]
    0x000013a0: 80 21           	movs	r1, #128	@ 0x80
    0x000013a2: 98 58           	ldr	r0, [r3, r2]
    0x000013a4: 49 00           	lsls	r1, r1, #1
    0x000013a6: 01 43           	orrs	r1, r0
    0x000013a8: 99 50           	str	r1, [r3, r2]
    0x000013aa: 80 22           	movs	r2, #128	@ 0x80
    0x000013ac: 92 00           	lsls	r2, r2, #2
    0x000013ae: 05 49           	ldr	r1, [pc, #20]	@ (0x13c4 <DL_SYSCTL_configSYSPLL+220>)
    0x000013b0: 59 58           	ldr	r1, [r3, r1]
    0x000013b2: 11 42           	tst	r1, r2
    0x000013b4: fb d0           	beq.n	0x13ae <DL_SYSCTL_configSYSPLL+198>                     while ((DL_SYSCTL_getClockStatus() & SYSCTL_CLKSTATUS_SYSPLLGOOD_MASK) != DL_SYSCTL_CLK_STATUS_SYSPLL_GOOD) { ...
    0x000013b6: f7 bd           	pop	{r0, r1, r2, r4, r5, r6, r7, pc}                            return;
    
    Addresses - loaded via ldr	rX, [pc, #xxx] instructions
    
    0x000013b8: 00 f0 0a 40     	and.w	r0, r0, #2315255808	@ 0x8a000000        -> SYSCTL, SYSCTL_BASE 
    0x000013bc: 08 11           	asrs	r0, r1, #4                              -> SYSCTL->SOCLOCK.HSCLKEN
    0x000013be: 00 00           	movs	r0, r0
    0x000013c0: ff fe ff ff     	mrc2	15, 7, pc, cr15, cr15, {7}
    0x000013c4: 04 12           	asrs	r4, r0, #8                              -> SYSCTL->SOCLOCK.CLKSTATUS
    0x000013c6: 00 00           	movs	r0, r0
    0x000013c8: 20 01           	lsls	r0, r4, #4
    0x000013ca: 0b 40           	ands	r3, r1
    0x000013cc: 24 01           	lsls	r4, r4, #4
    0x000013ce: 0b 40           	ands	r3, r1
    0x000013d0: 00 00           	movs	r0, r0
    0x000013d2: 40 40           	eors	r0, r0                                  -> Upper 16-bit of CPUSS->CTL
    0x000013d4: 28 11           	asrs	r0, r5, #4                              -> SYSCTL->SOCLOCK.SYSPLLPARAM0
    0x000013d6: 00 00           	movs	r0, r0
    0x000013d8: 2c 11           	asrs	r4, r5, #4                              -> SYSCTL->SOCLOCK.SYSPLLPARAM1
    0x000013da: 00 00           	movs	r0, r0
    0x000013dc: 72 ff 0f 00     	vhadd.u<illegal width 64>	d16, d2, d15
    
     

    Thanks,
    Chris.

  • Hi Chris,

    As implemented in the SDK, it disable cache and then populate SYSPLLPARAM0/1 tuning registers from flash. So. if the compiler mess it up for the code optimization, then it will result in the issues. [That is why I add a code alignment here for the test.]

    Although it shows the instruction finished, can you take a look at CPUSS->CTL, and see whether it is updated after this assembly code executed.

    B.R.

    Sal

  • Hi Sal,

    So. if the compiler mess it up for the code optimization

    I think it's obvious that the compiler didn't "mess up" anything. The (optimized) assembly code generated by TI ARM CLANG for this sequence of operations isn't all that much different:

    	ldr	r4, .LCPI0_2                Address of CPUSS->CTL into r4
    	ldr	r5, [r4]
    	movs	r6, #7
    	ands	r6, r5                  uint32_t ctlTemp = DL_CORE_getInstructionConfig(); -> r6
    	movs	r5, #5
    	str	r5, [r4]                    DL_CORE_configInstruction(DL_CORE_PREFETCH_ENABLED, DL_CORE_CACHE_DISABLED, DL_CORE_LITERAL_CACHE_ENABLED);
    	ldr	r5, [r0, #36]
    	ldr	r7, [r5]
    	str	r7, [r2, #32]               SYSCTL->SOCLOCK.SYSPLLPARAM0 = *(volatile uint32_t *) ((uint32_t) config->inputFreq);
    	ldr	r5, [r5, #4]
    	str	r5, [r2, #36]               SYSCTL->SOCLOCK.SYSPLLPARAM1 = *(volatile uint32_t *) ((uint32_t) config->inputFreq + (uint32_t) 0x4);
    	str	r6, [r4]                    CPUSS->CTL = ctlTemp;

    Anyway:

    can you take a look at CPUSS->CTL, and see whether it is updated after this assembly code executed.

    That's a little hard to do given the difficulties I have debugging this issue, i.e. does not occur when a debugger is connected, and I'm not sure how else I could determine the actual value of this register at such an early stage in the init process. Any suggestions are appreciated.

    However:
    If I understood you correctly, you are saying that disabling the instruction cache might get delayed because it is still enabled at this point in time?
    If so, what exactly would be the consequence that leads to the hard fault eventually?

    Furthermore, it looks to me as if it would be the responsibility of the code (i.e. the TI SDK code) to make sure that something like that is avoided and I'm not sure if the __NOP() instructions suggested above are a reliable way to do that.

    In the FreeRTOS code I pointed out above, I see dsb and isb instructions before and after the CPU is put to sleep, but a similar issue still occurs even though only with GNU ARM toolchain v9.2.1.

    In other words:
    What is the recommended approach proposed by TI?

  • Hi Chris,

    Oh, I forget this will not trigger in the debug mode. this make it hard to obersrve.

    By the way, how do you find below instruction trigger hardfault the without in debug mode?

    "PC reported in hard fault, write r5 to r3+r7   SYSCTL->SOCLOCK.SYSPLLPARAM0 = "

    If I understood you correctly, you are saying that disabling the instruction cache might get delayed because it is still enabled at this point in time?
    If so, what exactly would be the consequence that leads to the hard fault eventually?

    That's my suspection. One information for this is the MCLK frequency, and what it the flash wait states?

    Anyway, let me forward your findings to our tool team to see any suggestion here.

    B.R.

    Sal

  • By the way, how do you find below instruction trigger hardfault the without in debug mode?

    I realized at some point that I can connect a debugger after the hard fault occurred and collect the corresponding exception stack frame. That gave me the lr and pc values. That then allowed for locating the corresponding code using the assembly output of the compiler as well as the disassembly output of the debugger.

    One information for this is the MCLK frequency, and what it the flash wait states?

    At this point in time, the MCU core is still clocked from the internal SYSOSC. Switching to HFCLK and the SYSPLL happens later.

    The flash wait states have been reconfigured to DL_SYSCTL_FLASH_WAIT_STATE_2 already, though, because we are running with MCLK and CPUCLK at 80 MHz, eventually. The complete system init routine follows:

    /*!
     * @brief   Initializes low-level MCU parameters and peripherals.
     *
     * This function configures system PLL and derived clocks, the flash wait states and the BOR threshold.
     */
    static void sysctlInit(void)
    {
        // Set the brownout threshold to the highest available level.
        // This is done to avoid that the application starts running too early.
        // TODO: Why does this happen here while DL_SYSCTL_activateBORThreshold() is invoked at the end? 
        DL_SYSCTL_setBORThreshold(DL_SYSCTL_BOR_THRESHOLD_LEVEL_3);
    
        // 2 flash wait states are required when running at MCLK and CPUCLK of 80 MHz.
        // 1 flash wait states could be used up to 48 MHz.
        // 0 flash wait states could be used up to 24 MHz.
        DL_SYSCTL_setFlashWaitState(DL_SYSCTL_FLASH_WAIT_STATE_2);
    
        // Configure the HFCLK source as HFXT oscillator (i.e. the 16 MHz oscillator on the THB).
        // The second argument specifies a "startup time" in steps of ~64us.
        // The third argument enables a "monitor" that makes sure that the HFXT oscillator is actually up and running.
        // NOTE: This routine would hang in an endless loop with the latter is not true!
        DL_SYSCTL_setHFCLKSourceHFXTParams(DL_SYSCTL_HFXT_RANGE_8_16_MHZ, 10, true);
    
        // Configure the system PLL. As a result, we're running with a CPUCLK of 80 MHz w/ the current configuration.
        DL_SYSCTL_configSYSPLL((DL_SYSCTL_SYSPLLConfig *) &lSYSPLLConfig);
    
        // Configure the ULPCLK divider. The ULPCLK is derived from the same source as CPUCLK and MCLK,
        // but provides an additional divider and drives the PD0 bus clock.
        // The maximum ULPCLK frequency is 40 MHz!
        DL_SYSCTL_setULPCLKDivider(DL_SYSCTL_ULPCLK_DIV_2);
    
        // Switch the MCLK source from the built-in system oscillator (SYSOSC) to the "high speed" clock (HSCLK).
        // NOTE: This macro expands to DL_SYSCTL_switchMCLKfromSYSOSCtoHSCLK
        DL_SYSCTL_setMCLKSource(SYSOSC, HSCLK, DL_SYSCTL_HSCLK_SOURCE_SYSPLL);
    
        // Enable the "external" clock, i.e. the output CLK_OUT, which may driven from all sorts of internal clocks.
        // In this case we're gating HFCLK to it, i.e. the 16 MHz oscillator on the THB which is also the input to
        // the system PLL. This is for debug/test/verification purposes only.
        DL_SYSCTL_enableExternalClock(DL_SYSCTL_CLK_OUT_SOURCE_ULPCLK, DL_SYSCTL_CLK_OUT_DIVIDE_DISABLE);
    
        // Configure the "Frequency Clock Counter" (FCC):
        // The FCC may be used to measure other clocks in the MCU. In this case we're gating SYSPLLCLK2X as the clock to
        // measure and are using to LFCLK (=32768 Hz clock) to trigger the measurement.
        // The measurement starts and ends with a rising edge of LFCLK (DL_SYSCTL_FCC_TRIG_TYPE_RISE_RISE) and lasts one
        // such period (DL_SYSCTL_FCC_TRIG_CNT_01).
        //
        // An actual measurement is started w/ DL_SYSCTL_startFCC().
        // DL_SYSCTL_isFCCDone() returns true when the measurement is done.
        // The result can be obtained using DL_SYSCTL_readFCC().
        //
        // This is for debug/test/verification purposes only.
        DL_SYSCTL_configFCC(DL_SYSCTL_FCC_TRIG_TYPE_RISE_RISE, DL_SYSCTL_FCC_TRIG_SOURCE_LFCLK, SYSCTL_GENCLKCFG_FCCSELCLK_SYSPLLCLK2X);
        DL_SYSCTL_setFCCPeriods(DL_SYSCTL_FCC_TRIG_CNT_01);
    
        DL_SYSCTL_activateBORThreshold();
    
        return;
    }

    Prior to that, only the following happens:

    void hal_systemInit(void)
    {
        DL_GPIO_reset(GPIOA);
        DL_GPIO_reset(GPIOB);
    
        DL_UART_Main_reset(UART_1_INST);
    
        DL_GPIO_enablePower(GPIOA);
        DL_GPIO_enablePower(GPIOB);
    
        DL_UART_Main_enablePower(UART_1_INST);
    
        // NOTE & TODO:
        // This is before reconfigurung the clocks. As such, we're presumably running at 32 (instead of 80) MHz here.
        // The #defines POWER_STARTUP_DELAY and VREF_READY_DELAY have been produced by TI's IDE.
        // I'm not sure which CPU clock speed it uses to calculate these numbers.
        // To be clarified and verified.
        delay_cycles(POWER_STARTUP_DELAY);
    
        sysctlInit();
        [...]

    Anyway, let me forward your findings to our tool team to see any suggestion here.

    Thx!

  • Hi Chris,

    Thanks for the information. Let me check with team.

    B.R

    Sal

  • Hi Chris,

    Would it be possible for you to perform a quick test?

    It would involve disabling the cache on the device at the very start of syscltInit(), poll until the cache has been succesfully disabled. Only then proceed with sysctl configuration. At the very end you can then re-enable the cache.

    Wondering if this could resolve your issue and possibly point fingers at some API's missing the steps mentioned by Sal regarding cache disable or it not being robust enough against optimization.

  • Hi Henry,

    Would it be possible for you to perform a quick test?

    Sure. However:

    poll until the cache has been succesfully disabled.

    How do I do that?

  • Thanks Henry for the comments.

    Hi Chris,

    By the way, please see disclosure of the cache issue in ERRATA: https://www.ti.com/lit/er/slaz742b/slaz742b.pdf 

    I think this is the root cause makes it entry the hardfault.

    B.R.

    Sal

  • @Sal:

    I think this is the root cause makes it entry the hardfault.

    Sounds like it, but the workaround is present in the SDK code. It "just" doesn't work as expected w/ the Arm GNU toolchain.

    @Henry:

    I'm still waiting for feedback wrt "poll until the cache has been successfully disabled.".
    I don't know how I can verify that.

    Regards,
    Chris.

  • Hi Chris,

    I am also waiting for Henry update.

    Maybe you can try below:

        // save CPUSS CTL state and disable the cache
        uint32_t ctlTemp = DL_CORE_getInstructionConfig();
        DL_CORE_configInstruction(DL_CORE_PREFETCH_ENABLED, DL_CORE_CACHE_DISABLED,
            DL_CORE_LITERAL_CACHE_ENABLED);
            
        while((CPUSS->CTL & CPUSS_CTL_ICACHE_MASK) !=
               DL_CORE_CACHE_DISABLED) {
               ;
        }
        
        ...

    I think this should work.

    By the way, can you help forward a minmum exmaple project with ARM GNU compiler which could reproduce your issues? Then I think it will be more efficient to dig more with this case.

    B.R.

    Sal

  • Hi All,

    Apologies for the delayed response from my side.

    When it comes to what I was talking about you can just take what Sal had recommended at the start and throw it at the start of your function.

    Then at function end re-apply the state and then re-enable the CPUSS.

    Thanks

  • It would involve disabling the cache on the device at the very start of syscltInit(), poll until the cache has been succesfully disabled. Only then proceed with sysctl configuration. At the very end you can then re-enable the cache.

    Yeah ... that fixes the issue. Which doesn't come as much of a surprise since inserting the _NOP() instructions (as suggested by Sal) fixed the issue as well. The issue also goes away when I disable/reenable the cache before/after invoking DL_SYSCTL_configSYSPLL() only.

    I think that confirms the relation to errata issue CPU_ERR_01 pointed out by Sal further above (even though the workaround proposed there apparently doesn't work in all cases).

    I can live with that "more encompassing" workaround assuming there are no other issues related to this errata hidden in the TI SDK.

    Just to satisfy my curiosity wrt "poll until the cache has been successfully disabled":

    The TRM doesn't indicate that reading CPUSS->CTL would return the actual state of the corresponding caches. As such, I would not expect to read back anything different from what was written to it previously. Apparently that's not true. The TRM also doesn't say what happens in detail when a cache or prefetch gets disabled.

    Can you shed some light on this or point me to other documentation that does?

  • Any feedback, Sal or Henry?

  • Hi Christian,

    APologies for the delayed response on this.

    The snippet being used ensures that execution is blocked until cache is disabled.

    This will help prevent your application from running into the default handler due to the errata.

  • Adding some further context but we are polling to ensure that any further caching beyond that point of polling is disabled. Sorry if my previous response caused any confusion.