This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CC3220MODA: Optimization issues with O2/O3

Part Number: CC3220MODA

Hi TI,

We're experiencing an issue with optimization O2 (and O3) and UART setup. With O0 there is no issue. We're using "9 2019-q4-major".

What seems to be happening is that a part of the setup code for the UART seems to disappear when using optimization. The code section:

static void SetBaudRate(uint32_t Port, uint32_t BaudRate)
{
    uint32_t ulDivUART;
    uint32_t ulUARTClk;

    ulUARTClk = system_Clock_PeripheralClockGet(Port); //system_Clock_GetSysFreq();

    // Is the required baud rate greater than the maximum rate supported
    // without the use of high speed mode?
    if((BaudRate * 16) > ulUARTClk) // CPU support 3Mbps -> always false
    {
        // Enable high speed mode.
        HWREG(Port + UART_O_CTL) |= UART_CTL_HSE;

        // Half the supplied baud rate to compensate for enabling high speed
        // mode.  This allows the following code to be common to both cases.
        BaudRate /= 2;
    }
    else
    {
        // Disable high speed mode.
        HWREG(Port + UART_O_CTL) &= ~(UART_CTL_HSE);
    }

    // Compute the fractional baud rate divider.
    ulDivUART = (((ulUARTClk * 8) / BaudRate) + 1) / 2;

    // Set the baud rate.
    HWREG(Port + UART_O_IBRD) = ulDivUART / 64;
    HWREG(Port + UART_O_FBRD) = ulDivUART % 64;

}

The function "system_Clock_PeripheralClockGet(port)" returns 80000000.

01012250:   ldr     r0, [pc, #100]  ; (0x10122b8 <UART_SetConfig+204>)
01012252:   movs    r1, #136        ; 0x88
01012254:   bl      0x100bb3c <assert_failed>
138           SetBaudRate(Port, BaudRate);
01012258:   mov     r0, r5
0101225a:   bl      0x100bbd0 <system_Clock_PeripheralClockGet>
//Register values at this point
// sp = 0x2003ff98
// pc = 0x101225e
// 
// r0 = 0x4c4b400
// r1 = 0x1c200
// r2 = 0x4000c000
// r3 = 0x4000d000
// r5 = 0x4000d000
// r6 = 0x60
// r7 = 0x6
// r8 = 0x0
// r10 = 0x1c200
// r12 = 0x20000000
105           if((BaudRate * 16) > ulUARTClk) // CPU support 3Mbps -> always false
0101225e:   ldr     r3, [r5, #48]   ; 0x30
01012260:   cmp.w   r0, r10, lsl #4
108               HWREG(Port + UART_O_CTL) |= UART_CTL_HSE;
01012264:   orr.w   r7, r7, r4
01012268:   itet    cc
0101226a:   orrcc.w r3, r3, #32
0101226e:   biccs.w r3, r3, #32
01012272:   movcc.w r10, r10, lsr #1
121           ulDivUART = (((ulUARTClk * 8) / BaudRate) + 1) / 2;
01012276:   lsls    r1, r0, #3
01012278:   orr.w   r8, r7, r8
0101227c:   udiv    r1, r1, r10
01012280:   adds    r1, #1
124           HWREG(Port + UART_O_IBRD) = ulDivUART / 64;
01012282:   str     r3, [r5, #48]   ; 0x30
01012284:   add.w   r12, r5, #48    ; 0x30
01012288:   lsrs    r3, r1, #7
0101228a:   orr.w   r6, r8, r6
0101228e:   ubfx    r1, r1, #1, #6
01012292:   str     r3, [r5, #36]   ; 0x24
125           HWREG(Port + UART_O_FBRD) = ulDivUART % 64;
01012294:   str     r1, [r5, #40]   ; 0x28
140           HWREG_UART(Port, UART_O_LCRH) = WordLength | StopBits | Parity | Mode; // Write to this reg also set the new baudrate
01012296:   str     r6, [r5, #44]   ; 0x2c
141           MODIFY_REG(HWREG_UART(Port, UART_O_CTL), UART_CTL_RTSEN | UART_CTL_CTSEN, HwFlowCtl);
01012298:   ldr.w   r3, [r12]
0101229c:   ldr     r2, [sp, #40]   ; 0x28
0101229e:   bic.w   r3, r3, #49152  ; 0xc000
010122a2:   orr.w   r4, r3, r2
010122a6:   str.w   r4, [r12]

Above is the assembly output with O2 flag set (i've added the registry values in the function for reference). The compiler seems to have removed the line "HWREG(Port + UART_O_CTL) &= ~(UART_CTL_HSE);" and we can't figure out why. Any assistance in explaining this would be appreciated. 

With best regards

Sebastian

  • I forgot to add that the function is called with arguments: Port=0x4000d000, BaudRate=115200. These values are in the comments in the assembly but easier to have them right away.

  • Could you please try using the TI compiler included with CCS? Does the same issue occur? 

    Furthermore, I noticed the comment "always false", if you take that HWREG statement out of the if-else, would this fix the issue? Can you try moving it out regardless of the if-else result? 

  • Hi, thanks for the reply.

    We prefer to use the gcc-arm-toolchain since then we can keep a common build environment for all projects. I will perform a test with CCS and check its output.

    Removing this will probably solve the issue at hand since then the code that currently gets run will not be present, however in order to ensure that the issue doesn't occur anywhere else in the project we would like to understand why it is optimized out.

    We did resolve the issue by moving the variable outside of the function and not having it static (if static the issue remained)

    uint32_t ulDivUART;
    
    static void SetBaudRate(uint32_t Port, uint32_t BaudRate)
    {
        //uint32_t ulDivUART;
        uint32_t ulUARTClk;
    
        ulUARTClk = system_Clock_PeripheralClockGet(Port); //system_Clock_GetSysFreq();
    ...

  • We changed the compiler from "9 2019-q4-major" to "10 2021.10" which I at first thought resolved the issue
    but after further inspection I realized there were some debug code trailing which fooled me.

    The issue is now narrowed down to being solved by a single NOP instruction.
    This needs to be placed between our call "HWREG(_Port + UART_O_IBRD) = ulDivUART / 64;" and "MODIFY_REG(HWREG_UART(Port, UART_O_CTL), UART_CTL_RTSEN | UART_CTL_CTSEN, HwFlowCtl);"

    if there's no NOP and running at full speed the memory of "HWREG(_Port + UART_O_FBRD)" will have the wrong value.
    However, if we go step by step in the assembly and steps beyond the line "HWREG(_Port + UART_O_FBRD) = ulDivUART % 64;"
    and run at full speed the UARTs will work and values will be correct.

    Just by adding the NOP regardless of optimization (O2/O3) will prevent faulty values in "HWREG(_Port + UART_O_FBRD)"

    The code with "NOP"  commented out:

    static void SetBaudRate(uint32_t _Port, uint32_t _BaudRate)
    {
        uint32_t ulDivUART;
        uint32_t ulUARTClk;
    
        ulUARTClk = system_Clock_PeripheralClockGet(_Port); //system_Clock_GetSysFreq();
    
        // Is the required baud rate greater than the maximum rate supported
        // without the use of high speed mode?
        if((_BaudRate * 16) > ulUARTClk) // CPU support 3Mbps -> always false
        {
            // Enable high speed mode.
            HWREG(_Port + UART_O_CTL) |= UART_CTL_HSE;
    
            // Half the supplied baud rate to compensate for enabling high speed
            // mode.  This allows the following code to be common to both cases.
            _BaudRate /= 2;
        }
        else
        {
            // Disable high speed mode.
            HWREG(_Port + UART_O_CTL) &= ~(UART_CTL_HSE);
        }
    
        // Compute the fractional baud rate divider.
        ulDivUART = (((ulUARTClk * 8) / _BaudRate) + 1) / 2;
        // Set the baud rate.
        HWREG(_Port + UART_O_IBRD) = ulDivUART / 64;
        //__NOP();//Enabling this makes it work
        HWREG(_Port + UART_O_FBRD) = ulDivUART % 64;
    
    }
    
    void UART_SetConfig(uint32_t Port, uint32_t BaudRate, uint32_t WordLength, uint32_t StopBits, uint32_t Parity, uint32_t Mode, uint32_t HwFlowCtl)//, uint32_t OverSampling, uint32_t OneBitSampling)
    {
        assert_param(IS_UART_BAUDRATE(BaudRate));
        assert_param(IS_UART_WORD_LENGTH(WordLength));
        assert_param(IS_UART_STOPBITS(StopBits));
        assert_param(IS_UART_PARITY(Parity));
        assert_param(IS_UART_MODE(Mode));
        assert_param(IS_UART_PORT(Port));
    
        SetBaudRate(Port, BaudRate);
        HWREG_UART(Port, UART_O_LCRH) = WordLength | StopBits | Parity | Mode; // Write to this reg also set the new baudrate
        MODIFY_REG(HWREG_UART(Port, UART_O_CTL), UART_CTL_RTSEN | UART_CTL_CTSEN, HwFlowCtl);
    }

    The assembly O3 with NOP:

    ......... ...
    106           if((_BaudRate * 16) > ulUARTClk) // CPU support 3Mbps -> always false
    0101542a:   cmp.w   r0, r9, lsl #4
    0101542e:   bcs.n   0x101547a <UART_SetConfig+194>
    109               HWREG(_Port + UART_O_CTL) |= UART_CTL_HSE;
    01015430:   ldr.w   r3, [r8, #48]   ; 0x30
    01015434:   orr.w   r3, r3, #32
    113               _BaudRate /= 2;
    01015438:   mov.w   r9, r9, lsr #1
    109               HWREG(_Port + UART_O_CTL) |= UART_CTL_HSE;
    0101543c:   str.w   r3, [r8, #48]   ; 0x30
    01015440:   add.w   r12, r8, #48    ; 0x30
    122           ulDivUART = (((ulUARTClk * 8) / _BaudRate) + 1) / 2;
    01015444:   lsls    r0, r0, #3
    01015446:   udiv    r1, r0, r9
    0101544a:   adds    r1, #1
    125           HWREG(_Port + UART_O_IBRD) = ulDivUART / 64;
    0101544c:   lsrs    r3, r1, #7
    0101544e:   str.w   r3, [r8, #36]   ; 0x24
    1183        __ASM volatile ("nop");
    01015452:   nop     
    142           HWREG_UART(Port, UART_O_LCRH) = WordLength | StopBits | Parity | Mode; // Write to this reg also set the new baudrate
    01015454:   orrs    r7, r4
    01015456:   orrs    r6, r7
    01015458:   orrs    r5, r6
    128           HWREG(_Port + UART_O_FBRD) = ulDivUART % 64;
    0101545a:   ubfx    r1, r1, #1, #6
    0101545e:   str.w   r1, [r8, #40]   ; 0x28
    01015462:   str.w   r5, [r8, #44]   ; 0x2c
    144           MODIFY_REG(HWREG_UART(Port, UART_O_CTL), UART_CTL_RTSEN | UART_CTL_CTSEN, HwFlowCtl);
    01015466:   ldr.w   r3, [r12]
    0101546a:   ldr     r2, [sp, #40]   ; 0x28
    0101546c:   bic.w   r3, r3, #49152  ; 0xc000
    01015470:   orrs    r3, r2
    01015472:   str.w   r3, [r12]
    145       }
    01015476:   ldmia.w sp!, {r3, r4, r5, r6, r7, r8, r9, pc}
    118               HWREG(_Port + UART_O_CTL) &= ~(UART_CTL_HSE);
    0101547a:   ldr.w   r3, [r8, #48]   ; 0x30
    0101547e:   bic.w   r3, r3, #32
    01015482:   str.w   r3, [r8, #48]   ; 0x30
    01015486:   add.w   r12, r8, #48    ; 0x30
    0101548a:   b.n     0x1015444 <UART_SetConfig+140>
    134           assert_param(IS_UART_BAUDRATE(BaudRate));
    0101548c:   ldr     r0, [pc, #8]    ; (0x1015498 <UART_SetConfig+224>)
    0101548e:   movs    r1, #134        ; 0x86
    01015490:   bl      0x100c608 <assert_failed>
    01015494:   b.n     0x10153cc <UART_SetConfig+20>
    01015496:   nop     
    01015498:   cmp     r3, #244        ; 0xf4
    0101549a:   lsls    r2, r0, #4
    0101549c:   stmia   r0!, {}
    0101549e:   ands    r0, r0
    ......... ...
    

    The assembly O3 without NOP:

    ......... ...
    106           if((_BaudRate * 16) > ulUARTClk) // CPU support 3Mbps -> always false
    0101542c:   cmp.w   r0, r6, lsl #4
    01015430:   bcs.n   0x1015472 <UART_SetConfig+186>
    109               HWREG(_Port + UART_O_CTL) |= UART_CTL_HSE;
    01015432:   ldr     r3, [r4, #48]   ; 0x30
    01015434:   orr.w   r3, r3, #32
    113               _BaudRate /= 2;
    01015438:   lsrs    r6, r6, #1
    109               HWREG(_Port + UART_O_CTL) |= UART_CTL_HSE;
    0101543a:   str     r3, [r4, #48]   ; 0x30
    0101543c:   add.w   r12, r4, #48    ; 0x30
    142           HWREG_UART(Port, UART_O_LCRH) = WordLength | StopBits | Parity | Mode; // Write to this reg also set the new baudrate
    01015440:   orr.w   r2, r5, r9
    01015444:   orr.w   r2, r2, r8
    122           ulDivUART = (((ulUARTClk * 8) / _BaudRate) + 1) / 2;
    01015448:   lsls    r1, r0, #3
    0101544a:   udiv    r1, r1, r6
    0101544e:   adds    r1, #1
    125           HWREG(_Port + UART_O_IBRD) = ulDivUART / 64;
    01015450:   lsrs    r3, r1, #7
    142           HWREG_UART(Port, UART_O_LCRH) = WordLength | StopBits | Parity | Mode; // Write to this reg also set the new baudrate
    01015452:   orrs    r2, r7
    128           HWREG(_Port + UART_O_FBRD) = ulDivUART % 64;
    01015454:   ubfx    r1, r1, #1, #6
    125           HWREG(_Port + UART_O_IBRD) = ulDivUART / 64;
    01015458:   str     r3, [r4, #36]   ; 0x24
    128           HWREG(_Port + UART_O_FBRD) = ulDivUART % 64;
    0101545a:   str     r1, [r4, #40]   ; 0x28
    0101545c:   str     r2, [r4, #44]   ; 0x2c
    144           MODIFY_REG(HWREG_UART(Port, UART_O_CTL), UART_CTL_RTSEN | UART_CTL_CTSEN, HwFlowCtl);
    0101545e:   ldr.w   r3, [r12]
    01015462:   ldr     r2, [sp, #40]   ; 0x28
    01015464:   bic.w   r3, r3, #49152  ; 0xc000
    01015468:   orrs    r3, r2
    0101546a:   str.w   r3, [r12]
    145       }
    0101546e:   ldmia.w sp!, {r3, r4, r5, r6, r7, r8, r9, pc}
    118               HWREG(_Port + UART_O_CTL) &= ~(UART_CTL_HSE);
    01015472:   ldr     r3, [r4, #48]   ; 0x30
    01015474:   bic.w   r3, r3, #32
    01015478:   str     r3, [r4, #48]   ; 0x30
    0101547a:   add.w   r12, r4, #48    ; 0x30
    0101547e:   b.n     0x1015440 <UART_SetConfig+136>
    134           assert_param(IS_UART_BAUDRATE(BaudRate));
    01015480:   ldr     r0, [pc, #8]    ; (0x101548c <UART_SetConfig+212>)
    01015482:   movs    r1, #134        ; 0x86
    01015484:   bl      0x100c608 <assert_failed>
    01015488:   b.n     0x10153ce <UART_SetConfig+22>
    0101548a:   nop     
    0101548c:   cmp     r3, #236        ; 0xec
    0101548e:   lsls    r2, r0, #4
    01015490:   stmia   r0!, {}
    01015492:   ands    r0, r0
    ......... ...
    

  • Thanks for discovering this Sebastian. I can relay this information internally, however it would be good to see if this issue occurs in other compilers or if it's specific to the ones you have mentioned here.

  • Hi, thanks for the feedback. From our perspective this doesn't seem to be a compiler issue but rather a timing issue between certain commands:

    If we compile it in O2 with 10 2021.10 we get the following assembly output (I've added comments with "BREAKPOINT #X"):

    ......... ...
    BREAKPOINT #1
    102           if((BaudRate * 16) > ulUARTClk) // CPU support 3Mbps -> always false		
    0101238c:   cmp.w   r0, r6, lsl #4
    01012390:   bcs.n   0x10123d2 <UART_SetConfig+186>
    105               HWREG(Port + UART_O_CTL) |= UART_CTL_HSE;
    01012392:   ldr     r3, [r4, #48]   ; 0x30
    01012394:   orr.w   r3, r3, #32
    109               BaudRate /= 2;
    01012398:   lsrs    r6, r6, #1
    105               HWREG(Port + UART_O_CTL) |= UART_CTL_HSE;
    0101239a:   str     r3, [r4, #48]   ; 0x30
    0101239c:   add.w   r12, r4, #48    ; 0x30
    141           HWREG_UART(Port, UART_O_LCRH) = WordLength | StopBits | Parity | Mode; // Write to this reg also set the new baudrate
    010123a0:   orr.w   r2, r5, r9
    010123a4:   orr.w   r2, r2, r8
    118           ulDiv = (((ulUARTClk * 8) / BaudRate) + 1) / 2;
    010123a8:   lsls    r1, r0, #3
    010123aa:   udiv    r1, r1, r6
    010123ae:   adds    r1, #1
    121           HWREG(Port + UART_O_IBRD) = ulDiv / 64;
    010123b0:   lsrs    r3, r1, #7
    141           HWREG_UART(Port, UART_O_LCRH) = WordLength | StopBits | Parity | Mode; // Write to this reg also set the new baudrate
    010123b2:   orrs    r2, r7
    127           HWREG(Port + UART_O_FBRD) = ulDiv % 64; 
    BREAKPOINT #3
    010123b4:   ubfx    r1, r1, #1, #6
    121           HWREG(Port + UART_O_IBRD) = ulDiv / 64;
    010123b8:   str     r3, [r4, #36]   ; 0x24
    127           HWREG(Port + UART_O_FBRD) = ulDiv % 64;
    010123ba:   str     r1, [r4, #40]   ; 0x28
    010123bc:   str     r2, [r4, #44]   ; 0x2c
    142           MODIFY_REG(HWREG_UART(Port, UART_O_CTL), UART_CTL_RTSEN | UART_CTL_CTSEN, HwFlowCtl); 
    BREAKPOINT #2
    010123be:   ldr.w   r3, [r12]
    010123c2:   ldr     r2, [sp, #40]   ; 0x28
    010123c4:   bic.w   r3, r3, #49152  ; 0xc000
    010123c8:   orrs    r3, r2
    010123ca:   str.w   r3, [r12]
    143       }
    010123ce:   ldmia.w sp!, {r3, r4, r5, r6, r7, r8, r9, pc}
    114               HWREG(Port + UART_O_CTL) &= ~(UART_CTL_HSE);
    010123d2:   ldr     r3, [r4, #48]   ; 0x30
    010123d4:   bic.w   r3, r3, #32
    010123d8:   str     r3, [r4, #48]   ; 0x30
    010123da:   add.w   r12, r4, #48    ; 0x30
    010123de:   b.n     0x10123a0 <UART_SetConfig+136>
    132           assert_param(IS_UART_BAUDRATE(BaudRate));
    010123e0:   ldr     r0, [pc, #8]    ; (0x10123ec <UART_SetConfig+212>)
    010123e2:   movs    r1, #132        ; 0x84
    010123e4:   bl      0x100bb60 <assert_failed>
    010123e8:   b.n     0x101232e <UART_SetConfig+22>
    010123ea:   nop     
    010123ec:   ldmdb   r4, {r0, r8}
    010123f0:   stmia   r0!, {}
    010123f2:   ands    r0, r0
    ......... ...
    

    If we set breakpoints #1 and #2 and run between them and read registers then: 0x4000D024=0x2B, 0x4000D028=0x26

    But if we set a n additional breakpoint at #3 and run between them the pause between #3 and #2 makes the registers: 0x4000D024=0x2B, 0x4000D028=0x1A

    The latter is the correct values, when we add a NOP between #3 and #2 instead of breakpoints this seems to resolve the issue.