This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/LAUNCHXL2-570LC43: Optimization Level 4 can cause TI ARM v18.12.4.LTS to not respect order of volatile accesses in the C code

Part Number: LAUNCHXL2-570LC43

Tool/software: TI C/C++ Compiler

Section J.3.10 Qualifiers of ARM Optimizing C/C++ Compiler v18.1.0.LTS User's Guide states:

The TI compiler will not reorder two volatile accesses, but it may reorder a volatile and a non-volatile access, so volatile cannot be used to create a critical section.
The TMS570LC4357_emac_status.zip CCS project attached to https://e2e.ti.com/support/microcontrollers/hercules/f/312/p/858377/3179628#3179628 contains the following function designed to sample two volatile registers in a specific order:
static inline void sampleEmacStatus (const bool sample_a_first,
                                     const volatile uint32_t *const in_a, const volatile uint32_t *const in_b,
                                     uint32_t *const out_a, uint32_t *const out_b)
{
    if (sample_a_first)
    {
        *out_a = *in_a;
        *out_b = *in_b;
    }
    else
    {
        *out_b = *in_b;
        *out_a = *in_a;
    }
}

Where the function is alternatively called with the sample_a_first parameter as either false or true.

With the debug configuration for the sample project where the Optimization Level is Off then the generated assembler shown in the debugger does allow the order or the volatile accesses to be controlled on each pass:

0000d618:   E34F0C52            movt       r0, #0xfc52
0000d61c:   E34FCCF7            movt       r12, #0xfcf7
112           if (sample_a_first)
0000d620:   E3510001            cmp        r1, #1
0000d624:   0A000002            beq        $C$L3
119               *out_b = *in_b;
0000d628:   E59CC000            ldr        r12, [r12]
120               *out_a = *in_a;
0000d62c:   E5901000            ldr        r1, [r0]
0000d630:   EA000001            b          $C$L4
114               *out_a = *in_a;
          $C$L3:
0000d634:   E5901000            ldr        r1, [r0]
115               *out_b = *in_b;
0000d638:   E59CC000            ldr        r12, [r12]
138           hdp_null = hdp == 0;
          $C$L4:

Whereas the Release configuration for the sample project where the Optimization Level is 4 in the generated assembler the sampleEmacStatus function appears to be been optimized such that only the first part of the condition is ever taken:

112           if (sample_a_first)
          $C$L39:
000092d8:   E3570001            cmp        r7, #1
000092dc:   E1A0C005            mov        r12, r5
114               *out_a = *in_a;
000092e0:   E5141004            ldr        r1, [r4, #-4]
115               *out_b = *in_b;
000092e4:   E59604C0            ldr        r0, [r6, #0x4c0]

I think r7 is the register which holds the sample_a_first boolean. There is a test on the value of r7, but there is no branch instructions based upon the comparison result.

I.e. optimization level 4 appears to removed the code which attempts to alternate the order of access to volatile registers through each pass. Is this a bug in the compiler, or a limitation of trying to use C code to control the order of access to two volatile registers?

  • Thank you for reporting this problem, and supplying a test case.  I can reproduce the same behavior.  It does appear that the compiler generates incorrect code.  I filed the entry CODEGEN-6909 to have this investigated.  You are welcome to follow it with the SDOWP link below in my signature.

    Thanks and regards,

    -George

  • Hello,

    I had similar issue with << >> operations of inline functions at level level-1 speed-3, producing incorrect bit shift operations. I noticed _INLINE was not in the project symbols or linker paths. After adding inline and reducing linker optimization level 0, speed 2 the inline << >> operations began to work correctly. The noted difference between two projects with the same inline function. 

    Note: level 4 must be passed to the linker for optimizations to take effect. But pre-processor CCS defaults _INLINE=1 was missing from the linker paths even at level 1.