This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TI-CGT: Invalid assembly code generated using TI-CGT-ARMLLVM to compile C++ for R5F

Part Number: TI-CGT

Tool/software:

Hi TI experts,

I have observed some strange behaviour running some bare metal test code on MCU cores on our Jacinto high (J784S4), and I have come to the conclusion it is the compiler generating invalid assembly from valid C++ source.

Attached are two versions of the same C++ file, along with compiler settings and resulting assembly.  I have verified the same behaviour both on compiler version 3.2.2 LTS and 4.0.0 LTS.

reordering_bug_v1.txt
#include "platform_abstraction.h"

int main(void)
{
    PlatformUartInit();

    for(int coreId = 2; coreId < 20; coreId++)
    {
        if(PlatformCoreLoad(static_cast<t_CoreId>(coreId)))
            PlatformUartPrintf("Ok\r\n");
        else
            PlatformUartPrintf("Bad\r\n");
    }

    while(true)
        ;
}

/opt/ti-cgt-armllvm_4.0.0.LTS/bin/tiarmclang
-c
-std=c++14
-Wall
-Wextra
-Werror
-mfloat-abi=hard
-mfpu=vfpv3-d16
-mcpu=cortex-r5
-march=armv7-r
-isystem
/opt/ti-cgt-armllvm_4.0.0.LTS/include
-O1
-g
main.cpp
-o
main.o

Disassembly of main.o:

TEXT Section .text.main, 0x44 bytes at 0x00000000
000000:              main:
000000:               .arm
000000:              :
000000: FEFFFFEB         BL              _Z16PlatformUartInitv [0x0]
000004: 005000E3         MOVW            R5, #0
000008: 006000E3         MOVW            R6, #0
00000c: 005040E3         MOVT            R5, #0
000010: 006040E3         MOVT            R6, #0
000014: 0240A0E3         MOV             R4, #2
000018: 00F020E3         MSR             CPSR_, #0
00001c: 00F020E3         MSR             CPSR_, #0
000020: 7400EFE6         UXTB            R0, R4, ROR #0
000024: FEFFFFEB         BL              _Z16PlatformCoreLoad24t_CoreId [0x24]
000028: 0610A0E1         MOV             R1, R6
00002c: 000050E3         CMP             R0, #0
000030: 0510A011         MOVNE           R1, R5
000034: 0100A0E1         MOV             R0, R1
000038: FEFFFFEB         BL              bsp_printf [0x38]
00003c: 014084E2         ADD             R4, R4, #1
000040: F6FFFFEA         B               0x00000020   ; Infinite loop forever beyond the bounds of the loop that should be up to < 20
reordering_bug_v2.txt
#include "platform_abstraction.h"

int main(void)
{
    PlatformUartInit();

    for(int coreId = 2; coreId < 20; coreId++)
    {
        if(PlatformCoreLoad(static_cast<t_CoreId>(coreId)))
            PlatformUartPrintf("Ok\r\n");
        else
            PlatformUartPrintf("Bad\r\n");
    }

    PlatformUartPrintf("Done looping\r\n");

    while(true)
        ;
}

/opt/ti-cgt-armllvm_4.0.0.LTS/bin/tiarmclang
-c
-std=c++14
-Wall
-Wextra
-Werror
-mfloat-abi=hard
-mfpu=vfpv3-d16
-mcpu=cortex-r5
-march=armv7-r
-isystem
/opt/ti-cgt-armllvm_4.0.0.LTS/include
-O1
-g
main.cpp
-o
main.o

Disassembly of main.o:

TEXT Section .text.main, 0x58 bytes at 0x00000000
000000:              main:
000000:               .arm
000000:              :
000000: FEFFFFEB         BL              _Z16PlatformUartInitv [0x0]
000004: 005000E3         MOVW            R5, #0
000008: 006000E3         MOVW            R6, #0
00000c: 005040E3         MOVT            R5, #0
000010: 006040E3         MOVT            R6, #0
000014: 0240A0E3         MOV             R4, #2
000018: 00F020E3         MSR             CPSR_, #0
00001c: 00F020E3         MSR             CPSR_, #0
000020: 7400EFE6         UXTB            R0, R4, ROR #0
000024: FEFFFFEB         BL              _Z16PlatformCoreLoad24t_CoreId [0x24]
000028: 0610A0E1         MOV             R1, R6
00002c: 000050E3         CMP             R0, #0
000030: 0510A011         MOVNE           R1, R5
000034: 0100A0E1         MOV             R0, R1
000038: FEFFFFEB         BL              bsp_printf [0x38]
00003c: 014084E2         ADD             R4, R4, #1
000040: 140054E3         CMP             R4, #20
000044: F5FFFF1A         BNE             0x00000020
000048: 000000E3         MOVW            R0, #0
00004c: 000040E3         MOVT            R0, #0
000050: 0FE0A0E1         MOV             R14, PC                ; Puts 0x58 into LR, which nop slides into random code on return from bsp_print
000054: FEFFFFEA         B               bsp_printf [0x54]
 

The compiler appears to be trying to merge together a final empty infinite loop at the end of the main function with whatever precedes it (in the first case, a loop that loads other cores, in the second case a print statement).

Either way, this results in illegal behaviour.  In the first case, the bounded loop loading up to 20 cores actually loops infinitely - there is no bounds check generated and the loop iterates on to cores that don't exist, causing the MCU1_0 that services the load requests to fail.

In the second case the code manually sets a return address beyond the end of the function (MOV R14, PC causes PC+8 to end up in LR (R14)), which then nop slides into some random code (in my case atoi from standard library), and eventually corrupts the stack enough to crash.

Can you verify / let me know whether there will be a compiler fix.

  • I am unable to generate the same assembly code.  There must be some detail I am missing.  Pick one version.  For that C++ source file, please follow the directions in the article How to Submit a Compiler Test Case.

    Thanks and regards,

    -George

  • Here is the attached pre-processed file for compiler 4.0.0 LTS per your link, along with command line and disassembly (but they're mostly the same as above).

    5047.main.pp.txt

    main.disass.txt
    Disassembly of /home/s58973/FCASW/fcasw-skr-tp15-digital-processor/out/obj/mcu1_1/apps/boot/main.o:
    
    TEXT Section .text.main, 0x48 bytes at 0x00000000
    000000:              main:
    000000:               .arm
    000000:              :
    000000: FEFFFFEB         BL              _Z16PlatformUartInitv [0x0]
    000004: 005000E3         MOVW            R5, #0
    000008: 006000E3         MOVW            R6, #0
    00000c: 005040E3         MOVT            R5, #0
    000010: 006040E3         MOVT            R6, #0
    000014: 0240A0E3         MOV             R4, #2
    000018: 00F020E3         MSR             CPSR_, #0
    00001c: 00F020E3         MSR             CPSR_, #0
    000020: 7400EFE6         UXTB            R0, R4, ROR #0
    000024: FEFFFFEB         BL              _Z16PlatformCoreLoadh [0x24]
    000028: 0610A0E1         MOV             R1, R6
    00002c: 000050E3         CMP             R0, #0
    000030: 0510A011         MOVNE           R1, R5
    000034: 0100A0E1         MOV             R0, R1
    000038: 0410A0E1         MOV             R1, R4
    00003c: FEFFFFEB         BL              _Z18PlatformUartPrintfPKcz [0x3c]
    000040: 014084E2         ADD             R4, R4, #1
    000044: F5FFFFEA         B               0x00000020
    
    DATA Section .rodata.str1.79872835505060279931, 0x1e bytes at 0x00000000
    000000:              :
    000000:              .L.str:
    000000: 63637553         .word 0x63637553
    000004: 66737365         .word 0x66737365
    000008: 796c6c75         .word 0x796c6c75
    00000c: 616f6c20         .word 0x616f6c20
    000010: 20646564         .word 0x20646564
    000014: 65726f63         .word 0x65726f63
    000018: 0d642520         .word 0x0d642520
    00001c: 0000000a         .word 0x0000000a
    
    DATA Section .rodata.str1.2784758399614570051, 0x1b bytes at 0x00000000
    000000:              :
    000000:              .L.str.1:
    000000: 6c696146         .word 0x6c696146
    000004: 74206465         .word 0x74206465
    000008: 6f6c206f         .word 0x6f6c206f
    00000c: 64656461         .word 0x64656461
    000010: 726f6320         .word 0x726f6320
    000014: 64252065         .word 0x64252065
    000018: 00000a0d         .word 0x00000a0d
    
    main.cmd.txt
    /opt/ti/ti-cgt-armllvm_4.0.0.LTS/bin/tiarmclang -c -std=c++14 -Wall -Wextra -Werror -mfloat-abi=hard -mfpu=vfpv3-d16 -mcpu=cortex-r5 -march=armv7-r -isystem /opt/ti/ti-cgt-armllvm_4.0.0.LTS/include -I/tmp/sanitised/include -I/tmp/sanitised/platform_abstraction/include -DSOC_J784S4 -DSOC_SECURITY_GP -DCORE_MCU1_1 -DCORE_TYPE_R5F -DSELF_CORE=MCU1_1 -DJ784S4_EVM -DBOARD_MCU1_1 -DBUILD_MCU -O1 -g -D_DEBUG_=1 -MT /tmp/sanitised/out/obj/mcu1_1/apps/boot/main.o -MT /tmp/sanitised/out/obj/mcu1_1/apps/boot/main.d -MMD -MP -MF /tmp/sanitised/out/obj/mcu1_1/apps/boot/main.d /tmp/sanitised/apps/boot/main.cpp -o /tmp/sanitised/out/obj/mcu1_1/apps/boot/main.o
    

    Thanks,

    Ross

  • Thank you for the test case.  I am able to reproduce the same behavior.

    It is the same as the known issue EXT_EP-11936.  The description of that issue is not a close match to your code.  But the root cause is the same.  We plan to update that issue to be more clear, and to show how it applies in more situations, including yours.

    The good news is that it is fixed in version 4.0.2.LTS, which is currently available.  Are you able to upgrade to version 4.0.2.LTS?

    Thanks and regards,

    -George

  • Thanks George,

    I can confirm both my cases generate correct code with version 4.0.2.  We'd already worked out we can put a nop in the loop too, so we'll do that until we can roll out the new compiler version.

    Ross