Tool/software:
Hi TI experts,
I have observed some strange behaviour running some bare metal test code on MCU cores on our Jacinto high (J784S4), and I have come to the conclusion it is the compiler generating invalid assembly from valid C++ source.
Attached are two versions of the same C++ file, along with compiler settings and resulting assembly. I have verified the same behaviour both on compiler version 3.2.2 LTS and 4.0.0 LTS.
#include "platform_abstraction.h" int main(void) { PlatformUartInit(); for(int coreId = 2; coreId < 20; coreId++) { if(PlatformCoreLoad(static_cast<t_CoreId>(coreId))) PlatformUartPrintf("Ok\r\n"); else PlatformUartPrintf("Bad\r\n"); } while(true) ; } /opt/ti-cgt-armllvm_4.0.0.LTS/bin/tiarmclang -c -std=c++14 -Wall -Wextra -Werror -mfloat-abi=hard -mfpu=vfpv3-d16 -mcpu=cortex-r5 -march=armv7-r -isystem /opt/ti-cgt-armllvm_4.0.0.LTS/include -O1 -g main.cpp -o main.o Disassembly of main.o: TEXT Section .text.main, 0x44 bytes at 0x00000000 000000: main: 000000: .arm 000000: : 000000: FEFFFFEB BL _Z16PlatformUartInitv [0x0] 000004: 005000E3 MOVW R5, #0 000008: 006000E3 MOVW R6, #0 00000c: 005040E3 MOVT R5, #0 000010: 006040E3 MOVT R6, #0 000014: 0240A0E3 MOV R4, #2 000018: 00F020E3 MSR CPSR_, #0 00001c: 00F020E3 MSR CPSR_, #0 000020: 7400EFE6 UXTB R0, R4, ROR #0 000024: FEFFFFEB BL _Z16PlatformCoreLoad24t_CoreId [0x24] 000028: 0610A0E1 MOV R1, R6 00002c: 000050E3 CMP R0, #0 000030: 0510A011 MOVNE R1, R5 000034: 0100A0E1 MOV R0, R1 000038: FEFFFFEB BL bsp_printf [0x38] 00003c: 014084E2 ADD R4, R4, #1 000040: F6FFFFEA B 0x00000020 ; Infinite loop forever beyond the bounds of the loop that should be up to < 20
#include "platform_abstraction.h" int main(void) { PlatformUartInit(); for(int coreId = 2; coreId < 20; coreId++) { if(PlatformCoreLoad(static_cast<t_CoreId>(coreId))) PlatformUartPrintf("Ok\r\n"); else PlatformUartPrintf("Bad\r\n"); } PlatformUartPrintf("Done looping\r\n"); while(true) ; } /opt/ti-cgt-armllvm_4.0.0.LTS/bin/tiarmclang -c -std=c++14 -Wall -Wextra -Werror -mfloat-abi=hard -mfpu=vfpv3-d16 -mcpu=cortex-r5 -march=armv7-r -isystem /opt/ti-cgt-armllvm_4.0.0.LTS/include -O1 -g main.cpp -o main.o Disassembly of main.o: TEXT Section .text.main, 0x58 bytes at 0x00000000 000000: main: 000000: .arm 000000: : 000000: FEFFFFEB BL _Z16PlatformUartInitv [0x0] 000004: 005000E3 MOVW R5, #0 000008: 006000E3 MOVW R6, #0 00000c: 005040E3 MOVT R5, #0 000010: 006040E3 MOVT R6, #0 000014: 0240A0E3 MOV R4, #2 000018: 00F020E3 MSR CPSR_, #0 00001c: 00F020E3 MSR CPSR_, #0 000020: 7400EFE6 UXTB R0, R4, ROR #0 000024: FEFFFFEB BL _Z16PlatformCoreLoad24t_CoreId [0x24] 000028: 0610A0E1 MOV R1, R6 00002c: 000050E3 CMP R0, #0 000030: 0510A011 MOVNE R1, R5 000034: 0100A0E1 MOV R0, R1 000038: FEFFFFEB BL bsp_printf [0x38] 00003c: 014084E2 ADD R4, R4, #1 000040: 140054E3 CMP R4, #20 000044: F5FFFF1A BNE 0x00000020 000048: 000000E3 MOVW R0, #0 00004c: 000040E3 MOVT R0, #0 000050: 0FE0A0E1 MOV R14, PC ; Puts 0x58 into LR, which nop slides into random code on return from bsp_print 000054: FEFFFFEA B bsp_printf [0x54]
The compiler appears to be trying to merge together a final empty infinite loop at the end of the main function with whatever precedes it (in the first case, a loop that loads other cores, in the second case a print statement).
Either way, this results in illegal behaviour. In the first case, the bounded loop loading up to 20 cores actually loops infinitely - there is no bounds check generated and the loop iterates on to cores that don't exist, causing the MCU1_0 that services the load requests to fail.
In the second case the code manually sets a return address beyond the end of the function (MOV R14, PC causes PC+8 to end up in LR (R14)), which then nop slides into some random code (in my case atoi from standard library), and eventually corrupts the stack enough to crash.
Can you verify / let me know whether there will be a compiler fix.