Hello,
we noticed some compiliation/memory-alignment-problems when using -O2 compared to -O1. So with -O1 everything works fine. With -O2 we get undefined and prefetch-aborts when accessing memory which was previously allocated on an tlsf-heap. We also noticed that this problem is solveable when we align our structs to allocate 8 Byte-alignments with the __attribute__((aligned(8)))-directive. We use a mix of C and C++.
So we wanted to find the root for this problem and thus we wanted to know which additional compiler-flags are set with -O2 compared to -O1. The compiler-manual does not tell much about the flags used additionally. Just what they do: https://software-dl.ti.com/codegen/docs/tiarmclang/rel1_3_0_LTS/compiler_manual/using_compiler/compiler_options/optimization_options.html (documentation of 1.3.1 compiler links to 1.3.0 at this topic). Based on an stackoverflow-post I tried to find out what the tiarmclang uses for -O2, so I tried it like this:
echo 'int;' | ./tiarmclang -xc -O2 - -o /dev/null -\#\#\#
echo 'int;' | ./tiarmclang -xc -O1 - -o /dev/null -\#\#\#
outputs are:
$ echo 'int;' | ./tiarmclang -xc -O2 - -o /dev/null -\#\#\# TI Arm Clang Compiler 1.3.1.LTS Target: arm-ti-none-eabi Thread model: posix InstalledDir: C:\ti\ti-cgt-armllvm_1.3.1.LTS\bin "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\bin\\tiarmclang.exe" "-cc1" "-triple" "thumbv7em-ti-none-eabihf" "-emit-obj" "--mrelax-relocations" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "-" "-mrelocation-model" "static" "-mframe-pointer=none" "-fmath-errno" "-fno-rounding-math" "-mconstructor-aliases" "-nostdsysteminc" "-fno-zero-initialized-in-bss" "-fdef-uninit-in-bss" "-fcommon" "-ffunction-sections" "-fdata-sections" "-fno-delete-null-pointer-checks" "-fwchar-type=int" "-fshort-enums" "-target-cpu" "cortex-m4" "-target-abi" "aapcs" "-fvisibility" "hidden" "-mfloat-abi" "hard" "-fallow-half-arguments-and-returns" "-fno-split-dwarf-inlining" "-debugger-tuning=gdb" "-resource-dir" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\lib\\clang\\12.0.1" "-internal-isystem" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\lib\\clang\\12.0.1\\include" "-internal-isystem" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\include\\c" "-O2" "-fdebug-compilation-dir" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\bin" "-ferror-limit" "19" "-fno-signed-char" "-fgnuc-version=4.2.1" "-vectorize-loops" "-vectorize-slp" "-faddrsig" "-o" "C:\\msys64\\tmp\\--242b98.o" "-x" "c" "-" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\bin\\tiarmlnk" "-IC:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\lib" "-o" "nul" "C:\\msys64\\tmp\\--242b98.o" "--start-group" "-llibc++.a" "-llibc++abi.a" "-llibc.a" "-llibsys.a" "-llibsysbm.a" "-llibclang_rt.builtins.a" "-llibclang_rt.profile.a" "--end-group" "--cg_opt_level=2"
and:
$ echo 'int;' | ./tiarmclang -xc -O1 - -o /dev/null -\#\#\# TI Arm Clang Compiler 1.3.1.LTS Target: arm-ti-none-eabi Thread model: posix InstalledDir: C:\ti\ti-cgt-armllvm_1.3.1.LTS\bin "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\bin\\tiarmclang.exe" "-cc1" "-triple" "thumbv7em-ti-none-eabihf" "-emit-obj" "--mrelax-relocations" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "-" "-mrelocation-model" "static" "-mframe-pointer=none" "-fmath-errno" "-fno-rounding-math" "-mconstructor-aliases" "-nostdsysteminc" "-fno-zero-initialized-in-bss" "-fdef-uninit-in-bss" "-fcommon" "-ffunction-sections" "-fdata-sections" "-fno-delete-null-pointer-checks" "-fwchar-type=int" "-fshort-enums" "-target-cpu" "cortex-m4" "-target-abi" "aapcs" "-fvisibility" "hidden" "-mfloat-abi" "hard" "-fallow-half-arguments-and-returns" "-fno-split-dwarf-inlining" "-debugger-tuning=gdb" "-resource-dir" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\lib\\clang\\12.0.1" "-internal-isystem" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\lib\\clang\\12.0.1\\include" "-internal-isystem" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\include\\c" "-O1" "-fdebug-compilation-dir" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\bin" "-ferror-limit" "19" "-fno-signed-char" "-fgnuc-version=4.2.1" "-faddrsig" "-o" "C:\\msys64\\tmp\\--ed7ffc.o" "-x" "c" "-" "C:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\bin\\tiarmlnk" "-IC:\\ti\\ti-cgt-armllvm_1.3.1.LTS\\lib" "-o" "nul" "C:\\msys64\\tmp\\--ed7ffc.o" "--start-group" "-llibc++.a" "-llibc++abi.a" "-llibc.a" "-llibsys.a" "-llibsysbm.a" "-llibclang_rt.builtins.a" "-llibclang_rt.profile.a" "--end-group" "--cg_opt_level=1"
only two differences: "-vectorize-loops" "-vectorize-slp"
The problem: you can't set these flags. They do not exist. And they are way less to accomplish the optimization that is noted in the manual.
The problem itself may be rooted in the tlsf since it does not seem to be updated since 2008 but is said to compile and work fine with -O2 based on gcc.
Since I often had some weird undefined and prefetch-aborts that vanished when i moved some libs inside the linker-script I am not sure where the problem is exactly located but it seems that at some optimization-level the alignment for .data-stuff and heap-allocated objects is not working correctly.
Best regards
Felix