AM263P4-Q1: Difference in behaviour of __TI_auto_init function

Part Number: AM263P4-Q1

Hi expert, 

I am writing this question related to an issue raised by my customer. I'd like to discuss a counterintuitive behavior I'm observing with the TI ARM Clang compiler on a Cortex-R5 target (Device is AM263P).

  Setup:
  - Compiler: TI ARM Clang v4.0.x.LTS
  - Target: ARM Cortex-R5 (running from L2 SRAM)
  - Linker initialization model: --rom_model

  Observation:

I'm measuring the cycle count of __TI_auto_init. The __TI_auto_init is called from the boot code. 

When a particular function call is present in main(), the initialization takes fewer
  cycles (~448K). When that same function call is commented out, initialization takes more cycles (~555K) — a difference of ~107K cycles.

This is counterintuitive because removing a function call makes the binary smaller, yet initialization becomes slower.

  What I've traced so far:

  1. Removing the function call shrinks main() by ~10 bytes.
  2. This causes the linker to shift all functions placed after main() in .text.
  3. The shifted addresses change function pointer values stored in .data (specifically an I/O function table of ~240 bytes).
  4. Profiled the execution cycle of __TI_auto_init by disabling cache. Still the observartions were the same. 
  
Context on the function call:                                                                                                                                                                     
  The function in question is Dio_WriteChannel(), a standard AUTOSAR MCAL Digital I/O driver API used throughout the application for GPIO operations. It is already called multiple times in the
  normal execution flow (pin reads, writes, and toggles as part of the DIO driver validation). The specific instance I'm commenting out is a single additional GPIO write at the end of main(). This
  is not an unusual or synthetic addition, it's a routine driver call that any developer might add or remove during normal application development. The observation is that this trivial, one-line
  source change (adding or removing a single existing API call) causes a ~107K cycle difference in __TI_auto_init, which executes before main() even runs.
  
  I am also attaching the map file of my test. I have copied the map file in cases where this function call is present and when the function call is not present.

Why this is important for the customer:
Customer is configuring a watchdog time in their application. This variation in the device startup code is affecting their WDG timeout calculations  
  
  Could you please help us understand this behaviour.
  

Attaching the map files,

with-additional-fn-call.txt 

without-additional-fn-call.txt 

 

dio_app_sip.zip 

 

I have also attached the project.

The __TI_auto_init function is here:  Utils/boot_armv7r_asm.asm

image.png

 

This is the additional function call I am referring to: example/DioApp.c

image.png


  Thanks & Regards,
  Aswin