This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS570LC4357: Data Abort on Branch

Part Number: TMS570LC4357
Other Parts Discussed in Thread: SEGGER, HALCOGEN

Issue:

I am having an odd issue where a data abort occurs on a branch instruction.

Setup:

I have the

  • Hercules LaunchPad TMS570LC43X connected to my laptop via the Micro USB cable.
  • I have a Segger JLink connected to the JTAG port and connected to my laptop.
  • I've used HALCoGen to generate some driver code that targets the GCC toolchain

Process:

My simple bootloader starts in FLASH with the HALCoGen startup code that sets up the registers, stacks, mpu, cache, and then jumps to my bootloader's main function. It prints to the UART fine. Drivers seem to work and be compiled properly.

I have a breakpoint set so that right before I jump to the RTOS (in RAM) I can load the RTOS over JTAG and then I continue. This seems to work fine. I can jump to the proper address in RAM and step through instructions and watch the registers get loaded with what they are supposed to be loaded with.

Symptom:

I get to the following spot in the RTOS startup code:

─────────────────────────────────────────────────────────── registers ────
$r0  : 0x8000268  →  0xe0411000  →  0xe0411000
$r1  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r2  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r3  : 0x8011c60  →  0x000000  →  0xea0018e6  →  0xea0018e6
$r4  : 0x200003df  →  0x200003df
$r5  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r6  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r7  : 0x80002d5  →  0x70b508f6  →  0x70b508f6
$r8  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r9  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r10 : 0x000000  →  0xea0018e6  →  0xea0018e6
$r11 : 0x000000  →  0xea0018e6  →  0xea0018e6
$r12 : 0x000000  →  0xea0018e6  →  0xea0018e6
$sp  : 0x8011c60  →  0x000000  →  0xea0018e6  →  0xea0018e6
$lr  : 0x8000278  →  0xe59f004c  →  0xe59f004c
$pc  : 0x8000274  →  0xe12fff17  →  0xe12fff17
$cpsr: [negative zero carry overflow INTERRUPT FAST thumb]
─────────────────────────────────────────────────────────────── stack ────
0x8011c60│+0x0000: 0x000000  →  0xea0018e6  →  0xea0018e6	 ← $r3, $sp
0x8011c64│+0x0004: 0x000000  →  0xea0018e6  →  0xea0018e6
0x8011c68│+0x0008: 0x000000  →  0xea0018e6  →  0xea0018e6
0x8011c6c│+0x000c: 0x000000  →  0xea0018e6  →  0xea0018e6
0x8011c70│+0x0010: 0x000000  →  0xea0018e6  →  0xea0018e6
0x8011c74│+0x0014: 0x000000  →  0xea0018e6  →  0xea0018e6
0x8011c78│+0x0018: 0x000000  →  0xea0018e6  →  0xea0018e6
0x8011c7c│+0x001c: 0x000000  →  0xea0018e6  →  0xea0018e6
──────────────────────────────────────────────────────── code:arm:ARM ────
    0x8000268 <bsp_start_vector_table_end+296> sub    r1,  r1,  r0
    0x800026c <bsp_start_vector_table_end+300> ldr    r7,  [pc,  #84]	; 0x80002c8 <bsp_start_hook_0_done+80>
    0x8000270 <bsp_start_vector_table_end+304> add    r7,  r7,  r1
 →  0x8000274 <bsp_start_vector_table_end+308> bx     r7
    0x8000278 <bsp_start_hook_0_done+0> ldr    r0,  [pc,  #76]	; 0x80002cc <bsp_start_hook_0_done+84>
    0x800027c <bsp_start_hook_0_done+4> ldr    r1,  [pc,  #76]	; 0x80002d0 <bsp_start_hook_0_done+88>
    0x8000280 <bsp_start_hook_0_done+8> cmp    r0,  r1
    0x8000284 <bsp_start_hook_0_done+12> beq    0x8000298 <bsp_start_hook_0_done+32>
    0x8000288 <bsp_start_hook_0_done+16> ldm    r1!,  {r2,  r3,  r4,  r5,  r6,  r7,  r8,  r9}
─────────────────────────────────── source:../../../bsps/a[...].S+476 ────
    471	 	ldr	r1, =.Lget_absolute_pc
    472	 .Lget_absolute_pc:
    473	 	sub	r1, r0
    474	 	ldr	r7, =bsp_start_hook_0
    475	 	add	r7, r1
 →  476	 	bx	r7
    477
    478	 	/* Allow bsp_start_hook_0() hooks to jump to this label */
    479	 bsp_start_hook_0_done:
    480
    481	 	/*
───────────────────────────────────────────────────────────── threads ────
[#0] Id 1, stopped 0x8000274 in bsp_start_vector_table_end (), reason: SINGLE STEP
─────────────────────────────────────────────────────────────── trace ────
[#0] 0x8000274 → bsp_start_vector_table_end()

This is a GDB view with the GEF visualization showing the registers, source code, and disassembled code in a single view.

If you look at the "code" section the next instruction is going to branch to the address contained in "r7". The "r7" register contains the address 0x80002d5.

However, using the GDB command "ni" (next instruction) takes me to the abort handler:

─────────────────────────────────────────────────────────── registers ────
$r0  : 0xfff7e400  →  0x000000  →  0xea0018e6  →  0xea0018e6
$r1  : 0x01c200  →  0xffffd9ff  →  0x000000  →  0xea0018e6  →  0xea0018e6
$r2  : 0x00e898  →   ; <UNDEFINED> instruction: 0xffffffff
$r3  : 0x3000032  →  0x3000032
$r4  : 0xfff7e400  →  0x000000  →  0xea0018e6  →  0xea0018e6
$r5  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r6  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r7  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r8  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r9  : 0x000000  →  0xea0018e6  →  0xea0018e6
$r10 : 0x000000  →  0xea0018e6  →  0xea0018e6
$r11 : 0x000000  →  0xea0018e6  →  0xea0018e6
$r12 : 0x000000  →  0xea0018e6  →  0xea0018e6
$sp  : 0x8001400  →  0x33018333  →  0x33018333
$lr  : 0x003608  →  0xa00000f  →  0xa00000f
$pc  : 0x000010  →  0xeafffffe  →  0xeafffffe
$cpsr: [NEGATIVE zero carry overflow INTERRUPT FAST thumb]
─────────────────────────────────────────────────────────────── stack ────
0x8001400│+0x0000: 0x33018333  →  0x33018333	 ← $sp
0x8001404│+0x0004: 0x62266c33  →  0x62266c33
0x8001408│+0x0008: 0x62636bb3  →  0x62636bb3
0x800140c│+0x000c: 0x681b62a3  →  0x000000  →  0xea0018e6  →  0xea0018e6
0x8001410│+0x0010: 0xb94f2001  →  0xb94f2001
0x8001414│+0x0014: 0xe7b64641  →  0xe7b64641
0x8001418│+0x0018: 0x4620f7ff  →  0x4620f7ff
0x800141c│+0x001c: 0xfc112001  →  0x000000  →  0xea0018e6  →  0xea0018e6
──────────────────────────────────────────────────────── code:arm:ARM ────
          0x4 <undefEntry+0>   b      0x4
          0x8 <svcEntry+0>     b      0x8
          0xc <prefetchEntry+0> b      0xc
●→       0x10 <dataEntry+0>    b      0x10
         0x14 <dataEntry+4>    b      0xdb8 <phantomInterrupt>
         0x18 <dataEntry+8>    ldr    pc,  [pc,  #-432]	; 0xfffffe70
         0x1c <dataEntry+12>   ldr    pc,  [pc,  #-432]	; 0xfffffe74
         0x20 <deregister_tm_clones+0> ldr    r0,  [pc,  #24]	; 0x40 <deregister_tm_clones+32>
         0x24 <deregister_tm_clones+4> ldr    r3,  [pc,  #24]	; 0x44 <deregister_tm_clones+36>
───────────────────────────────────────────────────────────── threads ────
[#0] Id 1, stopped 0x10 in dataEntry (), reason: BREAKPOINT
─────────────────────────────────────────────────────────────── trace ────
[#0] 0x10 → dataEntry()
[#1] 0x3608 → sciSetBaudrate(sci=0xfff7e400, baud=0x1c200)
──────────────────────────────────────────────────────────────────────────

As you can see I am now at the data abort handler.

Another interesting tidbit is that the Segger GDB server reports:

Reading 64 bytes @ address 0x08001400
WARNING: Failed to read memory @ address 0x62266C32
WARNING: Failed to read memory @ address 0x62266C32
Reading 64 bytes @ address 0x62266C00
WARNING: Failed to read memory @ address 0x62266C00
WARNING: Failed to read memory @ address 0x62266C32
Received monitor command: cp15 6 0 0 0
Reading CP15 register (6,0,0,0 = 0x62266C32)
Received monitor command: cp15 5 0 0 0
Reading CP15 register (5,0,0,0 = 0x00001008)

As you can see, it reports an address that isn't even in the same region, it is in the async RAM section, while I am executing in the RAM section.

I think that it is odd that this is occurring on a branch and not a load or store. It occurs at the same place every time.

Investigation:

I have found some data abort debugging articles and forum posts and have tried to follow as many points as I can.

  •   The data fault status register (DFSR)
    • Value: 0x00001008
    • Status [10,3:0]: 0b1000
      • Source: Synchronous External Abort
      • FAR Validity: Valid
    • SD [12]: 0x01
      • Only valid for external aborts, which this is
      • 1 = AXI Slave error (SLVERR), or unsupported exclusive access, for example exclusive access using the AHB peripheral port, caused the abort
    • RW [11]:
      • 0: read access caused the abort
  • Data Fault Address Register (DFAR)
    • Value: 0x62266C32
    • ExclamationThis is really bizarre because this is what is reported in the Segger GDB server but I have no code there! I'm not branching there and I never see when that address is in my stack or registers. Where is it coming from?
  • Auxiliary Data Fault Status Register (ADFSR)
    • Value: 0x00000000
    • CacheWay [27:24]
      • Value:
      • Description: The value returned in this field indicates the cache way or ways in which the error occurred.
    • Side [23:22]
      • Value:
      • Description: The value returned in this field indicates the source of the error.
    • Recoverable error [21]
      • Value:
      • Description: The value returned in this field indicates if the error is recoverable.
      • Decoded: 0 = Unrecoverable error.
    • SideExt [20]:
      • Value: 0b0
      • Description: The value returned in this field indicates the source of the error. See Table 4-32 for the encodings.
      • Decoded:
        • Along with Side, this indicates that the source of the error is "Cache/AXIM"
  • CPSR
    • Value: 0x800003d7
    • T [5]: 0
      • Not in thumb mode
    • M [4:0]: 0b10111:
      • Mode: Abort Mode
  • SPSR_abt
    • Value: 0x800003df
    • T[5]: 0
      • Not in thumb mode
    • M[4:0]: 0b11111
      • Mode: System Mode
  • The CPSR and SPSR_abt tell me that I am going from regular privileged code execution straight to the abort handler.

The relevant MPU regions are mainly brought over from the HALCoGen generated code, however, I changed the RAM region to be executable:

    static const mpu_region_t tms570lc43x_mpu_regions[NUM_MPU_REGIONS] = {
    {
        .enabled = true,
        .region_number = 0,
        .start_address = 0x0,
        .size = MPU_4_GB,
        .type = MPU_NORMAL_OINC_NONSHARED,
        .permissions = MPU_PRIV_NA_USER_NA_NOEXEC,
        .disabled_sub_regions = 0xFF,
    },
    {
        // FLASH
        .enabled = true,
        .region_number = 1,
        .start_address = FLASH_START,
        .size = MPU_4_MB,
        .type = MPU_NORMAL_OIWTNOWA_NONSHARED,
        .permissions = MPU_PRIV_RO_USER_RO_EXEC,
    },
    {
        // RAM
        .enabled = true,
        .region_number = 2,
        .start_address = RAM_START,
        .size = MPU_512_KB,
        .type = MPU_NORMAL_OIWTNOWA_NONSHARED,
        .permissions = MPU_PRIV_RW_USER_RW_EXEC,
    },
    // ..... other non-relevant regions
    };

I overwrite the weak MPU initialization function in the HALCoGen code and iterate through this array and setup the regions in the same way. I have used the JTAG and the cp15 instructions to verify that the regions are set up as they are specified here.

I can also use the GDB "examine" command to view and disassemble the code at the address that is originally in "r7". So I feel like it isn't an issue related to accessing the memory or the code not being there. I'm very stumped. I have a feeling that there is some configuration that I am missing with the processor but I can't find anything related to data aborts and branches too.

Any tips are appreciated.

  • Is the RTOS a separate application image which includes device/clock/peripheral initialization, and its own interrupt vector, linker cmd file? Can the RTOS application boot-up correctly if it is programmed to 0x00000000?

  • Data Fault Address Register (DFAR)
    • Value: 0x62266C32
    • This is really bizarre because this is what is reported in the Segger GDB server but I have no code there! I'm not branching there and I never see when that address is in my stack or registers.

    The stack view *after* the data abort is shown does show 0x62266c33 which only differs in that the least significant bit is set:

    ─────────────────────────────────────────────────────────────── stack ────
    0x8001400│+0x0000: 0x33018333 → 0x33018333 ← $sp
    0x8001404│+0x0004: 0x62266c33 → 0x62266c33

    I'm not sure if the GDB view with the GEF visualization is somehow de-referencing pointers which is triggering reads leading to aborts.

  • and Chester, I must admit that this entire issue was the cause of my own lapse in intelligence.

    I was using the next instruction command in GDB instead of step on that branch. However when I let the program go it still didn't print, which is because the serial driver wasn't setting the baud properly.

    , you're suggestion got me looking into booting from 0x00000000 and I was getting the same issue at the same place. This made me evaluate my method more and right away I realized I was using the wrong command.

    The root of my issue was that my serial driver was setting the baud wrong. The application that I am using is a hello world application and I believe it is causing the abort once it is done.

    Thanks all for taking a look!

  • Chester,

    Good catch! Thanks, but I did end up resolving the issue. I couldn't figure out how to @ your username in my explanation.