Other Parts Discussed in Thread: SEGGER, HALCOGEN
Issue:
I am having an odd issue where a data abort occurs on a branch instruction.
Setup:
I have the
- Hercules LaunchPad TMS570LC43X connected to my laptop via the Micro USB cable.
- I have a Segger JLink connected to the JTAG port and connected to my laptop.
- I've used HALCoGen to generate some driver code that targets the GCC toolchain
Process:
My simple bootloader starts in FLASH with the HALCoGen startup code that sets up the registers, stacks, mpu, cache, and then jumps to my bootloader's main function. It prints to the UART fine. Drivers seem to work and be compiled properly.
I have a breakpoint set so that right before I jump to the RTOS (in RAM) I can load the RTOS over JTAG and then I continue. This seems to work fine. I can jump to the proper address in RAM and step through instructions and watch the registers get loaded with what they are supposed to be loaded with.
Symptom:
I get to the following spot in the RTOS startup code:
─────────────────────────────────────────────────────────── registers ──── $r0 : 0x8000268 → 0xe0411000 → 0xe0411000 $r1 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r2 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r3 : 0x8011c60 → 0x000000 → 0xea0018e6 → 0xea0018e6 $r4 : 0x200003df → 0x200003df $r5 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r6 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r7 : 0x80002d5 → 0x70b508f6 → 0x70b508f6 $r8 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r9 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r10 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r11 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r12 : 0x000000 → 0xea0018e6 → 0xea0018e6 $sp : 0x8011c60 → 0x000000 → 0xea0018e6 → 0xea0018e6 $lr : 0x8000278 → 0xe59f004c → 0xe59f004c $pc : 0x8000274 → 0xe12fff17 → 0xe12fff17 $cpsr: [negative zero carry overflow INTERRUPT FAST thumb] ─────────────────────────────────────────────────────────────── stack ──── 0x8011c60│+0x0000: 0x000000 → 0xea0018e6 → 0xea0018e6 ← $r3, $sp 0x8011c64│+0x0004: 0x000000 → 0xea0018e6 → 0xea0018e6 0x8011c68│+0x0008: 0x000000 → 0xea0018e6 → 0xea0018e6 0x8011c6c│+0x000c: 0x000000 → 0xea0018e6 → 0xea0018e6 0x8011c70│+0x0010: 0x000000 → 0xea0018e6 → 0xea0018e6 0x8011c74│+0x0014: 0x000000 → 0xea0018e6 → 0xea0018e6 0x8011c78│+0x0018: 0x000000 → 0xea0018e6 → 0xea0018e6 0x8011c7c│+0x001c: 0x000000 → 0xea0018e6 → 0xea0018e6 ──────────────────────────────────────────────────────── code:arm:ARM ──── 0x8000268 <bsp_start_vector_table_end+296> sub r1, r1, r0 0x800026c <bsp_start_vector_table_end+300> ldr r7, [pc, #84] ; 0x80002c8 <bsp_start_hook_0_done+80> 0x8000270 <bsp_start_vector_table_end+304> add r7, r7, r1 → 0x8000274 <bsp_start_vector_table_end+308> bx r7 0x8000278 <bsp_start_hook_0_done+0> ldr r0, [pc, #76] ; 0x80002cc <bsp_start_hook_0_done+84> 0x800027c <bsp_start_hook_0_done+4> ldr r1, [pc, #76] ; 0x80002d0 <bsp_start_hook_0_done+88> 0x8000280 <bsp_start_hook_0_done+8> cmp r0, r1 0x8000284 <bsp_start_hook_0_done+12> beq 0x8000298 <bsp_start_hook_0_done+32> 0x8000288 <bsp_start_hook_0_done+16> ldm r1!, {r2, r3, r4, r5, r6, r7, r8, r9} ─────────────────────────────────── source:../../../bsps/a[...].S+476 ──── 471 ldr r1, =.Lget_absolute_pc 472 .Lget_absolute_pc: 473 sub r1, r0 474 ldr r7, =bsp_start_hook_0 475 add r7, r1 → 476 bx r7 477 478 /* Allow bsp_start_hook_0() hooks to jump to this label */ 479 bsp_start_hook_0_done: 480 481 /* ───────────────────────────────────────────────────────────── threads ──── [#0] Id 1, stopped 0x8000274 in bsp_start_vector_table_end (), reason: SINGLE STEP ─────────────────────────────────────────────────────────────── trace ──── [#0] 0x8000274 → bsp_start_vector_table_end()
This is a GDB view with the GEF visualization showing the registers, source code, and disassembled code in a single view.
If you look at the "code" section the next instruction is going to branch to the address contained in "r7". The "r7" register contains the address 0x80002d5.
However, using the GDB command "ni" (next instruction) takes me to the abort handler:
─────────────────────────────────────────────────────────── registers ──── $r0 : 0xfff7e400 → 0x000000 → 0xea0018e6 → 0xea0018e6 $r1 : 0x01c200 → 0xffffd9ff → 0x000000 → 0xea0018e6 → 0xea0018e6 $r2 : 0x00e898 → ; <UNDEFINED> instruction: 0xffffffff $r3 : 0x3000032 → 0x3000032 $r4 : 0xfff7e400 → 0x000000 → 0xea0018e6 → 0xea0018e6 $r5 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r6 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r7 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r8 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r9 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r10 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r11 : 0x000000 → 0xea0018e6 → 0xea0018e6 $r12 : 0x000000 → 0xea0018e6 → 0xea0018e6 $sp : 0x8001400 → 0x33018333 → 0x33018333 $lr : 0x003608 → 0xa00000f → 0xa00000f $pc : 0x000010 → 0xeafffffe → 0xeafffffe $cpsr: [NEGATIVE zero carry overflow INTERRUPT FAST thumb] ─────────────────────────────────────────────────────────────── stack ──── 0x8001400│+0x0000: 0x33018333 → 0x33018333 ← $sp 0x8001404│+0x0004: 0x62266c33 → 0x62266c33 0x8001408│+0x0008: 0x62636bb3 → 0x62636bb3 0x800140c│+0x000c: 0x681b62a3 → 0x000000 → 0xea0018e6 → 0xea0018e6 0x8001410│+0x0010: 0xb94f2001 → 0xb94f2001 0x8001414│+0x0014: 0xe7b64641 → 0xe7b64641 0x8001418│+0x0018: 0x4620f7ff → 0x4620f7ff 0x800141c│+0x001c: 0xfc112001 → 0x000000 → 0xea0018e6 → 0xea0018e6 ──────────────────────────────────────────────────────── code:arm:ARM ──── 0x4 <undefEntry+0> b 0x4 0x8 <svcEntry+0> b 0x8 0xc <prefetchEntry+0> b 0xc ●→ 0x10 <dataEntry+0> b 0x10 0x14 <dataEntry+4> b 0xdb8 <phantomInterrupt> 0x18 <dataEntry+8> ldr pc, [pc, #-432] ; 0xfffffe70 0x1c <dataEntry+12> ldr pc, [pc, #-432] ; 0xfffffe74 0x20 <deregister_tm_clones+0> ldr r0, [pc, #24] ; 0x40 <deregister_tm_clones+32> 0x24 <deregister_tm_clones+4> ldr r3, [pc, #24] ; 0x44 <deregister_tm_clones+36> ───────────────────────────────────────────────────────────── threads ──── [#0] Id 1, stopped 0x10 in dataEntry (), reason: BREAKPOINT ─────────────────────────────────────────────────────────────── trace ──── [#0] 0x10 → dataEntry() [#1] 0x3608 → sciSetBaudrate(sci=0xfff7e400, baud=0x1c200) ──────────────────────────────────────────────────────────────────────────
As you can see I am now at the data abort handler.
Another interesting tidbit is that the Segger GDB server reports:
Reading 64 bytes @ address 0x08001400 WARNING: Failed to read memory @ address 0x62266C32 WARNING: Failed to read memory @ address 0x62266C32 Reading 64 bytes @ address 0x62266C00 WARNING: Failed to read memory @ address 0x62266C00 WARNING: Failed to read memory @ address 0x62266C32 Received monitor command: cp15 6 0 0 0 Reading CP15 register (6,0,0,0 = 0x62266C32) Received monitor command: cp15 5 0 0 0 Reading CP15 register (5,0,0,0 = 0x00001008)
As you can see, it reports an address that isn't even in the same region, it is in the async RAM section, while I am executing in the RAM section.
I think that it is odd that this is occurring on a branch and not a load or store. It occurs at the same place every time.
Investigation:
I have found some data abort debugging articles and forum posts and have tried to follow as many points as I can.
- The data fault status register (DFSR)
- Value: 0x00001008
- Status [10,3:0]: 0b1000
- Source: Synchronous External Abort
- FAR Validity: Valid
- SD [12]: 0x01
- Only valid for external aborts, which this is
- 1 = AXI Slave error (SLVERR), or unsupported exclusive access, for example exclusive access using the AHB peripheral port, caused the abort
- RW [11]:
- 0: read access caused the abort
- Data Fault Address Register (DFAR)
- Value: 0x62266C32
This is really bizarre because this is what is reported in the Segger GDB server but I have no code there! I'm not branching there and I never see when that address is in my stack or registers. Where is it coming from?
- Auxiliary Data Fault Status Register (ADFSR)
- Value: 0x00000000
- CacheWay [27:24]
- Value:
- Description: The value returned in this field indicates the cache way or ways in which the error occurred.
- Side [23:22]
- Value:
- Description: The value returned in this field indicates the source of the error.
- Recoverable error [21]
- Value:
- Description: The value returned in this field indicates if the error is recoverable.
- Decoded: 0 = Unrecoverable error.
- SideExt [20]:
- Value: 0b0
- Description: The value returned in this field indicates the source of the error. See Table 4-32 for the encodings.
- Decoded:
- Along with Side, this indicates that the source of the error is "Cache/AXIM"
- CPSR
- Value: 0x800003d7
- T [5]: 0
- Not in thumb mode
- M [4:0]: 0b10111:
- Mode: Abort Mode
- SPSR_abt
- Value: 0x800003df
- T[5]: 0
- Not in thumb mode
- M[4:0]: 0b11111
- Mode: System Mode
- The CPSR and SPSR_abt tell me that I am going from regular privileged code execution straight to the abort handler.
The relevant MPU regions are mainly brought over from the HALCoGen generated code, however, I changed the RAM region to be executable:
static const mpu_region_t tms570lc43x_mpu_regions[NUM_MPU_REGIONS] = { { .enabled = true, .region_number = 0, .start_address = 0x0, .size = MPU_4_GB, .type = MPU_NORMAL_OINC_NONSHARED, .permissions = MPU_PRIV_NA_USER_NA_NOEXEC, .disabled_sub_regions = 0xFF, }, { // FLASH .enabled = true, .region_number = 1, .start_address = FLASH_START, .size = MPU_4_MB, .type = MPU_NORMAL_OIWTNOWA_NONSHARED, .permissions = MPU_PRIV_RO_USER_RO_EXEC, }, { // RAM .enabled = true, .region_number = 2, .start_address = RAM_START, .size = MPU_512_KB, .type = MPU_NORMAL_OIWTNOWA_NONSHARED, .permissions = MPU_PRIV_RW_USER_RW_EXEC, }, // ..... other non-relevant regions };
I overwrite the weak MPU initialization function in the HALCoGen code and iterate through this array and setup the regions in the same way. I have used the JTAG and the cp15 instructions to verify that the regions are set up as they are specified here.
I can also use the GDB "examine" command to view and disassemble the code at the address that is originally in "r7". So I feel like it isn't an issue related to accessing the memory or the code not being there. I'm very stumped. I have a feeling that there is some configuration that I am missing with the processor but I can't find anything related to data aborts and branches too.
Any tips are appreciated.