Hi Experts,
I seems to be getting bus fault on a very stable code that is running on several other platform including ST Micro Cortex M3 controller. I tried to debug to the best of my limited assembly knowledge but not able to understand the reason and looking for some help from ARM gurus on this forum.
I am using gcc environment for compilation and gdb for debugging. Complete details of various registers and gdb output is below.
Instruction expected to be executed next and register information on gdb before instruction execution
1: x/i $pc => 0x20008e52 <aes_gen_tables+750>: ldr.w r3, [r7, #2056] ; 0x808 (gdb) info registers r0 0x7b52 31570 r1 0xe5 229 r2 0xe5 229 r3 0x7b00 31488 r4 0x2001ac54 536980564 r5 0xa5a5a5a5 -1515870811 r6 0xa5a5a5a5 -1515870811 r7 0x2001a408 536978440 r8 0xa5a5a5a5 -1515870811 r9 0xa5a5a5a5 -1515870811 r10 0xa5a5a5a5 -1515870811 r11 0xa5a5a5a5 -1515870811 r12 0x200162cc 536961740 sp 0x2001a408 0x2001a408 lr 0x20008f8d 536907661 pc 0x20008e53 0x20008e53 <aes_gen_tables+750> xpsr 0x1000020 16777248
Now r3 is supposed to be loaded with r7 value + 2056 bytes So I printed the address r7 val+2056 as below
(gdb) p/x *0x2001AC10 $1 = 0x66 (gdb) p *0x2001AC10 $2 = 102
Now I executed the instruction with si command on gdb and printed again register information as below
(gdb) si 0x20008e56 444 ((unsigned long)MUL(0x0D, x) << 16) ^ 1: x/i $pc => 0x20008e56 <aes_gen_tables+754>: cmp r3, #0 (gdb) info registers r0 0x7b52 31570 r1 0xe5 229 r2 0xe5 229 r3 0x66 102 r4 0x2001ac54 536980564 r5 0xa5a5a5a5 -1515870811 r6 0xa5a5a5a5 -1515870811 r7 0x2001a408 536978440 r8 0xa5a5a5a5 -1515870811 r9 0xa5a5a5a5 -1515870811 r10 0xa5a5a5a5 -1515870811 r11 0xa5a5a5a5 -1515870811 r12 0x200162cc 536961740 sp 0x2001a408 0x2001a408 lr 0x20008f8d 536907661 pc 0x20008e57 0x20008e57 <aes_gen_tables+754> xpsr 0x1000020 16777248
This means that r3 is loaded correctly with value 102. Now it is expected to execute next instruction (cmp r3, #0) so it should compare value of r3 with 0, but the moment i execute it gdb hangs (nothing comes next)
(gdb) si
I stop gdb with CTRL+C to see where it is executing
Program received signal SIGINT, Interrupt. FaultISR () at ../../common/startup_gcc.c:279 279 } 1: x/i $pc => 0x2000a240 <FaultISR+4>: b.n 0x2000a240 <FaultISR+4>
Ooops..Above shows that after cmp instruction fault handler hitted and processor is looping within Fault handler while loop
. I dumped the register information again as below
(gdb) info registers r0 0x0 0 r1 0xa0e700 10544896 r2 0xa0e700 10544896 r3 0x20019980 536975744 r4 0x2001ac54 536980564 r5 0xa5a5a5a5 -1515870811 r6 0xa5a5a5a5 -1515870811 r7 0x2001991c 536975644 r8 0xa5a5a5a5 -1515870811 r9 0xa5a5a5a5 -1515870811 r10 0xa5a5a5a5 -1515870811 r11 0xa5a5a5a5 -1515870811 r12 0x200162cc 536961740 sp 0x2001991c 0x2001991c <pui32Stack+4024> lr 0xfffffff1 -15 pc 0x2000a241 0x2000a241 <FaultISR+4> xpsr 0x1000023 16777251 (gdb) bt #0 FaultISR () at ../../common/startup_gcc.c:279 #1 <signal handler called> #2 0x2000ee5c in xPortPendSVHandler () at ../../third_party/FreeRTOS/source/portable/GCC/ARM_CM4/port.c:317 #3 <signal handler called> #4 0x2000ed82 in prvPortStartFirstTask () at ../../third_party/FreeRTOS/source/portable/GCC/ARM_CM4/port.c:211 #5 0x2000edb2 in xPortStartScheduler () at ../../third_party/FreeRTOS/source/portable/GCC/ARM_CM4/port.c:244 #6 0x000000fe in ?? () #7 0x000000fe in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?)
I see strange entries in r1 and r2 (0xa0e700), not sure where these values are coming from. I tried to find the value of various fault status registers as below
(gdb) p/x *((unsigned long *)(0xE000ED28)) $3 = 0x8200 (gdb) p/x *((unsigned long *)(0xE000ED38)) $4 = 0xa0e700 (gdb) p/x *((unsigned long *)(0xE000ED2C)) $5 = 0x40000000 (gdb) p/x *((unsigned long *)(0xE000ED30)) $6 = 0x0 (gdb) p/x *((unsigned long *)(0xE000ED3C)) $7 = 0x0 (gdb)
1) CFSR value 0x8200 indicates Bus Fault.
2) BFAR value 0xa0e700 is the faulting address Which really seems to be strange address because I do not have any SRAM portion mapped there nor does cmp instruction was expected to access that address after which fault occuredd. The same value is there in r1 and r2 after fault.
3) HFSR value 0x40000000 shows that it is escalated as hardware fault
I am trying to understand from where the value 0xa0e700 is loaded in r1 and r2 and why cmp instruction is causing the fault.
This is an open source code working on various platforms. This fault is really strange for me and after analyzing for more than 2 days I am not able to reach the root cause of the problem. I suspected stack overflow but I noticed that that vApplicationStackOverflowHook call back is not hitting.
I would also like to know In getting_started_with_wlan_station gcc example where is the stack size defined that FreeRTOS would be using for the task spawned.
Problem is 100% reproducible every time.