SYS/BIOS exception in ti_sysbios_utils_Load_updateCurrentThreadTime__E

Derek Wilson

Other Parts Discussed in Thread: SYSBIOS

I'm running into an exception in the code section: ti_sysbios_utils_Load_updateCurrentThreadTime__E. I believe this is the module which is causing the exception because when the exception is hit, the PC lies within the code range for that module. I have several LOG_print function calls throughout my application, and if I conditionally compile them all out, I do not hit the exception. Also, if I move the location in my code of some of the LOG_print function calls, I do not hit the exception.

I'm having difficulty finding ways to track this down, as I do not have a call stack upon hitting the exception. Upon searching the forums, I thought that the Extended Trace Buffer (ETB) utility would be useful; however, I believe I have found that it is not available on the C6748.

I'm running on a C6748, with CCS 6.1 and SYS/BIOS 6.40.1.15. I am using the XDS100v3 JTAG emulator.

Any ideas on how to track down what's leading to the exception?

over 8 years ago

0 Derek Wilson over 8 years ago

Intellectual 380 points

Furthermore, the addition of breakpoints completely changes the behavior (with the application crashing earlier or later), and the addition of more trace logging (using LOG_print) completely changes the behavior as well. I just can't get a clear picture of where the application is crashing or what is causing it.

0 judahvang over 8 years ago in reply to Derek Wilson

TI__Mastermind 32475 points

Derek,

Can you tell me what the IRP register is when the exception happens?

Its sounding to me like an interrupt is causing a corruption and then when the code gets back into LOG_printf, an exception occurs.

Judah

0 Derek Wilson over 8 years ago in reply to judahvang

Intellectual 380 points

Judah,

Thanks for helping me look into this!

That does sound possible, and I hadn't considered that. I do have 2 interrupts occurring which set off DMA events from external memory (an FPGA). The interrupts are asynchronous and each is at approximately 867 Hz. It looks like IRP=0xc0202fa8, and looking at my .map file, that return address lies within the Idle function:

c0202e20 000001e0 : BIOS.obj (.text:ti_sysbios_utils_Load_idleFxn__E)
c0203000 000001e0 ti.targets.rts6000.ae674 : Error.oe674 (.text:xdc_runtime_Error_policyDefault__E)

The complete register dump from the console is:

B13=0x0
B14=0xc02632e0 B15=0xc0259e30
B16=0xc02400b8 B17=0x108
B18=0x1a4 B19=0x78
B20=0x8 B21=0x69
B22=0xffffffff B23=0x0
B24=0x0 B25=0x0
B26=0x2 B27=0x90
B28=0x29 B29=0x4000
B30=0x118 B31=0xc021b92c
NTSR=0x1000e
ITSR=0xf
IRP=0xc0202fa8
SSR=0x0
AMR=0x0
RILC=0x24
ILC=0x0
Exception at 0xc0210004
EFR=0x2 NRP=0xc0210004
Internal exception: IERR=0x2
Fetch packet exception
ti.sysbios.family.c64p.Exception: line 256: E_exceptionMax: pc = 0xc0210004, sp = 0xc0259e30.
xdc.runtime.Error.raise: terminating execution

And from the .map file, you can see the code which resides at the exception location:

c020fec0 000000c0 : BIOS.obj (.text:ti_sysbios_utils_Load_logCPULoad__I)
c020ff80 000000c0 : BIOS.obj (.text:ti_sysbios_utils_Load_updateCurrentThreadTime__E)
c0210040 000000c0 ti.uia.loggers.ae674 : LoggerStopMode.oe674 (.text:ti_uia_loggers_LoggerStopMode_flush__E)

0 Derek Wilson over 8 years ago in reply to Derek Wilson

Intellectual 380 points

Judah,

Proceeding with your assumption that interrupts were messing things up, I think I have fixed the problem by disabling HWIs and SWIs around the call to LOG_print().

What I had looks like this:

#ifdef DO_LOGGING
#define TRACE2(format, v...) do { Log_print2(Diags_USER1, format, ##v); } while(0)
#else
#define TRACE2(format, v...) ((void)0)
#endif

So in my code, I could just do:

TRACE2("Here's a log statement: val1 = %d, val2 = %d", val1, val2).

I have similar macros for Log_print0() - Log_print6().

I have changed this to:

#ifdef DO_LOGGING
#define TRACE2(format, v... ) do { Hwi_disable(); Swi_disable(); Log_print2(Diags_USER1, format, ##v); Swi_enable(); Hwi_enable(); } while(0)
#else
#define TRACE2(format, v...) ((void)0)
#endif

This seems to have stopped my crashing. However, will there be any unforeseen implications (besides the possible latency associated with disabling interrupts for undetermined amounts of time)? In other words, these TRACE2() statements are almost all called from HWI or SWI context, and I know that in DSP/BIOS, Swi_disable/Swi_enable could not be called from HWI context, but this has been fixed in SYS/BIOS.

Any thoughts? Thanks!

0 Derek Wilson over 8 years ago in reply to Derek Wilson

Intellectual 380 points

Is Log_print() a re-entrant function? Perhaps it wasn't that interrupts were messing things up, but rather the nesting of interrupts which was causing Log_print() to be nested?

0 Derek Wilson over 8 years ago in reply to judahvang

Intellectual 380 points

Despite my previous comments, disabling interrupts around calls to Log_print() did not fix the problem. I added in just a couple more lines of code (in a section which doesn't even get executed), and the crashing is back even with interrupts disabled. I'm back to being stumped!

0 judahvang over 8 years ago in reply to Derek Wilson

TI__Mastermind 32475 points

Derek,

You should be able to call Log_print in Hwi or Swi context. I don't believe that is the problem as you have already confirmed that disabling Hwis and Swis around Log_print did not solve the problem.

Your Exception log in the first few posts says its a "Fetch packet exception". The interesting part is...according to your memory map the exception occurred in an area of a real function. Typically you get this error, if you're trying to execute code in some area that is not really code but data instead. Is it possible that memory got corrupted in that part of code? Can you confirm by looking at that memory before the exception and then after the exception to see if perhaps something corrupted that memory?

Judah

0 Derek Wilson over 8 years ago in reply to judahvang

Intellectual 380 points

Thanks, Judah.

Okay, so I opened a memory map showing that section of memory upon connecting to the target, then I went ahead and let the application run. Sure enough, I saw a few changed values in program data right around there. The same memory location seemed to be getting changed every time I hit the exception, so I put a hardware watchpoint at that address, reloaded, reran, and my application breakpointed due to the watchpoint. The breakpoint would happen upon setting a member variable of an object I had dynamically allocated. So I check the memory map, and sure enough, my heap starts right before that section of memory, so obviously I have not sized it appropriately.

I went ahead and statically allocated all the large objects, as I've been meaning to get rid of all my dynamic allocation anyways (bad DSP programming practice, I know!!), and for now it seems that I might have fixed the problem. It all makes sense too, so that's good.

Thanks for your help! I'll now have something else to look for upon hitting random exceptions. I had verified the stack wasn't getting corrupted but didn't even think about the heap growing too large : - /

I'll go ahead and verify this question as answered...thanks!

0 judahvang over 8 years ago in reply to Derek Wilson

TI__Mastermind 32475 points

Awesome. Glad you were able to figure it out.

Processors

Processors forum

SYS/BIOS exception in ti_sysbios_utils_Load_updateCurrentThreadTime__E