This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SYS/BIOS exception in ti_sysbios_utils_Load_updateCurrentThreadTime__E

Other Parts Discussed in Thread: SYSBIOS

I'm running into an exception in the code section: ti_sysbios_utils_Load_updateCurrentThreadTime__E.  I believe this is the module which is causing the exception because when the exception is hit, the PC lies within the code range for that module.  I have several LOG_print function calls throughout my application, and if I conditionally compile them all out, I do not hit the exception.  Also, if I move the location in my code of some of the LOG_print function calls, I do not hit the exception.

I'm having difficulty finding ways to track this down, as I do not have a call stack upon hitting the exception.  Upon searching the forums, I thought that the Extended Trace Buffer (ETB) utility would be useful; however, I believe I have found that it is not available on the C6748.  

I'm running on a C6748, with CCS 6.1 and SYS/BIOS 6.40.1.15.  I am using the XDS100v3 JTAG emulator.

Any ideas on how to track down what's leading to the exception?

  • Furthermore, the addition of breakpoints completely changes the behavior (with the application crashing earlier or later), and the addition of more trace logging (using LOG_print) completely changes the behavior as well. I just can't get a clear picture of where the application is crashing or what is causing it.
  • Derek,

    Can you tell me what the IRP register is when the exception happens?

    Its sounding to me like an interrupt is causing a corruption and then when the code gets back into LOG_printf, an exception occurs.

    Judah

  • Judah,

    Thanks for helping me look into this!

    That does sound possible, and I hadn't considered that. I do have 2 interrupts occurring which set off DMA events from external memory (an FPGA). The interrupts are asynchronous and each is at approximately 867 Hz. It looks like IRP=0xc0202fa8, and looking at my .map file, that return address lies within the Idle function:

    c0202e20 000001e0 : BIOS.obj (.text:ti_sysbios_utils_Load_idleFxn__E)
    c0203000 000001e0 ti.targets.rts6000.ae674 : Error.oe674 (.text:xdc_runtime_Error_policyDefault__E)

    The complete register dump from the console is:

    B13=0x0
    B14=0xc02632e0 B15=0xc0259e30
    B16=0xc02400b8 B17=0x108
    B18=0x1a4 B19=0x78
    B20=0x8 B21=0x69
    B22=0xffffffff B23=0x0
    B24=0x0 B25=0x0
    B26=0x2 B27=0x90
    B28=0x29 B29=0x4000
    B30=0x118 B31=0xc021b92c
    NTSR=0x1000e
    ITSR=0xf
    IRP=0xc0202fa8
    SSR=0x0
    AMR=0x0
    RILC=0x24
    ILC=0x0
    Exception at 0xc0210004
    EFR=0x2 NRP=0xc0210004
    Internal exception: IERR=0x2
    Fetch packet exception
    ti.sysbios.family.c64p.Exception: line 256: E_exceptionMax: pc = 0xc0210004, sp = 0xc0259e30.
    xdc.runtime.Error.raise: terminating execution


    And from the .map file, you can see the code which resides at the exception location:

    c020fec0 000000c0 : BIOS.obj (.text:ti_sysbios_utils_Load_logCPULoad__I)
    c020ff80 000000c0 : BIOS.obj (.text:ti_sysbios_utils_Load_updateCurrentThreadTime__E)
    c0210040 000000c0 ti.uia.loggers.ae674 : LoggerStopMode.oe674 (.text:ti_uia_loggers_LoggerStopMode_flush__E)
  • Judah,

    Proceeding with your assumption that interrupts were messing things up, I think I have fixed the problem by disabling HWIs and SWIs around the call to LOG_print().

    What I had looks like this:

    #ifdef DO_LOGGING
    #define TRACE2(format, v...) do { Log_print2(Diags_USER1, format, ##v); } while(0)
    #else
    #define TRACE2(format, v...) ((void)0)
    #endif
    

    So in my code, I could just do:

    TRACE2("Here's a log statement: val1 = %d, val2 = %d", val1, val2).  

    I have similar macros for Log_print0() - Log_print6().  

    I have changed this to:

    #ifdef DO_LOGGING
    #define TRACE2(format, v... ) do { Hwi_disable(); Swi_disable(); Log_print2(Diags_USER1, format, ##v); Swi_enable(); Hwi_enable(); } while(0)
    #else
    #define TRACE2(format, v...) ((void)0)
    #endif

    This seems to have stopped my crashing.  However, will there be any unforeseen implications (besides the possible latency associated with disabling interrupts for undetermined amounts of time)?  In other words, these TRACE2() statements are almost all called from HWI or SWI context, and I know that in DSP/BIOS, Swi_disable/Swi_enable could not be called from HWI context, but this has been fixed in SYS/BIOS.

    Any thoughts?  Thanks!

  • Is Log_print() a re-entrant function? Perhaps it wasn't that interrupts were messing things up, but rather the nesting of interrupts which was causing Log_print() to be nested?
  • Despite my previous comments, disabling interrupts around calls to Log_print() did not fix the problem. I added in just a couple more lines of code (in a section which doesn't even get executed), and the crashing is back even with interrupts disabled. I'm back to being stumped!
  • Derek,

    You should be able to call Log_print in Hwi or Swi context.  I don't believe that is the problem as you have already confirmed that disabling Hwis and Swis around Log_print did not solve the problem.

    Your Exception log in the first few posts says its a "Fetch packet exception".  The interesting part is...according to your memory map the exception occurred in an area of a real function.  Typically you get this error, if you're trying to execute code in some area that is not really code but data instead.  Is it possible that memory got corrupted in that part of code?  Can you confirm by looking at that memory before the exception and then after the exception to see if perhaps something corrupted that memory?

    Judah 

  • Thanks, Judah.

    Okay, so I opened a memory map showing that section of memory upon connecting to the target, then I went ahead and let the application run.  Sure enough, I saw a few changed values in program data right around there.  The same memory location seemed to be getting changed every time I hit the exception, so I put a hardware watchpoint at that address, reloaded, reran, and my application breakpointed due to the watchpoint.  The breakpoint would happen upon setting a member variable of an object I had dynamically allocated.  So I check the memory map, and sure enough, my heap starts right before that section of memory, so obviously I have not sized it appropriately.

    I went ahead and statically allocated all the large objects, as I've been meaning to get rid of all my dynamic allocation anyways (bad DSP programming practice, I know!!), and for now it seems that I might have fixed the problem.  It all makes sense too, so that's good.

    Thanks for your help!  I'll now have something else to look for upon hitting random exceptions.  I had verified the stack wasn't getting corrupted but didn't even think about the heap growing too large : - /

    I'll go ahead and verify this question as answered...thanks!

  • Awesome.  Glad you were able to figure it out.