This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

BIOS error / debugging

Other Parts Discussed in Thread: SYSBIOS

CCS 4.2.3.00004

BIOS 6.31.04.27

 

When running some of my code I get an error:

ti.sysbios.gates.GateMutex: line 114: assertion failure: A_badContext: bad calling context. See GateMutex API doc for details.
xdc.runtime.Error.raise: terminating execution

I am not using any gates directly in my code, although I do not know how the other BIOS modules that I am using might.

The call stack in the debug window is just

Thread [main] (Suspended)   
    0 abort() at exit.c:67 0xc065a120

 

How do I get any useful information from a crash like this to find out where in my code the problem is occurring?

Is there another SYS/BIOS module that I should include, or are there parameters that I should set?

 

When I get an exception I get the register dump (with ti.sysbios.family.c64p.Exception.enablePrint set to TRUE).

 

Thanks,

  Peter Steinberg

  • Hi Peter,

    Are you calling printf or System_printf from a Swi or Hwi? Here is a description of that issue and other reasons you might be getting the assert: http://e2e.ti.com/support/embedded/f/355/p/106478/375273.aspx#375273.

    Todd

     

  • The compiler RTS library is not compiled with any (or enough) symbolic debug info.

    Can you try placing a breakpoint at the following symbol?   I think you'll get a valid stack trace, unless one of the called functions on the stack is within the RTS library.

    _xdc_runtime_Assert_raise__I

    The most common reason for this error is an ISR that calls printf() or some other RTS function that requires a task lock.   We register a mutex with the RTS library to make the library reentrant.  But, this mutex can only be used by task-level code by default.

    You can try changing the type of this mutex to be a "stronger" lock via the BIOS.rtsGateType configuration parameter.   You can set it disable interrupts for example.

    BIOS.rtsGateType = BIOS.GateHwi;


    This is not advised in general as this would make the RTS library code disable interrupts for a long time, but you can try this to see if the RTS library is the source of your problem.

    My guess is that you have an ISR calling printf() in one of the error legs.

  • There are no "printf" or "System_printf" function calls, unless they are somewhere inside the RTS / BIOS.

     

    I am using SysStd as the system runtime support proxy..

     

    I could possibly be calling Memory_alloc from an SWI; I'll have to walk through the code to see if that is happening.

    Is there an example available of setting up and using a different heap manager for this purpose?

     

    Is there a place in the documentation stating these limitations?

     

    The Bios User Guide .pdf file gives an example that uses System_printf inside a SWI (on page 2-41).  Is this example wrong?

     

    Peter

  • I do not have the symbol "_xdc_runtime_Assert_raise__I" in my loadmap or .out file.

  • I'm sorry.  I gave you the wrong symbol.  Can you try 'xdc_runtime_Error_raiseX__F'

    If you want to allocate memory from a Swi, I would suggest using HeapBuf or HeapMultiBuf which are more deterministic and manage a linked list of fixed size memory blocks.  These disable interrupts for short time to update the list.  They can be called from ISR level.   Check the Memory chapter in the latest BIOS User's Guide for more info.  It is also possible to change the gate type for HeapMem to be GateHwi but this is not a good idea as calls to alloc/free could take a long time with interrupts disabled.

    -Karl-

  • System_printf() can be called from a Swi if you use the SysMin system provider.   The default is SysStd which uses C stdio which uses that mutex.

    var SysMin = xdc.useModule('xdc.runtime.SysMin');
    var System = xdc.useModule('xdc.runtime.System');

    System.SupportProxy = SysMin;

  • I verified my code and there are no memory allocation / free calls inside an HWI / SWI.

     

    Breaking at 'xdc_runtime_Error_raiseX__F' give a stack trace of:

    Thread [main] (Suspended)   
        0 xdc_runtime_Error_raiseX__F(struct xdc_runtime_Error_Block *, unsigned short, char *, int, unsigned int, int, int) at Error.c:113 0xc0652c40   

    And again, there are not printf / System_printf calls in the code.

     

    Peter

  • I suspect that it's the RTS lock that is giving you the problem.  To confirm this and help with the search, can you please change the RTS lock to 'none'?    The code won't be thread-safe, but that should be OK for this experiment.

    Add this to your .cfg file:

    BIOS.rtsGateType = BIOS.NoLocking;


    If the problem goes away, then we know that the problem is that there is a call to one of the RTS functions that is using this mutex.   If the problem remains, then we need to look elsewhere (probably HeapMem).

  • The same error occurs with noLocking set.

     

    Under xdc.runtine.Memory:

    defaultHeapInstance              sysbios.heaps.HeapMem.Instance#0

    defaultHeapSize                     1049576

    In the ROV view for HeapMem, I have one entry:

    address 0xc0500790
    buf 0xc0400250
    align 8
    totalSize 0x100000
    totalFreeSize 0xcc0c8
    largestFreeSize 0xcc0c8

     

  • OK.  Then it must be HeapMem that's the problem.

    Can you add this to your .cfg?   This registers a Hwi (interrupt) gate with HeapMem to allow HeapMem to be called from ISR context.

    var HeapMem  = xdc.useModule('ti.sysbios.heaps.HeapMem');
    var GateHwi  = xdc.useModule('ti.sysbios.gates.GateHwi');

    HeapMem.common$.gate = GateHwi.create();
    HeapMem.common$.gateParams = null;


    If the problem goes away, we know that someone is calling Memory_alloc()->HeapMem_alloc() from wrong context.  Next step would be to place a breakpoint on 'ti_sysbios_heaps_HeapMem_alloc__E" and use ROV to check the calling context.   Check ROV->BIOS->currentThreadType as shown below, every time you hit this breakpoint.  The context must be main or task (not Swi or Hwi).

    This is very painful I agree.  We need to figure out how to get that stack backtrace to work out of the box so that this binary search isn't necessary.

     

    -Karl-

  • It does not hit a breakpoint, but fails with:

    ti.sysbios.knl.Clock: line 211: assertion failure: A_badThreadType: Cannot create/delete a Clock from Hwi or Swi thread.
    ti.sysbios.gates.GateMutex: line 114: assertion failure: A_badContext: bad calling context. See GateMutex API doc for details.
    xdc.runtime.Error.raise: terminating execution

    The code is not creating or deleting a timer in an SWI / HWI, but it does start, stop, and reconfigure one in a HWI and a SWI, although that code has been running for a few months.

     

    Peter

  • Hi Peter --

    What APIs are you using to reconfigure the Clock from Hwi and Swi level?   This assert will happen if you call Clock_create() or Clock_construct() from Hwi or Swi level.  Did you change something recently that might have surfaced this?   Calling Clock_construct() from Hwi or Swi should have never worked.  The Clock module manages a linked list of clock objects.  This list is traversed by a function that runs at Swi level.  It is not safe to update this list by another Swi or Hwi.

    The problem with the assert is a known issue which we are fixing for BIOS 6.32.   The root of the problem is that the Assert() is triggered at ISR level.  The assert() calls a function to print the assert() but this function tries to use a GateMutex which causes the gate mutex assert since this whole function trail is running at Swi or Hwi level.

    -Karl-

  • The only clock or timer functions called in an HWI or SWI are:

    Clock_getTicks

    Timer_setFunc

    Timer_setPeriodMicroSecs

    Timer_start

    Timer_stop

    but those calls are all in HWI & SWI handlers that have been running properly for months.

  • Hi Peter --

    Are you calling Clock_create() or Clock_construct() somewhere in your app?    The source code to Clock.c can be found in ti/sysbios/knl/Clock.c in your disty.  

    Clock_create/construct call 'Clock_Instance_init()'.   The 2nd assert below is the assert that is triggering.

    /*
     *  ======== Clock_Instance_init ========
     */
    Void Clock_Instance_init(Clock_Object *obj, Clock_FuncPtr func, UInt timeout,
        const Clock_Params *params)
    {
        Queue_Handle clockQ;

        Assert_isTrue((BIOS_clockEnabled == TRUE), Clock_A_clockDisabled);

        Assert_isTrue(((BIOS_getThreadType() != BIOS_ThreadType_Hwi) &&
                       (BIOS_getThreadType() != BIOS_ThreadType_Swi)),
                            Clock_A_badThreadType);

     

    BIOS Semaphore_pend() and Mailbox_post/pend() use Clock_construct internally if the timeout timeout !=0.    Can you search your code for calls to pend/post and see if any of them have a timeout (timeout value that is not 0 or forever (-1))?  And if it's possible for them to be called from Hwi or Swi level?   You might be able to add an "if BIOS_threadType == Swi/Hwi" check at those call sites and catch the problem.

    If that doesn't work, then maybe you can disassemble the Clock_Instance_init function in a disassembly window and put a breakpoint on the assert/true condition and catch it and get a stack backtrace.

    -Karl-

  • The code never calls Clock_create() or Clock_construct().  It does call Timer_create() once during in the startup task before the rest of the code is running.

    There are no semaphore or mailbox calls with timeouts that are not 0 in the SWI & HWI handlers.

     

    I don't have Clock_Instance_init in my symbol / map file.  I do have some ti_sysbios_kl_Clock_... functions, but the only instance function I see is ti_sysbios_knl_Clock_Instance_finalize__F

  • Hi Peter --

    I think you are compiling your app using "whole_program" profile which does some aggressive inlining of functions.  I think Clock_Instance_init() is being inlined so you don't see the symbol.   Can you try compiling for "debug" profile?  This setting is in the Build Settings->CCS->RTSC page.

    -Karl-

  • Karl,

      thanks for your help in walking through the internals of SYS/BIOS to help me get a stack trace back to into my code.

     

    Peter