This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Exception handling on the C674x

Other Parts Discussed in Thread: SYSBIOS, OMAP-L138

SYSBIOS 6.34.2.1

We have an NMI that we would like to actually ignore on the DSP side of the OMAP-L138.  The ARM is seeing the signal brought in to a GPIO pin.

The NMI is not considered fatal.

We have been able to hook into the NMI exception but returning from it seems to hang the DSP.   We would like to return cleanly and continue on.

What do we need to do to get this exception to return cleanly?

Thanks,

  • Hi Mike,

    I'm wondering if you can use the following API:

        Exception_setReturnPtr()

    passing the NRP to it, so that execution will return to the code where the exception occurred.

    Steve

  • We will give that a shot and try it out...

    Thanks,

  • sigh - no difference...

    I have included the code (below), the Log Info output (below that), and the Exception console (at the bottom)

    ------------------------------------------------------------

    ////////////////////////////////////////////////

    void nmi_power(UArg dummy)
    ////////////////////////////////////////////////

    {

      OSU_POWER_DOWN_PERM_T* nmi_count = (OSU_POWER_DOWN_PERM_T*)OSU_UTIL_REVS_PERM_START_ADD;

       nmi_count->dsp_nmi = nmi_count->dsp_nmi +1;

      Exception_setReturnPtr((ti_sysbios_family_c64p_Exception_FuncPtr)NRP);

      Log_info2("NMI Power drop out occured....NMI current count = 0x%04x, NRP = 0x%08x ",nmi_count->dsp_nmi,NRP);

    }

    ------------------------------------------------------------------------

    Here is what the Log Infos give:

    -------------------------------------------------------------------------

    5144238,ARM9_0,"[../src/OSU_Power_Down.c:157] Power Down Perm [0x62001240] (Init) Assert 0, De-Assert 0, NMI Count = 0",Main Logger,

    9133236289,ARM9_0,[../src/OSU_Power_Down.c:111]  Falling edge Power  interrupt occurred  = 0x0001,Main Logger,

    9182931423,ARM9_0,[../src/OSU_Power_Down.c:118]  Rising edge Power  interrupt occurred  = 0x0001,Main Logger,

    21299263092,C674X_0,"[../src/OSU_FFT_Support.c:1172] NMI Power drop out occured....NMI current count = 0x0001, NRP = 0xc69ae550 ",Main Logger,

    ----------------------------------------------------------------------------------------------------\

    Here is what the Exception looks like on the console:

    ------------------------------------------------------------------------------------------------------

    [C674X_0] A0=0x1 A1=0x1

    A2=0x1 A3=0x1f4

    A4=0xc69401f0 A5=0x114

    A6=0x1df A7=0x16000000

    A8=0xf5be3053 A9=0x0

    A10=0xafffe88 A11=0xc69c6780

    A12=0x0 A13=0x0

    A14=0x0 A15=0x0

    A16=0x3 A17=0x0

    A18=0xc6393f68 A19=0x30

    A20=0x61a7f A21=0xc4dc9c80

    A22=0xc4dc9c80 A23=0xb000000

    A24=0x240 A25=0x0

    A26=0x1 A27=0x248

    A28=0x1a4 A29=0x0

    A30=0x0 A31=0x0

    B0=0x0 B1=0x0

    B2=0x0 B3=0xc6994f94

    B4=0x5a1ca B5=0x16872b0

    B6=0x0 B7=0xc69c6e50

    B8=0x0 B9=0xc69c6c1c

    B10=0x14000103 B11=0x0

    B12=0x0 B13=0x0

    B14=0xc69c84c4 B15=0xc6398030

    B16=0x0 B17=0xc69c6510

    B18=0x0 B19=0x80

    B20=0x69 B21=0x0

    B22=0x1c4 B23=0x1ac

    B24=0xc63afff8 B25=0x1fe

    B26=0xc0ffffff B27=0x1f0

    B28=0x200 B29=0x1c4100c

    B30=0x7f B31=0x0

    NTSR=0x1000f

    ITSR=0xf

    IRP=0xc69950a0

    SSR=0x0

    AMR=0x0

    RILC=0x0

    ILC=0x0

    Exception at 0xc69950b8

    EFR=0x80000000 NRP=0xc69950b8

    Legacy NMI Exception

    ti.sysbios.family.c64p.Exception: line 248: E_exceptionMin: pc = 0xc69950b8, sp = 0xc6398030.

    To see more exception detail, use ROV or set 'ti.sysbios.family.c64p.Exception.enablePrint = true;'

    xdc.runtime.Error.raise: terminating execution

     

     

  • Hi Mike,

    I've spoken with some folks around the building, and it may be possible to recover from the exception you are hitting.  The issue is, however, that on the C6x, because it has an exposed pipeline, an exception could occur during a branch delay slot.  If this were the case, then returning may just cause a crash or some other random/wrong result.  But if the exception did not occur during a branch delay slot, you should be able to return safely from the exception.

    There is a status bit "IB" in the NTSR register that should indicate this (it's also present in the TSR register).  This bit will get set to a value of 1 if the exception happens inside the branch delay slots.  So, you could use this to detect if you can return safely from the exception (i.e. when it's 0).  But if this bit is 1, then you are pretty much out of luck.

    With regards to the previous suggestion I made (with Exc_returnPtr) not working, maybe you can try branching to NRP and see if the result is any different.

    Lastly, if you haven't already, you might try looking at the following documents:

    TMS320C64x/C64x+ DSP CPU and Instruction Set (spru732j.pdf)

    and

    TMS320C64x+ DSP Megamodule Reference Guide (spru871k.pdf)

    Steve

  • My understanding is exactly what you said, some we can return from some we can't.  The hope is that it would be a rare case so most we could return from.  Also the NMI should be rare outside of the testing environment.

    I was hoping there would be a way to maybe shut off or not allow the turning on of the exception engine so that the NMI gets treated like an interrupt?

    I realize I would loose all the other exceptions, but at this point we cant have the HW changed so it is all trade offs at this point.

    The NMI is tied to a GPIO pin and our Power Supply that tells us when we are loosing power.  We have the ARM do all the clean up nessasary and then if power comes back in a given time, we continue on our way.  But the DSP is dieing because of the NMI.  So we are just looking for a way to keep it alive and running.

    Thanks so much for your time.

     

  • Mike Geppert said:
    I was hoping there would be a way to maybe shut off or not allow the turning on of the exception engine so that the NMI gets treated like an interrupt?

    Yes, the NMI can be configured as an interrupt or an exception - I see this as listed as a feature of the 64x+ interrupt controller on page 156 of spru871k:

    "One non-maskable signal that you can use as either an interrupt or an exception (NMI)"

    This is detailed on page 167 in section 7.4.1 CPU – Interrupt Controller Interface:

    "When system exceptions are not enabled in the CPU, the non-maskable interrupt (NMI) acts as an
    interrupt and when received will post a flag to the BIT1 field in the IFR register. When system exceptions
    are enabled in the CPU; however, this flag is not set. Rather, the exception source is identified in the
    exception flag register (EFR) to denote whether the source is NMI, EXCEP, an internal exception, or a
    software exception (SWE/SWENR)."

    Just before this (also on page 167):

    "You can turn on exceptions by setting the global exceptions enable field (GEE) in
    the ITSR register (ITSR). You should enable exceptions prior to enabling any interrupts to ensure that an
    NMI is not received while its mode (exception vs. interrupt) is changing."

    So it looks like you need to ensure that the GEE bit of the ITSR register is turned off in order to disable exceptions.

    Steve

  • Now were on the same page...

    So how do you do that when using SYSBIOS?

    We tried to NOT include the exception module, but it got included anyway.

    We then tried to suck in the exception.c file, but got lots of dependensies that we gave up on.  I think the last ones were more about MACROS that we didnt know and couldnt find what to set them too or how to get them set correctly.

    We then wrote a Voodoo function to go and replace the C code that turned the bit on, with NOPs.   This seemed to stop the exception but we dont interrupt either and we seem to exception for other reasons...

     

  • Mike,

    Mike Geppert said:
    We then tried to suck in the exception.c file, but got lots of dependensies that we gave up on.  I think the last ones were more about MACROS that we didnt know and couldnt find what to set them too or how to get them set correctly.

    I don't think this will work ... you probably need to modify that file in your BIOS installation and then rebuild the BIOS library if you want to make changes to that file.

    Mike Geppert said:
    So how do you do that when using SYSBIOS?

    Looking at the Exception.c file, I see that the GEE is toggled in the function:

    Int Exception_Module_startup(Int phase)
    {
        extern volatile cregister unsigned TSR;
      
        ...

        /* enable EXCEP input to generate an NMI interrupt */
        TSR |= (Exception_TSRXEN | Exception_TSRGEE);

        /* clear EXC bit in TSR from previous exception processing */
        TSR &= ~(Exception_TSREXC);

    ...

    }

    The problem is that, once enabled, the GEE cannot be disabled without resetting the CPU.

    Can you try adding the following setting to your *.cfg file?   This should configure your app to NOT bring in the Exception module, and in result that above code should not be called.

    var Hwi = xdc.useModule("ti.sysbios.family.c64p.Hwi");

    Hwi.enableException = false;

    Steve

  • We finally got the SYS/BIOS to rebuild.  We commented out the  "TSR |= (Exception_TSRXEN | Exception_TSRGEE);".

    The next problem we ran into is that the NMI was being setup as an Interrupt.

    In the HWI.c file we found "Hwi_enableIER(Hwi_module->ierMask);" that was setting the IER to enable the NMI.

    Turns out the Hwi_module->ierMask is getting setup in the MAIN_pe674.c file that is auto generated.

    0x4783 is the value.  We understand all the bits, but the 1?  On the OMAP-L138 this is reserved.  The 2 bit is the one we want to stop.

    So we modifed the HWI.c to crush the lower 2 bits  ~(0x03) and we seem to run.  We only now just got it going so we have much more testing to do.

    Our Lead engineer is a bit nervouse about adding code to change the MASK so the question we have is, can we change this 0x4783 to 0x4780 so how using the CFG file for the ierMask?

    I couldnt find anything in the CDOCs.

    We have not tried to clear the bit outside of the SYS/BIOS yet.  That is on our todo list.  Not sure if this is a write once bit or not.

    Thanks!

  • Mike Geppert said:
    The next problem we ran into is that the NMI was being setup as an Interrupt.

    I thought this was the result you were hoping for?

    -->

    Mike Geppert said:
    I was hoping there would be a way to maybe shut off or not allow the turning on of the exception engine so that the NMI gets treated like an interrupt?

    -------------------------------------------

    Mike Geppert said:

    Turns out the Hwi_module->ierMask is getting setup in the MAIN_pe674.c file that is auto generated.

    0x4783 is the value.  We understand all the bits, but the 1?  On the OMAP-L138 this is reserved.  The 2 bit is the one we want to stop.

    So we modifed the HWI.c to crush the lower 2 bits  ~(0x03) and we seem to run.  We only now just got it going so we have much more testing to do.

    Our Lead engineer is a bit nervouse about adding code to change the MASK so the question we have is, can we change this 0x4783 to 0x4780 so how using the CFG file for the ierMask?

    Have you tried to get the IER mask the way you want at run time by calling Hwi_disableIER()?

    I don't think you can modify bit 1 of IER via the configuration.  When the Hwi module processes the BIOS *.cfg configuration settings (this is done in the JavaScript code of ti/sysbios/family/c64p/Hwi.xs), the first thing it does is enable bit 1:

    function module$static$init(mod, params)
    {
        mod.ierMask = 2;    // enable NMI

    Other IER bits are then set based on Hwi instances and non-dispatched interrupts.  You can see that in Hwi.xs with lines that look like the following:

        if (params.enableInt) {
            mod.ierMask |= 1 << intNum;
        }

    Steve

  • Steve,

    Thanks for you sugestions...

    Ideally we would like to disable the NMI completly.  The NMI pin is tied to a GPIO pin and the ARM is doing the function.  So now we need to simple stop the DSP from crashing basically anyway we can.

    As an exception, we have not had any luck returning cleanly at all.  We understand that this is pretty unlikly as there are conditions where we can not return at all.  We started down a path to clean things up and then jump to cinit in hopes that we could come back and reinitalize everything.  Not ideal, but a better alternative.  We seem to be getting stuck in ipc_start and it is returning not ready.  Seems something to do with the Shared RAM is not setup nicly.  The DSP is the one that owns it, we are calling Ipc_detach and Ipc_stop before we jump.  The ARM is doing the same thing and also has issues with ipc_start returning not ready.

    Next we tried to deal with the NMI as an ISR.  Seems if we do this it takes presidence over being an exception.  Turns out the ISR is level triggered and so it called 100s of times which means nothing else is being done while this is happening.  Our whole system becomes unstable at this point.  Not really sure why.  I am still working on this.

    Next thing was to simple deactivate the ISR by using the HWI_disableIER(0x02).  Here is where we need your help.  It seems that IPC doesnt like this at all.  We loose comunications between the ARM and DSP when we do this.  It is very deminstratable, comment out the line, we comunicate, put it back and we dont.  I am dumb founded as too what the connection is between the IER and the NMI bit?

    We are getting despret on this.  Apperently respinning the board would take a really long time.

    So it sounds like figureing out what the connection between the IER 0x02 bit is and IPC - and then fixing it that if possible might be our best bet.

    Thanks soooo much for your help.

     

  • Are you suggeting we modify the JAVA script?

    That seems bad...

    We actually rebuilt the kernel (not that we are happy with doing that either, we really would like to mod the CFG files some how and get the results we want) to shut down both the NMI Exception and IFR, now the IPC does not comunicate.

    We dont understand the relationship between the NMI and IPC comunication.

    We are using IPC 1.25.02.12 on both the ARM and DSP.

    Thanks.