This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SYS_abort issue with CCS 5.2

In several previous releases, we had a custom function that replaced SYS_abort with a version that left the SRIO enabled so the DSP could be reset from the SRIO switch.  WE have the following entry in the .tcf file to do this:

 

bios.SYS.ABORTFXN = prog.extern("DspAbort__FPce");

bios.SYS.ERRORFXN = prog.extern("DspAbort__FPce");

bios.SYS.EXITFXN = prog.extern("DspAbort__FPce");

 

We also have some test routines that will call this function to assure that if the DSP crashes, that it will reset, reload and come back into operation.

 

   SYS_abort("ForcedSysAbort");

   Logging_Cl::logSevere("ForcedSysAbort");

 

It seems when we moved up to CCS 5.2 from CCS 5.1 this capability is now broken.  When this function is called the DSP still keeps operating and the log message following the call gets printed.

  • I tried removing the  entries in the .tcf file and the system will now reset on command.  So, it seems like out override function is not getting properly linked.

  • Hi David Boles,

    I'm guessing that your change of CCS versions also caused your BIOS version to change.

    Could you please tell me the version of BIOS used in the two scenarios you described? Also XDC tools and TI compiler (code gen) versions?

    It's possible that something changed to cause this between these product versions.

    Steve

  • For the version with problems:

    Compiler: 7.3.4

    Bios: 5.41.13.42

    XDC: 3.23.3.53

    For the version that was OK.  I say was, since I had to upgrade my tools to work for both our legacy application on the 6455 and new application on the 6678.  I have not tested the older project with the new toolset.  I did look at the previous version of .cproject

    DSPBIOS_VERSION=5.41.11.38

    We were still trying to keep the same compiler version (7.3.4) although upgrading the tools changed the directory structure.  I am not sure we were/are using the XDC tools for the 6455.

  • Let me add this as well.  The reason for overriding bios.SYS.ABORTFXN, bios.SYS.ERRORFXN, and bios.SYS.EXITFXN was the concern that DSP BIOS did not leave the SRIO interface active.  Since we boot via the interface, disabling it would require a power cycle to be able to reboot.  We wanted a "warm" reboot of the DSP in case it crashed.  To that end, our override function was:

       // Stay in this loop until reset.
       while (true == true)
       {
          // Disable cache
          CACHE_wbInvAllL2(CACHE_WAIT);

          // Enable all CPPI interrupts
          sSRIO_REGS->RX_CPPI_ICRR = 0;
          sSRIO_REGS->RX_CPPI_ICRR2 = 0;
          sSRIO_REGS->TX_CPPI_ICRR = 0;
          sSRIO_REGS->TX_CPPI_ICRR2 = 0;

          // Enable the Device Reset Interrupt
          sSRIO_REGS->ERR_RST_EVNT_ICRR3 = sINTDST5;

          // Interrupt enable register
          extern cregister volatile unsigned int IER;

          // Control status register.
          extern cregister volatile unsigned int CSR;

          // Global interrupt enable (GIE) and previous GIE.
          static const U32 sCSR_REGISTER = 0x00000003;

          // Enable interrupt 8 (SRIO)
          IER = C64_EINT8;

          // Enable interrupts
          CSR |= sCSR_REGISTER;
       }

    As I have said previously, by removing our override function from config.tcf, the forced abort test function will work.

  • David Boles said:
    As I have said previously, by removing our override function from config.tcf, the forced abort test function will work.

    The default behavior for SYS_abort() is to call UTL_doAbort(), which in turn will call UTL_halt().

    UTL_doAbort() will simply print the string that was passed to it (in this case "ForcedSysAbort") and then call UTL_halt(). UTL_halt() disables all HWIs and then spins forever.  This is meant to trap the program.

    In your above code, you also have a loop that runs forever, but in your case you haven't disabled interrupts.

    While I'm not sure what else your app may be doing, I can think of one possible problem scenario, for example.

    Let's say you had another task running ("task A") which has print statements in it.  Let's say that previously in the application execution, task A ended up pending on a semaphore.  Next, your program runs task B, which calls SYS_abort and is then running the above while (true == true) {...} code.

    Now, say an interrupt comes in and in result an ISR runs that posts the semaphore that task A is pending on.  Your app would next switch back to task A and make some print statements.

    Is it possible something like this could be happening in your application?

    Steve

  • We are not running any tasks using DSP BIOS.

    I left out some of the code before the loop in DspAbort.  Here is the code before the infinite loop:

      // SRIO register overlay
       static const CSL_SrioRegsOvly sSRIO_REGS = reinterpret_cast<CSL_SrioRegsOvly>(CSL_SRIO_0_REGS);

       // Flags to disable all interrupts
       static const U32 sDISABLE_ALL_FLAGS = 0xFFFFFFFF;

       // Control bits for enabling the reset interrupt
       static const U32 sINTDST5 = 5;

       // Disable all doorbell interrupts
       sSRIO_REGS->DOORBELL_INTR_ROUTE[0].DOORBELL_ICRR = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->DOORBELL_INTR_ROUTE[0].DOORBELL_ICRR2 = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->DOORBELL_INTR_ROUTE[sONE].DOORBELL_ICRR = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->DOORBELL_INTR_ROUTE[sONE].DOORBELL_ICRR2 = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->DOORBELL_INTR_ROUTE[sTWO].DOORBELL_ICRR = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->DOORBELL_INTR_ROUTE[sTWO].DOORBELL_ICRR2 = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->DOORBELL_INTR_ROUTE[sTHREE].DOORBELL_ICRR = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->DOORBELL_INTR_ROUTE[sTHREE].DOORBELL_ICRR2 = sDISABLE_ALL_FLAGS;

       // Disable all LSU interrupts
       sSRIO_REGS->LSU_ICRR[0] = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->LSU_ICRR[sONE] = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->LSU_ICRR[sTWO] = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->LSU_ICRR[sTHREE] = sDISABLE_ALL_FLAGS;

       // Disable all error interrupts
       sSRIO_REGS->ERR_RST_EVNT_ICRR = sDISABLE_ALL_FLAGS;
       sSRIO_REGS->ERR_RST_EVNT_ICRR2 = sDISABLE_ALL_FLAGS;

  • This is the reason we wrote our own version of SYS_abort was to leave the SRIO interrupts enabled so the DSP could be reset.  Maybe we did not need to do this since this effort was part of fixing several things.  So the following two questions remain.

    1.  Why does it appear that our override of SYS_abort is not working.  When we look at the logging the chip does, it is still running and sending messages.  It would not do that if it were in the infinite while loop.

    2.  Why does the BIOS version leave the SRIO interface enabled so the SRIO switch can send the reset symbols to reset and download (through the SRIO interface) the DSP.  Even though SRIO interrupts may be disabled the chip can still be rebooted using the BIOS version of SYS_abort.

  • David Boles said:
    1.  ...  When we look at the logging the chip does, it is still running and sending messages.  It would not do that if it were in the infinite while loop.

    Agreed.  What is the code context of the messages that you see being printed?  If you don't have any tasks running, are they in ISR context then?

    David Boles said:
    1.  Why does it appear that our override of SYS_abort is not working. 

    The way that you currently have it set up (with the above abort loop code set to ABORTFXN), SYS_abort() will call your ABORTFXN function, instead of the default behavior of disabling interrupts and trapping (I think this is what you mean by "correctly aborting").

    So, with your configuration, calling SYS_abort() is really just the same as calling your ABORTFXN function directly.  SYS_abort() is just acting as a wrapper function to the abort function you configured in your config.tcf file.  So, if you don't "make" it abort in your ABORTFXN , it won't abort.

    David Boles said:
    2.  Why does the BIOS version leave the SRIO interface enabled so the SRIO switch can send the reset symbols to reset and download (through the SRIO interface) the DSP.  Even though SRIO interrupts may be disabled the chip can still be rebooted using the BIOS version of SYS_abort.

    I think by this you mean the BIOS version (behavior) of SYS_abort().  Again, the default behavior of SYS_abort() is to disable all hardware interrupts (i.e. HWI_disable()) and then spin forever, trapping the program within UTL_halt.  In your case, you are spinning in the infinite loop in an attempt to trap, but since interrupts are still enabled, you are not really trapped there.

    BIOS does not know anything about SRIO - they are completely separate.  So if SRIO is mapped to a bios HWI object, then the standard behavior of SYS_abort() will disable it due to the global disabling of interrupts in UTL_halt.

    Steve

  • Here is a simplified answer to your first question.  We have a set of objects in a list the "running" code loops through the list calling each object.  When that object returns the next object in the list is called, etc.  Sort of a simple round-robin scheduler.  So when I say the chip is running, I see logging from each of the objects, so the main run loop is still executing and not the tight loop in the abort function.

    I  guess we do not fully understand the SRIO engine on the 6455.  If interrupts are disabled, can the SRIO engine still reboot the chip?  By reboot, the Tundra switch is instructed to send 3 reset symbols over the SRIO link layer.  I assume this to be similar to toggling a reset input to the chip.  We then download the firmware using direct IO and send a SRIO doorbel to start the chip.  Once the chip is reset it may be in a different state than it was in SYS_abort so all. this may not really matter.

  • Can you please attach your config.tcf file? (and any files it may include, such as *.tci files)

    David Boles said:
    We have a set of objects in a list the "running" code loops through the list calling each object.  When that object returns the next object in the list is called, etc.  Sort of a simple round-robin scheduler.  So when I say the chip is running, I see logging from each of the objects, so the main run loop is still executing and not the tight loop in the abort function.

    What's the context of this code?  Are you doing this from main()?

  • 5270.Config1_tcf.txt

    main() sets up some hardware then calls the scheduler object that runs in an infinite while loop.

  • Thanks for attaching the config file, I took a look and am wondering about:

    bios.HWI.instance("HWI_INT8").fxn = prog.extern("cb__22DspSrioDriverTi6455_ClSFv");

    bios.HWI.instance("HWI_INT6").fxn = prog.extern("i8k__18ResourceManager_ClSFv");

    If you put break points in these ISR functions, do you still hit the break points after your ABORTFXN has begun running? Do these functions contain statements that print/log output?

    Steve

  • The first interrupt is the SRIO driver.  The fact that log messages are being sent by the DSP and recorded by the control processor indicate this interrupt is still running.  The second interrupt is the 8K clock and only updates a counter.  When our SYS_abort function is called, the main loop that calls each object, will show that each object sends a log message back to the NPU indicating the driver is still working.  So it is like the SYS_abort function is never called.  Whereas, if we use the BIOS SYS_abort, the main loop stops, heartbeat message (from the control CPU) are not replied and the control CPU instructs the SRIO switch controlling that DSP to reset (reboot) that DSP.

  • David Boles said:
    if we use the BIOS SYS_abort, the main loop stops, heartbeat message (from the control CPU) are not replied and the control CPU instructs the SRIO switch controlling that DSP to reset (reboot) that DSP.

    This sounds like the expected behavior.  The default behavior of the BIOS SYS_abort function is to disable interrupts and trap the program.  So you would not see any more output and the program will be forever stuck in UTL_halt. And you won't see any logging any more since you're ISRs are no longer running due to interrupts being disabled.  You can verify this by halting the target at this point.  You should see the PC inside UTL_halt.

    David Boles said:
    When our SYS_abort function is called, the main loop that calls each object, will show that each object sends a log message back to the NPU indicating the driver is still working.  So it is like the SYS_abort function is never called.

    Is your DspAbort__FPce() function being called in this case?

    If so then this sounds like the expected behavior as well.  In this case you changed the default behavior or SYS_abort() as follows:

    bios.SYS.ABORTFXN = prog.extern("DspAbort__FPce");

    Which causes SYS_abort to simply call your function - DspAbort__FPce - instead of disabling interrupts and trapping.  Nothing more, nothing less.  It simply calls your function in place of disabling interrupts and trapping.

    It sounds to me that since your function  DspAbort__FPce does not disable interrupts, that is why you still see the log statements coming out from the SRIO function cb__22DspSrioDriverTi6455_ClSFv that's mapped to HWI8.  Interrupts are still enabled (because you didn't disable them) and therefore ISRs are still running and logging data.

    ...

    Or maybe I'm misunderstanding.  Maybe what you want/expect is to have  SYS_abort() call your function and also keep its original behavior of disabling interrupts and trapping ...? 

    If that's the case, you would need to call HWI_disable() at the beginning of DspAbort__FPce().  In this case, I would expect a call to SYS_abort to do the following:

    1. call DspAbort__FPce()
    2. DspAbort__FPce() will immediately disable interrupts due to HWI_disable() call
    3. After that point, there should be no more logging coming out
    4. The program will then do register configurations you posted above
    5. Then the program will trap within the infinite    while (true == true) loop
    6. At this point, you could verify this by halting the target.  The PC should be stuck/trapped inside your DspAbort__FPce() function

    Steve

  • I think we are getting off the issue a bit.  Our DspAbort tested OK in several previous firmware releases.  The DSP would go into the while loop and "ping" messages from the control CPU would never get a reply.  This would cause the control CPU to send a reset via the SRIO switch to that DSP and reset it.  The DSP may actually get the ping messages, but since the main loop was not running, a reply was not generated.  Something happened when I upgraded to CCS 5.2 (from CCS 5.1) and also upgraded the BIOS version.  Now, the DSP keeps running when a (test) message is sent that forces a call to SYS_abort (our DspAbort).

    It may have not been clear in this thread, but the goal was when the DSP "crashed" that it would get restarted. This is the issue of utmost concern. It is not restarting with the current build.

    In the process of fixing this issue I removed our DspAbort override of SYS_abort.  I cannot say for sure (I did not do the initial development), but it appears that our DspAbort is unnecessary.  The only reason we did this was to leave the SRIO interface enabled so the DSP could be restarted via the SRIO interface. 

  • Hi David,

    Perhaps you are right.  Maybe at this point it would be best to try to figure out which variable during your move from CCSv5.1 -> v5.2 caused the issue.

    Can you try rebuilding your app (in CCSv5.2) with the same components that you used in your CCSv5.1 set up?  Then see if you can get your app back to the behavior you expect?

    I believe CCS v5.1 ships with:

    1. BIOS 5.41.11.38
    2. XDC tools 3.23.01.43
    3. CG tools 7.3.1 (coff and elf)

    If you are able to get your program working as it did before, then you could try to change one of the above variables at a time to see which one may be the culprit.

    You can find BIOS and XDC tools releases here:

    http://software-dl.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/index.html

    Steve