This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

posible race condition between SWIs

Hi folks,

I am debugging a very hard to reproduce issue and after a very long analisys I am thinking it is tied to a SWI that is not being executed on time, please see the below description:

  • DSP/BIOS version 5.33.04
  • DSP 64P (on OMAP3430)
  • mySWI (dynamically created) with priority of 8 and initial mailbox value of 0x1 is posted from a HWI as follows:

               SWI_andn(mySWI, 0x1);

  • sleep_SWI (dynamically created) with priority of 8 is posted from the idle task (TSK object of priority 0) as follows:

              SWI_or(sleep_SWI,sleepCode);

sleep_SWI is intended to put the DSP to Sleep via a call to: PWRM_sleepDSP(sleepCode, sleepArgs, 0);  we return only when an OMAP mailbox interrupt pulls us out of sleep state.

When the application hangs I connected the JATG and CCS and I could see that the SWI running is sleep_SWI I guess the DSP is sleeping, waiting for the OMAP mailbox interruption, also the CCS shows the Program Counter nearby to the __PWR_doIdle assembly symbol.

I am thinking that the first SWI to be posted (I mean put in the scheduler ) is the sleep_SWI; then the HWI occurs and  mySWI is also scheduled and the HWI ends; then the DSP/BIOS scheduler takes the control and sleep_SWI is executed (because it was posted first), this means that the DSP will be put in sleep; and mySWI will never be executed since the sleep_SWI is running and nothing can preempt it (only a OMAP mailbox interrupt can). With this situation the application will hang.

Is the above scenario posible?

If yes, if I am able to send the OMAP mailbox INTr the DSP will wake up and the scheduler will execute the peding SWI mySWI?

Is it posible to know if there are other SWIs already in the sheduler before put the DSP to Sleep?

I will appreciate your inputs, any idea can help.

Regards,

Armando.

 

  • You are correct that SWIs of the same priority will execute in FIFO order.  If you want "mySWI" to execute rather than "sleep_SWI" when both are posted, this to me is the very definition of "higher priority".  Why not make "mySWI" a higher priority, or make "sleep_SWI" a lower priority?

  • Thanks Brad,

    You comment makes sense, but the sleep_SWI was created with same priority to give it some autonomy, since we also want to save power.

    Is it posible to know if mySWI has been already scheduled when we are running sleep_SWI? and if that is the case  just increase the priority of mySWI to preempt sleep_SWI?

    I think the root cause of the issue is that the sleepSWI must not be posted just before of posting the mySWI, becauRegrse if this occurs mySWI will never execute.

    Regards,

    Armando.

     

  • Make sure you are using the BIOS dispatcher when posting the SWI from the HWI.

    armando said:
    Is it posible to know if mySWI has been already scheduled when we are running sleep_SWI? and if that is the case  just increase the priority of mySWI to preempt sleep_SWI?

    Not that I know of.  That looks to be pretty deep down into the kernel.  Maybe someone else will have a tip to get the info.

    armando said:
    I think the root cause of the issue is that the sleepSWI must not be posted just before of posting the mySWI, becauRegrse if this occurs mySWI will never execute.

    Does it never execute or just delayed for a really long time till you wake up due to your interrupt?

  • I am guessing if for example getting the status of the SWIs can give me some clue of which have been already put in the scheduler queue, I mean:

    Does the Inactive Status mean the SWI is in the scheduler pending to be executed? which status means what?

    Actually I am thinking the following test to make sure that in fact the mySWI was put in the scheduler:

    in the HWI that post the SWI:

    HWI_handler()
    {
       SWI_andn(mySWI, 0x1);
       post = 1;
    }

    then in the SWI:
    mySWI()
    {
       post = 0;
       . . .
    }

    Thus, when the hang occurs connect with CCS and if the global variable post is 0, this means that the SWI has been executed, but if the post is 1, this means that the SWI has been just put in the scheduler.

    Is the above assumption correct?

    Other thing that I have observed is that when finally the interrupt is sent and wakes up the DSP, mySWI is not executed, rather a TSK of priority 7 that was in blocked state is executed. Does this mean that for some reason mySWI was never posted?

    Regards,

    Armando.