This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Ideas to prevent HWI stack overflow?

Hi,

Scenario:

  • I used Hwi_create to trigger my interrupt service routine (RXISR) everytime the board receives an ethernet packet. 
  • HWI params - eventId is the event when EMAC rx triggers, maskSetting is Hwi_MaskingOption_ALL (also tried with SELF)
  • This is the only interrupt configured

Problem:

When the device is hit with a packet burst, device stops on a ISR stack overflow. On ROV, hwiStackPeak will always reach hwiStackSize no matter how high I set the program stack. 

Setup:

  • C6472
  • BIOS 6.34.02.18
  • NDK 2.22.3.20

Question:

I believe the SELF or ALL mask setting should prevent the ISR from re-entering until the function returns. As a test case, I added a rxcount++ at the beginning of my RXISR, and a rxcount-- at the end. For normal operation, rxcount will be 0 (or 1, if you're lucky enough to hit pause in the ISR). In the case of a packet burst, I can see the rxcount go up to as high as ~100 when it crashes. Is there a way to prevent the interrupt service routine from being executed multiple times?

I'm more or less DDOSing my device via hping3 --flood or opening mutiple TCP connections to stream data. I'm open to any ideas.

Thanks,

-Ivan

  • Ivan,

    I'm not sure why you are creating interrupt every time by using Hwi_create().
    Generally speaking, the registration of interrupt handler should be done during system initialization phase and only done one time.
    Did you mean Hwi_delete() was also being used after the interrupt completion ?
    In your case, I think this should be done otherwise the heap memory Hwi is using would be run out. 

    If you see the execution multiple time in your ISR, your ISR code may have some code with heavy processing. In this case, please consider to use tasklet (Swi).
    And if your ISR code is actually very light, I think you can recheck the status of ISR after ISR processing completed.
    The pseudo code may look like below.
    Now I'm assuming there is something status register with ether and user can clear the status bits by writing 1 to the bit field:  

    Void ISR () {

               while (*interrupt_status_reg & RX_COMPLETED_MASK) {
                   /* clear interrupt status*/
                   *interrupt_status_reg = RX_COMPLETED_MASK ;

                   /* do processing*/

                   .....

              }
    }

    Kawada

  • Kawada,

    I apologize if my wording made it sound confusing. Hwi_create is only done once. I meant to say that the ISR is triggered by seeing a packet on the wire.

    Polling on a status bit is not very effective either (I originally tried it with a semaphore). The problem is that when my board sees a burst of packets, multiple RXISRs are being executed (and multiple current context being pushed onto the program stack). Ideally, this shouldn't happen with the SELF or ALL mask option, but this might be a sort of race condition where the interrupt triggers faster than the mask is set.

    I'll play around with the priorities and SWI. Thank you for your inputs.

    -Ivan

  • Ivan,

    This thread definitely got my attention, especially since we've seen problems with NDK apps on some h/w not being able to recover after a ping flood (in particular the DM648).  I wonder if the same thing could be happening on that h/w.

    Reviewing the Hwi documentation (SYS/BIOS user guide) I see the following:

    All hardware interrupts run to completion. If a Hwi is posted multiple times before its ISR has a chance
    to run, the ISR runs only one time. For this reason, you should minimize the amount of code performed
    by a Hwi function.

    If interrupts are globally enabled—that is, by calling Hwi_enable()—an ISR can be preempted by any
    interrupt that has been enabled.

    It sounds like you are NOT seeing this behavior.  Is that right?

    Steve

  • Steve,

    Correct; it seems like if a Hwi is posted multiple times very quickly, the ISR is pushed to the call stack multiple times too.

    I'll continue investigating to see if there are any more clues.

    -Ivan

  • Did you confirm an appropriate IER bit was being enabled in your ISR ?

    If dispatcherAutoNestingSupport property is true in Hwi module, the appropriate bits should be masked off (disabled)
    just before user's ISR callback and then these bits would be restored just after user's ISR completion.

    I think you would be able to check dispatcherAutoNestingSupport status in ROV.

    Kawada

  • HI Ivan, HI Kawada,

    I found this thread and I am facing exactly this issue. I am using NDK version 2.24.02.31. what was your conclusion about this issue and how to resolve it?

    I also tried to play around with priorities, stack size, interrupt masking, source code of NDK, emac driver, nimu layer but no affection.

    Thanks and regards,
    Viet
  • HI Ivan, HI Kawada,

    I found this thread and I am facing exactly this issue. I am using NDK version 2.24.02.31. what was your conclusion about this issue and how to resolve it?

    I also tried to play around with priorities, stack size, interrupt masking, source code of NDK, emac driver, nimu layer but no affection.

    Thanks and regards,
    Viet
  • Can you please open a new thread? Include versions you are using and the device.

    Todd
  • Thanks Todd.

    we found that the problem is the function to handle interrupt of emac driver and Nimu ethernet layer is too long. so when we optimize and now it is working well.

    Viet