This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F280039: Stack Overflow Detection in SPRA820 is always effective?

Part Number: TMS320F280039
Other Parts Discussed in Thread: C2000WARE, SYSCONFIG

The application report SPRA820 suggests a method to detect a stack overflow, but I'm doubtful about its real effectiveness.

In that application is defined a small window (8 words), monitored by a watchpoint, near the end of the valid stack (lower addresses) - see section 3.1 in the above document. When some code starts clobbering the monitored space, just because the stack pointer is going that much down the area, some event would be triggered to signal it. At first sight, this can appear as a good enough solution, but I doubt this holds (always) true when calling a function like this:

int  foo(int x)

{

    int    buffer[100];

    int    var = x;

    ...

}

If `buffer` happens to be allocated starting before the monitoring window and ending after it, and is never completely written, there's no guarantee that it will trigger the overflow. At the same time, the `var` assignment alone can modify data in an unexpected area (well below the stack area). This function can even return successfully after it has destroyed valuable data, without leaving any trace.

Is this consideration founded as it looks to me?

  • Hi Luca,

    This app note was created before ERAD support. ERAD on F28003x should be easier to use and better documented in our TRM. We already have an example located in our C2000WARE SDK that show cases stack overflow detection using the ERAD module. I would highly recommend referencing to our example to get started with stack overflow detection.

    Best,

    Ryan Ma

  • Hi Ryan,
    both the examples in the file erad_ex3_stack_overflow_detect.c and with SysConfig peripheral in the file  erad_ex3_stack_overflow_detect_syscfg.c behave as the detection method shown in SPRA820, even though they're very clear and short, using the driverlib API.

    Adding the two variable declarations of my example to the function recursiveFunction() should expose again the chance of missing the overflow. As anticipated, there should be some kind of access to buffer (let's say to its last element) and var to validate their presence, or they could be optimized and disappear.
    I'm sorry that I can't test my hypothesis, as I'm in a design analysis phase, without real hardware yet.

    Furthermore, the C2000ware examples increase the chance of happening, as they use a mask value of zero, then they're capturing access to a single address, instead of an 8 words block as in the original. This can be changed, of course, by modifying the mask value.

  • Hi Lucas,

    Maybe I'm misunderstanding the question. Can you send over your test case that exposes your hypothesis?

    Best,

    Ryan Ma

  • From C2000Ware I imported the example in
      C:\ti\c2000\C2000Ware_5_02_00_00\driverlib\f28003x\examples\erad\application_owned_examples
    named
      erad_ex3_stack_overflow_detect_syscfg

    Then I modified the existing recursiveFunction() with following code and comments for explanation:

    // Dummy, just to justify a global access
    volatile uint32_t dummyValue = 0;

    //
    // recursive function to fill the stack
    //
    void recursiveFunction(uint32_t delay)
    {
        // Reserves space on stack: if not written, it won't trigger
        // a stack overflow
        uint32_t    buffer[100];
        // This assignment alone can already be below the
        // lower stack position
        uint32_t    clobber = delay;
        // When overflowing, it *can* happen to have stack as in:
        //  ;[Stack higher address]
        //   [previous calls, etc.]                     ; +   Allowed stack
        //   [return address, frame pointer, ...]       ; +
        //  FP + buffer + 99           : .space 020h    ; +
        //                               .space ...     ; +
        //  __TI_STACK_END - THRESHOLD : .space ...     ; +   [Watch point]
        //                               .space ...     ; +
        //  __TI_STACK_END             : .space ...     ; +   [Stack ends here!]
        //                               .space ...     ; ///  Forbidden area
        //  FP + buffer + 0            : .space 020h    ; ///
        //  FP + clobber               : .space 020h    ; ///
        //
        //  Here (__TI_STACK_END - THRESHOLD) falls between local buffer start
        //  address and buffer end address, and it's the only address checked
        //  for access by the watchpoint in ERAD.
        //  Any write access to `buffer` lower positions, or to `clobber`
        //  will violate a forbidden area

        functionCallCount++;

        // The following lines are here just to give a meaning to
        // local variables, so that they aren't considered unused;
        // buffer[1] ... buffer[99] are untouched
        buffer[0] = functionCallCount + 1;
        dummyValue = clobber + buffer[0];

        //
        // Recursive function
        //
        recursiveFunction(delay + 1UL);
    }

    Saving also the listing of generated ASM seems to confirm my suspicion. Most of the space for `buffer` is untouched, then the watchpoint can be untriggered, while the space below can be clobbered.

  • Hi Luca,

    Thank you for providing this. I will take a look into this using the example and consult with a few folks here to see what we can do to update this example if need be.

    Best,

    Ryan Ma

  • Hi Luca,

    I have been able to replicate your findings and the code ends up in the ITRAP ISR. 

    I will consult with another colleague of mine to see what we can do for this case and if there is a solution.

    Best,

    Ryan Ma

  • Hi Luca,

    We can at best detect writes within a given window such as 0x800-0x8FF for example using the MASK for the ERAD. On C28x there is no ERAD solution for detecting larger jumps. 

    It is best to make sure when allocating an array to write to each address in that array in your recursive function so ERAD will have that trigger.

    Best,

    Ryan Ma