This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Stack Overflow when using UART + DMA



Hey guys,

I seem to be getting a Stack Overflow error when using HalUARTWrite(1,"hello",5);

The data writes fine, and I can read the 'hello' on a pc, but I immediately get an IAR popup.

Warning: Possible IDATA stack overflow detected. T

The stack IdataStack is filled to 100% (192 of 192 bytes). The warning threshold is set to 90%.

I've checked my project properties to see my allocated sizes:

Project>Properties>General>Stack/Heap

IDATA 0xC0
PDATA 0x00
XDATA 0x280 


I'm assuming this has to do with an overflow caused by the DMA but I have way of debugging it.  

  • Hi Jonathan,

    These IdataStack warnings can come around for no apparent reason sometimes, especially if you are using a battery.

    A couple of things though.. C-strings actually have a zero byte at the end, so the length is 6 in this case - not that it should matter. Could you also try placing the string in a variable and pass the variable to HalUARTWrite? Again, it shouldn't matter, but crazy things happen with the optimization sometimes.

    When do you get the popup? During a breakpoint that occurs later, or without any breakpoints?

    Best regards,
    Aslak 

  • Hey Aslak,

    So a few things, my device is powered via usb instead of the battery.  

    I also tried multiple messages to transmit.  Specifically:

    unsigned char msg;
    HalUARTWrite(1,&msg,1);

    ...and I still get the popup.

    I do not have any breakpoints set, and the popup appears right after the write function. I know this because if I set a breakpoint on HALUARTWrite and step over it, i'll get the popup there.

    Also, if I set a breakpoint in HalUARTWrite and single step through it, the popup comes right after the return statement.

    Also, I am unable to transmit a message greater than 14 characters... which may or may not be related.

  • Hi Jonathan,

    So, the IDATA is used for call stack as far as I know. But an error may also be falsely triggered by a) reset via brownout, unhandled interrupt or otherwise b) communication loss between debugger and chip.

    Could you perhaps click inside the Disassembly window and step by each instruction near where it fails to find out exactly which instruction causes the error?

    Also, you can click on view ->Memory->IData. You may have to select something else and then idata again. bytes marked 0xCD have never been in use.
    Also, view->Stack will give you an overview of how much is used. Keep an eye on these while debugging. 

    Also, if you look in the memory view and the register view and the disassembly view when it fails, a communication error will show up as 0xFF in most all values in these windows. This may indicate a reset, or other unexpected action. 

    Best regards,
    Aslak 

  • Hi Aslak, Thank you for your very thorough reply. I will try each of those troubleshooting techniques you suggested.

    In the meantime, can you explain a little more about your first statement regarding "unhandled interrupts".  I currently have the following interrupts set, but I'm not sure what interrupts get fired when the hal_uart_dma write gets called.

    My concern is that I have conflicting interrupts here that cause a chain reaction, hence the call stack to go buzzurk.

    IEN2 =  0x0F = 0b00001111 (RFERRIE & ADCIE & URX0IE & URX1IE)
    IEN1 = 0x25 = 0b00100101 (DMAIE & T2IE & P0IE)
    IEN0 =  0x8C = 0b10001100 (UTX0IE & UTX1IE)

  • Why are interrupts enabled for both UARTs? All of the sample apps use one UART or the other, never both at the same time. It is possible to do, but it must be done right. Are you trying to run both UART's?

    Why is Rx ISR enabled at all - the UART by DMA module uses the DMA exclusively to handle Rx and the ISR should not be enabled.

     

  • So in the future, I plan to use both UARTs, but for this test I don't need both. I'll turn that off.

    I will also turn off Rx ISR. I think at this point i was getting desperate and flipping bits on and off to see if the problem would vanish.

    I'll report my findings.  Thanks for your comments. 

  • Hey guys,

    I've got some interesting results here.

    I removed unnecessary interrupts from IEN registers and ran again.  

    I set a breakpoint at the at the HalUartWrite() function and before it's called, the Memory is quite normal.  As I single step through it, there are some changes to the memory, but nothing unusual.  I continue single stepping until i get to OSAL.c: osal_run_system(void).  At this point, the uart data has still not been sent to the port. I can single step all the way through this function, and once I get to the end and the debugger exits the osal function, I get the stack overflow popup.

    If i halt execution while the popup is up, i see some very interesting things in the memory.

    It looks like some of the profile characteristics have moved in there. Is this normal?

    Another thing i tried was to look at the disassembly window while the stack overflow popup occurs.  This shows the cursor inside the RF_NormalIsr function. Specifically A2 E7 MOV C, A.7 

    Can there be some sort of conflict between UART1 Alt1 and the core bluetooth functionality?

  • Hi Jonathan,

    This is not, in fact, normal. Does the IData memory look like this at the beginning? At what point does it go haywire?

    It's possible that you somehow have filled up the XData memory through some bug or other in the code, as IData overlaps the last 256b of XData. This is further made plausible by the fact that the call stack pointer points to 0x4C, which is  a reasonable spot right in the beginning/middle of IData.

    BR,
    Aslak 

  • Hey Aslak,

    To answer your questions above: Things go haywire at the end of  OSAL.c: osal_run_system(void)...once the cursor exits this function.

    So I spent plenty of time this weekend figuring out what is going on here. (And to be honest, I'm still scratching my head) I started from the beginning, using the original keyfob project and original hal_uart driver.  I made the slight modifications to the hal driver that would allow me to set up comms on uart 1, alt 1, and it worked.

    Then I would go an make a minor modification to hal_board_cfg.h (such as modifying LED 3's port to 1.2), and boom.... Stack Overflow.  I'd go back and undo the minor change I just made, and the stack overflow would still be present.  It was as if something got compiled because of the change i made and could not be undone.

    I went through 4-5 iterations of this process, trying to narrow down where this problem was being introduced.  Each time I had thought I narrowed it down, and I didn't. The only way to recover from the stack overflow as to copy the original keyfob project and hal drivers and start again...what a pain!

    Currently, I have uart working, but I don't dare to touch the hal_board_cfg file because I know something terrible is going to happen.

    Are there any known problems where maybe a file that has not been touched is not compiled by IAR and instead, its original compiled version is used. And once you touch that file, it recompiles, causing a problem like this to appear?

  • Hi Jonathan,

    Well. It sure sounds a bit strange. Something that helps sometimes is Project->Clean, which deletes all the intermediate object files, forcing a complete recompilation. Also holding down the reset button on the CC Debugger, even taking out the battery of the device to make sure that the RAM is completely reset. Just to eliminate things.

    Best regards,
    Aslak 

  • Thank's Aslak, I'll continue to monitor this situation and post my results as I find them.  

  • Both of you guys are getting wrapped around the axle and losing the forest for the trees ... and getting quite superstitious as well. There are no known problems to changing things in hal_board_cfg.h (as long as they are consistent with the actual target h/w, which I assume is a custom board, not the TI sample keyfob itself, right?) and in general, the sample applications are not 'fragile' in that changing one thing upsets an unstable equilibrium and everything fall apart, etc.

    So, it seems like there could be a Vdd problem on the customer h/w, and driving Tx causes a bus voltage sag such that the CC254x either resets outside of the debugger control or browns out and comes back, and thus the debugger losing sync with the h/w and claiming a stack overflow.