This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Strange behavior of code execution



Hello dear colleagues developers,

Let me first describe my situation:

in my code I have several things going on:

1. There is an infinite while loop located in my main() function... after HW and interrupt initialization, execution enters the loop and remains there where it outputs some data on the LCD. (roughly 10 times per second, I use delay for this).

2. In the same time I have two interrupts running. One is General purpose timer interrupt (which is scanning the keyboard and doing some other small stuff) executed every 100 ms. Other interrupt is int1_isr which is triggered by external ECG daughterboard (reading of SPI, filtering etc...)

After random time (sometimes 20 minutes, sometimes it only takes a couple of seconds) one of the following occurs:

  • program counter somehow exits the while loop in the main. Two interrupts remain to execute properly. I have valid data coming from ECG board, and I have control over keyboard from the other interrupt...... or
  • GPT timer gets disabled on its own, while loop is still executed but I loose keyboard which is scanned in the timer interrupt. As I can see Interrupt register IER0 gets overwritten somehow, but since that is CPU register and not located in the memory how is that possible? Can the code overwrite that register by writing to some memory location?

Could anyone explain what could be the cause of this strange behavior? I was thinking that somehow stack gets corrupted so I tried debugging using the methods described here but couldn't find anything suspicious. Funny thing is that only one part of the code gets corrupted, while other remain to execute properly.

I'm not using DSP/BIOS.

Please help, this is driving me nuts!

 

 

 

  • DSP_KungFu_Master,

    You don't say which C5000 DSP you are using.  Can you tell us because different DSPs in the family have different characteristics which can effect how a problem such as yours could be debugged.

    Generically, it sounds possible that your problem occurs on return from an interrupt.  You don't say whether the INT1 every goes wrong.  If not I would start looking there.  Make sure that you have complete context save/restore implemented.

    When you find the code has left the while() loop in main(), does it go to someplace consistently in the program space?  Seems strange that if you jump to a random place in the code space that the DSP doesn't eventually get into a bad state.

    Please provide some additional information so we can pursue this further.

    Regards.

  • Sorry I forgot to mention. I'm using C5505 DSP.

    I'm not sure what are you thinking when you say: "complete context save/restore". My function is written in C and is defined as interrupt void int1_isr(). I thought that this way everything will be restored automatically upon exiting from the function. (actually I'm using almost entire code from your ECG daughter board example with comes with the board)

    Concerning your second question, after the code leaves the While() loop in main it jumps to some place in program space where it is stuck in some kind of a loop. I never realized this but the loop is in the memory.c file from the DSP library located in free function, and it stays here forever: memory.c line 348

     

        while (next < sysblock) {

              prev = next;

              next = next->next_free;

        }

     

    If that means something I'm doing memory allocation in interrupt using malloc, and releasing it in other part of my code, so maybe that could be source of the problem?

     

    Thanks,

     

  • Thanks for the clarification on your device and system.

    I was suggesting looking at context save and restore because it is a common problem.  Since you are using the interrupt keyword and not using DSP/BIOS, I agree with you that context save/restore should not be a problem.  Just to verify, are you also using the interrupt keyword with the timer interrupt?

    Did you ever see any problems with the ECG code before you modified it?  If not, then best to concentrate on your modifications. 

    One thing that comes to mind is your use of malloc in an interrupt routine.  Which routine is it?  Malloc is a fairly expensive function to implement on a real-time system, that is why DSP/BIOS includes memory management features that are designed to be very efficient.  If we start by suspecting something related to malloc() is causing the problem, then a few  questions come up:
    1. How big is your heap?  Any possibility you are having some type of memory problem where you overflow the heap into some other data space?
    2. Do you allow nested interrupts in your code?  This is usually a bad practice for real-time code.
    3. Is it possible that you are overrunning your interrupt timing?  Usually you are ok, but miss data, but since you are not releasing the same as malloc occurs maybe some contention is happening.

    Let me know if this leads anywhere?

    Regards.

  • Hello,

     

    I did some additional debugging, and this is what I found out. I put printouts for every malloc() and free() I do (please take notice of the text in red):

     

     

    12:35:33 Malloc pointer 191c4 

    12:35:33 Malloc pointer 191d4 

    12:35:33 Free pointer 191c4 

    12:35:33 Free pointer 191d4 

    12:35:33 Malloc pointer 191c4 

    12:35:33 Free pointer 191c4 

    12:35:33 Malloc pointer 191c4 

    12:35:33 Malloc pointer 191d4 

    12:35:33 Free pointer 191c4 

    12:35:33 Free pointer 191d4 

    12:35:34 Malloc pointer 191c4 

    12:35:34 Free pointer 191c4 

    ...........

    12:35:40 Malloc pointer 191c4 

    12:35:40 Malloc pointer 191c4 

    12:35:40 Free pointer 191c4 

    12:35:40 Free pointer 191c4 

     

     

    It looks like malloc() sometimes returns twice the same address, and when I try to release it for the second time it crashes (software is left in the free() function in memory.c). 

    What can be the cause of such behavior? How to resolve this?

    Any ideas will be appreciated...

     

    Thank you,

  • it looks like the mallocs are in two diffrent ISR's

     

    is this correct?

     

    Malloc may not be thead safe

     

    disable ISR's around the calls to malloc and free

  • Another recommendation is to take malloc and free out of the ISRs and put them in the while loop.  Simply have a flag that indicates memory management is required in the ISR along with whatever information is required to malloc or free. 

  • Hi Tommy

    are malloc and free thread safe?

    this is some usefull code to help you debug where you are at any one time

    void ToggleXf(void)
    {
     static BOOL bState = FALSE;

     if (bState)
     {
      asm("    BCLR XF");
     }
     else
     {
      asm("    BSET XF");
     }

     bState = !bState;
    }

    as long as you can access the XF bit on a scope

    might give you a clue to the run time path through the code

    you should not use printf's in an isr and i think they may also use malloc??

    Sean