This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Tool/software: TI C/C++ Compiler
Hi everyone,
we are using your microcontroler TMS320F28027 for a DC/DC-Converter.
We have very sporadic errors (about 1 time per day).
With an oscilloscope we could measure the error. It looks as if our controller is based on a wrong value for a few cycles. Whether it is a wrong current or voltage value we could not determine so far. Now the question is if there are other effects which can change the value of a variable at one time and reset it at a later time. So far we have been looking for overflowing variables, pointers, runtime problems of the ADC interrupt and concept errors. Now that we have been looking for the cause for a while, we hope to get new ideas from you.
Is it possible that a variable overwrites an adjacent variable in memory due to an overflow?
Is it possible that sporadic errors occur due to a heavily loaded ADC interrupt?
Is a high utilization of the flash memory critical in any way? We currently use 89% of the flash.
Regards,
Joachim
Have you taken care of the known ADC issues from the F2802x Errata? Is there any correlation with potential noise from power loading?
Joachim Sinner said:Is it possible that a variable overwrites an adjacent variable in memory due to an overflow?
Buffer overflows are typically software-driven so this would be dependent on the programming practices implemented in the custom application.
Stack overflows are certainly possible and have the potential to overwrite memory beyond the intended stack area. You can prefill the allocated stack memory with a known value to see if the stack is ever utilized fully after running for some time.
Joachim Sinner said:Is it possible that sporadic errors occur due to a heavily loaded ADC interrupt?
Real-time deadlines might be missed if there is heavy concurrent activity that might delay the ADC ISR.
For significant delays, the ADC has an OVERFLOW flag that will be set if additional ADC interrupts are generated while an existing interrupt is waiting to be serviced. It does not sound like your system has such a significant delay because the ADC will stop responding to interrupts until its OVERFLOW flag is serviced.
Joachim Sinner said:Is a high utilization of the flash memory critical in any way? We currently use 89% of the flash.
Is the flash being programmed at run-time? There should not be any negative effects from filling the flash at load-time.
Thank you very much for your answer.
Yes we have implemented the known errors from the errata in a test software. However, the errors still occur.
We will investigate the stack more closely. So far we have examined it with the help of the "Stack Usage" tool and could not find any particular load.
Thank you very much for the hint with the OVERFLOW flag. This flag is not reset anywhere in our code. So we can exclude an overload of the ADC-IR.
The flash is not programmed during runtime.
Unfortunately we could not reproduce the problem in debugging. So far it only occurs in the field.
In the search for the cause we have tested different approaches. Two completely different changes in content led to the fact that the problem no longer occurred.
For such phenomena we only know problems with pointers as the cause. But we use pointers only to output information via serial communication. Pointer operations are never used to write data.
Maybe you have an idea for a mechanism, which leads to an unexpected sporadic writing of adressranges?
Regards,
Joachim
Joachim,
It's difficult for me to speculate as to what might be corrupting your data based on the limited observations. My approach under such circumstances would be to keep an open mind to physical failure mechanisms as well. For example, the effects of noise and temperature or shifts in voltage or frequency.
I agree that errors in pointer manipulation are a common source for software-based corruption. There could also be rare errors that are introduced through the compiler that may not be obvious at a high level. Synchronization issues between modules can also lead to corruption -- for example if the ADC early interrupt is used to trigger data processing and the CPU happens to read the result registers before they are refreshed.
If you know the memory address(es) of the corruption, you can use a Hardware Watchpoint in CCS to halt execution whenever the address is being written by the CPU.
-Tommy
Hi Tommy,
many thanks for your help. In the end we couldn't find the reason for our problems.
But we generated a software without any failures. We don't know why this SW work, but it works.
We ran out of time, so we have to use this working SW.
Thank you for your support.
Best Regards
Joachim