This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TM4C1294NCPDT: Firmware stops functioning

Part Number: TM4C1294NCPDT


Hello,

Firmware issue:

We have a custom board and sensors connected together, when the sensor is removed the device performs a software reset, after software reset it tries to setup the system from start and it goes into forever idle loop (the firmware stops functioning), and this issue happens ones in 400 - 500 tries to remove the sensor and then plug back in.

We suspect following issues: timing problem of tasks, or semaphore loses argument, semaphores looses counts, semaphore not called anymore, it goes into forever idle loop.

The following shows the firmware timing diagram. after 40 ads packets it sends it to the PC, that is why the time after the second red dashed line is used up.

- Is there a way to debug these errors, because of 1 in 500 occurrence?

  • Hi,

    We suspect following issues: timing problem of tasks, or semaphore loses argument, semaphores looses counts, semaphore not called anymore, it goes into forever idle loop.

      I'm really not an TI-RTOS expert but I'm not fully convinced that it is a TI-RTOS kernel problem yet. It seems to me that the semaphore is not getting posted in your application and the tasks are pending waiting for it. I guess you need to find out why there is no semaphore posted. Another thing to look at is if your tasks are pending with a long timeout? You can configure the PEND with a timeout so it won't get stuck. Below is a recommendation from TI-RTOS training material. 

    Note: We recommend you use timeouts on your PEND calls so that your code will never get “stuck” waiting for a Semaphore that never gets posted. If you make the timeout long enough, you can be assured that it won’t timeout before a reasonable time has expired. When you use timeouts, however, ALWAYS CHECK THE RETURN VALUE OF THE PEND ! If you don’t, you could process data that is not there and cause problems. There are two ways to get to the line of code after the PEND – either you got the Semaphore or you timed out. One means “process the data” and the other means “ooops, there was a problem”. So, handle these accordingly.

    What type of semaphore do you use? Counting semaphore or binary? What if you use binary semaphore? Does that make a difference?

      

  • Hi, Charles

    We are using counting(FIFO) semaphores. Regarding binary semaphores Im not so confident but will give a try.

    I tried to change the timing on the semaphores, and when the semaphore reaches the timeout then the firmware just stops functioning.

    Will test around with the timings and the binary semaphore and come back with the findings

  • Hi Robert,

    this issue happens ones in 400 - 500 tries to remove the sensor and then plug back in.

    Stil working this issue though seems like excellent odds if the intended purpose does not actually re-plug the sensor 400 - 500 counts. How long after sensor unplug does the SW reset take place? Another thing to consider is the sensors power source and sensors circuit debounce.

    Long ago worked on game cartridge PCB had shorter +5V edge trace so ground trace connected first. Plastic cartridge had locking pinch ears and small piece of AL sticky back tape placed on back side of both pinch ears. The cartridge PCB had two gold pins one on each side used too external reset the CPU when plastic ears were pinched plugging cartridge into 34 pin edge-connector. The game cartridge had ROM required address and data bus connections into a CPU with running kernel. Game cartridge never failed in a mobile setting with ±G-forces shaking the entire system for hours on end. Back then there was only NMI pin on most CPU's, could not ever do SW reset.

  • I tried to change the timing on the semaphores, and when the semaphore reaches the timeout then the firmware just stops functioning.

    So the task in PEND state does reach timeout. Doesn't this mean no one is posting the semaphore? This is why I'd to know what is your mechanism to post a semaphore and why does it stop posting? If the posting comes from an interrupt that will stop generating interrupt after your device is unplugged then it may be the reason. I'm think if you can install some type of watchdog timer, not necessarily a hardware watchdog but maybe a software version. If a flag is not written by a software timer within a specified time then it will generate a software reset. Just a thought but you can be more creative than me. 

    Also to give a heads-up, I will be out of office until next Tuesday.