I have some boards that have the TMX version of the TMS320C6747 DSPs that seem to be intermittently hanging and/or not booting after several months. The code uses one of the MCASPs and EDMAs for passing audio streams. Most boards will boot up and run fine for a while (anywhere from several minutes to several days) and some will eventually lockup. The boards will even lockup under static input conditions (i.e. the board is running and no inputs are changing, no memory is being allocated, no new data is being processed) after a while. I haven't checked all the boards yet but I noticed when I attach the debugger, when the board hangs, I believe it is within the SYS_EXITFXN function and it appears to be stuck on an endless NOP loop. Also, in some cases the Heartbeat task is evidently not being called either because the LED is not blinking. This task should always be running! I also checked all the error registers associated with MCASP and did not see any errrors being asserted before or after locking up.
I realize the TMX devices are suspect, but I need some way to be certain that the problem is due to the TMX device and not our code. I have had 3 boards that were failing with TMX devices replaced with TMS devices and at least for now, the problem is not showing up. My concern is that we intially did not see any issues with the TMX devices, but now after several months we are starting to see failures and so how can I be sure the same will not occur with the TMS devices?
Also, on a couple of the boards that previously worked, but then all of the suddent would not boot, I tracked it down to a point in the code in which the MCASP was being switched from an internally generated clock to an external clock. Apparently it could not detect the external clock properly and would hang there waiting for it. If I forced it back to the internal clock, it ran fine.