We have a product which we have been producing for many years and this product includes the LM3S8933 chip.
Recently we have received a costumer complaint that our product (a single case a single case for now) is having problems.
We have replaced the Chip on the user device with other LM3S8933 and the issues the costumer experienced were solved.
Upon our investigation of the problematic chip, we encounter the following behavior:
1. Both in UDP and TCP the chip misses sent packet (the packets are not received by the chip) which in case of TCP causes TCP retransitions and in case of UDP faulty logic detected in our tests.
2. After a short period the chip falls into Hard fault (Fault ISR).
We are using TI supplied example firmware, and we have detected the source of the fault to the function:
static struct pbuf * stellarisif_receive(struct netif *netif)
and more specifically to the following section
The stack image will show the Hard fault cause
As you can see the ptr variable has exited the memory section, this is caused due to faulty reply from the following part at the start of the function:
the temp variable returned as 0, which is in contradiction to the supplied note and in contradiction to the logic in the function.
The issue we have is as follows, adding protection to this error case solves the Hard fault but not the packet loss in the PHY.
This issue we have experienced is a single case in several years, but event though we can understand there can be occasional problems in chips, we must be able to detect these problems during our device production verification. Unfortunately we were unable to detect this issue unless performing communication tests for a few minutes. Obviously we cannot add these tests to our production line, this is why we need some sort of indicator that there is a problem in the PHY part of the chip.
Please also note, that the cortex part in working fine, the clock and pll are locked correctly and all other communications (for example we also have UART communication running on this chip) is also working correctly. The crystals and trace lines on our PCB were also check and were good. Changing the faulty micro to new PCB still shows the same problems and inserting new micro the the original faulty PCB does not show any issues.
Please advice on a method with which we can detect such issues before we release our products to customers.