We have a custom board with an AM3352BZCZ60 with the USB interface operating as a device. Several thousand of these boards have been built over the past three years and are operating in the field, but on the last build we started seeing USB transaction errors. The switch to 'B' parts happened six months ago, several thousand boards have been built with that part in that timeframe with no issues until present. Using a hardware USB analyzer, we were able to diagnose that the transaction error is due to data corruption which then causes a CRC error.
Running a diagnostic test that has the device send a pattern 1, 2, 3, ...FE, FF, 1, 2, ... (note: 0 is intentionally skipped) in 512 byte packets. The next packet is the same except that it starts at 3 rather than 1. Subsequent packets start at 5, 7, 9, etc. There are a total of 160 MB transmitted in the entire test. What I'm observing is:
- Most of the time, the tests passes (~5-10 power cycles work without error, test can be re-run, etc.)
- If the test passes the first time after power up, it seems to always pass. If the test fails the first time after power up, it seems to always fail.
- If the test fails, it is usually in one of the earlier packets (i.e. packet #10-100), but it has also been seen to fail on packets #1500-2000.
- Cold spray and heating the part does not appear to change the behavior of the failure
- There have been no recent design changes to the board.
- Usually there appears to be somewhat of a pattern to the garbled data. Several times (but not always, or even 'most') the data captured by the analyzer is bit shifted by two bits. If it happens, this apparent bit shifting typically occurs for stretches of 16 or 32 bytes, then two bytes are wrong, then another stretch of 16 or 32 bytes that are again bit shifted.
- One particular run had the following pattern: 48 consecutive bytes were exactly bit shifted by two bits; somewhat later 54 consecutive bytes were bit shifted by four bits with three sets of two consecutive bytes that were not simple bit shifts; followed by 52 bytes that groups of two that were 6 bit shifted and two that were not. Finally, after 160 bytes, the bits were now all aligned and the final 22 bytes of the packet matched but it was one byte off (i.e. where an 'F1' was expected, an F0 was received). On this particular packet, 513 bytes were received rather than 512. On other failures, there is an extra byte or two about half the time, the other half the packet is short by several bytes.
- Using the expected data to compute when USB Bit Stuffing should occur, does not show any correlation to when the unexpected apparent bit shifting actually happens.
The net from all of that was a suspicion that maybe the AM3352 USB is somehow not generating data at the proper speed so the USB host (a commercial PC) and the USB analyzer (CATC model UPAS2500H) are capturing the data when the USB DP/DM signals from the AM3352 are switching. Along those lines, we checked:
- Crystal frequency is correct to within ~25 ppm.
- Supply voltages show no anomaly.
- Lot codes where we've seen this problem so far: 58A5L9W, 58A5N4W and 58A1L8W
Any suggestions?