Other Parts Discussed in Thread: TMS320F28375D, MSP430F5529
Hello, I'm looking for guidance on how to debug a problem I'm seeing with USB bulk packets.
The setup:
I have a TMS320F28375D processor acting as the USB host, and an MSP430F5528 acting as a USB device. They are on separate custom boards that are otherwise functional. Their USB signals are directly connected to each other and only each other, and are talking USB Full Speed. The cable between them is a simple twisted pair for D+/D-, and power and ground are provided from the TMS320 board to the MSP430 board. Because 3.3V is being provided, we don't use the 5V VBUS and from a USB standpoint the MSP430 is "self-powered". We have viewed the D+/D- signals on an oscilloscope and a logic analyzer, and the signals look stable and correct.
The MSP430 provides a bulk IN and bulk OUT endpoint for all traffic besides enumeration which occurs normally and successfully on Endpoint 0. The TMS320 uses code heavily adapted from the keyboard host example, and the MSP430 uses a customized version of the PHDC driver. The TMS320 is able to send (bulk OUT) and receive (bulk IN) data from the MSP430; the data is received by the MCUs correctly on both ends.
Because this is a relatively high data-throughput use case, the TMS320 is often sending series of maximum-payload-size packets (64 bytes) to the MSP430. Per normal USB spec, each packet must be ACKed before the next packet is sent. Currently we see occasional NAKs due to the MSP430 not being ready for the next packet yet -- this is NOT the problem we are trying to solve, as we expect our MSP430 MCU code isn't responding fast enough.
The problem:
~10% of the time, a bulk OUT packet will not get acknowledged by the MSP430. This lack of acknowledge can occur with a packet of any size (that we've tried so far anyway). The TMS320 USB module will re-send the packet up to 3 times, and most of the time one of those retries is successful and transmission continues. We are sending enough data that eventually 3 retries fail in a row, and the USB module "gives up" trying to send the packet and raises an error with the MCU.
If the TMS320 reschedules the same packet each time this error is raised, an ACK is eventually seen and the program moves on. While we can "get past" this problem, we are concerned there is serious reason for the MSP430's lack of acknowledgement, and the overall bandwidth is degraded due to all the resending.
Attached is a screen capture of this problem occurring on a small packet.
We have examined the connection electrically and tried tweaking the filtering, but so far nothing has changed this behavior. This behavior is consistent not only with our custom MSP430 board, but also with a TS430 "ZIF Socket" development kit and a MSP430F5529 launchpad dev kit.
The question:
I don't see a way to tell on the MSP430 side why an individual packet was not acknowledged. The User's Guide says "If a CRC or bit stuff error occurs when the data packet is received, then no handshake is returned to the host." The CRC doesn't appear to change between re-transmissions, and I'm not sure how to identify a "bit stuff error" on a scope but the timing of the EOP transition looks identical.
Is there anything I'm missing or should look into? Thanks for your help!