I recently ran a project that is largely based on one of the CDC examples of the (latest) MSP430 USB stack. Although in the absolute majority of cases the device works fine, there is one exceptions that occurs very rarely:
If a PC with one of our devices starts up, the device does not always enumerate properly. The device manager then shows that there is a USB device but it only shows as 'unknown device'. In most cases (I guess 99%+) the device is listed as serial port and works fine.
I tried multiple computers, multiple of our boards and Windows 7 and 8 but the result is the same although some PC's seem to exhibit the problem a bit more often (still it hardly ever happens).
You can make the problem go away by unplugging our device and plugging it back in, or by restarting the computer. I guess both cases could have the problem to re-appear, but as it is so rare I have never seen it fail twice in a row. And also because rebooting the computer (when the device had not been found) is easy to test by some scripts, this was the way to go. Still it takes hours (sometimes days) of continuous rebooting for the problem to appear on our test devices, but at our customer it happens more frequently and that is a problem for them.
First I thought that the MSP430 might have ended up in 'ST_PHYS_CONNECTED_NOENUM_SUSP' as I could end up there by connecting the USB connector loosely to the USB port. Probably just enough for the USB device to be detected by the host but not to enumerate it properly. I added some blinking LEDs when it had ended up there but after 2 more days of rebooting the reboot process had stopped but the device was not in that state.
Unfortunately I have never seen it fail with a debugger attached and connecting the debugger when the failure had occurred, made the MSP430 reset so I really have no idea where it ends up and the stack is also mostly event driven so there is a chance that the MSP430 is in LPM anyway leaving me even more clueless. When the device is detected and enumerated correctly it seems to work just fine (at least for as long as we have tested this).
Does anyone have any idea what else to look for? Could it be a hardware error or something in the stack?