Summary: Under some conditions, the system fails to receive isochronous IN transfers, when DMA is used. Only resetting the usb hardware brings it back to life.
Hardware: BeagleBone Black with a UVC camera attached to its USB host port. Tested with two cameras, one Logitech Quickcam Pro 9000, the other is an embedded camera module from a laptop. Both work fine when connected to a PC.
Kernel: 3.14.26 taken from ti-processor-sdk-linux-am335x-evm-01.00.00.00, with cppi41 dma enabled. The issue is not present with DMA disabled.
For USB trace capture, I've disabled the High-Speed USB mode in the kernel (MUSB_POWER_HSENAB), as I don't have access to a High-Speed analyzer. The problem occurs with High-Speed as well as with Full-Speed USB.
Test case:
Note: During testing, I've also experienced soft lockups related to dynamic CPU frequency scaling. In case of lockups disable cpufreq or set the governor to fixed-frequency.
With the camera attached, I run a shell script which repeatedly opens the video device and reads a few frames (using http://git.ideasonboard.org/yavta.git
while sleep 1; do yavta -s320x240 -c4 /dev/video0; done
After a few iterations (occasionally it takes several minutes), yavta freezes waiting for the first frame. After this point, no further video streaming is possible. USB trace (included below) shows that the driver fails to receive a response for the first (and only) isochronous IN token. Dumping the hardware registers shows that the RXPKTRDY bit is set for the appropriate endpoint, which would indicate that the USB hardware has received a response. But the DMA-completed interrupt never arrives.
Restarting yavta or replugging the camera have no effect. The only way to reset to a working state, apart from rebooting is resetting the driver and hardware as follows:
echo musb-hdrc.1.auto > /sys/bus/platform/drivers/musb-hdrc/unbind
echo musb-hdrc.1.auto > /sys/bus/platform/drivers/musb-hdrc/bind
My question:
How to solve this problem, or - at least - how to debug issues between the MUSB and CPPI41?
USB trace snipplet for the failing case(taken at 12Mbit):
=== SOF 241
IN addr 4 endpoint 0
NAK
...
IN addr 4 endpoint 0
NAK
=== SOF 242
IN addr 4 endpoint 0
DATA1 (0 bytes)
ACK
=== SOF 243
IN addr 4 endpoint 1
NAK
=== SOF 244
IN addr 4 endpoint 2 # MUSB sends one IN token on the isochronous endpoint
DATA0 (0 bytes) # receives a response
=== SOF 245
# no further IN tokens are sent for endpoint 2
=== SOF 246
=== SOF 247
=== SOF 248
=== SOF 249
=== SOF 250
=== SOF 251
=== SOF 252
IN addr 4 endpoint 1
NAK
Kernel function trace snipplet:
I've added some trace_printk-s to the code, and filtered the trace to include only musb_* and cppi41_* functions.
135.623111 | 0) 4.208 us | musb_h_disable();
135.623732 | 0) 2.708 us | musb_map_urb_for_dma();
135.623743 | 0) | musb_urb_enqueue() {
135.623749 | 0) | /* urb endpoint: 82 pipe: 410280 */
135.623764 | 0) | musb_start_urb() {
135.623769 | 0) 1.417 us | cppi41_dma_channel_allocate();
135.623777 | 0) | cppi41_dma_channel_program() {
135.623782 | 0) | /* dma de7e6388 status = 2 */
135.623785 | 0) 0.916 us | cppi41_set_dma_mode.isra.3();
135.623791 | 0) 1.917 us | cppi41_dma_prep_slave_sg();
135.623798 | 0) 0.417 us | cppi41_tx_submit();
135.623803 | 0) 1.584 us | cppi41_dma_issue_pending();
135.623809 | 0) + 29.667 us | }
135.623811 | 0) + 44.000 us | }
135.623814 | 0) + 68.291 us | }
135.623819 | 0) 0.625 us | musb_map_urb_for_dma();
135.623824 | 0) | musb_urb_enqueue() {
135.623828 | 0) | /* urb endpoint: 82 pipe: 410280 */
135.623832 | 0) 5.750 us | }
135.623837 | 0) 0.542 us | musb_map_urb_for_dma();
135.623842 | 0) | musb_urb_enqueue() {
135.623844 | 0) | /* urb endpoint: 82 pipe: 410280 */
135.623848 | 0) 4.709 us | }
135.623853 | 0) 0.541 us | musb_map_urb_for_dma();
135.623858 | 0) | musb_urb_enqueue() {
135.623861 | 0) | /* urb endpoint: 82 pipe: 410280 */
135.623865 | 0) 4.625 us | }
135.623870 | 0) 0.542 us | musb_map_urb_for_dma();
135.623875 | 0) | musb_urb_enqueue() {
135.623878 | 0) | /* urb endpoint: 82 pipe: 410280 */
135.623882 | 0) 4.833 us | }
Here yavta freezes in the "failing" case. In the "working" case, the next entry in the trace would be like this:
132.994996 | 0) | cppi41_irq() {
132.995005 | 0) | cppi41_dma_callback() {
132.995010 | 0) 1.458 us | cppi41_dma_tx_status();
132.995019 | 0) | cppi41_trans_done() {
132.995025 | 0) | /* dma de7e6388 status = 1 */
132.995029 | 0) | musb_dma_completion() {
132.995032 | 0) | musb_host_rx() {
132.995038 | 0) | /* <== hw 14 rxcsr 2200, urb actual 0 (+dma 0) */
132.995042 | 0) | cppi41_dma_channel_program() {
132.995045 | 0) | /* dma de7e6388 status = 2 */
132.995047 | 0) 0.750 us | cppi41_set_dma_mode.isra.3();
132.995051 | 0) 0.750 us | cppi41_dma_prep_slave_sg();
132.995056 | 0) 0.458 us | cppi41_tx_submit();
132.995061 | 0) 1.000 us | cppi41_dma_issue_pending();
132.995066 | 0) + 21.916 us | }
132.995069 | 0) + 35.042 us | }
132.995070 | 0) + 39.209 us | }
132.995072 | 0) + 51.166 us | }
132.995075 | 0) + 67.375 us | }
132.995077 | 0) + 76.333 us | }