This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

am335x slow USB FIFO read

I successfully used the Starterware bulk USB example to create my own class compliant device on USB0. I'm running with Bulk-in and Bulk-out endpoints using 512 byte packets with a high speed host (Windows 7, 64-bit). Performance when writing to the Bulk-in endpoint (device to host) FIFO is surprisingly fast even though the low level driver code uses PIO. That code is able to transfer 512 bytes to the FIFO in about 15 usec. But even though the driver uses similar code when reading 512 bytes from the Bulk-out endpoint (host to device) FIFO, it takes 9 times longer! This comes out to over 270 nsec/byte where writing only takes 30 nsec/byte. All endpoints use single buffering. What's going on?

At first I thought this was DDR memory latency (my target board is BBB) so I wrote some test code that reads the FIFO into a register (doesn't write to buffer in RAM) and it's also slow so it appears the root problem is the memory access that reads the FIFO register.

  • The read has a turn-around delay. The write has not.

    Use the DMA controller to transfer the data.

  • I ended up reading the FIFO register (using PIO) as a long instead of a byte and not only does it return the correct data, it's 4 times faster! Both read and write benefit from this approach but I only tested EP1 and EP2.

    DMA may be the preferred approach but the complexity this adds is daunting. Even with the Starterware CPPI driver, it's a major effort to understand and integrate plus it seems it must know the receive data length in advance. Is this true?