My application is communicating with a SPI Flash Memory chip. I have code working flawlessly when directly accessing the SSI peripheral in a tight, optimized loop. I have attempted to improve performance by changing the data transfer from direct SSI access to uDMA. I've kept the Flash Memory command byte and address bytes (usually 4-Byte address) in direct SSI access, so that only the longer Flash data transfer is handled by uDMA.
I have the uDMA Channel Control set to UDMA_SIZE_8 | UDMA_ARB_4 for efficiency. Channel Transfer is UDMA_MODE_PING_PONG. Channel Attribute has UDMA_ATTR_USEBURST set, and I believe that I am careful to wait for the SSI Busy to be de-asserted and then only program the uDMA size to a multiple of 4 bytes.
Reading from SPI Flash works flawlessly right after a System Reset, no matter how many times I repeat the command. Since these commands involve alternating between direct SSI access for command and address versus uDMA for data, it seems that my code is written correctly. However, if I write to the SPI Flash (using direct access only, no uDMA), the next uDMA read from SPI Flash suffers from the issue that there is an extra word received at the start, followed by the expected data, one word too late. Looking at the actual pins with a logic analyzer, it's apparent that the SPI Flash chip is sending the right data at the right time, but the uDMA is delivering an extra word. Once the system gets into this state, repeating the uDMA read always has that extra word at the start of each new transfer.
Other than looking for general help, based on the above issues, I'm also wondering whether there is some way to flush or reset the uDMA peripheral that I'm missing. Since there is an extra word coming from uDMA, I initially suspected that my write code left unread data in the SSI FIFO, but that doesn't seem possible since I empty the FIFO.
On a related note: Is it sufficient to loop until the SSI peripheral is not Busy before starting uDMA? Or, would I need to also check that the FIFO is Empty?