Tool/software: Linux
I am using a custom board based on the TI EVM for AM437x.
I have tested using the linux kernel from ti-processor-sdk-linux-am437x-evm-04.00.00.04, and newer versions (4.12.y, 4.13.y, 4.14.y) from the linux mainline repo. All exhibit the following behaviour.
I need to receive "high speed" (1.5Mbs - 3Mbs) data through the UART (ttyS2). I get RX overruns using the default interrupt driven receiver, so I have enabled DMA for that UART in the device tree. The serial ports have been configured to pass raw data, and all flow control has been turned off. I have captured the serial data on a logic analyzer and the transmitted data is correct.
With DMA enabled, the transfers contain extra NULL bytes. That is, when I send a payload of 8192 bytes, I sometimes receive more that 8192 bytes. The extra bytes are always 0x00's, and the remain data is correct.
To test, I generated an 8k file that contains the following string "0123456789ABCDEF" repeated 512 times. Here is a hexdump of the source file:
hexdump -C /root/8k 00000000 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46 |0123456789ABCDEF| * 00002000
I sent and received the data using the following command:
cat /dev/ttyS2 > /tmp/dump & sleep 1 ; cat /root/8k > /dev/ttyS4 ; sleep 10 ; kill -HUP %1 ; sleep 2 ; wc /tmp/dump ; hexdump -C /tmp/dump
The sleeps are just to make sure the processes are "ready". Here is an example output of the above command when the extra 0x00's are captured:
0 1 8248 /tmp/dump
00000000 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 46 |0123456789ABCDEF|
*
000000c0 00 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 |.0123456789ABCDE|
000000d0 46 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 |F0123456789ABCDE|
*
00000140 46 00 00 00 00 30 31 32 33 34 35 36 37 38 39 41 |F....0123456789A|
00000150 42 43 44 45 46 30 31 32 33 34 35 36 37 38 39 41 |BCDEF0123456789A|
*
00000720 42 43 44 45 46 30 31 00 00 00 00 00 32 33 34 35 |BCDEF01.....2345|
00000730 36 37 38 39 41 42 43 44 45 46 30 31 32 33 34 35 |6789ABCDEF012345|
*
00001380 36 37 38 39 41 42 43 44 45 46 00 00 00 00 00 00 |6789ABCDEF......|
00001390 00 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 |.0123456789ABCDE|
000013a0 46 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 |F0123456789ABCDE|
*
000015d0 46 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |F...............|
000015e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000015f0 00 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 |.0123456789ABCDE|
00001600 46 30 31 32 33 34 35 36 37 38 39 41 42 43 44 45 |F0123456789ABCDE|
*
00001970 46 00 00 00 00 00 00 00 30 31 32 33 34 35 36 37 |F.......01234567|
00001980 38 39 41 42 43 44 45 46 30 31 32 33 34 35 36 37 |89ABCDEF01234567|
*
00002038
The extra 56 bytes are 0x00's. The other 8192 bytes match the source file. As seen above, the extra 0x00's are inserted into the stream at multiple locations.
With the standard kernel module, the Rx DMA size is set to 48 bytes with a 48 byte FIFO threshold and 48 byte DMA burst size. At the end of one DMA transfer, the driver directly reads any remaining bytes from the UART FIFO, then enables another DMA transfer.
With a customized kernel, I have determined that the extra 0x00's occur during the DMA of data from the UART FIFO to memory, not during the direct reads. Also, the extra 0x00's have not appeared in the first DMA transfer (after opening the serial port). I think the 0x00s are caused by the DMA reading from an empty FIFO.
I have also tested with a larger DMA buffer size of 1k with various FIFO thresholds and matching DMA burst sizes. In these tests, the first 1k of data is always good, and the directly read bytes from the FIFO between the end of one DMA transfer and the start of the next is always good. The extra 0x00's may occur anywhere during the DMA transfer (e.g starting 252 bytes into a 1k buffer with threshold = 16 burst = 16).
It appears that the DMA is triggered before the FIFO has enough data. Perhaps the DMA is incorrectly triggered at the start of the subsequent transfers, and it take a few transfers to "catch" up and overtake the FIFO.
I haven't found any errata directly related to the above problem, and the TRM is somewhat vague about the Rx Trigger from the UART.
Is this a spurious trigger? Are there work-arounds? Anyone else notice this problem?