Hello,
We have a case where BBB / AM3359 UART0 is used in FIFO interrupt mode. The driver idea is based on StarterWare code, which has been modified to our purposes.
TX_FIFO_TRIG is set to a value >1 (e.g. 8, 16 etc., using either granularity 1 or 4), FIFO operation is enabled etc.
Whenever the THR interrupt occurs, the SW writes a maximum of Y (e.g. Y = 63-TX_FIFO_TRIG) bytes into the TX FIFO. There is a guard mechanism in place to stop writing when TXFIFO_LVL exceeds e.g. 62. (It also worked with TXFIFOFULL check).
When there is nothing to send, the THR interrupt is disabled. It is enabled again when we have something to send. (I also had the interrupt disabled for the duration of writing to the FIFO like they had in StarterWare example, but it did not have any impact on this issue or anything else).
This approach and the driver is verified as follows:
- it can perform hundreds of thousands of transfers (overnight) of different lengths over UART at 115200 kbps, 1 stop bit, no flow control, showing no fails. These transfers are either back to back or separated by a short duration (of RX over UART0), in two different test cases.
- debug code proves that the approach works i.e. the FIFO is being fed so it does not underflow or overflow (i.e. i write a special character in the stream each time the FIFO is written, so I can see how often I am feeding it - indirect evidence, but as the pattern makes sense I would say it has some value in proving it works. With FIFO empty at start, the write is more bytes and when feeding the UART, the number of written bytes depends on where I have set the trigger, and the final write can of course be shorter again )
- by removing the guard mechanism (TXFIFO_LVL ) I can make UART0 send out garbage (this is supposed to prove that it does not overflow normally)
However, the problem starts when the driver is used for sending out buffers periodically, with appx. one second pause in between transmissions. The line is silent for ~1 second and we start the transmission.
Occasionally, but not always, the first transfer attempt after a 1 second pause results in sending out garbage. The only differentce this write has to the rest of the writes is that it is the first one in the row. (It used to be consisting of more bytes than the rest because the FIFO was empty initially, but I reduced the length of the first FIFO fill to the level of the typical later fills, so this is also ruled out). The subsequent FIFO writes again yield fully succesfull transmission. The frequency of this failure varies, but generally, you have a fail every few seconds, e.g. 1 out of 5 buffers writes, which happen at 1 second intervals, fails.
My colleague had a look into this situation with a scope and:
- UART0 output on the BBB (before going to USB cable)
- Timing and voltages look OK.
- there appear to be extra bits between characters at the beginning of a packet
- The bits are well-defined with regard to timing and voltage as if the UART is really sending them.
Several things were tried out to rule out possible root causes:
- Writing just one byte at a time using the UARTCharPut (and not UARTFIFOCharPut), it still failed (this time, only the first byte after the 1 second pause was garbage). This was a very surprising result as this approach effectively means that we don't write to the FIFO until it is completely empty.
- different combinations of TX_FIFO_TRIG and TXFIFO_LVL (as well as TXFIFOFULL, and other limitations to number of bytes written) were tried out
Finally, it turns out that setting TX_FIFO_TRIG to 0 (!) or 1 removes this instability from the system. (Also value 2 appears to work but this has not been verified properly). 3 and above, it starts failing again.
The only difference to the trial with UARTCharPut (as described above) is the TX_FIFO_TRIG setting, and the fact that in the UARTCharPut case we (potentially) come to polling the FIFO empty bit (also tried the shift register empty condition) a bit earlier.
Based on this. the TX_FIFO_TRIG values >1 (or maybe 2) behave in a way not understood by me yet.
(Or then the system does not tolerate accesses to UART0 HW whenever the FIFO is in the process of writing into the transfer shift register and/or out of the chip.)
Would you have a theory what could have caused this behaviour?
There is the advisory 1.0.12 UART: Extra Assertion of FIFO Transmit DMA Request, UARTi_DMA_TX.
Also, in some Linux discussions which I found by google, they indicate that another issue might be present:
<citation starts>
At least on AM335x the following problem exists: Even if the TX FIFO is
empty and a TX transfer is programmed (and started) the UART does not
trigger the DMA transfer.
After $TRESHOLD number of bytes have been written to the FIFO manually the
UART reevaluates the whole situation and decides that now there is enough
room in the FIFO and so the transfer begins.
This problem has not been seen on DRA7 or beaglebone (OMAP3). I am not
sure if this is UART-IP core specific or DMA engine.
The workaround is to use a threshold of one byte, program the DMA
transfer minus one byte and then to put the first byte into the FIFO to
kick start the transfer.
<citation ends>
These are both related to using DMA, which we haven't got. However, the latter case bears some similarity to our case with FIFO being empty and having difficulties in making a succesfull transmission. (In our case, if for some reason the FIFO did not start running correctly when we first wrote to it, we could actually overflow the system by the subsequent writes - and by overflowing it on purpose, I can make it behave similarly. However, I emphasise this all is speculation based on supeficial similarities).
Do you have any further information on any possibility of issues in TX FIFO interrupt generation also in non-DMA use case?