AM625: AM625x UART 2Mbps packet loss issue

Part Number: AM625

AM625x UART 2Mbps packet loss issue

https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1504677/am625-run-multiple-serial-ports-simultaneously-and-toggle-the-wired-nic-to-introduce-interference

The above is about the previous issue of packet loss at 2Mbps for AM625x UART. Previously, when no GPU was used, UART at 2Mbps did not experience packet loss, and increasing CPU usage with stress-ng also did not cause packet loss;
Recently, I have noticed that when the GPU usage exceeds 12% (under Qt calling OpenGL for real-time rendering), UART starts to encounter OE errors (as shown by cat /proc/tty/driver/serial)


root@am62xx-evm:~# cat /sys/kernel/debug/pvr/status
Driver Status:   OK
Device ID: 0:128
Firmware Status: OK
Server Errors:   0
HWR Event Count: 0
CRR Event Count: 0
SLR Event Count: 0
WGP Error Count: 0
TRP Error Count: 0
FWF Event Count: 0
APM Event Count: 10
GPU Utilisation: 14%
DM Utilisation:  VM0
           2D:   0%
         GEOM:   0%
           3D:  13%
          CDM:   0%
          
root@am62xx-evm:~# cat /proc/tty/driver/serial
serinfo:1.0 driver revision:
0: uart:unknown port:00000000 irq:0
1: uart:unknown port:00000000 irq:0
2: uart:8250 mmio:0x02800000 irq:240 tx:258355 rx:141 RTS|DTR|DSR
3: uart:8250 mmio:0x02810000 irq:241 tx:0 rx:0 RTS|DTR
4: uart:8250 mmio:0x02820000 irq:242 tx:19305 rx:325769 RTS|DTR|DSR
5: uart:unknown port:00000000 irq:0
6: uart:8250 mmio:0x02840000 irq:243 tx:0 rx:1 brk:1 RTS|DTR|DSR
7: uart:8250 mmio:0x02850000 irq:244 tx:87498 rx:37328920 oe:440 RTS|DTR|DSR
8: uart:8250 mmio:0x02860000 irq:245 tx:18791 rx:195160 RTS|DTR|DSR
9: uart:unknown port:00000000 irq:0
10: uart:unknown port:00000000 irq:0
11: uart:unknown port:00000000 irq:0

ttyS7: Packet loss is very serious

 

SDK:          ti-processor-sdk-linux-am62xx-evm-10.01.10.04-Linux-x86-Install.bin
Linux kernel: 6.6.58+


Only when the GPU usage rate is above 12%, does the UART driver layer start to encounter OE errors;
Previously, TI technical personnel responded that UART does not support DMA; it may be due to high bandwidth interruptions from the GPU, causing UART to operate in non-DMA mode and resulting in packet loss.

  • Hi,

    Sorry I didn't get a chance to look into this today. I will get back to you by this Thursday.

  • Hi Dongliang,

    The oe error in procfs means the UART RX FIFO overflows.

    In the previous e2e thread, I have mentioned irq affinity, do you use it to handle the irq of UART and GPU on the two A53s separately?

  • Check interrupt number
    root@am62xx-evm:~# cat /proc/interrupts
               CPU0       CPU1       
     11:   18315696   20059903     GICv3  30 Level     arch_timer
     14:          0          0     GICv3 130 Level     pinctrl
     15:          0          0     GICv3  23 Level     arm-pmu
     16:      53512          0     GICv3  66 Level     4d000000.mailbox thr_012
     25:          0          0     GICv3 269 Level     2b10000.audio-controller_rx
     26:          0          0     GICv3 270 Level     2b10000.audio-controller_tx
     27:     117305          0     GICv3 193 Level     20000000.i2c
     28:         74          0     GICv3 194 Level     20010000.i2c
     29:      49429          0     GICv3 195 Level     20020000.i2c
     48:     986221          0  MSI-INTA 1714176 Edge      485c0100.dma-controller chan0
     60:      12118          0  MSI-INTA 1714688 Level     485c0100.dma-controller chan0
     78:          0          0  MSI-INTA 1715718 Edge      485c0100.dma-controller chan1
     96:          0          0  MSI-INTA 1716230 Level     485c0100.dma-controller chan1
    118:         30          0  MSI-INTA 1970707 Level     8000000.ethernet-tx0
    126:          0         32  MSI-INTA 1970715 Level     8000000.ethernet-tx1
    134:         26          0  MSI-INTA 1970723 Level     8000000.ethernet-tx2
    142:          0         32  MSI-INTA 1970731 Level     8000000.ethernet-tx3
    150:         45          0  MSI-INTA 1970739 Level     8000000.ethernet-tx4
    158:          0         48  MSI-INTA 1970747 Level     8000000.ethernet-tx5
    166:      15749          0  MSI-INTA 1970755 Level     8000000.ethernet-tx6
    174:          0         29  MSI-INTA 1970763 Level     8000000.ethernet-tx7
    208:     127573          0  MSI-INTA 1971731 Level     8000000.ethernet
    240:     175044          0     GICv3 210 Level     2800000.serial
    241:          0          0     GICv3 211 Level     2810000.serial
    242:     343207          0     GICv3 212 Level     2820000.serial
    243:          0          0     GICv3 214 Level     2840000.serial
    244:   23615107          0     GICv3 215 Level     2850000.serial
    245:     600530          0     GICv3 216 Level     2860000.serial
    246:          0          0     GICv3 134 Level     8000000.ethernet
    256:          0          0     GICv3 220 Level     xhci-hcd:usb1
    257:         34          0     GICv3 258 Level     xhci-hcd:usb2
    258:      59816          0     GICv3 165 Level     mmc0
    259:         52          0     GICv3 114 Level     mmc2
    308:          0          0      GPIO  38 Edge    -davinci_gpio  1-0039
    369:         22          0      GPIO   3 Edge    -davinci_gpio  egalax_i2c
    418:    2575257          0     GICv3 116 Level     tidss
    419:          0          0     GICv3 115 Level     mmc1
    420:    3646414          0     GICv3 118 Level     pvrsrvkm
    IPI0:    296497     716920       Rescheduling interrupts
    IPI1:   7130301   21971156       Function call interrupts
    IPI2:         0          0       CPU stop interrupts
    IPI3:         0          0       CPU stop (for crash dump) interrupts
    IPI4:         0          0       Timer broadcast interrupts
    IPI5:    937695     881413       IRQ work interrupts
    IPI6:         0          0       CPU wake-up interrupts
    Err:          0
    root@am62xx-evm:~#

    UART6 2Mbps interrupt number 244 CPU0
    244: 23615107 0 GICv3 215 Level 2850000.serial

    GPU driver interrupt number 420 CPU0
    420: 3646414 0 GICv3 118 Level pvrsrvkm

    Bind GPU interrupt number 420 to CPU1
    # echo 2 > /proc/irq/420/smp_affinity


    root@am62xx-evm:~# cat /proc/interrupts
               CPU0       CPU1       
    244:   23757254          0     GICv3 215 Level     2850000.serial
    420:    3666378       2380     GICv3 118 Level     pvrsrvkm

    root@am62xx-evm:~# cat /proc/interrupts
               CPU0       CPU1       
    244:   23757716          0     GICv3 215 Level     2850000.serial
    420:    3666378       2454     GICv3 118 Level     pvrsrvkm

    Binding successful
    Check UART packet loss rate command
    watch -n1 -d cat /proc/tty/driver/serial

    Check GPU usage command
    watch -n1 -d cat /sys/kernel/debug/pvr/status

    Check interrupt number growth command
    watch -n1 -d cat /proc/interrupts



    result:
    Every 1.0s: cat /sys/kernel/debug/pvr/status                                                                                                                           am62xx-evm: Tue Jun 30 23:46:28 2026

    Driver Status:   OK

    Device ID: 0:128
    Firmware Status: OK
    Server Errors:   0
    HWR Event Count: 0
    CRR Event Count: 0
    SLR Event Count: 0
    WGP Error Count: 0
    TRP Error Count: 0
    FWF Event Count: 0
    APM Event Count: 37
    GPU Utilisation: 43%
    DM Utilisation:  VM0
               2D:   0%
             GEOM:   1%
               3D:  42%
              CDM:   0%


    Every 1.0s: cat /proc/tty/driver/serial                                                                                                                                am62xx-evm: Tue Jun 30 23:47:09 2026

    serinfo:1.0 driver revision:
    0: uart:unknown port:00000000 irq:0
    1: uart:unknown port:00000000 irq:0
    2: uart:8250 mmio:0x02800000 irq:240 tx:255277 rx:700 RTS|DTR|DSR
    3: uart:8250 mmio:0x02810000 irq:241 tx:0 rx:0 RTS|DTR
    4: uart:8250 mmio:0x02820000 irq:242 tx:5689 rx:218707 RTS|DTR|DSR
    5: uart:unknown port:00000000 irq:0
    6: uart:8250 mmio:0x02840000 irq:243 tx:0 rx:0 DSR
    7: uart:8250 mmio:0x02850000 irq:244 tx:71224 rx:25825006 oe:81 RTS|DTR|DSR
    8: uart:8250 mmio:0x02860000 irq:245 tx:13002 rx:135524 RTS|DTR|DSR
    9: uart:unknown port:00000000 irq:0
    10: uart:unknown port:00000000 irq:0
    11: uart:unknown port:00000000 irq:0


    Experiment summary:
    In the case of GPU driver bound to CPU0 and UART bound to CPU1 (network interrupt bound to CPU0):
    The GPU usage rate reached around 15%, and two OE errors occurred in about 8 minutes (compared to one OE error every 50 seconds before). The problem still persists;
    When the GPU usage rate reaches 45%, the frequency of triggering this error is alleviated, but the problem still persists (with a probability of two OE errors occurring every 20 seconds)

    Bind CPU cores, there are improvements but the problem still persists.

  • Thanks for the test.

    Bind GPU interrupt number 420 to CPU1
    # echo 2 > /proc/irq/420/smp_affinity

    Since CPU0 handles the irq for most of the modules by default. Can you please try to move the UART6 irq to CPU1 to see if this makes difference?

    echo 1 > /proc/irq/244/smp_affinity_list

    By the way, have you tested with the RT kernel to see if the problem is the same?

  • Both CPU0 and CPU1 have been tried in exchange, but the problem still persists;
    I haven't tried the real-time kernel yet. I'll give it a try on my end in the next couple of days.

  • Thanks for the experiments. I will try to replicate the issue on my evm later next week. I am currently out of office for the holidays and will be back mid of next week.