This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM625: How to enable uart dma in RT-LINUX?

Part Number: AM625
Other Parts Discussed in Thread: AM623,

Tool/software:

Hi 

Do we have any FAE or guide to show how to implement UART DMA in RT-LINUX?

SDK 8.6

Regards

Zekun

  • Subject: UART5 Data Loss & Input Overrun Issue on AM623 Processor

    Problem Description:

    I'm using the AM623 processor and encountering occasional data loss (1-3 bytes missing) during UART5 reception at 115200 baud. This occurs approximately once every few hundred transfers. The kernel logs show repeated input overrun errors:

    [ 3816.583194] ttyS ttyS5: 1 input overrun(s)
    [ 3921.439083] ttyS ttyS5: 1 input overrun(s)
    [ 3968.364367] ttyS ttyS5: 1 input overrun(s)
    [ 4055.473693] ttyS ttyS5: 1 input overrun(s)
    Current UART5 Configuration (DTS):
    main_uart5: serial@2850000 {
    compatible = "ti,am64-uart", "ti,am654-uart";
    reg = <0x00 0x02850000 0x00 0x100>;
    interrupts = <GIC_SPI 183 IRQ_TYPE_LEVEL_HIGH>;
    power-domains = <&k3_pds 156 TI_SCI_PD_EXCLUSIVE>;
    clocks = <&k3_clks 156 0>; clock-names = "fclk";
    };
    Request:
    To resolve the overrun issue, I propose adding DMA support to UART5.Please guide me on how to configure this UART5


  • Hi Huanhuan,

    UART shouldn't overrun at 115200 baud. Even if so, DMA won't help.

    Can you please test with SDK10.1 kernel to see if the issue still happens?

  • Hi LiuBin,

        I reproduced the same problem across three SDK versions: ‌08_06_00_42‌, ‌09_00_00_03‌, and ‌09_01_00_08‌, but the issue does not occur in the ‌10_01_10_04‌ SDK version.

    Steps to Reproduce

    1. PC-side‌: Send ~180 bytes of data to the AM62 board every ‌100ms‌ or ‌500ms‌.
    2. AM62-side Data Reception‌:
      • Step 1‌: Configure the serial port with:
        bashCopy Code
        stty -F /dev/ttyS3 ispeed 115200 ospeed 115200 cs8 -echo time 1
      • Step 2‌: Log received data using:
        bashCopy Code
        cat /dev/ttyS3 > serial_115200_100_atk.log

    Observed Issue

    • For every ‌200 frames‌ of data, ‌1–3 bytes are lost‌ in one frame.
    • The kernel logs input overrun errors:
      textCopy Code
      [ 1895.708432] ttyS ttyS3: 1 input overrun(s)
      [ 2280.827981] ttyS ttyS3: 1 input overrun(s)
      [ 2294.383918] ttyS ttyS3: 1 input overrun(s)

    Request
    Since the newer SDK (10_01_10_04) fixes this issue, please provide a patch that we can apply to the older SDK versions.

    Hello LiuBin, I found the two links below. Do they address the issue I described?
    Through actual testing, applying the repair method provided in the link has resulted in a more severe byte loss probability, with ‌5 out of 11 packet groups‌ experiencing byte loss.

    https://lore.kernel.org/lkml/ffbe6439-9696-4abe-976b-07286b37a219@ti.com/T/
    https://www.ti.com/lit/er/sprz536b/sprz536b.pdf?ts=1742317798669



  • lore.kernel.org/.../quote]

    Yes, this is the patch which fixes the data loss problem.

    Through actual testing, applying the repair method provided in the link has resulted in a more severe byte loss probability, with ‌5 out of 11 packet groups‌ experiencing byte loss.

    Did you apply the v3 or v4 of this patch?

    [/quote]
  • Bin:

     I wanted to ask if the same DMA issue happened on AM64x Linux-RT prior to version 10.1 ?

    thanks

    Jim

  • Jim,

    Yes, it would affect on AM64x too. It relates to Errata i2310.

  • Jim,

    For your future reference, whenever you have questions related to an e2e thread, instead of asking them in the same thread, please click the yellow "+ ask a related question" at the top-right page of the thread, it will open a new e2e thread and link to the relevant thread.

  • Yes, we use  v4 of this patch on 08_06_00_42

  • I compared the UART driver between SDK8.6 and 10.1, there are quite some differences besides the patch you applied, but I am not certain what else changes is needed for the data loss problem you have.

  • In SDK8.6, instead of the patch v4 you applied, can you please try the kernel patch below to see if it fixes the data loss problem? The patch basically removes the "UART_RX_TIMEOUT_QUIRK" flag.

    diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c
    index 537bee8d2258..acc017baf325 100644
    --- a/drivers/tty/serial/8250/8250_omap.c
    +++ b/drivers/tty/serial/8250/8250_omap.c
    @@ -1254,8 +1254,7 @@ static struct omap8250_dma_params am33xx_dma = {
     
     static struct omap8250_platdata am654_platdata = {
            .dma_params     = &am654_dma,
    -       .habit          = UART_HAS_EFR2 | UART_HAS_RHR_IT_DIS |
    -                         UART_RX_TIMEOUT_QUIRK,
    +       .habit          = UART_HAS_EFR2 | UART_HAS_RHR_IT_DIS,
     };
     
     static struct omap8250_platdata am33xx_platdata = {

  • After removing the UART_RX_TIMEOUT_QUIRK flag, the kernel still prints the message ‌"ttyS ttyS5: 1 input overrun(s)"‌, and ‌data bytes are missing‌.

  • Okay, this means the issue is not related to Errata i2310, but something else. The kernel has huge difference between SDK 8.6 and 10.1, it won't be easy to tell which software change in SDK10.1 kernel fixes the problem.

  • Hi Bin,

     I noticed that after applying a TI patch, when service processes are not running, the UART doesn't lose bytes. However, when running service programs, byte loss frequently occurs. The byte loss always coincides with UART_LSR_OE being set. The chip's FIFO is 64 bytes in size, but repeated losses occur at the 65th byte position. Since we're using RT-Linux with forcibly threaded interrupts, I suspect that when services are running, the threaded interrupts might not be handled promptly enough. Therefore, I still hope TI could provide a device tree configuration reference for UART DMA to help me verify this hypothesis.

  • Hi Huanhuan,

    DMA won't help in this case, but would cause more problems.

    Which AM625 device do you use? How many A53 cores does it have? If more than one A53, have you tried to use irq affinity for this UART port?

  • Hi Bin,

    Thank you for your support.

    Root Cause Analysis:
    The real-time Linux (RT-Linux) forces all interrupt service routines (ISRs) to be threaded, meaning system interrupts are handled by kernel threads to execute service functions (which then compete with user processes and other kernel threads for scheduling). After applying TI’s patch (which addresses a bug in TI’s UART controller and reduces the probability of byte loss but does not fully resolve the issue), when the product line’s service processes run, the system not only spawns additional processes but also wakes up many other kernel threads. In this scenario, the UART receive thread cannot be scheduled in time, leading to overflow of the RX FIFO and persistent byte loss.

    Resolution and Validation:
    The UART receive interrupt trigger threshold was adjusted from the default 48 bytes (the chip’s RX FIFO size is 64 bytes) to 16 bytes. This adjustment was made because:

    • Many processors’ UART FIFO buffers are 8/16/32 bytes in size.
    • Threaded interrupts introduce scheduling latency.
    • No reference configuration for UART DMA is available, and enabling DMA could introduce other issues.

    Verification Test:
    Two UART channels simultaneously received 180 bytes every 100ms. After 20,000 data sets across both channels, ‌zero byte loss‌ was observed.

     

  • Hi Huanhuan,

    Resolution and Validation:
    The UART receive interrupt trigger threshold was adjusted from the default 48 bytes (the chip’s RX FIFO size is 64 bytes) to 16 bytes.

    Thanks for sharing the resolution.

  • Hi Bin,

        Our product is currently undergoing an SDK upgrade from version 08_06_00_42 to 10_01_10_04. Regarding the UART DMA issue you mentioned, I would like to confirm whether this problem still persists in the latest 10_01_10_04 version.I'm also very interested in the DMA-related issues. Could you share what specific problems DMA might introduce? Are there any technical documents we can reference for further investigation?

        The root cause of the byte loss is related to interrupt threading, where the UART receive thread may fail to be scheduled promptly enough to read the receive registers. To improve the robustness of UART reception, we intend to implement DMA in the new SDK. Our concern is that even with a 16-byte interrupt threshold, transient system overloads might still lead to RX FIFO overflow due to scheduling delays. We believe enabling DMA could mitigate this risk by reducing reliance on real-time thread scheduling.

  • Hi Huanhuan,

    There isn't much structural change in the UART drivers from sdk 8.6 to 10.1, so I don't expect the behavior change.

    The problem is in the kernel scheduling, that is why I mentioned uart irq affinity, which likely would resolve the problem.

    With DMA enabled, kernel has to serve the DMA transfer completion interrupt, which still relies on kernel scheduling.

  • Hi Bin,

    We have already attempted binding the UART interrupt to a dedicated CPU core via IRQ affinity, but the ‌byte loss issue persists‌.

  • Hi Huanhuan,

        I reproduced the same problem across three SDK versions: ‌08_06_00_42‌, ‌09_00_00_03‌, and ‌09_01_00_08‌, but the issue does not occur in the ‌10_01_10_04‌ SDK version.
        Our product is currently undergoing an SDK upgrade from version 08_06_00_42 to 10_01_10_04. Regarding the UART DMA issue you mentioned, I would like to confirm whether this problem still persists in the latest 10_01_10_04 version.

    About two weeks ago you said the issue does not happen with sdk 10.1.10.4, but now you asked if the problem is still in sdk 10.1.10.4. Can you please clarify the disparity? Are you talking about different uart usecases / problems?

    Since you are now working on upgrading to sdk10.1, I'd like to understand the exact uart testcase and failure, so that I can reproduce it on my side and look into it.