This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM6548: UART receive errors

Part Number: AM6548

Tool/software: Linux

Hi

I have some problem  when I use uart

The EVM board can not receive data fully,there are some data missing and some data errors.

Reproduce:

1. sdk version:

root@am65xx-evm:~# uname -a                                                     
Linux am65xx-evm 4.19.38-g4dae378bbe #14 SMP PREEMPT Mon Aug 5 14:38:57 CST 201x

2.Test steps:

  set the uart baud rate: 9600

PC program :uart_send.sh

#!/bin/bash

count=1000;

for((i=1;i<=$count;i++));
do
echo  "0;1;3;4;6;7;8;9;a;b;c;d;e;f;10;11;12;13;14;15;16;17;18;19;1a;1b;1c;1d;1e;1f;20;21;22;23;24;25;26;27;28;29;2a;2b;2c;2d;2e;2f;30;31;32;33;34;35;36;37;38;39;3a;3b;3c;3d;3e;3f;40;41;42;43;44;45;46;47;48;49;4a;4b;4c;4d;4e;4f;50;51;52;53;54;55;56;57;58;59;5a;5b;5c;5d;5e;5f;60;61;62;63;64;65;66;67;68;69;6a;6b;6c;6d;6e;6f;70;71;72;73;74;75;76;77;78;79;7a;7b;7c;7d;7e;7f;80;81;82;83;84;85;86;87;88;89;8a;8b;8c;8d;8e;8f;90;91;92;93;94;95;96;97;98;99;9a;#a3" > /dev/ttyUSB3;
done

EVM:uart_recv.sh

#!/bin/bash                                                                     
while (true);                                                                   
do                                                                              
cat /dev/ttyS1 > test.txt;                                                      
done

Firstly,execute uart_recv.sh on the evm and then execute uart_send.sh on the pc

Result: test.tar

can use "#" to count how many items  I have received.

I have send 1000 counts items ,but I have only receive 991 items.

besides there are some items error,maybe there are some space or  error data.

  • Hi,

    Is this issue following up to the original thread below?

    https://e2e.ti.com/support/processors/f/791/t/800007

    Which patches have you applied on top of the SDK kernel?

    How exactly do you set the baud rate and termios on both the evm and pc?

  • Hi

    I haven't use any patches.

    baud rate is default 9600, I have not use any other tools to set the baud rate.

    currently, the sdk version is 6.0.

    In sdk6.0 and sdk5.3 both versions, packets are lost when communicating with serial ports

    the DMA problem should be solved under sdk6.0,but seems that the problem remains

  • Hi,

    Asura said:
    I haven't use any patches.

    Please apply the kernel patch I provided in the post below on to SDK 6.0 kernel, it fixes one DMA driver bug.

    https://e2e.ti.com/support/processors/f/791/p/825280/3061069#3061069

    Asura said:
    baud rate is default 9600, I have not use any other tools to set the baud rate.

    How have you exactly confirmed the baud rate is 9600 on both the EVM and the PC? The default setting is not always the same on both. termios has to be set to the same for both ends of the uart connection.

    Have you tried the tool https://github.com/cbrake/linux-serial-test.git I recommended to you a few weeks ago? It configures the uart ports before transferring data for you.

    By the way, you haven't updated the related thread below for a few weeks. Is the issue still pending or it is solved?

    https://e2e.ti.com/support/processors/f/791/t/825280

  • hi

    I have used the patch 0001-dmaengine-ti-k3-udma-Fix-wrong-of_node_put-usage-on-.patch.txt you provided. 

    the test results show the problem remains. It can not solve the problem.

    I have used stty to check the baud rate is the same on the both side.

    the data can transfer and receive normally, it is not the problem of the baud setting.

    I have used the tools linux-serial-test ,but it can not count how many data I send and how many data I receive. it not satisfy our demands.

  • Hi,

    The linux-serial-test tool is an open-source project and the source code is available. It can be modified to only send certain amount of data instead of in an infinite loop.

    Another alternative is that the filesystem in the SD card already has a tool called serialcheck which can be used to validate uart too. The following page has details for how to use it to do lookback test.

    http://processors.wiki.ti.com/index.php/Linux_Core_UART_User%27s_Guide

    If you also want to test sending from a PC uart, you can compile the tool from source code for PC Linux. The source code is at

    https://github.com/nsekhar/serialcheck

  • hi

    currently, I do not care how to use the tools to do the loopback test.

    tools is just a method to verify the problem.Now the problem has appeared.I think it is useless to use other method to reproduce this problem.

    as the described reproduce steps before, when I send data from PC to the board,but the board can not receive the data fully and has some error data exists.

    this is what I concern.my concern is the data loss and error.

    And I have already used the patch you provided for me,the problem remains.and I try to disable DMA, just using interrupt,seems everything goes well.

    So the DMA still has problem under sdk6.0.

    by the way, I use the method to disable DMA. modify dts to disable dma.refer link:

  • Hi,

    I mentioned multiple test tools was just to ensure your test procedure is correct, as echo/cat commands don't set the port mode, the termios of the ports have to be the same before using echo/cat command.

    Anyway, I did the same echo/cat test on the AM654x EVM, transmitted 1000 times from the PC, and received 1000 times on the EVM.

    I first applied the following patch on the SDK6.0 kernel to route UART1 to the "MCU UART1" header. I didn't apply the dma patch I mentioned earlier, I believe it just fixes a memory leak issue.

    diff --git a/arch/arm64/boot/dts/ti/k3-am654-base-board.dts b/arch/arm64/boot/dts/ti/k3-am654-base-board.dts
    index 2ff3ac0faba1..defc2d25f90f 100644
    --- a/arch/arm64/boot/dts/ti/k3-am654-base-board.dts
    +++ b/arch/arm64/boot/dts/ti/k3-am654-base-board.dts
    @@ -185,6 +185,15 @@
                    >;
            };
    
    +       mcu_uart0_pins_default: mcu_uart0_pins_default {
    +               pinctrl-single,pins = <
    +                       AM65X_WKUP_IOPAD(0x0044, PIN_INPUT, 4) /* (P4) MCU_OSPI1_D1.MCU_UART0_RXD */
    +                       AM65X_WKUP_IOPAD(0x0048, PIN_OUTPUT, 4) /* (P5) MCU_OSPI1_D2.MCU_UART0_TXD */
    +                       AM65X_WKUP_IOPAD(0x004c, PIN_INPUT, 4) /* (P1) MCU_OSPI1_D3.MCU_UART0_CTSn */
    +                       AM65X_WKUP_IOPAD(0x0054, PIN_OUTPUT, 4) /* (N3) MCU_OSPI1_CSn1.MCU_UART0_RTSn */
    +               >;
    +       };
    +
            mcu_fss0_ospi0_pins_default: mcu-fss0-ospi0-pins_default {
                    pinctrl-single,pins = <
                            AM65X_WKUP_IOPAD(0x0000, PIN_OUTPUT, 0) /* (V1) MCU_OSPI0_CLK */
    @@ -329,6 +338,11 @@
            status = "disabled";
     };
    
    +&mcu_uart0 {
    +       pinctrl-names = "default";
    +       pinctrl-0 = <&mcu_uart0_pins_default>;
    +};
    +
     &main_uart0 {
            pinctrl-names = "default";
            pinctrl-0 = <&main_uart0_pins_default>;
    @@ -363,6 +377,12 @@
                    reg = <0x21>;
                    gpio-controller;
                    #gpio-cells = <2>;
    +               p1 {
    +                       gpio-hog;
    +                       gpios = <14 gpio_active_high="">;
    +                       output-low;     /* output on header */
    +                       line-name = "uart_sel";
    +               };
            };
     };
    

    On the PC, I used your uart_send.sh script to transmit. On the evm, I used the following command to receive:

    # while true; do cat /dev/ttyS1 > /dev/shm/test.txt; done

    Note that I write the output data to the DDR share memory instead of the SD card to avoid any slow SD card writing problem. After the PC transmit is done,

    root@am65xx-evm:~# grep '#' /dev/shm/test.txt | wc -l
    1000

  • Hi

    Sorry for the delay reply. I have use these tools the serialcheck,linux-serial-test and serialstats and so on to verify the uart, the test results show the  uart seems correct.

    But I think the problems still exists,as when I use gdb/gdbserver  to debug a problem through uart. It can not debug correctly.

    this is why I try a lot of method to reproduce this problem.  but I failed to find  a simple case to reproduce the uart problem. So I would like to illustrate the problem using gdb/gdbserver.

    I have done some tries:

    1. disable the dma, just use interrupt,the gdb/gbdserver can debug correctly. the send data and receive data through uart can match and correctly.

    2.enable dma, the board can not receive data correctly,I have used the logical analyzer and print he data from the gdbserver to prove the problem.

                indeed, board can not receive correctly ,so debug would result in failed

    3. enable dma ,and just add a printk in the __dma_rx_do_complete, and the debug could go on and could debug the code correctly.

    diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c
    index d3d55a634..4e97b0b18 100644
    --- a/drivers/tty/serial/8250/8250_omap.c
    +++ b/drivers/tty/serial/8250/8250_omap.c
    @@ -806,6 +806,7 @@ static void __dma_rx_do_complete(struct uart_8250_port *p)
             dmaengine_terminate_async(dma->rxchan);
         if (!count)
             goto unlock;
    +    printk("DEBUG DMA buf %.*s\n", count,dma->rx_buf);
         ret = tty_insert_flip_string(tty_port, dma->rx_buf, count);
     
         p->port.icount.rx += ret;
    -- 

     

    Add debug info ,it seems broke out some workflow of uart. hope you can help me find what's the problem of the UART DMA.

  • Hi,

    Thanks for the details, this really helps. It sounds like adding a delay (caused by printk) in dma tx complete makes the problem disappear, right? and you only can reproduce the dma rx issue with gdb over uart, correct?

    By the way, do you use MCU_UART (/dev/ttyS1) on the UART USB connector or MCU UART1 connector on the EVM?

  • Hi 

    I add the printk in the dma RX complete callback. it makes the problem disappear.

    The port I have used is the USB-UART connector on the evm board not the mcu uart header.

  • Hi,

    Thanks for the clarification. I will look into the issue with gdb.

  • Hi,

    I am able to produce the issue on the AM65x EVM with gdb session. I am reporting the issue to the development team and will keep you posted.

    The issue I see is that after run 'target remote /dev/ttyUSB1' on the host PC, the EVM gdbserver reports either

    gdbserver: Reply contains invalid hex digit 36

    or

    Bad checksum, sentsum=<a_random_number>, csum=<a_random_number>, buf=<a_random_string>

    Please confirm if you see the same symptom?

  • Dear liu bin

    I confirm that there are the similar  output  "invalid hex digit 36" and also have bad checksum appears.

    BRs

  • Dear liu bin

    I would like to know What is the state of the problem and what progress has been made?

    I haven't received any solutions yet.

    BRs

  • Hi,

    I was able to reproduce the issue and reported it to our development team, but I have't heard the estimate from the dev team yet.Typically it would take weeks to fix it. I will keep you posted once I received the solution from the dev team. Thanks for the patience.

  • Hi,

    Please apply the following two kernel patches, and let me know if it solves the issue.

    Date: Tue, 17 Sep 2019 14:53:27 +0530
    From: Vignesh Raghavendra <vigneshr@ti.com>
    To: Sekhar Nori <nsekhar@ti.com>
    CC: TI Internal Linux Patch Review <linux-patch-review@list.ti.com>,
     b-liu@ti.com
    Subject: [tiL4.19-CON PATCH 1/2] serial: 8250: 8250_omap: Fix DMA teardown
     sequence during RX timeout
    
    Calling dmaengine_terminate_async() does not guarantee all the data that
    is picked up DMA and is in flight to memory is flushed immediately,
    therefore poll for the in flight data to be flushed before pushing
    buffer to tty ldisc.
    Ideal way to solve this without polling is to call
    dmaengine_synchronize() before pushing data to tty layer, but that
    cannot be done in interrupt context and code cannot be moved to bottom
    half as we need to hold rx_dma_lock spinlock.
    Therefore introduce a bounded polling mechanism to know data has been
    flushed. Since this is a flush at DMA hardware level, sequence should be
    quite deterministic and loop upper bound is set to 5 times the observed
    value.
    
    Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
    ---
     drivers/tty/serial/8250/8250_omap.c | 25 ++++++++++++++++++++++---
     1 file changed, 22 insertions(+), 3 deletions(-)
    
    diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c
    index d3d55a634313..77c1a18d4f43 100644
    --- a/drivers/tty/serial/8250/8250_omap.c
    +++ b/drivers/tty/serial/8250/8250_omap.c
    @@ -788,6 +788,8 @@ static void __dma_rx_do_complete(struct uart_8250_port *p)
     	struct omap8250_priv	*priv = p->port.private_data;
     	struct uart_8250_dma    *dma = p->dma;
     	struct tty_port         *tty_port = &p->port.state->port;
    +	struct dma_chan		*rxchan = dma->rxchan;
    +	dma_cookie_t		cookie;
     	struct dma_tx_state     state;
     	int                     count;
     	unsigned long		flags;
    @@ -798,12 +800,29 @@ static void __dma_rx_do_complete(struct uart_8250_port *p)
     	if (!dma->rx_running)
     		goto unlock;
     
    +	cookie = dma->rx_cookie;
     	dma->rx_running = 0;
    -	dmaengine_tx_status(dma->rxchan, dma->rx_cookie, &state);
    +	dmaengine_tx_status(rxchan, cookie, &state);
     
     	count = dma->rx_size - state.residue + state.in_flight_bytes;
    -	if (count < dma->rx_size)
    -		dmaengine_terminate_async(dma->rxchan);
    +	if (count < dma->rx_size) {
    +		dmaengine_terminate_async(rxchan);
    +
    +		/*
    +		 * Poll for teardown to complete which guarantees in
    +		 * flight data is drained.
    +		 */
    +		if (state.in_flight_bytes) {
    +			int poll_count = 25;
    +
    +			while (dmaengine_tx_status(rxchan, cookie, NULL) &&
    +			       poll_count--)
    +				cpu_relax();
    +
    +			if (!poll_count)
    +				dev_err(p->port.dev, "teardown incomplete\n");
    +		}
    +	}
     	if (!count)
     		goto unlock;
     	ret = tty_insert_flip_string(tty_port, dma->rx_buf, count);
    -- 
    2.23.0
    

    Date: Tue, 17 Sep 2019 14:53:28 +0530
    From: Vignesh Raghavendra <vigneshr@ti.com>
    To: Sekhar Nori <nsekhar@ti.com>
    CC: TI Internal Linux Patch Review <linux-patch-review@list.ti.com>,
     b-liu@ti.com
    Subject: [tiL4.19-CON PATCH 2/2] serial: 8250: 8250_omap: Remove redundant
     call to omap_8250_rx_dma_flush
    
    omap_8250_rx_dma_flush() is called twice in am654_8250_handle_rx_dma()
    and first call quite redundant in case of second one. Drop the redundant
    call.
    
    Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
    Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
    ---
     drivers/tty/serial/8250/8250_omap.c | 1 -
     1 file changed, 1 deletion(-)
    
    diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c
    index 77c1a18d4f43..088232f6aa75 100644
    --- a/drivers/tty/serial/8250/8250_omap.c
    +++ b/drivers/tty/serial/8250/8250_omap.c
    @@ -1105,7 +1105,6 @@ static unsigned char am654_8250_handle_rx_dma(struct uart_8250_port *up,
     		if (!up->dma->rx_running) {
     			omap_8250_rx_dma(up);
     		} else {
    -			omap_8250_rx_dma_flush(up);
     			/*
     			 * Disable RX timeout, read IIR to clear
     			 * current timeout condition, clear EFR2 to
    -- 
    2.23.0
    

  • Hi Liu bin

    the patches work. gdbserver can work well now.

    thanks

  • Hi,

    Thanks for the update. I am glad the issue is solved.

    The coming Processor SDK release (v6.1) is currently in the late test cycle so these patches probably won't be included, but they will be in the next Processor SDK release (v6.2).