This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3354: Strange behavior of UART with OMAP 8250 serial driver

Part Number: AM3354

Hello

 

We have 2 problems with serial interface.

We are using RTU Modbus protocol over RS485 through UART peripheral.

 

Kernel version: Linux ABB-54-4a-16-fb-1f-16 4.19.94-rt39 #3 PREEMPT RT Tue Mar 9 16:16:21 CET 2021 armv7l GNU/Linux

Serial driver: OMAP 8250 with DMA (as default conf from evm-06.03.00.106)

 

The first strange behavior regards the transmission. Sometime, roughly  1 over 1000, the frame sent out by our AM33354 microprocessor is not correct. There is one byte unexpectedly inserted in the first position, then we get the normal frame except the last byte that is missing.

The byte prefixed is always the first byte of the previous transmitted frame (frame which is correct).

 

For example:

 

 the following frame has a wrong prefixed 0xFA

 

  

The correct frame would have been:

FA 10 00 10 00 0A 03 EA 03 84 00 00 07 E5 00 03 00 05 00 0D 00 0D 00 39 00 83 02 9C

Thus, the length is correct, but from the real frame the actual one is truncated by 1 byte and 1 byte is inserted in the first position.

The previous frame was :

Which starts with 0xFA too. If this previous file started by 0xYY, then the worn one would start by 0xYY.

 

Our application sends always the same set of frames that is composed by modbus function code (req[1], the second byte of frame) that can be 0x04 or 0x06 or 0x10 or 0x14. Moreover, the first byte (the modbus slave address) can be 0x11, 0x81 or 0xFA. As already described, in case of error, the second byte is the first one of the previous frame, thus 0x11, 0x81 or 0xFA.

We have inserted a log in the modbus library (libmodbus) we use that proves the buffer passed to write system call is never the strange frame we see by oscilloscope.

 

static ssize_t _modbus_rtu_send(modbus_t *ctx, const uint8_t *req, int req_length) {​​​​

    /* ... */

    if (req[1] != 4 && req[1] != 6 && req[1] != 16 && req[1] != 20) {​​​​​ // This line filter frames affected by the problem, normally Tx frames have 0x04 or 0x06 or 0x10 or 0x14 as second byte.

        time_t t = time(NULL);

        printf("*** Suspect outbound frame before write %ld\n", t);

        for (int i = 0; i < req_length; i++)

           printf("[%.2X]", req[i]);

        printf("\n");

}​​​​​

    size = write(ctx->s, req, req_length);

}

 

We never enter in the IF condition, that means the req buffer is always composed as expected.

 

 


The second strange behavior regards the reception.

The data we read from the sensors is retrieved through chunks of files, transferred using the Modbus opcode 20 (read file record).

To read these file chunks we manually access the serial port socket with a custom poll()/read() loop.

The problem we are experiencing is that many opcode 20 requests (like 1 in 70) fail because the first read() instantly returns a block of data filled with a single repeated value.

We analyzed the RS485 line with an oscilloscope, and concluded that this data is not present on the bus.

I’ve attached a log file with 6 examples of this problem.

 

We noticed a few things about this problem:

- The size of the "fake" data block is always 48, which we think is the size of a DMA trasfer in the UART driver

- The repeated value is always the 65th byte of the previous read chunk. You can see it from the attached examples

- If we set CONFIG_SERIAL_8250_DMA=n in kernel configuration, the problem disappears

Thanks in advance.

BR.

Lorenzo

  • Hi Lorenzo,

    What is the baud rate used?

    Does the issue still happen if use different baud rate?

  • Hi Bin,

    the baud rate is 9600bps.

    We have not tried with different baud rates since we do not have control over FW of all sensors (except one) and unfortunately 9600bps is a constraint not removable in long term. Anyway, if you consider that test could provide some precious clues then we can temporarily set it up by using the sensor we can modify.

    Lorenzo.

  • Hi,

    Regarding the added byte, you are probably affected by this bug:
    github.com/.../d96f04d347e4011977abdbb4da5d8f303ebd26f8

  • Hi Rasmus,

    Thanks for bring up this patch.

    Hi Lorenzo,

    You don't need to test other baud rate right now. 9600 is a common baud rate.

    Can you please try the patch Rasmus mentioned above to see if it solves the issue?

  • Hi all,

    we have tried with the patch from Rasmus, but we got the same kind of Tx error.

    During the weekend test we have used  the CONFIG_SERIAL_8250_DMA=n, that at least makes the secondary problem (the dummy 48bytes in rx) disappear.

    In addition, we earlier tested also a version with baud rate moved to 19200bps which seemed to not present the Tx problem, Anyway, we are not fully confident that changes to setup, such as using only 1 sensor instead of 2 didn't modify the test behavior. 

    Remark: for the last tests with original baud rate we are using kernel 4.14 because with 4.19 The WiFi is not working. Our aim would be to continue to use 4.14 also for the next tests.

    We remain available to perform different overnight tests if they can provide any clue to understand the problem.

    BR

    Lorenzo

  • Hi Lorenzo,

    During the weekend test we have used  the CONFIG_SERIAL_8250_DMA=n, that at least makes the secondary problem (the dummy 48bytes in rx) disappear.

    Do you mean the first issue (unexpected first byte in tx) still happens with UART DMA disabled?

    In addition, we earlier tested also a version with baud rate moved to 19200bps which seemed to not present the Tx problem, Anyway, we are not fully confident that changes to setup, such as using only 1 sensor instead of 2 didn't modify the test behavior. 

    Do you mean the number of sensors used in testing 9600 and 19200 are different?

    Can you please run the UART check tool serialstats in parallel with your test to get the line status? Pay attention on any line error when the tx and/or rx issue happens during the test.

    The following link as details to compile and run the tool.

    software-dl.ti.com/.../UART.html

  • Hi Bin,

    yes I meant the unexpected byte in tx still happens with UART DMA disabled.

    The number of sensors for test with different baud rates are different:

    -2 sensors @9600bps

    -1 sensor @19200bps

    We have started ,before your reply, an overnight test where we changed peripheral to go on Modbus. In this test, the RS485 bus is accessed trough an USB peripheral connected to a CP2105 from Silicon Labs (USB to Uart bridge) which we have on the pcb for a different purpose. We thought this test is useful to discriminate if problem is really on the serial driver.

    As soon as the test ends we will start the one with original UART and serialstats as you suggested.

    BR

    Lorenzo.

  • Hi Bin,

    we have run the patched kernel with serialstats monitoring the Uart: the problem occurred several times but log from utility doesn't highlight errors.

    cts: 0 dsr: 0 rng: 0 dcd: 0 rx: 1042568 tx: 135774 frame error 0 overuns 0 parity: 0 break: 0 buffer overrun: 0

    Previously, the test with ttyUSB didn't show any error.

    BR.

    Lorenzo.

  • Hi Lorenzo,

    Thanks for the update. I will discuss this with our dev team and get you back.

  • Hi again,

    Just a side information.

    We are using as RTS a normal GPIO; since in our pcb the hw rts from Uart peripheral is not routed to RS-485 transceiver DE. The rts GPIO is driven then by libmodbus sw library.

    By the way in my opinion a bad RTS handling should not lead to an unexpected byte in the stream. The oscilloscope screenshots posted earlier are referred to RS485 bus, the transmission enable seems always well aligned to the data stream (you can see about 1ms delay in keeping bus before end after data transmission as implemented in libmodbus) .

    Lorenzo.

  • Hi Lorenzo,

    This is important information. I am not a RS-485 or Modbus expert and cannot comment if improper RTS timing could cause such data corruption.

    I'd expect a few things regarding to the RTS control for RS-485:

    - The kernel 8250 UART driver supports RS-485 mode, it will use the RTS (or GPIO-RTS pin) to control the RS-485 DE. The RS-485 mode is enabled in port configuration while open the port. Is this what your software is doing or you don't configure the UART port in RS-485 mode? You mentioned the libmodbus library controls the RTS GPIO, which sounds like you don't use the kernel driver RS-485 mode.

    - Besides the hw RTS pin to control the RS-485 DE, the kernel driver also supports GPIO based RTS control. Just define the GPIO RTS information in kernel device tree.

    - The kernel driver RS-485 mode also supports RTS control delay before and after TX, the delay is specified in kernel device tree too. You might want to try different delay interval to see if it makes difference in the data corruption?

  • Hi Bin,

     

    please find below the dts part related to the Uart.

    &uart2 {

        pinctrl-names = "default";

        pinctrl-0 = <&uart2_pins>;

        status = "okay";

        rts-gpio = <&gpio1 12 GPIO_ACTIVE_LOW>;

        rs485-rts-active-high;

        rs485-rts-delay = <0 0>;

        linux,rs485-enabled-at-boot-time;

    };

    Until some weeks ago we had been using the older driver omap-serial.c which was able to handle correctly the GPIO-RTS pin, the Tx problem was not present but we had some real-time strange behavior in calling poll on uart file descriptor (the poll timeout was not always respected).

    In order to try to fix this issue, then we migrated to the newer 8250_omap which seems to ignore the dts rts-gpio entry GPIO-RTS pin. For this reason we have introduced the manual driving of the pin which is supported also by libmodbus; so that, the pin is driven from user space. The serial port is opened as following:

    int ret = modbus_rtu_set_serial_mode(this->modbus, MODBUS_RTU_RS485);

    which is internally implemented as:

    rs485conf.flags = SER_RS485_ENABLED;

    ioctl(fd, TIOCSRS485, &rs485conf)

     

    As far as we have verified the only way to change the delay is keeping to delegate the GPIO RTS handling to libmodbus. If needed we can change it and verify the behavior.

     

    Lorenzo.

  • Hi Lorenzo,

    While implementing a new UART 8250 driver early last year, I was able to adjust the rts delay for RS-485. I will look into it and explain how to do it, also look at how rts-gpio should be used in 8250 driver framework.

  • Hi Lorenzo,

    I spent some time reviewing the kernel serial driver code, it does provide the framework to support gpio based modem control pins (BTY, the DT property would be rts-gpios, not rts-gpio), it is implemented in file drivers/tty/serial/serial_mctrl_gpio.c, and documented in kernel Documentation/devicetree/bindings/serial/serial.txt.

    Unfortunately only a very few serial drivers support it, and 8250_omap driver is NOT one of them. So your project using gpio based rts for RS-485 DE control is not really what the 8250_omap driver supports.

    We could ask our kernel dev team to add gpio based modem control support into 8250_omap, but I don't think the work will be done very soon.

    The only thing you could test, which I can think of, is to add some delay (in milliseconds) before and after tx to see if this changes the behavior.

    If not, do you think we can come back to the omap_serial driver and try to solve the polling timeout issue?

  • Hi Lorenzo,

    I received the response from our dev team - it turns out the GPIO based RTS support is already in kernel v5.4 with the following two patches. I will back port the patches to kernel v4.19 in AM335x Processor SDK v6.3 and provide them to you today.

    4a96895f74c9 ("tty/serial/8250: use mctrl_gpio helpers")
    fc64f7abbef2 ("serial: 8250_omap: Fix gpio check for auto RTS/CTS ")

    BTY, We have not tested RS485 with DMA (so not sure if RX would work w/o auto CTS/RTS) hence our Dev suggest to keep it disabled.

  • Hi Lorenzo,

    Attached below are the patches to add GPIO based modem control support in 8250_omap driver in kernel v4.19, they are all back ported from new kernels.

    8250-rts-gpio-support-4.19.zip

    In kernel dts, please use "rts-gpios" instead of "rts-gpio" to specify the GPIO RTS pin.

    The DT property "rs485-rts-active-high" is omap-serial driver specific, it is not supported in 8250 core or 8250-omap driver. So please set it in rs485conf.flags in your application, similar to what is set in omap-serial driver:

    1615         if (of_property_read_bool(np, "rs485-rts-active-high")) {               
    1616                 rs485conf->flags |= SER_RS485_RTS_ON_SEND;                      
    1617                 rs485conf->flags &= ~SER_RS485_RTS_AFTER_SEND;                  
    1618         } else {                                                                
    1619                 rs485conf->flags &= ~SER_RS485_RTS_ON_SEND;                     
    1620                 rs485conf->flags |= SER_RS485_RTS_AFTER_SEND;                   
    1621         }

    I also don't see the DT property "rs485-rts-delay" is supported in 8250 core or 8250-omap driver either. So please specify the delay in rs485conf in your application too.

    rs485conf.delay_rts_after_send = ...
    rs485conf.delay_rts_before_send = ...