AM263P4: UART driver RX stops working while TX still works.

Part Number: AM263P4
Other Parts Discussed in Thread: SYSCONFIG,

Currently using SDK 10.2.0.5.  UART transfer mode set to interrupt.image.png

 

Serial cable is connected from our board to laptop using an isolater.  The bus is a bit noise and evident in seeing non ascii characters being echoed back and not all characters received at the MCU.  This can potentially cause bus errors.

When running in a live system after several hours, the UART RX on the MCU stops working but the TX is still working.  A thread is able to TX characters to the laptop.

MCU is unable to RX any commands which leaves the product in a very dangerous state since we can't shut it down.

As a first step, I asked AI to help find the issue.   I understand AI can hallucinate and gaslight, but it's response seems believable and highlighted some potential issues with the driver.

Can TI help confirm if this is indeed an issue with the driver or not?  This would really help focus my search on where this issue could reside.

Claude Opus' analysis:
================================================

Your UART is configured with transferMode = UART_CONFIG_MODE_INTERRUPT (line 698 of ti_drivers_open_close.c). This creates race conditions because:

 

  1. TX and RX share hUart->state — the HLD UART_write() unconditionally sets state = UART_STATE_READY (line 555 of uart_v0.c) before every write, even if a read is in progress. This is a TI SDK workaround to allow concurrent TX/RX but it corrupts state machine invariants.
  2. Semaphore race — the read path uses SemaphoreP_pend() with a 10ms timeout. If a byte arrives between the timeout and UART_readCancelNoCB(), the ISR posts the binary semaphore but the HLD ignores the result. This stale semaphore post causes the next read to return immediately with uninitialized data, and over many occurrences can permanently desynchronize the driver.
  3. RX line errors (overrun/framing from USB-serial adapters) trigger UART_procLineStatusErr() which disables RX interrupts and flushes the FIFO inside the ISR, then skips the normal read-completion path. The semaphore is never posted, the read times out, but repeated errors can make the CLI appear permanently dead.

 

Recommended Fix

  • Change the UART transfer mode to UART_CONFIG_MODE_POLLED in SysConfig. This is appropriate because:
  • You're already doing 1-byte reads with a 10ms timeout (polling semantics)
  • Polled mode uses UART_lld_read()/UART_lld_write() which are self-contained, always restore state = READY, and don't involve semaphores or ISRs
  • The HLD's state = READY hack on line 555 of uart_v0.c still allows writes to proceed during reads
  • Eliminates all ISR-related race conditions
  • No code changes needed in bsp_uart.cpp

 

The change is in SysConfig for the UART0 instance: set Transfer Mode from "Interrupt" to "Polled".

 

 

Yes, it's a bug (or at minimum a serious design deficiency) in TI's UART HLD/LLD driver stack. Specifically:

The Bug: Line 555 of uart_v0.c

 

// Inside UART_write(), before every write:

hUart->state = UART_STATE_READY;

 

TI unconditionally resets the shared state to READY before every write, even if a read is in progress. This was likely an intentional workaround to allow simultaneous TX and RX on the same UART instance, but it breaks the state machine contract that the LLD relies on.

Why It's a Bug

  1. The LLD state machine assumes mutual exclusion. UART_lld_readIntr() checks state == UART_STATE_READY as a guard (line 2722), transitions to UART_STATE_BUSY, and expects state to remain BUSY until the ISR completes the transfer. The HLD write path violates this assumption.
  2. The ISR checks state. In interrupt mode, the ISR uses state to decide whether a read is active. If UART_write() resets it to READY mid-read, the ISR may skip processing received data, leaving the read semaphore un-posted and the read hanging until timeout.
  3. Stale semaphore posts. When a read times out and you call UART_readCancelNoCB(), if a byte arrives between the timeout and the cancel, the ISR posts the binary semaphore. This stale post causes the next SemaphoreP_pend() to return immediately with no actual data, corrupting the read flow. Over time this desynchronizes the driver permanently.
  4. Line status errors (overrun/framing) disable RX interrupts in UART_procLineStatusErr() but never re-enable them in the error path — only the normal completion path re-enables them on the next read. If the error coincides with the state corruption above, RX can become permanently disabled.

 

TI's Stance

 

This pattern exists across multiple TI SDK versions (MCU+ SDK 10.x). It's not documented as a known issue. TI's examples typically use UART in a simple request-response pattern (write, then read, never concurrent), which masks the bug. Your use case — continuous background polling for RX while asynchronously writing TX — exposes it.

 

Bottom Line

 

It's a TI driver bug triggered by concurrent TX and RX in interrupt mode. Your options:

  • Switch to polled mode (recommended — matches your polling usage pattern, sidesteps all the bugs)
  • File a bug with TI via E2E forums referencing the state = READY reset in UART_write() and the lack of separate TX/RX state tracking
  • Hello Huey,

    Thank you for the detailed problem description and for including Claude's analysis. I understand this is a critical issue for your system safety. Let me address your concerns and provide some guidance.

    Verification of the Analysis

    I've reviewed the AM263P4 Technical Reference Manual and relevant documentation. While I cannot directly verify the specific driver implementation details that Claude referenced (such as the state variable handling in uart_v0.c line 555), I can confirm several important points:

    From the TRM:

    • The UART hardware is designed with separate, prioritized interrupts for RX and TX operations, indicating concurrent operation should be supported
    • Section 13.1.4.4.6.2 contains a critical warning: "In FIFO interrupt mode with flow control, ensure that the trigger level to HALT transmission is greater than or equal to the RX FIFO trigger level... otherwise, FIFO operation stalls."

    Your reported symptoms are consistent with a driver state machine issue:

    • RX stops while TX continues functioning
    • Occurs after extended operation with line errors present
    • Noisy bus environment with visible framing/data errors

    Recommended Diagnostic Steps

    Before changing the transfer mode, I recommend adding error monitoring to help isolate the root cause:

    // Monitor UART line status errors
    uint32_t lsr = UART_readLineStatus(handle);
    if (lsr & (UART_LSR_OE | UART_LSR_FE | UART_LSR_PE | UART_LSR_BI)) {
        // Log which errors occur and their frequency
        // This will help determine if line errors correlate with RX failure
    }


    Track overrun errors (OE), framing errors (FE), parity errors (PE), and break interrupts (BI). This data will be valuable for identifying the failure pattern.

    Recommended Solutions

    Option 1: Switch to Polled Mode (Recommended)

    Given your usage pattern (1-byte reads with 10ms timeout), polled mode is actually the appropriate choice:

    • Change Transfer Mode from "Interrupt" to "Polled" in SysConfig for your UART0 instance
    • No application code changes required
    • Eliminates all ISR and semaphore-related race conditions
    • Matches your polling-based usage pattern

    This is not a workaround—polled mode is a legitimate transfer mode designed for exactly this type of operation.

    Option 2: If Staying in Interrupt Mode

    Verify your FIFO configuration in SysConfig:

    • Ensure RX_FIFO_TRIG_HALT >= RX_FIFO_TRIG_START per TRM requirements
    • Review RX FIFO Trigger Level settings
    • Check flow control configuration

    Option 3: Improve Bus Integrity

    Your noisy bus is a contributing factor:

    • Verify proper termination on the serial cable
    • Check ground connections through the isolator
    • Consider a different USB-to-serial adapter
    • Reduce cable length if possible

    Next Steps

    I recommend you try polled mode first as it should resolve the issue quickly. However, I'd also like to investigate this further with our driver development team.

    Please provide the following information:

    1. Your complete UART SysConfig configuration (FIFO settings, flow control, etc.)
    2. Results from the error monitoring code above
    3. Confirmation of whether switching to polled mode resolves the issue

    I will escalate this internally to have our SDK team review the specific driver implementation concerns you've raised, particularly regarding state variable handling during concurrent TX/RX operations in interrupt mode.

    Regarding Claude's specific claim about line 555 in uart_v0.c:
    I will ask our SDK team to confirm whether UART_write() unconditionally resets the state variable and whether this is safe for concurrent operations. This is a valid question that deserves an official answer.

    Thank you for bringing this to our attention. System safety is our top priority, and we want to ensure the driver behaves correctly under all operating conditions.

    Best Regards,
    Zackary Fleenor