AM2434: Assert on "tcp_free_acked_segments" and "pbuf_free"

Part Number: AM2434
Other Parts Discussed in Thread: SYSCONFIG

System

AM2434 with 1GB external DDR
SDK 11.0.0.15
CCS 20.3.1.5__1.9.1
Compiler: TI CLANG v4.0.4 LTS
sysconfig: 1.25.0

Issue

Randomly we get the following asserts:
- "pbuf_free: p->ref > 0" line 755 of pbuf.c
- "tcp_receive: valid queue length" line 1135 in tcp_in.c 

The two issues are grouped in the same topic since they seem to be related due to the nature of this bug- Issues are described as random since any change in the code could change the behaviour of this
bug: if few lines of code are added, if a task is declared static or dynamic or any kind of modification could let the issue disappear.

Actions taken

  • Allocated all LWIP buffers to a specific memory location and via MPU setting marked as non-cacheable, shareable, cacheable. Result: no improvements
  • Allocated all LWIP buffers in memory location on DDR or MSRAM. For both cases, via MPU settings these locations were marked as non-cacheable, shareable, cacheable. Result: no improvements
  • Disabled all types of cache in the core with this issue. Result: application is running slow but with no blocks
  • Used current traces to track the issue. Result: no findings probably due to the limits of our tracing tool
  • Instrumented with additional traces to understand when the problem arise. Result: some of the new traces probably perturb the code execution and the issue is not visible anymore
  • Invalidated cache after Rx packets were received. Added CacheP_inv in line 1236 of lwip2enet.c for all the list->bufPtr. Result: application is running but does not explain why by setting the memory area as non-cacheable is not solving the issue
  • Increased number of pools in lwippools.h. Result: application works but there's no evidence about a memory issue
  • Checked with AI agent if in our application TCP core is locked/unlocked properly to prevent concurrent access to the memory. Result: The analysis has revealed that all the guards are implemented properly and no one are missing in our code

Question

We believe that the possible solutions found are not robust enough to prevent this kind of issue. Since we are running out of ideas, we want to know if there is some way to better understand the source
of the issues and how to fix them

  • Hi Davide,

    I am expecting the failures seen are not in the examples in the SDK. Can you please let us know more about the usecase that you are observing these assert failures? (TCP server, UDP client, network load, etc will help)

    Our LwIP examples use custom pBufs for Rx datapath, which are allocated outside of the LwIP stack allocations. Tx datapath uses pBufs from lwippools. Can you please confirm if you have changed the memory allocations for the custom pBufs and not the stack's pBufs? Is the issue seen in both Rx and Tx side pBufs?

    Did you change any other LwIP configurations? I will try to reproduce this issue in our test benches, and start from there. I am also assuming these failures are seen in release build. Can you please check if the behaviour is same in both debug and release builds?

    These details will give more context to give a better start our analysis. Please let us know if you have any further specific queries.

    Thanks and regards,
    Teja.

  • Hello Teja, thank you for reaching out.

    Can you please let us know more about the usecase that you are observing these assert failures?

    Our application uses mDNS (UDP multicast) to discover an MQTT broker, and then establishes an MQTT connection over TCP. The board acts as the MQTT client.
    The mDNS service runs continuously in the background while the MQTT connection is active. The assert failures are observed during normal runtime (not during high network load).
    Currently, the board is connected through a Linux virtual machine, so the network traffic is relatively limited/light. We do not have multiple high-traffic devices on the network at the moment.
    A remark: usually these issues are seen at the beginning of the connection with the broker, but sometimes, these happen also few minutes after the connection

    Can you please confirm if you have changed the memory allocations for the custom pBufs and not the stack's pBufs?

    The test we made included the modification of the numbers of mempools (line10:14 of lwippools.h). To be clear, we increased the first field of the mempool alloc function. As a test, this modification has been stored separately and is not part of the actual code

    Is the issue seen in both Rx and Tx side pBufs?

    The issue has been seen on Rx side. About the issue with tcp_free_acked_segments, actually, the issue has been seen on Rx side, but the affected array is on Tx side

    Did you change any other LwIP configurations?

    We added IPv6, here our lwipopts.h

    /*
     * Copyright (c) 2001-2003 Swedish Institute of Computer Science.
     * All rights reserved.
     *
     * Redistribution and use in source and binary forms, with or without modification,
     * are permitted provided that the following conditions are met:
     *
     * 1. Redistributions of source code must retain the above copyright notice,
     *    this list of conditions and the following disclaimer.
     * 2. Redistributions in binary form must reproduce the above copyright notice,
     *    this list of conditions and the following disclaimer in the documentation
     *    and/or other materials provided with the distribution.
     * 3. The name of the author may not be used to endorse or promote products
     *    derived from this software without specific prior written permission.
     *
     * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED
     * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
     * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
     * SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
     * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
     * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
     * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
     * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
     * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY
     * OF SUCH DAMAGE.
     *
     * This file is part of the lwIP TCP/IP stack.
     *
     * Author: Adam Dunkels <adam@sics.se>
     *
     */
    /**
     *  Copyright (c) Texas Instruments Incorporated 2022
     *
     *  Redistribution and use in source and binary forms, with or without
     *  modification, are permitted provided that the following conditions
     *  are met:
     *
     *    Redistributions of source code must retain the above copyright
     *    notice, this list of conditions and the following disclaimer.
     *
     *    Redistributions in binary form must reproduce the above copyright
     *    notice, this list of conditions and the following disclaimer in the
     *    documentation and/or other materials provided with the
     *    distribution.
     *
     *    Neither the name of Texas Instruments Incorporated nor the names of
     *    its contributors may be used to endorse or promote products derived
     *    from this software without specific prior written permission.
     *
     *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
     *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
     *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
     *  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
     *  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
     *  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
     *  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
     *  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
     *  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
     *  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
     *  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
     */
    #ifndef LWIP_LWIPOPTS_H
    #define LWIP_LWIPOPTS_H
    
    #ifdef __cplusplus
    extern "C"
    {
    #endif
    
    #ifdef LWIP_OPTTEST_FILE
    #include "lwipopts_test.h"
    #else /* LWIP_OPTTEST_FILE */
    
    #include "lwipopts_os.h"
    
    #define MDNS_MAX_SERVICES          2
    
    #define LWIP_DEBUG                 1
    
    #define LWIP_NETIF_HOSTNAME        1
    
    #define LWIP_IPV4                  1
    #define LWIP_IPV6                  1
    
    
    /* Checksum on copy from app buffers to pbufs, boosts performance. It is set to zero as checksum offload is not enabled on both Rx and Tx side.*/
    #define LWIP_CHECKSUM_ON_COPY			    1
    #define LWIP_CHECKSUM_CTRL_PER_NETIF    1
    #define CHECKSUM_GEN_UDP                1
    #define CHECKSUM_GEN_TCP                1
    #define CHECKSUM_GEN_IP                 1
    #define CHECKSUM_GEN_ICMP               1
    #define CHECKSUM_GEN_ICMP6              1
    
    #define CHECKSUM_CHECK_IP               1
    #define CHECKSUM_CHECK_UDP              1
    #define CHECKSUM_CHECK_TCP              1
    #define CHECKSUM_CHECK_ICMP             1
    #define CHECKSUM_CHECK_ICMP6            1
    
    /* IPv6 settings (dual-stack, SLAAC; no DHCPv6) */
    #define LWIP_IPV6_AUTOCONFIG       LWIP_IPV6
    #define LWIP_ICMP6                 LWIP_IPV6
    #define LWIP_IPV6_MLD              LWIP_IPV6
    #define LWIP_IPV6_DHCP6            0
    #define LWIP_IPV6_NUM_ADDRESSES    3
    /* Keep bring-up deterministic for embedded: make link-local address usable immediately. */
    #define LWIP_IPV6_DUP_DETECT_ATTEMPTS 0
    
    #define LWIP_HAVE_MBEDTLS          1
    
    /* ---------- UDP options ---------- */
    #define LWIP_UDP                   1
    #define LWIP_UDPLITE               LWIP_UDP
    #define UDP_TTL                    255
    
    /* ---------- TCP options ---------- */
    #define LWIP_TCP                   1
    #define TCP_TTL                    255
    
    #define LWIP_ALTCP                 (LWIP_TCP)
    #ifdef LWIP_HAVE_MBEDTLS
    #define LWIP_ALTCP_TLS             (LWIP_TCP)
    #define LWIP_ALTCP_TLS_MBEDTLS     (LWIP_TCP)
    #endif
    
    #define DNS_MAX_NAME_LENGTH        256
    #define DNS_TABLE_SIZE             4
    
    #define LWIP_SOCKET                (NO_SYS==0)
    #define LWIP_NETCONN               (NO_SYS==0)
    #define LWIP_NETIF_API             (NO_SYS==0)
    
    #define LWIP_IGMP                  LWIP_IPV4
    #define LWIP_ICMP                  LWIP_IPV4
    
    #define LWIP_DNS                   LWIP_UDP
    #define LWIP_MDNS_RESPONDER        LWIP_UDP
    
    #define LWIP_NUM_NETIF_CLIENT_DATA (LWIP_MDNS_RESPONDER + 2)
    
    #define LWIP_SINGLE_NETIF          1
    #define LWIP_NETIF_LOOPBACK        0
    #define LWIP_HAVE_LOOPIF           (LWIP_NETIF_LOOPBACK && !LWIP_SINGLE_NETIF)
    #define LWIP_LOOPBACK_MAX_PBUFS    10
    
    #define TCP_LISTEN_BACKLOG         1
    
    #define LWIP_COMPAT_SOCKETS        1
    #define LWIP_SO_RCVTIMEO           1
    #define LWIP_SO_RCVBUF             1
    #define LWIP_SO_SNDTIMEO           1
    
    #define LWIP_TCPIP_CORE_LOCKING    1
    
    
    #define LWIP_NETIF_LINK_CALLBACK        1
    #define LWIP_NETIF_STATUS_CALLBACK      1
    #define LWIP_NETIF_EXT_STATUS_CALLBACK  1
    
    #ifdef LWIP_DEBUG
    
    #define LWIP_DBG_MIN_LEVEL         0
    #define PPP_DEBUG                  LWIP_DBG_ON
    #define MEM_DEBUG                  LWIP_DBG_ON
    #define MEMP_DEBUG                 LWIP_DBG_ON
    #define PBUF_DEBUG                 LWIP_DBG_ON
    #define API_LIB_DEBUG              LWIP_DBG_ON
    #define API_MSG_DEBUG              LWIP_DBG_ON
    #define TCPIP_DEBUG                LWIP_DBG_ON
    #define NETIF_DEBUG                LWIP_DBG_ON
    #define SOCKETS_DEBUG              LWIP_DBG_ON
    #define DNS_DEBUG                  LWIP_DBG_ON
    #define AUTOIP_DEBUG               LWIP_DBG_ON
    #define DHCP_DEBUG                 LWIP_DBG_ON
    #define IP_DEBUG                   LWIP_DBG_ON
    #define IP_REASS_DEBUG             LWIP_DBG_ON
    #define ICMP_DEBUG                 LWIP_DBG_ON
    #define IGMP_DEBUG                 LWIP_DBG_ON
    #define UDP_DEBUG                  LWIP_DBG_ON
    #define TCP_DEBUG                  LWIP_DBG_ON
    #define TCP_INPUT_DEBUG            LWIP_DBG_ON
    #define TCP_OUTPUT_DEBUG           LWIP_DBG_ON
    #define TCP_RTO_DEBUG              LWIP_DBG_ON
    #define TCP_CWND_DEBUG             LWIP_DBG_ON
    #define TCP_WND_DEBUG              LWIP_DBG_ON
    #define TCP_FR_DEBUG               LWIP_DBG_ON
    #define TCP_QLEN_DEBUG             LWIP_DBG_ON
    #define TCP_RST_DEBUG              LWIP_DBG_ON
    #define ETHARP_DEBUG               LWIP_DBG_ON
    #endif
    
    #define LWIP_DBG_TYPES_ON         (LWIP_DBG_ON|LWIP_DBG_TRACE|LWIP_DBG_STATE|LWIP_DBG_FRESH|LWIP_DBG_HALT)
    
    /* ---------- Memory options ---------- */
    /* MEM_ALIGNMENT: should be set to the alignment of the CPU for which
       lwIP is compiled. 4 byte alignment -> define MEM_ALIGNMENT to 4, 2
       byte alignment -> define MEM_ALIGNMENT to 2. */
    /* MSVC port: intel processors don't need 4-byte alignment,
       but are faster that way! */
    #define MEM_ALIGNMENT           32U
    
    #define MEM_USE_POOLS           1
    #define MEMP_USE_CUSTOM_POOLS	1
    
    #define LWIP_SUPPORT_CUSTOM_PBUF 1U
    /* Debug checks - will impact throughput if enabled */
    #define MEM_OVERFLOW_CHECK       (0)
    #define MEM_SANITY_CHECK         (0)
    
    /* MEMP_NUM_PBUF: the number of memp struct pbufs. If the application
       sends a lot of data out of ROM (or other static memory), this
       should be set high. */
    #define MEMP_NUM_PBUF           128
    /* MEMP_NUM_RAW_PCB: the number of UDP protocol control blocks. One
       per active RAW "connection". */
    #define MEMP_NUM_RAW_PCB        3
    /* MEMP_NUM_UDP_PCB: the number of UDP protocol control blocks. One
       per active UDP "connection". */
    #define MEMP_NUM_UDP_PCB        6
    /* MEMP_NUM_TCP_PCB: the number of simulatenously active TCP
       connections. */
    #define MEMP_NUM_TCP_PCB        5
    /* MEMP_NUM_TCP_PCB_LISTEN: the number of listening TCP
       connections. */
    #define MEMP_NUM_TCP_PCB_LISTEN 8
    /* MEMP_NUM_TCP_SEG: the number of simultaneously queued TCP
       segments. */
    #define MEMP_NUM_TCP_SEG        128
    /* MEMP_NUM_SYS_TIMEOUT: the number of simulateously active
       timeouts. */
    /* IPv6 (ND6/MLD6) increases the number of timeouts; let lwIP size this if possible. */
    #if defined(LWIP_NUM_SYS_TIMEOUT_INTERNAL)
    #define MEMP_NUM_SYS_TIMEOUT    LWIP_NUM_SYS_TIMEOUT_INTERNAL
    #else
    #define MEMP_NUM_SYS_TIMEOUT    30
    #endif
    
    /* The following four are used only with the sequential API and can be
       set to 0 if the application only will use the raw API. */
    /* MEMP_NUM_NETBUF: the number of struct netbufs. */
    #define MEMP_NUM_NETBUF         128
    /* MEMP_NUM_NETCONN: the number of struct netconns. */
    #define MEMP_NUM_NETCONN        10
    /* MEMP_NUM_TCPIP_MSG_*: the number of struct tcpip_msg, which is used
       for sequential API communication and incoming packets. Used in
       src/api/tcpip.c. */
    #define MEMP_NUM_TCPIP_MSG_API   128
    #define MEMP_NUM_TCPIP_MSG_INPKT 128
    
    /* Debug checks - will impact throughput if enabled */
    #define MEMP_OVERFLOW_CHECK      (0)
    #define MEMP_SANITY_CHECK        (0)
    
    /* ---------- Pbuf options ---------- */
    /* PBUF_POOL_SIZE: the number of buffers in the pbuf pool. Setting this to zero as Rx Custom Pbufs are used. */
    #define PBUF_POOL_SIZE          0U
    
    /* PBUF_POOL_BUFSIZE: the size of each pbuf in the pbuf pool. */
    #define PBUF_POOL_BUFSIZE       1536
    
    /** SYS_LIGHTWEIGHT_PROT
     * define SYS_LIGHTWEIGHT_PROT in lwipopts.h if you want inter-task protection
     * for certain critical regions during buffer allocation, deallocation and memory
     * allocation and deallocation.
     */
    #define SYS_LIGHTWEIGHT_PROT    (NO_SYS==0)
    
    /* Length of TCP queue equal to number of listening connections +
       number of segments to receive*/
    #define TCPIP_MBOX_SIZE			(MEMP_NUM_TCP_PCB_LISTEN + MEMP_NUM_TCP_SEG)
    
    /* Number of TCP packets queued for receiving by each TCP connection*/
    #define DEFAULT_TCP_RECVMBOX_SIZE (MEMP_NUM_TCP_SEG)
    
    /* Controls if TCP should queue segments that arrive out of
       order. Define to 0 if your device is low on memory. */
    #define TCP_QUEUE_OOSEQ         1
    
    #define TCP_CALCULATE_EFF_SEND_MSS      1
    
    /* TCP Maximum segment size. */
    #define TCP_MSS                 1460
    
    /* TCP sender buffer space (bytes). */
    #define TCP_SND_BUF             (8 * TCP_MSS)
    
    /* TCP sender buffer space (pbufs). This must be at least = 2 *
       TCP_SND_BUF/TCP_MSS for things to work. */
    #define TCP_SND_QUEUELEN       (8 * TCP_SND_BUF/TCP_MSS)
    
    /* TCP writable space (bytes). This must be less than or equal
       to TCP_SND_BUF. It is the amount of space which must be
       available in the tcp snd_buf for select to return writable */
    #define TCP_SNDLOWAT           (TCP_SND_BUF/2)
    
    /* TCP receive window. */
    #define TCP_WND                 (TCP_SND_BUF)
    
    /* Maximum number of retransmissions of data segments. */
    #define TCP_MAXRTX              12
    
    /* Maximum number of retransmissions of SYN segments. */
    #define TCP_SYNMAXRTX           4
    
    #define DEFAULT_THREAD_STACKSIZE    (5 * 1024)
    #define TCPIP_THREAD_STACKSIZE      (8 * 1024)
    
    //#define LWIP_FREERTOS_THREAD_STACKSIZE_IS_STACKWORDS (1)
    
    /* ---------- ARP options ---------- */
    #define LWIP_ARP                1
    #define ARP_TABLE_SIZE          10
    #define ARP_QUEUEING            1
    
    
    /* ---------- IP options ---------- */
    /* Define IP_FORWARD to 1 if you wish to have the ability to forward
       IP packets across network interfaces. If you are going to run lwIP
       on a device with only one network interface, define this to 0. */
    #define IP_FORWARD              0
    
    /* IP reassembly and segmentation.These are orthogonal even
     * if they both deal with IP fragments */
    #define IP_REASSEMBLY           1
    #define IP_REASS_MAX_PBUFS      (10 * ((1500 + PBUF_POOL_BUFSIZE - 1) / PBUF_POOL_BUFSIZE))
    #define MEMP_NUM_REASSDATA      IP_REASS_MAX_PBUFS
    #define IP_FRAG                 1
    #define IPV6_FRAG_COPYHEADER    1
    
    /* ---------- ICMP options ---------- */
    #define ICMP_TTL                255
    
    
    /* ---------- DHCP options ---------- */
    /* Define LWIP_DHCP to 1 if you want DHCP configuration of
       interfaces. */
    #define LWIP_DHCP               LWIP_UDP
    
    /* 1 if you want to do an ARP check on the offered address
       (recommended). */
    #define DHCP_DOES_ARP_CHECK    (LWIP_DHCP)
    
    
    /* ---------- AUTOIP options ------- */
    #define LWIP_AUTOIP            (LWIP_DHCP)
    #define LWIP_DHCP_AUTOIP_COOP  (LWIP_DHCP && LWIP_AUTOIP)
    
    #define DEFAULT_UDP_RECVMBOX_SIZE 320
    
    /* ---------- RAW options ---------- */
    #define LWIP_RAW                1
    
    #define DEFAULT_RAW_RECVMBOX_SIZE (MEMP_NUM_TCP_SEG)
    
    /* ---------- Statistics options ---------- */
    
    #define LWIP_STATS              1
    #define LWIP_STATS_DISPLAY      1
    
    #if LWIP_STATS
    #define LINK_STATS              1
    #define IP_STATS                1
    #define ICMP_STATS              1
    #define IGMP_STATS              1
    #define IPFRAG_STATS            1
    #define UDP_STATS               1
    #define TCP_STATS               1
    #define MEM_STATS               1
    #define MEMP_STATS              1
    #define PBUF_STATS              1
    #define SYS_STATS               1
    
    #define LWIP_SNMP               LWIP_UDP
    #define MIB2_STATS              LWIP_SNMP
    #ifdef LWIP_HAVE_MBEDTLS
    #define LWIP_SNMP_V3            0
    #endif
    
    #endif /* LWIP_STATS */
    
    /* ---------- NETBIOS options ---------- */
    #define LWIP_NETBIOS_RESPOND_NAME_QUERY 1
    
    /* ---------- PPP options ---------- */
    
    #define PPP_SUPPORT             0      /* Set > 0 for PPP */
    
    #if PPP_SUPPORT
    
    #define NUM_PPP                 1      /* Max PPP sessions. */
    
    
    /* Select modules to enable.  Ideally these would be set in the makefile but
     * we're limited by the command line length so you need to modify the settings
     * in this file.
     */
    #define PPPOE_SUPPORT           1
    #define PPPOS_SUPPORT           1
    
    #define PAP_SUPPORT             1      /* Set > 0 for PAP. */
    #define CHAP_SUPPORT            1      /* Set > 0 for CHAP. */
    #define MSCHAP_SUPPORT          0      /* Set > 0 for MSCHAP */
    #define CBCP_SUPPORT            0      /* Set > 0 for CBCP (NOT FUNCTIONAL!) */
    #define CCP_SUPPORT             0      /* Set > 0 for CCP */
    #define VJ_SUPPORT              1      /* Set > 0 for VJ header compression. */
    #define MD5_SUPPORT             1      /* Set > 0 for MD5 (see also CHAP) */
    
    #endif /* PPP_SUPPORT */
    
    #endif /* LWIP_OPTTEST_FILE */
    
    
    /*-------------------------Misc Options-------------------------*/
    /* Enables ARP*/
    #define LWIP_ARP			  1
    
    /* Enables a routine to be called when netif is deleted. Used to close CPSW*/
    #define LWIP_NETIF_REMOVE_CALLBACK      1
    
    /* TCP thread priority level*/
    #define TCPIP_THREAD_PRIO				7
    
    /* Thread priority of any other thread created using the stack functions */
    #define DEFAULT_THREAD_PRIO				1
    
    /* Prevents pbuf chain from getting created thus disabling scatter-gather*/
    #define LWIP_NETIF_TX_SINGLE_PBUF       0
    
    #define DEFAULT_ACCEPTMBOX_SIZE			(TCPIP_MBOX_SIZE)
    
    
    /*---------------------------------------------------------------*/
    #if defined(__ARM_ARCH) && (defined(__TI_EABI__) || defined(__clang__))
        /*------------------------------------------------------------------------*/
        /* Under EABI, use function to access errno since it likely has TLS in    */
        /* a thread-safe version of the RTS library.                              */
        /*------------------------------------------------------------------------*/
        extern volatile int *__aeabi_errno_addr(void);
        #define errno (* __aeabi_errno_addr())
    #elif defined(__ARM_ARCH) && defined(__GNUC__)
        /*------------------------------------------------------------------------*/
        /* Under EABI, use function to access errno since it likely has TLS in    */
        /* a thread-safe version of the RTS library.                              */
        /*------------------------------------------------------------------------*/
        extern int *__errno(void);
        #define errno (* __errno())
    #elif !defined(__C6X_MIGRATION__) && defined(__TMS320C6X__) && defined(__TI_EABI__)
        /*------------------------------------------------------------------------*/
        /* Under EABI, use function to access errno since it likely has TLS in    */
        /* a thread-safe version of the RTS library.                              */
        /*------------------------------------------------------------------------*/
        extern int *__c6xabi_errno_addr(void);
        __TI_TLS_DATA_DECL(int, __errno);
    
        #define errno (* __c6xabi_errno_addr())
    #else
        extern _DATA_ACCESS int errno;
        _TI_PROPRIETARY_PRAGMA("diag_push")
        /* errno is not allowed under MISRA, anyway */
        _TI_PROPRIETARY_PRAGMA("CHECK_MISRA(\"-5.6\")") /* duplicated name in another scope (errno) */
        _TI_PROPRIETARY_PRAGMA("CHECK_MISRA(\"-19.4\")") /* macro expands to parenthesized */
        #define errno errno
        _TI_PROPRIETARY_PRAGMA("diag_pop")
    #endif
    
    
    #ifdef __cplusplus
    }
    #endif
    
    #endif /* LWIP_LWIPOPTS_H */
    


    Can you please check if the behaviour is same in both debug and release builds?

    Usually we used Debug configuration since we are in a development phase and release configuration has never been used. We can make a test but as I stated, any code changes could impact the behavior of the issue, even in Debug configuration

    Let me know if you need further information.
    Thank you,

    Davide

  • Hi Davide,

    Thank you for the details. The origin for Rx and Tx pBufs are different, and I would like to check if the correct memory regions are set to uncached regions. I will look further into the details and will respond back in 2 working days regarding this issue.

    Please feel free to ping on this thread if the response is delayed further.

    Thanks and regards,
    Teja.

  • Hi Davide,

    Invalidated cache after Rx packets were received. Added CacheP_inv in line 1236 of lwip2enet.c for all the list->bufPtr. Result: application is running but does not explain why by setting the memory area as non-cacheable is not solving the issue

    Adding the CacheP_inv over the list memory is acting on the DMA memory buffers, which will then be used as the pbuf payload addresses for LwIP. When you are making the pbuf memory as uncached, the buffer pointers themselves are in uncached region, but the data is still in cached memory. 

    We have not seen such issues in out applications in the past, but that could be the reason why setting the cache settings for the pBufs are still not fixing the errors. I was not able to reproduce the error as of now. But I am curious to know if you are able to reproduce this issue in any of our OOB examples. 

    Thanks and regards,
    Teja.

  • Hi Teja,

    Thank you for your help. Unfortunately is very difficult for us reproduce the issue also in our application, so trying to reproduce the issues in the SDK examples would be very expansive in terms of time. Could you suggest us strategy on how to debug these issues? If needed, could you suggest us also al list of tools to better understand the origin of the latters?

    Thank you,

    Davide

  • Hi Davide,

    There is a call scheduled for looking into this issue on friday. Let's discuss more details there, and try to rootcause the issue.

    Please let us know if there are any further queries

    Thanks and regards,
    Teja.

  • Hi Davide and Roberto,

    From the meeting that we had today, We understand the following:

    1. Issue is reproducible even with only one core running.
    2. Issue is fully resolved after moving the gUart0UdmaTxRingMem out of the cache range from memp_tab_TCPIP_MSG_INPKT

    Current Target points that we  think are the reasons for such behaviour:

    1. Cache invalidations of UART ring memory and that memory being closer than 32 Bytes from LwIP memory pointers. This is corrupting the updated LwIP addresses in the cache and bringing the stale address from DDR. I will try to reproduce this issue in our setup to validate the possibility. 
    2. Stack memory running close to mamixum capacity, and could be causing missing instructions. Please check the current stack usage in your task list to understand if this is happening due to insufficient memory available to the thread. You can check the task stats using "vTaskGetRunTimeStats()" API

    Once we can confirm the behaviour based on the tests, we can identify methods to follow up with the fixes which will prevent the issue from happening with other peripherals as well, apart from UART.

    Thanks and regards,
    Teja.

  • Hi Davide,

    We are able to reproduce the issue in our setup by invalidating the memory address at the start of the cacheline just like in your case, and then stepping through the run instead of free running. By free running with this cache inv added, we were not able to reproduce your specific failure, but it is still going to a different error case.

    We identified that the UART DMA ring mem buffers are made to be aligned, but they are not in a separated section of the memory, where there could be other elements as well in the space. We suggest to add an attribute to the Rx and Tx DMA ring mem for UART DMA, with the following change:

    /* UART UDMA Channel Ring Mem */
    static uint8_t gUart`i`UdmaRxRingMem[UART_UDMA_TEST_RING_MEM_SIZE] __attribute__((aligned(UDMA_CACHELINE_ALIGNMENT), section(".bss:UART_DMA_MEM")));
    static uint8_t gUart`i`UdmaTxRingMem[UART_UDMA_TEST_RING_MEM_SIZE] __attribute__((aligned(UDMA_CACHELINE_ALIGNMENT), section(".bss:UART_DMA_MEM")));

    This can be found in the file mcu_plus_sdk/source/sysconfig/drivers/.meta/uart/templates/uart_config_am62x.c.xdt

    We are reviewing the other components for similar issues, and will update the thread based on the findings.

    Thanks and regards,
    Teja.