This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

LWIP 1.4.1 TCP stresses memory leaks & (tiva-tm4c129.c) configuration - timers, types of PCB.

Guru 55913 points
Other Parts Discussed in Thread: EK-TM4C1294XL

The nitty gritty of EK-TM4C1294XL MPU resets during TCP transmit / receive. Random halts during EMAC transmit when two separate IP struct lists bind into TCP stack allowing separate multiple TCP ports to exist in virtual EMAC PCB memory pool space. Sometimes an LWIP debug  STAT message follows: Assert Fails (p = !NULL), PCB (p->1), but not always. The allocator de-allocator often have or show LWIP debug Assert Panic Sanity as sane when not at all sane.

One issue an orphaned configured EMAC transmit interrupt and seemingly unbalanced interrupt handling in the EMAC.  

 A few minor code changes in PHY interrupt handling and DMA descriptor intermediate PBUF storage (type) SRAM seem to help in the Fickle LWIP memory management obscure payload-> [methods]. Seems to make since the DMA scatter gather process intermediate (type) {RAM_POOL} leaving LWIP priority at the network transport layers to manage {PBUF_POOL} PBUFS. LWIP allocating / de-allocating PBUFS to the pool via descriptor Blocks previously stored in SRAM at the hardware data link and physical layers. A very complicated packet handling schema indeed. The oddest part (tiva-tm4c129.c) snatches incoming  receive frames and attaches available PBUFS with previously assigned receive descriptors passing them up to LWIP.  

4.28.2015: Must add proper include path for (tiva-tm4c129.c) to find NETIF_DEBUG==1 in (lwipopts.h).

{${SW_ROOT}/third_party/lwip-1.4.1/ports/tiva-tm4c129}

Type PBUF_RAM uses mem_free() call found inside pbuf_free() also appears to have timing issues. Systick 2.5us LWIP timer interval range (5-10ms) often cause random MPU Resets or the Exosite client Halts during pbuf_free(). Yet at times  the Telnet client will still be functional as if noting has gone wrong in TCP stack.
ASSERT_FAIL at line 825 of C:/Software/Tivaware/TivaWare_C_Series-2.1.0.12573/third_party/lwip-1.4.1/src/core/ipv4/ip.c: p->ref == 1
ASSERT_FAIL at line 668 of C:/Software/Tivaware/TivaWare_C_Series-2.1.0.12573/third_party/lwip-1.4.1/src/core/ipv4/ip.c: p->ref == 1
ASSERT_FAIL at line 339 of C:/Software/Tivaware/TivaWare_C_Series-2.1.0.12573/third_party/lwip-1.4.1/src/core/mem.c: mem_free: mem->used
ASSERT_FAIL at line 651 of C:/Software/Tivaware/TivaWare_C_Series-2.1.0.12573/third_party/lwip-1.4.1/src/core/pbuf.c: pbuf_free: p->ref > 0

* 5.3.2015: The other issue is type {PBUF_POOL} is falling through invoking mem_free() (RAM_POOL) during pbuf_free() versus memp_free() no matter if {MEMP_MEM_MALLOC == 1 or 0}. That seems to contradict the way the code is written in pbuf_free() to explicitly invoke memp_free() for (pbuf_pool) type Macro definition. Seems as if CCS5 may be linking the wrong Macro confusing anyone who trouble shoots Heap memory related issues involving (pbuf.c).  

LWIP 1.4.1 Seemingly has timing issues that effect DMA transfers and can not pass the litmus test when NETIF is being heavily tasked at high speeds with multiple TCP struct port list bindings.

Also read packet drops LWIP timeout errors:
https://e2e.ti.com/support/microcontrollers/tiva_arm/f/908/t/412465

(tiva-tm4c129.c)

 for(ui32Loop = 0; ui32Loop < NUM_RX_DESCRIPTORS; ui32Loop++)
  {
      g_pRxDescriptors[ui32Loop].pBuf = pbuf_alloc(PBUF_TRANSPORT,
                                                     PBUF_POOL_BUFSIZE, 
                                                       PBUF_RAM); //was PBUF_POOL

            /* This buffer is outside the DMA-able memory space so we need
             * to copy the pbuf.
             */
            pBuf = pbuf_alloc(PBUF_TRANSPORT, p->tot_len, PBUF_RAM);

       /* Allocate a new buffer for this descriptor */
      pDescList->pDescriptors[pDescList->ui32Read].pBuf = pbuf_alloc(PBUF_TRANSPORT,
                                                        	         PBUF_POOL_BUFSIZE,
                                                        	           PBUF_RAM); //was PBUF_POOL  

 4.22.2015
Found a hidden Macro clokes the PBUF_POOL(type), defined in (memp_std.h) description "PBUF_POOL"

 
/*
 * A list of pools of pbuf's used by LWIP.
 *
 * LWIP_PBUF_MEMPOOL(pool_name, number_elements, pbuf_payload_size, pool_description)
 *     creates a pool name MEMP_pool_name. description is used in stats.c
 *     This allocates enough space for the pbuf struct and a payload.
 *     (Example: pbuf_payload_size=0 allocates only size for the struct)
 */
LWIP_PBUF_MEMPOOL(PBUF,      MEMP_NUM_PBUF,            0,                             "PBUF_REF/ROM")
LWIP_PBUF_MEMPOOL(PBUF_POOL, PBUF_POOL_SIZE,       PBUF_POOL_BUFSIZE,          "PBUF_POOL")

(pbuf.c) struct pbuf * pbuf_alloc(pbuf_layer layer, u16_t length, pbuf_type type) { struct pbuf *p, *q, *r; u16_t offset; s32_t rem_len; /* remaining length */ LWIP_DEBUGF(PBUF_DEBUG | LWIP_DBG_TRACE, ("pbuf_alloc(length=%"U16_F")\n", length)); /* determine header offset */ switch (layer) { case PBUF_TRANSPORT: /* add 56 bytes room for transport (often TCP) layer header */ offset = PBUF_LINK_HLEN + PBUF_IP_HLEN + PBUF_TRANSPORT_HLEN; break; case PBUF_IP: /* add room for IP layer header */ offset = PBUF_LINK_HLEN + PBUF_IP_HLEN; break; case PBUF_LINK: /* add room for link layer header */ offset = PBUF_LINK_HLEN; break; case PBUF_RAW: offset = 0; break; default: LWIP_ASSERT("pbuf_alloc: bad pbuf layer", 0); return NULL; }

 

(tiva-tm4c129.c)
Interrupt source set for {EMAC_INT_TX_STOPPED} Orphaned Netif interrupt had no vector - added to transmit below.
Called from tivaif_init()
static void
tivaif_hwinit(struct netif *psNetif)
{ 


  /* Enable the Ethernet RX and TX interrupt source. */
  EMACIntEnable(EMAC0_BASE, (EMAC_INT_RECEIVE | EMAC_INT_TRANSMIT |
                EMAC_INT_TX_STOPPED | EMAC_INT_RX_NO_BUFFER |
                EMAC_INT_RX_STOPPED | EMAC_INT_PHY));



void
tivaif_interrupt(struct netif *psNetif, uint32_t ui32Status)
{
  tStellarisIF *tivaif;
 /* Process the transmit DMA list, freeing any buffers that have been
   * transmitted since our last interrupt. We also call this function in cases
   * where the transmitter has stalled due to missing buffers and break to
   * to avoid a hanging interrupts for descriptors that have no pbufs attached.
   */
  if(ui32Status & (EMAC_INT_TRANSMIT | EMAC_INT_RX_NO_BUFFER |
EMAC_INT_TX_STOPPED)) { tivaif_process_transmit(tivaif); } /** * Process the receive DMA list and pass all successfully received packets * up the stack. We also call this function in cases where the receiver has * stalled due to missing buffers since the receive function will attempt to * allocate new pbufs for descriptor entries which have none. */ if(ui32Status & (EMAC_INT_RECEIVE | EMAC_INT_RX_NO_BUFFER | EMAC_INT_RX_STOPPED)) { tivaif_receive(psNetif); } }

  • There are hidden LWIP options (LWIP_RAW==1 and (RAW_PCB == 4) defined defaults in (opt.h). 
     
    There are no (raw_pcb) calls made to (raw.c) in the Exosite program and LWIP_RAW optionally need not be enabled. 

  • Memory pool buffers Overflow and Sanity checks were not being enabled == 1 in (lwipopts.h) or (opt.h).



    (memp.c) remove (!) to allow sanity checks and to add PCB overflow constraints. #if (!)MEMP_MEM_MALLOC /* don't build if not using pools configured for use in lwipopts.h */ struct memp { struct memp *next; #if MEMP_OVERFLOW_CHECK const char *file; int line; #endif /* MEMP_OVERFLOW_CHECK */ };

     

  • BTW:

    Seems to be {PBUF_POOL_SIZE } is not being instantiated with some value expected in (lwipopts.h) or (opt.h).

    Oddly the LWIP suggested PBUF size is not being adhered to, missing is the 4x memory alignment multiple in the {PBUF_POOL_BUFSIZE}. 

    http://lwip.wikia.com/wiki/Tuning_TCP

    //*****************************************************************************
    //
    // ---------- Pbuf options ----------
    //
    //*****************************************************************************
    #define PBUF_LINK_HLEN                  16   // TivaTm4c129 Must be 16 default is 14
    #define PBUF_POOL_SIZE                  20   // #Elements: PBUF_POOL_SIZE > IP_REASS_MAX_PBUFS so stack still able receive
                                                 // packets even if maximum amount of fragments is enqueued for reassembly!
    #define PBUF_POOL_BUFSIZE       LWIP_MEM_ALIGN_SIZE(TCP_MSS+40+PBUF_LINK_HLEN) //3624 bytes
                                                 /* The size of each pbuf in the pbuf pool.
                                                  * Accomodate single full size TCP frame in one pbuf,
                                                  * including TCP_MSS, IP header, and link header.
                                                  * default:LWIP_MEM_ALIGN_SIZE(TCP_MSS+40+PBUF_LINK_HLEN)*/
    
    #define ETH_PAD_SIZE                    0    // default is 0

    The 4096 TCP_WND & 1500 MSS both quite large, likely cause transmitted fragments. LIWP frag STATS often show 1000's when enabled.

    //*****************************************************************************
    //
    // ---------- TCP options ----------
    //
    //*****************************************************************************
    #define LWIP_TCP                        1
    //#define TCP_TTL                         (IP_DEFAULT_TTL)
    #define TCP_WND                         (4 * TCP_MSS) // 3400 default 2048, 2x TCP_MSS min.
    #define TCP_MAXRTX                      12   //default 12
    #define TCP_SYNMAXRTX                   6    //default 6
    #define TCP_QUEUE_OOSEQ                 1    // Queue segments that arrive out of order. (p!=NULL)
                                                 // define 0 if low on memory.
    #define TCP_OOSEQ_MAX_PBUFS             12   // The maximum number of pbufs queued on ooseq per pcb.
    #define TCP_OOSEQ_MAX_BYTES             80   // The maximum number of bytes queued on ooseq per pcb, 960 bytes.
    #define TCP_MSS                         850  // default 536, Sets the upper limit advertised, to transmit
                                                 // back to the remote host. We are the remote host.
    #define TCP_CALCULATE_EFF_SEND_MSS      1
    #define TCP_SNDQUEUELOWAT               LWIP_MAX(((TCP_SND_QUEUELEN)/2), 5)
    #define TCP_SND_BUF                     (4 * TCP_MSS) //default 1700 2x, now 3400
    #define TCP_SND_QUEUELEN                (4 * (TCP_SND_BUF/TCP_MSS)) //default 16
    #define TCP_LISTEN_BACKLOG              1    // Listen backlog explicitly defined 0xff
    #define TCP_DEFAULT_LISTEN_BACKLOG      0xff //256 Bytes
    #define TCP_OVERSIZE                    TCP_MSS // default TCP_MSS

  • Reasons why (PBUF_TRANSPORT) && (PBUF_RAM) are better suited for the task lies mainly in the vendor DMA RX-Descriptors struct (tiva-tm4c129.c) uses for allocating ample space that receive packets fit in to each PCB stored in SRAM.   

    Seemingly PCB_POOL was intended for LWIP internal use (tcp-out.c / tcp_in.c). Perhaps the vendor layer should refrain from using PCB_POOL for temporary packet storage prior to filtering data into the application layer. Likewise it would seem vendor using PBUF_RAW (layer) class does not allocate enough PBUF space in SRAM to hold the entire RX-Descriptor recovered scatter DMA delivered by (tcp_in.c) at the transport layer?

    Seemingly Layer class PBUF_TRANSPORT a better choice to allocate enough SRAM PBUF space for incoming Received packets including {TCP_MSS, IP header, link header}, otherwise Packet drops may occur at the PHY devise layer. 

    CSMCSDA collision avoidance in the MLINK adds to the time spent at device layer for receiving incoming packets so packet drops count. 

    Hence packet Frame handling and configuration miss-matching at ascending TCP layers can directly impact the device layer. Setting to high of a TCP_MSS (1500) may also add to the number of dropped receive packets from the Host if or when the vendor transport layer is struggling. 

    LWIP (tcp_out.c) 3 Phases :

        /*
         * Phase 1: Copy data directly into an oversized pbuf.
         * The number of bytes copied is recorded in the oversize_used
         * variable. The actual copying is done at the bottom of the
         * function.
         */
    #if TCP_OVERSIZE
    #if TCP_OVERSIZE_DBGCHECK
        /* check that pcb->unsent_oversize matches last_unsent->unsent_oversize */
        LWIP_ASSERT("unsent_oversize mismatch (pcb vs. last_unsent)",
                    pcb->unsent_oversize == last_unsent->oversize_left);
    #endif /* TCP_OVERSIZE_DBGCHECK */

    ~~~~~~~~~~~~

    /* Phase 2: Create a pbuf with a copy or reference to seglen bytes. We
           * can use PBUF_RAW here since the data appears in the middle of
           * a segment. A header will never be prepended. */
          if (apiflags & TCP_WRITE_FLAG_COPY) {
            /* Data is copied */
            if ((concat_p = tcp_pbuf_prealloc(PBUF_RAW, seglen, space, &oversize, pcb, apiflags, 1)) == NULL) {
              LWIP_DEBUGF(TCP_OUTPUT_DEBUG | 2,
                          ("tcp_write : could not allocate memory for pbuf copy size %"U16_F"\n",
                           seglen));
              goto memerr;
            }
    
    ~~~~~~~~~~~
     /*
       * Phase 3: Create new segments.
       * The new segments are chained together in the local 'queue'
       * variable, ready to be appended to pcb->unsent.
       */
      while (pos < len) {
        struct pbuf *p;
        u16_t left = len - pos;
        u16_t max_len = mss_local - optlen;
        u16_t seglen = left > max_len ? max_len : left;
    #if TCP_CHECKSUM_ON_COPY
        u16_t chksum = 0;
        u8_t chksum_swapped = 0;
    #endif /* TCP_CHECKSUM_ON_COPY */
    
        if (apiflags & TCP_WRITE_FLAG_COPY) {
          /* If copy is set, memory should be allocated and data copied
           * into pbuf */
          if ((p = tcp_pbuf_prealloc(PBUF_TRANSPORT, seglen + optlen, mss_local, &oversize, pcb, apiflags, queue == NULL)) == NULL) {
            LWIP_DEBUGF(TCP_OUTPUT_DEBUG | 2, ("tcp_write : could not allocate memory for pbuf copy size %"U16_F"\n", seglen));
            goto memerr;
          }

     

  • The function mem_free() de-allocates chained PBUFs. The application loop should periodically make calls to {sys_check_timeouts()} to harvest OOSEQ PBUF (timers.c) when the PBUF_POOL runs dry given the (lwipopt.h) settings below.

    Not exactly sure the default LWIP timeout value when (NO_SYS==0) is harvesting 1 OSSEQ PBUF when the PBUF_POOL runs dry.  Default LWIP settings appear contrary to Adam Dunkles Swedish Institute readme text surrounding the function below.

    It would seem believed the LWIP module timeout function triggered from defaults in LWIP configuration.

    Defaults for NO_SYS==1 : TCP_QUEUE_OOSEQ == 1,  MEMP_NUM_TCP_TIMEOUTS ==7 {Total of LWIP modules}.

    (pbuf.h)

    #if LWIP_TCP && TCP_QUEUE_OOSEQ
    /** Define this to 0 to prevent freeing ooseq pbufs when the PBUF_POOL is empty */
    #ifndef PBUF_POOL_FREE_OOSEQ
    #define PBUF_POOL_FREE_OOSEQ 1
    #endif /* PBUF_POOL_FREE_OOSEQ */
    #if NO_SYS && PBUF_POOL_FREE_OOSEQ
    extern volatile u8_t pbuf_free_ooseq_pending;
    void pbuf_free_ooseq();
    /** When not using sys_check_timeouts(), call PBUF_CHECK_FREE_OOSEQ()
        at regular intervals from main level to check if ooseq pbufs need to be
        freed! */
    #define PBUF_CHECK_FREE_OOSEQ() do { if(pbuf_free_ooseq_pending) { \
      /* pbuf_alloc() reported PBUF_POOL to be empty -> try to free some \
         ooseq queued pbufs now */ \
      pbuf_free_ooseq(); }}while(0)
    #endif /* NO_SYS && PBUF_POOL_FREE_OOSEQ*/
    #endif /* LWIP_TCP && TCP_QUEUE_OOSEQ */

    (timers.c)

    #if NO_SYS
    
    /** Handle timeouts for NO_SYS==1 (i.e. without using
     * tcpip_thread/sys_timeouts_mbox_fetch(). Uses sys_now() to call timeout
     * handler functions when timeouts expire.
     *
     * Must be called periodically from your main loop.
     */
    void
    sys_check_timeouts(void)
    {
      if (next_timeout) {
        struct sys_timeo *tmptimeout;
        u32_t diff;
        sys_timeout_handler handler;
        void *arg;
        u8_t had_one;
        u32_t now;
    
        now = sys_now();
        /* this cares for wraparounds */
        diff = now - timeouts_last_time;
        do
        {
    #if PBUF_POOL_FREE_OOSEQ
          PBUF_CHECK_FREE_OOSEQ();
    #endif /* PBUF_POOL_FREE_OOSEQ */
          had_one = 0;
          tmptimeout = next_timeout;
          if (tmptimeout && (tmptimeout->time <= diff)) {
            /* timeout has expired */
            had_one = 1;
            timeouts_last_time = now;
            diff -= tmptimeout->time;
            next_timeout = tmptimeout->next;
            handler = tmptimeout->h;
            arg = tmptimeout->arg;
    #if LWIP_DEBUG_TIMERNAMES
            if (handler != NULL) {
              LWIP_DEBUGF(TIMERS_DEBUG, ("sct calling h=%s arg=%p\n",
                tmptimeout->handler_name, arg));
            }
    #endif /* LWIP_DEBUG_TIMERNAMES */
            memp_free(MEMP_SYS_TIMEOUT, tmptimeout);
            if (handler != NULL) {
              handler(arg);
            }
          }
        /* repeat until all expired timers have been called */
        }while(had_one);
      }
    }
    

     

  • Appears we must add the missing LWIP module, (sys_check_timeouts() not found in the Tivaware (tiva-tm4c129.c) main application loop.

    This function checks the LWIP module timeouts and performs out of sequence (OOSEQ) freeing queued PBUFS into the PBUF_POOL when empty.
    We have added it into the main loop of LWIP timers calls at the bottom of the function in (lwiplib.c) .

    //*****************************************************************************
    //
    // The local time when the LWIP modules timeout detect timer was last serviced.
    //
    //*****************************************************************************
    #if NO_SYS && (LWIP_TCP && TCP_QUEUE_OOSEQ)
    static uint32_t g_ui32LwipModulesTimer = 0;
    #endif


    static void
    LwipServiceTimers(void)
    {
    ~~~~~~~~~~~~~~

       /* Periodically check LWIP timeouts. Attempt to reclaim memory space
        * from queued out-of-sequence TCP segments.
        * Put an element back into its pool. Free OOSEQ queued PBUFs when
        * the PBUF_POOL is empty, PBUF_CHECK_FREE_OOSEQ == 1 (pbuf.h)
    #if LWIP_TIMERS
        if((g_ui32LocalTimer - g_ui32LwipTimeoutTimer) >= MEMP_NUM_SYS_TIMEOUT)
        {
        	g_ui32LwipTimeoutTimer = g_ui32LocalTimer;
        	sys_check_timeouts();
        }