This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Beware LWIP 1.4.1 PBUF_POOL_BUFFSIZE static setting leads to random memory heap alignment faults.

Guru 55913 points

Back tracking issue:

https://e2e.ti.com/support/microcontrollers/tiva_arm/f/908/p/417592/1490210#1490210

https://e2e.ti.com/support/microcontrollers/tiva_arm/f/908/t/409303

The Stellaris group overrides of (opts.h) -- (PBUF_POOL_BUFFSIZE) in (lwipopts.h) migrated into Tivaware has compatibility issues with DMA descriptor PBUF.

Likewise when (tcp_in.c) out of sequence frames are queued in SRAM PBUF's they randomly cause heap memory to slowly collapse with (opt.h) defaults OOSEQ_MAX_PBUF == 0,  OOSEQ_MAX_BYTES == 0. More of that below posting.

Random peripheral bus faults are very difficult to track down the what or why of them occurring and seemingly can stem from more than one origin.

Especially true given the complexity of Adam Dunkels LWIP heap memory management techniques. Tivaware includes the third party LWIP to take advantage of the advanced TM4C1294 EMAC designed with state of the art DMA RX/TX descriptors.

The lesson learned, do not accept vendor prepackaged configurations as being foolproof, do that yourself and test the C code perpetually.

http://lwip.wikia.com/wiki/Tuning_TCP

 

 

  • mem.h:
    
    /** Calculate memory size for an aligned buffer - returns the next highest * multiple of MEM_ALIGNMENT (e.g. LWIP_MEM_ALIGN_SIZE(3) and * LWIP_MEM_ALIGN_SIZE(4) will both yield 4 for MEM_ALIGNMENT == 4). */ #ifndef LWIP_MEM_ALIGN_SIZE #define LWIP_MEM_ALIGN_SIZE(size) (((size) + MEM_ALIGNMENT - 1) & ~(MEM_ALIGNMENT-1)) #endif mem.c: /** All allocated blocks will be MIN_SIZE bytes big, at least! * MIN_SIZE can be overridden to suit your needs. Smaller values save space, * larger values could prevent too small blocks to fragment the RAM too much. */ #ifndef MIN_SIZE #define MIN_SIZE 12 #endif /* MIN_SIZE */ /* some alignment macros: we define them here for better source code layout */ #define MIN_SIZE_ALIGNED LWIP_MEM_ALIGN_SIZE(MIN_SIZE) #define SIZEOF_STRUCT_MEM LWIP_MEM_ALIGN_SIZE(sizeof(struct mem)) #define MEM_SIZE_ALIGNED LWIP_MEM_ALIGN_SIZE(MEM_SIZE) /** the heap. we need one struct mem at the end and some room for alignment */ static u8_t ram_heap[MEM_SIZE_ALIGNED + (2*SIZEOF_STRUCT_MEM) + MEM_ALIGNMENT]; /** pointer to the heap (ram_heap): for alignment, ram is now a pointer instead of an array */ static u8_t *ram; /** the last entry, always unused! */ static struct mem *ram_end; /** pointer to the lowest free block, this is used for faster search */ static struct mem *lfree;

  • Discovered below switches are where receiving OOSEQ frames get messy:

    LWIP (opt.h) incorrectly states the values below have (NO limit) and are only valid when TCP_QUEUE_OOSEQ  =  0. They are corrected below to now make sense as they relate to the actual function. Adam Dunkels suggest to disable OOSEQ = 0 when low memory but TM4C1294 256KB SRAM is not low memory.

    Otherwise OOSEQ_MAX_PBUF == 0,  OOSEQ_MAX_BYTES == 0 -- disables the function to apparently queue RX OOSEQ frames.

    The function values above (Pbuf/Bytes) are only valid when TCP_QUEUE_OOSEQ == 1 and Not == 0.  Even after tweaking these 2 values keeping OOSQ memory use under 8320 bytes (32/260), that extends the length of time a random bus fault might occur. 

    The client side RX MSS host TCP_WND is handled by RX OOSEQ function. Exosite server network operating system CTCP might be a crippling factor in that TCP_WND.  Hence (16/128) = 4160 Bytes OOSEQ PBUF memory is not enough PBUF space that quickly fills SRAM in excessive DMA RX descriptor broken chains with PBUF fragmentation rapidly tripping a bus fault. Also seems plausible Frequent occurrences (C+ break out) on RX FRAME ERROR's at (tiva-tm4c129.c) abstraction layer lead to PBUF pool fragmentation in SRAM.

    One reason why TCP_QUEUE_OOSEQ was likely disabled by Stellaris group was due to small space SRAM on LM3S processors plus they where not using the internet on any RDK that I am aware of.

    http://lwip.wikia.com/wiki/Tuning_TCP

    All good intent the (opt.h) section should read:

    /**
     * TCP_OOSEQ_MAX_BYTES: The maximum number of bytes queued on ooseq per pcb.
     * Default is 0 bytes (limit). Only valid for TCP_QUEUE_OOSEQ==1.
     */
    #ifndef TCP_OOSEQ_MAX_BYTES
    #define TCP_OOSEQ_MAX_BYTES           12
    #endif
    
    /**
     * TCP_OOSEQ_MAX_PBUFS: The maximum number of pbufs queued on ooseq per pcb.
     * Default is 0 bytes (limit). Only valid for TCP_QUEUE_OOSEQ==1.
     */
    #ifndef TCP_OOSEQ_MAX_PBUFS
    #define TCP_OOSEQ_MAX_PBUFS           80
    #endif

  • BTW: Updated the free OOSEQ segments and LWIP timeouts function call added to LWIPServiceTimers()  found in (lwiplib.c).

    LWIP must periodically (defrag) elements back into the Pbuf_pool especially as the heap becomes low on RX PBUFS in SRAM.

    Seems tiva-tm4c129.c can not allocate any new PBUF for a DMA RX descriptor when the heap is heavily fragmented with broken chains.

    TM4C SRAM fragmentation seems plausible if OOSEQ element house keeping is not being preformed with (proper) OOSEQ PBUF settings.

    What say the group?

       /* Periodically check LWIP timeouts. Attempt to reclaim memory space
        * from queued out-of-sequence TCP segments.
        * Put an element back into its pool. Free OOSEQ queued PBUFs when
        * the PBUF_POOL is empty, explicit PBUF_CHECK_FREE_OOSEQ == 1 (pbuf.h) */ 
    #if LWIP_TIMERS 
        if((g_ui32LocalTimer - g_ui32LwipTimeoutTimer) >= MEMP_NUM_SYS_TIMEOUT)
        {
        	g_ui32LwipTimeoutTimer = g_ui32LocalTimer;
        	sys_check_timeouts();
        }
    #endif