This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

lwIP Driver losing pbufs when receiving lots of packets

Other Parts Discussed in Thread: EK-TM4C1294XL

Hello,

I have two Tiva-C Development boards

* EK-TM4C1294XL Rev C

* TM4C129X Development Board

I have tested every lwIP implementation across both boards and they all have the same issue.  When inspecting lwIP's MEMP_STATS_DISPLAY the number of used PBUF_POOL objects eventually equals the number of available objects.  This also causes all network activity to stop.

When starting the application, there are the default 8 PBUF_POOL objects used, prepared for DMA receiving.  If I perform some HTTP GETs, there are still only 8 PBUF_POOL objects used, indicating that things are working and pbufs are being recycled as expected.

Now, I pull in a small test app that listens on a TCP port, accepts the connection and receives bytes.  It currently just dumps the bytes but this has been used on multiple lwIP implementations without issue so I have faith that the issue isn't in my code.

The receive handler, to show that it is not the cause:

static err_t disc_received(void *arg, struct tcp_pcb *pcb, struct pbuf *p, err_t err)
{
	if (p!=NULL)
	{
		tcp_recved(pcb, p->tot_len);
		pbuf_free(p);
	}
	return ERR_OK;
}

If I send a small 300kb file eventually all pbufs are used and inspecting the PBUF_POOL shows that the used quantity equals the available.

In order to transmit this 300kb file, 600 512 byte pbufs are used, and I have 64 total.  It will occasionally succeed and typically gets to about 95% transfer.  It is very repeatable though always ending at a different percentage complete.  I'm sure there's a race condition somewhere but have not found it yet.

Has anyone found this yet?

-Ken

  • Hello Ken,

    Thanks for putting in the effort to investigate this issue. We did hear about it.

    There is a bug in the lwip driver for Tiva TM4C129x devices in the file “<TivaWare_Directory>/third_party/lwip-1.4.1/ports/tiva-tm4c129/netif/tiva-tm4c129.c" that causes a pbuf memory leak. Due to this memory leak, over time the TM4C129x device stops responding to Ethernet traffic. It is difficult to notice this issue in low traffic environments but relatively easier to notice in high traffic environments, as you have demonstrated.

    We have made some modification to the "tiva-tm4c129.c" file and in our tests so far we were not able to reproduce this issue. We haven't concluded all our tests, but I believe the modifications will fix this issue.

    I am attaching the file "tiva-tm4c129.c" with the modifications. Please download it and save it in the folder "<TivaWare_Directory>/third_party/lwip-1.4.1/ports/tiva-tm4c129/netif/" (after backing-up the original file).

    Thanks again for bringing this to our notice!

    Regards,
    Sai
  • Success!  Thank you!  This did indeed fix my issue.

    -Ken

  • I am glad it works.

    On a related note, we have an issue with the ROM version of EMACInit() API on silicon revision A0 (of TM4C129x devices) that results in the MMC interrupts being enabled. These interrupts will fire only after running for a long time, which could be days in some cases. If these interrupts are not cleared, the main loop won't get executed. This issue has been fixed in the ROM version of the API on silicon revision A1 and the flash version of the API.

    If using silicon revision A0 please make sure that the flash version of EMACInit() API is used. To use the flash version of the API, use EMACInit() instead of MAP_EMACInit() or ROM_EMACInit() in all the code. If using lwip, please modify MAP_EMACInit() to EMACInit() in the function "lwIPinit()" in the file "<TivaWare_Directory>/utils/lwiplib.c".

    In the next TivaWare release MAP_EMACInit() will be updated to map to the flash version of the API (EMACInit()) in silicon revision A0 and ROM version of the API (ROM_EMACInit()) in silicon revision A1. To get the silicon revision read DID0 register @ 0x400FE000. If bits 15:0 read 0x0000, it's rev A0.

    Thanks,
    Sai
  • When it rains it pours:
    e2e.ti.com/.../1457249