EK-TM4C1294XL: LWIP Raw API - tcp issues

Luis Afonso

Guru 20670 points

Part Number: EK-TM4C1294XL
Other Parts Discussed in Thread: TEST2

Hello everyone,

So I have been trying to get TCP working on the TM4C with the LWIP provided in the utilities libraries.
It's to add into a existing project so using TI-RTOS would be a pain (or maybe less pain?)

Anyway, I have seen the echo example and after fumbling a bit with it and having some issues due to never have used something like this I got to this point.
The idea is to basically use it to transfer data to and from a python application. Nothing fancy, it's basically a serial port. The test I am making generates messages at 2Khz, all 48 bytes in size which will have to be eventually reach the python app.

I made a test program for this. It's not ideal but by itself it should work.
I have ran into some issues one that seems related to this "already fixed" bug. Let's call this test1
In test1 I use putty to read the tcp data. It basically is a reader and discards. I am basically watching variables that indicate if "anyone" dropped bytes (before the solution I have now it did). No problems for 30min.
But it seems that pcb->snd_queuelen base value keep increasing ever so slowly. I have tried disabling the message generator in the middle of the test and snd_queuelen did not decrease!

Other issue. Let's call it test2.
Pretty much the same, the difference is that I am using python application. Not a python test code, it's the application with the twist that it simply discards everything as soon as it takes it from the socket buffer. Just to remove possible bottleneck. Well it works fine until the TM4C simply stops sending data. It didn't froze, timer interrupt keep getting called, message generator working, connection established, no faults, lwip TCP queues don't seem full. it simply stopped sending stuff. There might be something in the pcb or lwip_stats that I might be missing (most likely) but it seems weird.
Wireshark shows "TCP out-of-order"

I still haven't checked everything I want for both tests but since, at least the first 1 seems reminiscent of the older bug, I decided to post here. The 2nd test is probably something obvious in the flags that I don't know so help from experience would be awesome.

With that, some details on what I've got implemented:

Nagle enabled
Some choices are to ensure that the raw api calls are made all in the same context. Not quite since there's 1 tcp_write called in main but it's with IntMasterDisable so I am hoping there no issue.

- I have a function "serial_lwIP_Send". It's called from the main code. It can either

It calls tcp_write with IntMasterDisable. Sets a flag to 0 when this happens (sentEvent), Only does this if sentEvent ==1 or circular buffer is empty.
If sentEvent == 0 then it loads a circular buffer
If tcp_write returned err_mem then it loads the circular buffer.

- Made callback for "tcp_sent". I call "tcp_sent" before each and every tcp_write

This is en charge of doing sentEvent = 1 and calling tcp_write() if there is data in the circular buffer. Only give tcp_write() a maximum of tcp_sndbuf()
Really only does this

- Made a periodic timer ISR with 5ms period

This ISR is higher priority (should be) than ethernet ISR
call tcp_output every 10th interrupt (so 50ms period)
calls lwIPTimer

Any help would be appreciated.
I can provide lwipopts.h. I don't believe the memory is the issue, I probably got way too much for what I need. PBufs related might not be the best though.

over 6 years ago

0 Charles Tsai over 6 years ago

TI__Guru**** 191366 points

Hi Luis,

Sorry, I'm not sure what the problem is in test2. Hopefully, there are more experienced LwIP users in the forum who can shed some lights. When you said there is no more data getting sent out, is the application continuing calling tcp_write() during this time? What is the difference between test1 and test2 from the MCU side? Are they running the same application? If you disable Nagle, does it make a difference?

0 Luis Afonso over 6 years ago in reply to Charles Tsai

Guru 20670 points

Hello Charles,

I ended up fixing it, at least for the test code, still haven't had time to try it in the real code.

"When you said there is no more data getting sent out, is the application continuing calling tcp_write() during this time? "
Yes. Actually sometimes it got stuck in tcp_write when going through pbuf but not in the cases I reported here.

"What is the difference between test1 and test2 from the MCU side? "
test1 vs test2. The client is putty vs Python application.
The fact that there's a difference is probably just random.

"Are they running the same application? "
The microcontroller runs the exact same code in both tests.

" If you disable Nagle, does it make a difference?"
Naggle was acting weird where the segments would go straight to max and freeze the whole thing. (Either Fault, or no fault but rest of the code froze because nothing was going out after it filled up even with tcp_write being called)

Okay but what was the issue? Reentrability.
"Threads": (I really don't like how the port comes with almost everything in Ethernet ISR, btw)
- Ethernet ISR (tcp_sent callback is here basically, among everything else)
- Timer1 ISR - lwIPTimer caller and tcp_output.

Well the issue is that Timer1 is calling tcp_output and is higher priority than Ethernet ISR!
Removed tcp_output from Timer1_ISR and created Timer2_ISR with same priority as Ethernet ISR to call tcp_output. Just to test. This way the code works "just fine" so far.

The reason here for extra timer and not just change Timer1_ISR priority is that since Timer1_ISR calls lwIPTimer, it needs (apparently) to be higher priority than Ethernet ISR.

0 Genatco over 6 years ago in reply to Luis Afonso

Guru 55913 points

Hi Luis,

What is the clock speed of Timer1 perhaps >5ms? The LWIP SW triggered interrupt (lwiplib.c) has its limits on the interval timer (lwipopts.h) and it should be an multiple of Timer1(n). Pubf pool crashes, TCP router fails and device layer TM4_129.c HAL gets cranky if he's pushed to hard <75ms ticks. For example IOT project loaded on each launch pad runs fine n=70 or 350ms ticks. I currently run telnet 75ms (n=3). The only thing that ever rarely occurs are telnet client disconnect timeouts.

BTW there is an example project for telnet.

0 Charles Tsai over 6 years ago in reply to Luis Afonso

TI__Guru**** 191366 points

HI Luis,

Thanks for the updates.

Luis Afonso said:
Okay but what was the issue? Reentrability.
"Threads": (I really don't like how the port comes with almost everything in Ethernet ISR, btw)
- Ethernet ISR (tcp_sent callback is here basically, among everything else)
- Timer1 ISR - lwIPTimer caller and tcp_output.

There is only one vector in the interrupt vector table for Ethernet. This was how the Ethernet/MCU was architect-ed. We can't change anything about it at this moment.

Luis Afonso said:
Well the issue is that Timer1 is calling tcp_output and is higher priority than Ethernet ISR!
Removed tcp_output from Timer1_ISR and created Timer2_ISR with same priority as Ethernet ISR to call tcp_output. Just to test. This way the code works "just fine" so far

I was also going to suggest that you change the Timer1 priority. But you seem to find a better solution. Please report back later on and share with the community if using Timer2 is more robust solution. I'm sure people in the community will benefit from your solution.

I will close the thread for now. If you have more updates with your solution please do share with us.

Arm-based microcontrollers

Arm-based microcontrollers forum

EK-TM4C1294XL: LWIP Raw API - tcp issues