TM4C1294NCPDT: tcp server does not respond to ping from client after many hours of working

Krishna Kumar A V

Part Number: TM4C1294NCPDT

CCS : 8.3.1

NO_SYS ( Not using RTOS)

We have developed a custom board with TM4C1294NCPDT with Ethernet and have implemented a TCP server using LWIP 1.4.1 by modifying an echo server example.

The Ethernet interrupt is not used and all Ethernet operations are done by calling lwIPEthernetIntHandler(); in the main loop of the program.

The connection to client PC is going through a switch.

The firmware responds to Modbus requests from the client PC.There are more than one such board connected to the switch. The client is a Windows 10 PC running a ModbusPol program.

The speed is forced to 10Mbs in the lwipopt.c .

It works OK for many hours but after a while does not respond to the client . Even ping fails. The only way to come out of this situation is to reset the Microcontroller.

The working scenario is as follows:

The PC running the Modbuspol is connected to the board to check the operation and then disconnected (by pulling out the cable while still running).

After a few hours the cable is connected back and the ModbusPol run again. Most of the time it works OK . But once in a while the Microcontroller does not respond even for ping from PC.

The firmware on the TM4C1294NCPDT continues to run to do the rest of the jobs(like writing to SD card , reading inputs etc).

We have tried a work around as below:

1. Check if the Cable is inserted.

2. If connected, check if there is any data being sent out in the echo_send function.

3. If there is no data being sent in the echo_send function for 30 seconds, do a software reset using SysCtlReset();

This works OK but it is not desirable solution as it disturbs the other features of the program.

I have the following questions:

1. Is it OK to pull out the cable while the program is running?

2. Can we have a work around to restart the server without SysCtlReset() ?

Like calling netif_set_down and netif_set_up functions to recover from the situation?

3. Can we call LwIPInit(... ) function more than once in the same program?

Can some one suggest how to go about debugging the program.

I am new to wireshark but am trying to figure out what is going on the PC side.

Thank you for any help.

over 3 years ago

0 Ralph Jacobi over 3 years ago

TI__Guru*** 134965 points

Hello Krishna,

Our Ethernet expert is currently out of office and will return Wednesday, I will need to defer to his expertise on this topic and will follow-up with him when he is back about this issue.

Best Regards,

Ralph Jacobi

0 Charles Tsai over 3 years ago

TI__Guru**** 183136 points

Hi,

I'm not familiar with Modbus so I can't really help you if the problem is somehow related to the Modbus stacks you have. Do you have any problem running the echo example with the same things you did to your modbus application like disconnecting the cable and reconnect again? Can you run it for hours/days without seeing an issue (e.g. unable to ping)?

Krishna Kumar A V said:
1. Is it OK to pull out the cable while the program is running?

I think so.

Krishna Kumar A V said:
2. Can we have a work around to restart the server without SysCtlReset() ?

If you just want to reset the EMAC and PHY, you can use SysCtlPeripheralReset() to reset just the EMAC and PHY only.

SysCtlPeripheralReset(SYSCTL_PERIPH_EMAC);
SysCtlPeripheralReset(SYSCTL_PERIPH_EPHY);

Krishna Kumar A V said:
3. Can we call LwIPInit(... ) function more than once in the same program?

I never tried that and I don't think you want to do that. I don't know the effect of initialing the network stack multiple times.

0 Krishna Kumar A V over 3 years ago in reply to Charles Tsai

Prodigy 80 points

Hi Charles,

Charles Tsai said:
If you just want to reset the EMAC and PHY, you can use SysCtlPeripheralReset() to reset just the EMAC and PHY only.

SysCtlPeripheralReset(SYSCTL_PERIPH_EMAC);
SysCtlPeripheralReset(SYSCTL_PERIPH_EPHY);

I tried the above but it gave exactly the same problem we were trying to solve, ie. not responding to ping. On further study I found that the above has set the MAC address to all FFs and that explains why the ping is not working.

Next , I tried to set the IP address back to the existing one using

lwIPNetworkConfigChange function.

This brings down the Network Interface and adds it back. This seems to be a reasonable work around as it does not disturb other parts of the application. Also I added a patch to store the lwip status like IP address , MAC address etc on to the SD card when the hang event occurs so that we can get an idea of what is causing the problem.

We will have to wait for a couple of days for the problem to show up and I will update the progress.

Thank you.

0 Krishna Kumar A V over 3 years ago in reply to Krishna Kumar A V

Prodigy 80 points

By The Way, a similar issue was discussed in these forums a few years back at

https://e2e.ti.com/support/microcontrollers/arm-based-microcontrollers-group/arm-based-microcontrollers/f/arm-based-microcontrollers-forum/575366/tm4c1292ncpdt-lwip-1-4-1-stack-stops-responding-after-multiple-hours-of-tcp-communications

At the end of the discussion, I could not find any solution to the probelm. Am I missing some thing?

0 Charles Tsai over 3 years ago in reply to Krishna Kumar A V

TI__Guru**** 183136 points

Hi Krishana,

Krishna Kumar A V said:
We will have to wait for a couple of days for the problem to show up and I will update the progress

Please let us know if the workaround works.

Krishna Kumar A V said:
At the end of the discussion, I could not find any solution to the probelm. Am I missing some thing?

The last post was by Amit asking to try MEM_SIZE for 64K instead of 16K as in the OP's lwIP configuration.

0 Krishna Kumar A V over 3 years ago in reply to Krishna Kumar A V

Prodigy 80 points

Krishna Kumar A V said:
Next , I tried to set the IP address back to the existing one using

lwIPNetworkConfigChange function.

This brings down the Network Interface and adds it back. This seems to be a reasonable work around as it does not disturb other parts of the application. Also I added a patch to store the lwip status like IP address , MAC address etc on to the SD card when the hang event occurs so that we can get an idea of what is causing the problem.

0 Krishna Kumar A V over 3 years ago in reply to Krishna Kumar A V

Prodigy 80 points

The use of lwipNetworkConfigChange function as a work around to restart lwip when hanging occurs did not help.

The MAC address is not getting disturbed at the time of hang.

At the time of hanging, the number of active tcp_pcbs is 2 .

To further debug, we have now added to log the state of each active tcp_pcb at the time of hanging.

Will update when we get the next hanging incident.

0 Genatco over 3 years ago in reply to Krishna Kumar A V

Guru 55553 points

Hello Krishna,

Might help to enable LWIP debug for TCPIP and print out status debug messages on virtual COM port monitor. The LWIP 1.4.1 internal TCP clock speeds seem to have some effect on PBUF faulting as reported via EMAC0 DMARIS register.

Regards,

0 Krishna Kumar A V over 3 years ago in reply to Genatco

Prodigy 80 points

Hi GI

Genatco said:
The LWIP 1.4.1 internal TCP clock speeds seem to have some effect on PBUF faulting as reported via EMAC0 DMARIS register.

Can you please explain the above a bit in detail ?

Connecting a virtual comport to the board is difficult as the board is inside an enclosure. What I am doing is to write the various TCP details to a file in the SD card when the hang occurs . Any suggestions on the values I have to log into this file ? Presently the 2 active tcp_pcbs are in ESTABLISHED state after hang.

Thank you.

0 Genatco over 3 years ago in reply to Krishna Kumar A V

Guru 55553 points

Krishna Kumar A V said:
What I am doing is to write the various TCP details to a file in the SD card when the hang occurs .

Seemingly best to plug XDC debugger into JTAG port and set a watch on EMAC0DMARIS register via CCS debug. You can easily tweak several lwipopts.h TCP settings to control packet messaging. The settings below were used for IOT packet transfers to internet host and local Telnet client with different pcb names. There is LWIP clocking module where one can set the number of clock ticks for the TCP protocol, search LWIP files. If the GPTM that calls LIWP stack is clocking to fast for LWIP internal timers there will be all sorts of debug error messages.

Fullscreen

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
//*****************************************************************************
//
// ---------- IP options ----------
//
//*****************************************************************************
#define IP_FORWARD                      0
#define IP_OPTIONS_ALLOWED              1    // default 1
#define IP_REASSEMBLY                   1    // default 1 Incomming
#define IP_FRAG                         1    // default 1 Outgoing any exceeding MTU
#define IP_REASS_MAXAGE                 3    // default 3 wait for seconds
#define IP_REASS_MAX_PBUFS             10    // default 10 Max PBUFS waiting to be reassembled
#define IP_FRAG_USES_STATIC_BUF         0    // May not be used for DMA-enabled MACs!
#define IP_FRAG_MAX_MTU                1500
#define IP_DEFAULT_TTL                  255
//#define IP_SOF_BROADCAST                 1 // Broadcast filter per pcb on udp send operations
//#define IP_SOF_BROADCAST_RECV            1 // Broadcast filter on recv operations
//*****************************************************************************
//
// ---------- TCP options ----------
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

//*****************************************************************************
//
// ---------- IP options ----------
//
//*****************************************************************************
#define IP_FORWARD                      0
#define IP_OPTIONS_ALLOWED              1    // default 1
#define IP_REASSEMBLY                   1    // default 1 Incomming
#define IP_FRAG                         1    // default 1 Outgoing any exceeding MTU
#define IP_REASS_MAXAGE                 3    // default 3 wait for seconds
#define IP_REASS_MAX_PBUFS             10    // default 10 Max PBUFS waiting to be reassembled
#define IP_FRAG_USES_STATIC_BUF         0    // May not be used for DMA-enabled MACs!
#define IP_FRAG_MAX_MTU                1500
#define IP_DEFAULT_TTL                  255
//#define IP_SOF_BROADCAST                 1 // Broadcast filter per pcb on udp send operations
//#define IP_SOF_BROADCAST_RECV            1 // Broadcast filter on recv operations


//*****************************************************************************
//
// ---------- TCP options ----------
//
//*****************************************************************************
#define LWIP_TCP                        1
#define TCP_TTL                         (IP_DEFAULT_TTL)
#define TCP_WND                         (4 * TCP_MSS) // 1400 default 2048, 2x TCP_MSS min.
#define TCP_MAXRTX                      12   //default 12
#define TCP_SYNMAXRTX                   6    //default 6
#define TCP_QUEUE_OOSEQ                 1    // Queue segments that arrive out of order. (p!=NULL)
                                             // define 0 if low on memory.
#define TCP_OOSEQ_MAX_PBUFS             12   // The maximum number of pbufs queued on ooseq per pcb.
#define TCP_OOSEQ_MAX_BYTES             80   // The maximum number of bytes queued on ooseq per pcb, 960 bytes.
#define TCP_MSS                        850  // default 536, Sets the upper limit advertised, to transmit
                                             // back to the remote host. We are the remote host.
#define TCP_CALCULATE_EFF_SEND_MSS      1
#define TCP_SNDQUEUELOWAT               LWIP_MAX(((TCP_SND_QUEUELEN)/2), 5)
#define TCP_SND_BUF                     (4 * TCP_MSS) //default 2x, now 1400
#define TCP_SND_QUEUELEN                (3 * (TCP_SND_BUF/TCP_MSS)) //12 default 16
#define TCP_LISTEN_BACKLOG              1    // Listen backlog explicitely defined 0xff
#define TCP_DEFAULT_LISTEN_BACKLOG      0xff //256 Bytes
#define TCP_OVERSIZE                    TCP_MSS // default TCP_MSS 850 bytes

0 Krishna Kumar A V over 3 years ago in reply to Genatco

Prodigy 80 points

This is the update on the problem:

We have tried a board at two places . At one place it runs without hanging for many days. At the other place , the same board hangs after a few hours.

So we are suspecting that the problem could be the ethernet switch used at the 2nd place. We are swapping the switches to confirm . Will update shortly.

Can a switch cause such a hanging problem?

0 Ralph Jacobi over 3 years ago in reply to Krishna Kumar A V

TI__Guru*** 134965 points

Hello Krishna,

Charles is our expert for this topic and he is out of office until next week - please expect a reply when he returns.

Best Regards,

Ralph Jacobi

Arm-based microcontrollers

Arm-based microcontrollers forum

TM4C1294NCPDT: tcp server does not respond to ping from client after many hours of working