RM46 LWIP UDP stuck in hdkif_rx_inthandler when multiple UDP packets received

Iceberg

Other Parts Discussed in Thread: HALCOGEN

Hi,

I originally posted this as a reply to a question in "https://e2e.ti.com/support/microcontrollers/hercules/f/312/t/445955"

I am having a problem with the code getting locked up in the hkdif_rx_handler typically within the

while(hdkif_swizzle_data(curr_bd->flags_pktlen) & EMAC_BUF_DESC_SOP) section of code on the RM46L852PGET package. In Halcogen 4.05 the only drivers / pinmux enabled are MIBSPI3, MII, and SCI, with MIBSPI3NCS_1 conflict cleared and MIBSPI3NENA cleared (as opposed to CS_5).MDIO and MDCLK are checked. The VIM RAM is set as expected, and the clocks are also configured to be the same as the LWIP Ethernet Demo. The Ethernet Demo is used but modified for UDP instead of HTTP.

I am using UDP mode on LWIP, and if the host computer sends one UDP packet per cycle (i.e. the packet gets handled in a timely fashion) there are no issues and everything works, whereas if two UDP packets are sent sequentially without being processed by the ISR, the code gets stuck in the ISR, and typically at the hkdif_swizzle_data(curr_bd->flags_pktlen) & EMAC_BUF_DESC_SOP)

flags_pktlen is a big number (3942645758).

Any thoughts? It's frustrating not being able to handle the receipt of multiple UDP packets without locking up the microcontroller.

Josh Karch

over 10 years ago

0 Anthony F. Seely over 10 years ago

TI__Guru 68830 points

Hi Joshua,

Is there any information in the MAC Network Statistics Registers that would help us figure out what's going on?

These registers should begin at 0xFCF7_8200.

0 Iceberg over 10 years ago in reply to Anthony F. Seely

Intellectual 565 points

Anthony, I have some memory information here from the memory browser: Also FYI the problem exists on the HDK with the ZWT part too. This capture is from the HDK so we have known hardware with this issue:

EMAC_RXGOODFRAMES
00000CD0
EMAC_RXBCASTFRAMES
0000095A
EMAC_RXMCASTFRAMES
00000000
EMAC_RXPAUSEFRAMES
00000000
EMAC_RXCRCERRORS
00000000
EMAC_RXALIGNCODEERRORS
00000000
EMAC_RXOVERSIZED
00000000
EMAC_RXJABBER
00000000
EMAC_RXUNDERSIZED
00000000
EMAC_RXFRAGMENTS
00000000
EMAC_RXFILTERED
000006E9
EMAC_RXQOSFILTERED
00000000
EMAC_RXOCTETS
00047A98
EMAC_TXGOODFRAMES
00000004
EMAC_TXBCASTFRAMES
00000003
EMAC_TXMCASTFRAMES
00000000
EMAC_TXPAUSEFRAMES
00000000
EMAC_TXDEFERRED
00000000
EMAC_TXCOLLISION
00000000
EMAC_TXSINGLECOLL
00000000
EMAC_TXMULTICOLL
00000000
EMAC_TXEXCESSIVECOLL
00000000
EMAC_TXLATECOLL
00000000
EMAC_TXUNDERRUN
00000000
EMAC_TXCARRIERSENSE
00000000
EMAC_TXOCTETS
00000344
EMAC_FRAME64
0x00000F14
EMAC_FRAME65T127
0x0000090E
EMAC_FRAME128T255
0x00000319
EMAC_FRAME256T511
0x00000093
EMAC_FRAME512T1023
0x00000000
EMAC_FRAME1024TUP
0x00000000
EMAC_NETOCTETS
0x0019C3A7
EMAC_RXSOFOVERRUNS
0x00001B16
EMAC_RXMOFOVERRUNS
0x00000000
EMAC_RXDMAOVERRUNS
0x0000190E 0x0016ACC4 0x0000190E 0x00000000 0x0000190E 0x0016ACC4 0x0000190E 0x00000000
0x0000190E 0x0016ACC4 0x0000190E 0x00000000 0x0000190E 0x00000000 0x00000000 0x00000000
0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000344 0x00000DF0
0x00000877 0x000002DC 0x00000083 0x00000000 0x00000000 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5
0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5
0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5
0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5
0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5
0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5
0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5
0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5
0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5 0x5BBC0DE5

0 Iceberg over 10 years ago in reply to Iceberg

Intellectual 565 points

Thank you by the way for your quick response!
-Josh

0 Anthony F. Seely over 10 years ago in reply to Iceberg

TI__Guru 68830 points

Hi Josh,

I'm not really an expert on the EMAC but from these statistics, it looks like there are quite a few RX Overruns being recorded. (See EMAC_RXSOFOVERRUNS, EMAC_RXDMAOVERRUNS counts).

So I would think that either making the receive matching more selective or increasing the number of DMA buffer resources (which means using more RAM)... Or possibly looking at latency issues.

Maybe the first thing to do is to check that you don't have an unreasonably small number of receive buffers for the task at hand. I'd think this number would be written into the RXnFREEBUFFER register somewhere early in the initialization code if you can't find it elsewhere.

Then it might be good to check the filtering - there are some sections in the EMAC TRM chapter that explain what filtering options there are - see Receive Channel Enabling and Receive Address Matching.

Also you might want to take a quick look at 31.2.13 regarding latency. If your descriptors are pointing to buffers in off chip RAM especially if its' slow async SRAM you may have a problem with latency. Not only would it be slower for the DMA to write the buffer but it's going to take longer for the CPU to process the buffers and return them to the queue.

Let me know if this gives you some things to check or if you've already checked these things and think something else may be going on.

Best Regards,
Anthony

0 Iceberg over 10 years ago in reply to Anthony F. Seely

Intellectual 565 points

Anthony, so in Halcogen there are 10 1500 byte buffers allocated automatically. They should be using on-chip RAM because my RM46L852PGE has no off-chip memory, so I don't think that's the issue. It just seems that two separate (<100 byte) messages being sent back to back is killing the task. The second message has a payload count of four bytes.

So the question is, what is causing RX Overruns? Is there someone at TI who can run a straight test on the RM46 example and replace the HTTP server with the UDP mode and send two back to back messages? That's basically where I'm at right now.

Regards,

Josh

0 Anthony F. Seely over 10 years ago in reply to Iceberg

TI__Guru 68830 points

Hi Josh,

Some other community members have tried lwIP/UDP with Hercules. Seems that there's some success mentioned in this post: e2e.ti.com/.../1468982

Maybe there's something in that post.

If you post your code here someone might be able to test it out. I'd like to myself but won't be able to until e.o. next week.

0 Iceberg over 10 years ago in reply to Anthony F. Seely

Intellectual 565 points

Anthony,

So it appears in the end the issue is that multiple messages come in and overrun the buffer before pbuf_free has the opportunity to be called. I refactored the code so that the callback function quickly determines the message received and memcopies the buffer in to the right location, then frees the buffer. I think we're almost there. The only other issue I seem to be having from time to time is getting a clean ethernet connection on power-up. I sometimes have to press the reset button to reset the PHY and MAC.
Thank you, I think we are good for now. That buffer overrun lead was great and led me to rethink how we were handling the interrupt!

Cheers,

Josh

0 Christian Fuchs1 over 9 years ago in reply to Iceberg

Expert 1250 points

Hello Josh,

I've the same Problem now.

If it is possible for you, can you please post your reworked code here? You will help me a lot.

Thank you in advance

Best Regards

Christian

0 Anthony F. Seely over 9 years ago in reply to Christian Fuchs1

TI__Guru 68830 points

Christian,

I think it's best if Josh can post his solution - but you can try in the meantime to increase the number of receive pbuf through HalCoGen. There is a control on the EMAC Global screen. It defaults to 10 which is low. You can increase this.

0 Iceberg over 9 years ago in reply to Anthony F. Seely

Intellectual 565 points

Christian, so the issue we had was not processing and releasing packets quickly enough. The Int Handler needs to release buffers once packets have been received ASAP in order to prevent this lockup from occurring.

Best Regards,

Josh Karch

0 Anthony F. Seely over 9 years ago in reply to Iceberg

TI__Guru 68830 points

Hi Josh,

To fix the issue though - were you able to simply increase the number of pBUF on the HalCoGen EMAC tab so that it would take longer to run out of buffers?

0 Iceberg over 9 years ago in reply to Anthony F. Seely

Intellectual 565 points

Anthony,

I actually kept pBUF the same in my application, I just handled it faster with less delays.

Best Regards,

Josh

0 Christian Fuchs1 over 9 years ago in reply to Iceberg

Expert 1250 points

Hi,

thanks for your answers.

Increasing the pBuf is not possible because of low RAM. In one test case the PC sends 500 ARP request in one second, so the buffer cannot be big enough to handle this.

A faster handling is also difficult because the EMAC rx ISR only posts the pointer to the buffer in a RTOS queue. Then a task makes the call to the ethernet stack.

My idea is now to stop the EMAC from receiving if there are no free buffers and enable it again when the task has freed the buffers again. But until now my changed software doesn't work like expected. Is this a possible way to handle this issue?

Best regards

Christian

0 Anthony F. Seely over 9 years ago in reply to Christian Fuchs1

TI__Guru 68830 points

Christian,

The EMAC has flow control capability - see 31.2.10.1.3 of the TRM.

Can you make use of this - I think it essentially does what you are asking but in hardware by itself.

Arm-based microcontrollers

Arm-based microcontrollers forum

RM46 LWIP UDP stuck in hdkif_rx_inthandler when multiple UDP packets received