This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

66AK2E05: Ethernet SW Transmission Hang

Part Number: 66AK2E05

Hi Ti Folks,

I have a problem that occurs rarely. Descriptors which were pushed by cores start to remain in Ethernet SW HW Que. Looks like TX DMA stops working.

I've checked all descriptors if there was any irregular bits or bytes. They were all good.

During debugging when i've wanted to reach memory region for ethernet statistics block emulator disconnected and can't reconnect again. 

I come across "KeyStoneII.BTS_errata_advisory.39 CPSW Stall if Duplex Changes During Transmission" in TCI6638 Silicon Errata document. 

The Advisory 39 was removed in 66AK2E05 Errata Document. 

Is Advisory 39 applies to 66AK2E05? 

Burhan. 

  • Hi, Burhan,

    TCI6638 is 66AK2H/K variant and one of the variants of 66AK2E/L is TCI6630K2L. They are 2 different board design and SoC. so no, the advisory 39 in Errata for 66AK2H/K doesn't apply to 66AK2E/L. 

    Rex

  • Hi Rex, 

    Thank you for your answer. 

    So do you have any idea what i should check for transmission hang issue for 66AK2E05? 

  • Hi, Burhan,

    Could you describe more on your environment? what SDK  (Linux or RTOS) are you using and its version. What is the system doing at the time when issue happens? Which core you connect to check memory? using CCS? Was it OK before typed in address in memory browser? etc. You only have a problem statement without details.

    Rex

  • Hi Rex,

    • It is a custom design board. 
    • Using pdk_k2e_4_0_5 for PA APIs.
    • Running TI RTOS.
    • Almost no traffic. Just pinging an IP address continuously and pushing dummy descriptor every 1msec to detect problem( dummy descriptors return to specific que that sw pops. If couldn’t pop in particular time detects the issue.  ) 
    • ARM3 connect to check memory using CCS. 
    • It was OK for a long time(really long time may be 3-4 hours), pop and inspect descriptors from ethernet que. 
    • Logging each descriptor that pushed to Ethernet Que. So, from logs I have chance to debug previous descriptors(because I’ve experienced that Packet Accelerator and Security Accelerator is open to error in case of not carefully prepared descriptor). Checked about 10 prior descriptors and found  nothing. The last dummy descriptor was still in the ethernet que. That shows transmission hangs. 
    • After inspecting descriptors just want to checked ethernet statistics and debugger connection lost. 
    • Normally, I can connect core and check ethernet statistics via debugger.
    • Next time the issue happens I will collect and share the last 10 descriptor memory to examine. 

    Thank you Rex for your support

    Burhan.

  • Hi, Burhan,

    I'll have a RTOS expert to help you.

    Rex

  • Hi,

    Is this test code based on the TI PA example or NIMU example? Is it just a ping test from PC? The NIMU doesn't use PA. The PA test is a loopback test for 10 or 20 packets, if you do that for tens of thousands packets, will that fail?

    Regards, Eric

  • Any update?

    -Eric

  • Hi Eric,

    This is production code and production system. Actually we follow up lot from TI PA examples during coding lower layers of protocol stacks.  

    Yes, it is just a ping test.

    In another branch i've just disabled Ethernet Switch external control. Normally in this setup, problem occurs in 2-3 days. However, by disabling external control we couldn't regenerate problem. 

    Regards,

    Burhan.

  • Hi,

    Thanks for the information. "Ethernet Switch external control">>>>what is exact this register? what value before and after? Is your finding can be used as a workaround or need further investigation?

    Regards, Eric

  • Hi Eric,

    It is 18th bit of the Ethernet Port n MAC Control Register (Pn_MAC_CTL). The explanation for this is in the document SPRUHZ3A. 

    EXT_EN : External Control Enable. Enables the fullduplex and gigabit mode to be selected from the

    FULLDUPLEX_IN and GIG_IN input signals and not from the FULLDUPLEX and GIG bits in this register.

    The FULLDUPLEX_MODE bit reflects the actual fullduplex mode selected.

    The value of this bit was 1 and I set it to 0. 

    Changing this bit from 1 to 0 seems to have solved the problem. 

    I wanted to make sure if this chip is also defined errata Advisory 39 like 6638. 

    No detailed review required at this time.

    Thanks

    Regards,

    Burhan.

  • Hi,

    I didn't see that in K2E errata. I am looking for design team for clarification and will update here.

    Regards, Eric

  • Hi, 

    Yes, it is not in the K2E errata. However, K2K workaround seems to solve problem. 

    It would be very good. I opened this title to clarify this.

    Thank you for your support Eric.

    Regards,

    Burhan. 

  • Burhan,

    Thanks for the patience, I got confirmation from design team that Advisory 39 for K2H/K also applies to K2E.

    Regards, Eric

  • Hi Eric,

    This information confirmed that the edit was the solution.

    Thank you very much for your support.

    Regards,

    Burhan.