This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS/PROCESSOR-SDK-AM335X: ALE age out now problem

Part Number: PROCESSOR-SDK-AM335X

Tool/software: Code Composer Studio

In our system we have trouble with ethernet connectivity.

We noticed that after 7-8 minutes of inactivity, we no longer have access to our device. There needs to be another ARP sequence before the device can be connected again.

E.g. a ping to a directly connected device will fail for the first two tries. The ping will initiate another ARP and then, the following pings succeed.

When the device is indirectly connected through a router, the device can never be connected again, since there is no automatic ARP coming from the router.

 

We noticed that the problem appears when the driver is cleaning old entries from the ALE table in EMAC_poll_v4. When the aleTimer expires, it calls EMAC_cpswALEAgeOutNow.

When reducing the timeout value, we can see the behavior much earlier, so this is obviously the reason for our problem.

 

Has anyone noticed the same? Any suggestions how to deal with this?

This issue could also connected with this post.

http://e2e.ti.com/support/processors/f/791/p/837445/3097018#3097018

 

Processor SDK 5.00.00.15

(PDK 1.0.11; NDK 2.26.00.08)

  • Hi Chris,

    Please check the post https://e2e.ti.com/support/processors/f/791/t/840450#pi320966=5 .

    Please try the following:

    Change the "_RtNoTimer = 0" to "_RtNoTimer = 1" in C:\ti_am3_600\ndk_3_60_00_13\packages\ti\ndk\stack\route\rtable.c and rebuild the NDK with ndk.mak. After rebuild the NIMU_BasicExample_evmAM335x_armExampleproject, I load and .run the newly generated OUT file. It worked as expected. I pinged the BBB after 30 mins, 2 hours and 10 hours idle from PC. They all worked.

    Please let us know your test result.

    Ming

  • Hi,

    At the moment it is no option to update to the ndk_3_60_00_13. (TI-Arm compiler is no longer supported).

    What is the function at this RtNoTimer and what is the influence with the ALE?

    If the ALE is bypassed, the ping is working, but this could not the solution.

  • Hi Chris,

    The  _RtNoTimer is used in RtTimeoutCheck() in ndk_3_60_00_13\packages\ti\ndk\stack\route\rtable.c

    The reason when the ALE is used with a switch, the ping function will stop working after a few mins idle is that the ALE age out happens. There will be no renew ARP from the switch nor the AM335x, the ARP renew from the PC which connected the switch cannot reach to the EVM either. Therefore the connection is lost forever.

    When the AM335x and PC are connected directly, the PC will send out the ARP renew periodically, therefore the ALE is refreshed periodically. That is why it works always.

    There two ways you can keep the connection going:

    1. Turns off the ALE ageout timer by comment out the following:

    EMAC_LOCAL_DEVICE->Config.aleTimerActive = 1; in pdk_am335x_1_0_15\packages\ti\drv\emac\src\v4\emac_cpsw.c. The side effect of this solution is that the 1024 entry of the ALE may get full eventually, but if you know your ALE entries are limited under 1024, then it is a quick and easy solution.

    2. Adding a active component in your network topology (like a PC or a DHCP) which periodically renew the ARP sequence.

    Ming

  •  

    Hi Ming,

     

    thank you for your response. What you say is exactly what we also found out as a reason why the connection gets lost after some minutes.

     

    To your suggestions:

    1) Turn off the ALE ageout timer:

    As you say, we think there is a risk that the ALE table might get full after some time and prerefer not to do this for this reason. Instead we reconfigured the Ethernet Subsystem

    CPSW_ALE CONTROL REGISTER to enable Port 0 unicast flooding, that is EN_P0_UNI_FLOOD = 1. It looks like this keeps the connection going and the switching functionallity alive

    (which we lost by enabling ALE BYPASS in ALE CONTROL register). We think this could result in additional CPU load.

     

    2. Adding an active component in the network topology:

    We inserted periodic pings (every 5 minutes and after restart) to the gateway (which actually could not exist in the network) with the result that the connection stays open.

     

    What would be your preferred solution?

  • Hi Chris,

    I would definitely prefer the second solution.

    Would you mind mark this thread as "Resolved"? Thanks!

    Ming

  • Hi Ming,

    before we can resolve this thread, would you please elaborate why you prefer the second solution?

    At the moment, we do not fully understand why (or when) the ping from the host touches the ALE entries. Is it the outgoing ICMP message or the incoming? What happens if the ping is not responded from outside, for example if the gateway does not exist or is down before the ALE age-out process is done?

    Regards

  • Hi Chris,

    The reason I prefer the second solution is that the first solution basically remove the restrainer to the incoming packets. It will definitely increase your CPU overhead by processing unnecessary packets.

    According to the TRM of the AM572x, the ALE is using the received packets to populate the ALE entry, so you do need the unicast response from the pinged device to refresh the ALE. Therefore a lost ping response will cause the ALE refresh not working. To prevent it happens, you may need to add other form of periodical packet traffic like status checking from the high level.

    One more possible solution is that if you know the MAC addresses of the target device ahead pf time, then you can add them into the ALE and prevent them to be aged out.

    Ming 

    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    The address lookup engine (ALE) processes all received packets to determine which port(s) if any that the packet should the forwarded to. The ALE uses the incoming packet received port number, destination address, source address, length/type, and VLAN information to determine how the packet should be forwarded. The ALE outputs the port mask to the switch fabric that indicates the port(s) the packet should be forwarded to.

    -------------------------------------------------------------------------------------------------------------------------------

     

  • Hello Ming Wei,

    we have implemented and are currently testing a solution that adds the host MAC into the ALE-table with "no-age-out" flag (0) at initialization of the device. We think this is what you meant by preventing the age out of the specific MAC without having to ping or flooding the port. Also preventing the CPU overhead.

    Do you see any issues regarding this solution or have comments / hints about it?

    Can there be cases where this solution has problems? We don't see any yet.

    Regards

  • Hi Chris,

    I think the solution you have implemented is the best I can think of at this point.

    If it works for you, please mark this thread as "Resolved".

    Ming