This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DP83TD510E: DP83TD510E: More robust Link in scenario with millisecond connection losses on physical connection

Part Number: DP83TD510E
Other Parts Discussed in Thread: TMUX7211

Hi,

this is a continuation of this thread: https://e2e.ti.com/support/interface-group/interface/f/interface-forum/1169986/dp83td510e-more-robust-link-in-scenario-with-millisecond-connection-losses-on-physical-connection/4430923#4430923

We tested the DP83TD510 by quickly disconnecting and reconnecting the communication lines using a solid state switch. We found that a disruption under 500us does not lead to a link failure, sometimes even with 1ms. Furthermore, we found out that interruption until 10ms lead to a link retrain with a wait time of 200ms, but with interruptions from 10ms to 200ms the PHY can recover the link and does not need a retrain procedure. Unfortunately, your notice about changing the register 0x502 (error count before link drop, disable descrambler unlock) seems not to effect the behavior after disrupting the link (except the link status register stays in link established state if “disable descrambler unlock” is set, even if link is broken).       

When looking at the IEEE 802.3cg standard, this does not seem to be the intended behavior. If the scrambler lock is lost, the PHY should return to the SEND IDLE state. The scrambler should then be able to regain the lock. A full retrain should only occur if the maxwait_timer (200ms ±2ms) runs out and the lock is not regained during that time. The standard stresses this:

„NOTE—After a disturbance on the link segment, e.g., when the current consumption on a powered link segment is quickly changed, the maxwait_timer allows the PHYs to stay in the SEND IDLE state before going to the SILENT state. This allows the PHYs to attempt to recover the link before a full retrain.“ (“IEEE Standard for Ethernet - Amendment 5: Physical Layer Specifications and Management Parameters for 10 Mb/s Operation and Associated Power Delivery over a Single Balanced Pair of Conductors”, p. 144)

Best Regards

  • Hello,

    Thank you for your reply. Please allow me until Wednesday 2/15 to get back to you.

    Sincerely,

    Gerome

  • Hello,

    I have relayed this information to our internal team for comment. I hope to get a response back next week. Please note TI is on holiday 2/20/23 in observance of President's day.

    Sincerely,

    Gerome

  • Hello,

    Some questions from internal team:

    1. How are the disruptions created? Shorting MDI lines or something?
    2. The above observations are counter intuitive i.e. if the disruption is present for shorter time that gives worse result?
    3. How did you conclude the phy did a retraining or not? There is no link drop in both cases, correct?

    Sincerely,

    Gerome

  • Hello Gerome,

    Thank you for your reply. To the questions from the team
    1. The disruptions were created by quickly discnnecting and reconnecting both MDI lines with a solid state switch (TMUX7211).
    3. We observed the MDI lines before and after the solid state switch with a differential probe on an oscilloscope (blue and red trace). We also monitored the ethernet packages through the phy with the trigger output of a capturing device (yellow trace) and the state of the solid state switch (green trace). The following image shows a disruption of 1ms.
     

    [https://imgur.com/a/SfOZkv7, 2562x1396]
     

    A retrain is clearly visible because both first PHYs go silent, then send link pulses and then send training sequences at a lower amplitude before sending at full amplitude.
     

    The following image shows a disruption of 50ms:


    [https://imgur.com/a/DO4A7Um, 2562x1396]


    The phy recovers the link without a full retrain.
     

    2. We also think that the observations are counter intuitive but the following could be happening : Short disruptions (<500us) do not lead to a scrambler unlock and therefor not to a link drop. During long disruptions (>10ms) the PHY correctly transitions to the SEND_IDLE state and is able to regain synchronization to recover the link. During disruptions between (500us and 10ms) some register or part of the logic has not transitioned to the correct state and a link recovery is therefor not possible.

    Sincerely,

    Kristian

  • Hello Kristian,

    20th Feb is a US public holiday. I'll let Gerome respond as soon as he's back to office.

    --
    Regards,
    Gokul.

  • HI Kristian,

    Please allow me time to bring this information to the team. At worst, I expect an update by EoW.

    Sincerely,

    Gerome

  • Hi Kristian,

    I apologize for the delay. I will ping our internal team regarding your query and get back to you next week.

    Sincerely,

    Gerome

  • Hi Kristian,

    Can you confirm the concern is that for disruptions between 500us-10ms yousee the link drop and a retraining whereas for longer disruptions it recovers without complete re-training?

    In no case phy gets completely stuck, correct?

    Sincerely,

    Gerome

  • Hi Gerome,

    that is absolutely correct, the PHY doesn´t get stuck.

    Sincerely,

    Kristian

  • Hi Kristian,

    I have relayed this to our internal team. Please note that I am out of office until next Tuesday, and I hope to reply with an update upon my return from the team.

    Sincerely,

    Gerome

  • Hi Kristian,

    To understand the context of your query, you only want to understand why you are seeing such behavior, for 500us to 10ms where link drops and retraining is needed but for longer disruptions, it recovers without re-training?

    Sincerely,

    Gerome

  • Hi Gerome,

    are there any updates from the team? Have they been able to replicate the issue?

    For more context:
    During field test we had issues with link drops due to vibrations at the connectors. In measurements we determined that the channel is only ever disturbed for a few milliseconds. This had us puzzled, because according to the 802.3cg standard the PHY should be able to recover the link with disruptions up to 200ms. Therefore we did some further investigation and lab tests, the results which we have posted earlier.

    For our application it is important that the connection is not interrupted for more than about 100ms maximum. A retrain takes much longer than that (up to 3s) and will lead to a full stop of the system, since the communication is critical. We therefor want to know, if there is a way for the PHY to recover the link with the 500us to 10ms disruptions as well. Since the PHY is able to recover longer disruptions it should be possible from our perspective.

    Single Pair Ethernet is a great technology and solves a lot of problems for us but with the issue it is currently unusable for us.

    Sincerely,

    Kristian

  • Hi Kristian,

    Thank you for the context. I have relayed this to the team, and I do think this will help clarify. Please expect a response by mid-week.

    Sincerely,

    Gerome

  • Hi Kristian,

    After discussing with team, we understand the concern regarding retraining and your system. We wanted to understand a few things:

    - What is the timeline you are looking for a solution? 

    - Can you confirm you are not using any register configuration and are running default settings, correct? If not, please share configuration

    - How repeatable is your observations that you are seeing regarding re-training for 500us-10ms? Is this consistent over many IC's?

    Sincerely,

    Gerome

  • Hi Gerome,

    - we need a solution within a maximum of 2 months (the sooner, the better, of course).

    • We are running default settings, but we enabled the packet generator in some tests by writing 0x64 in register 0x11A and 0x557 in register 0x119. The behavior did not change with the packet generator enabled or disabled. We reset the PHY using the reset pin at the start of the test and introduced a fault of defined length every ten seconds. The PHY was not reset during the tests. We triggered the oscilloscope at the start of the fault and observed the behavior of the PHY.

      We also tried changing the settings in register 0x502 and manually configuring master and slave using 0x1834 and 0x202, which also did not change the behavior.
    • The tests were carried out with four devices. The behavior did not change in any combination of devices, so we are somewhat sure that it is not a one of.

    Sincerely,

    Kristian