DP83867IR: Ethernet PHY information

Part Number: DP83867IR
Other Parts Discussed in Thread: TMDX654IDKEVM, TMDS64EVM

Tool/software:

Hi,

I'm having trouble with the link connection of the DP83867IR phy, which I think has already been documented in the latest Application Notes Troubleshooting Guide (Rev. C).

My case seems to be the one described in paragraph 3.5, third bullet point ("Read register 0x0005[15] and If 0x0005 bit[15] = 1"). I just said "seems" since I still did not manage to read register 0x0005 as suggested, but the behavior looks the same. I do have two different boards, using this PHY and having problem connecting over the network managed by this PHY.

I'm working to implement the proposed solution on the driver used by my RTOS, but I would like to have more details on the reported info. There (on the Application Notes) it says that the "PRN is not exactly random and if both DP83867 start auto-negotiation at the same time there is a possibility both DP83867 send out the exact same random seed (PRN) and result in dead lock.". This looks very similar to my setup since the problem comes more often when the boards are powered-on simultaneously. However, it is not very clear what " at the same time" means... is there any time window reference? How much is the necessary delay for not having this issue?

Thanks.

Andrea

  • Hi Andrea,

    We do not have an exact reference for this time window.

    Please try the suggested workaround noted in section 3.5 to avoid adding power sequence constraints between PHYs:

    If this workaround resolves link when applied manually, this can be implemented in the RTOS driver for each PHY for proper operation on start-up.

    Thank you,

    Evan

  • Hi Evan,

    Thanks for your answer.

    As reported in my first post, I'm working on implementing the solution into the driver. Finger crossed.

    I just asked for the time windows since I would like to understand what the first cause of this issue is. In a DTS of a commercial distributor for an AM65xx-based baseboard, I just found this:

    &icssg0_mdio {
          /* ti,force-master is required due to the high jitter of our clock
           * generator. This is a workaround and prevents connections to other
           * devices that also require to use master mode
           */
          pruss0_eth0_phy: ethernet-phy@1 {
                reg = <1>;
                ti,rx-internal-delay = <DP83867_RGMIIDCTL_2_00_NS>;
                ti,fifo-depth = <DP83867_PHYCR_FIFO_DEPTH_4_B_NIB>;
                ti,force-master;
          };
    };

    According to the comment, they reported the first cause of this issue as "high jitter" on the PHY clock. However, I checked the required jitter of the PHY and compared to mine it seems ok! (requested: 11ps RMS, provided: 200 fs RMS). So, I don't think the "high jitter" is the real issue.

    Could you please provide me with more info about it?

    Thanks.

    Andrea

  • Hi Andrea,

    I believe this DTS comment is referring to the clock jitter from AM65 generated clock (GTX_CLK input to PHY).

    The cause for this issue is independent from clock jitter. Regardless of clock jitter, if the PHYs start up at the same time, their internal state machines will continuously attempt to link with the same PRN seed sequence. In this case, they will be unable to link with incompatible seeds. 

    The workaround for this is to create a delay between PHY powerups, or apply the proposed solution for prevent dependency on PRN seed for link.

    Thank you,

    Evan

  • Hi Evan

    Thanks again for your precise support. This seems to make more sense to me, although I still have some questions.

    Why did this not happen using TI evaluation boards mounting the same processor and PHYs? I implemented the same setup using the TMDX654IDKEVM for communicating with the TMDS64EVM, but I never encountered this issue before. What's the difference in that case? 

    I do need to understand this since I'm designing a custom carrier board to support an AM65 processor. For my project, two of these boards shall communicate via the PRUSS-based eth port, so I need to be sure they work properly and how to fix this error.

    Thanks.

    Andrea

  • I just forgot to mention another aspect of the bug I encountered: sometimes, no matter if the boards started simultaneously, they stopped communicating after a while. Or, other times, they do not start communicating since the boot, but after varying time windows (some minutes).

    If the issue is due to the simultaneous start (which in turn makes the board generate the same PNR), why do I face the problem even in a non-simultaneous start?

    Thanks.

    Andrea

  • Hi Andrea,

    implemented the same setup using the TMDX654IDKEVM for communicating with the TMDS64EVM, but I never encountered this issue before.

    In the case of two different boards communicating, there is likely some hardware / layout difference on the power scheme causing the PHY power sequence to complete with some time difference, regardless of simultaneous power input.

    I just forgot to mention another aspect of the bug I encountered: sometimes, no matter if the boards started simultaneously, they stopped communicating after a while. Or, other times, they do not start communicating since the boot, but after varying time windows (some minutes).

    This seems to be a separate issue from PRN deadlock. How often does this bug occur? Are the PHY/MCU power rails and input clocks stable when this bug occurs?

    Thank you,

    Evan

  • Hi Evan,

    Regarding the setup, I do agree with you that a few differences may occur between the two boards making the bug not proposing. Actually, in some tests, I did also connect two TMDX654IDKEVM boards and didn't experience this issue. Unfortunately, I did conduct very few tests with this setup, many times ago, so I can not be very sure about the results. I'll try to do more and longer.

    For the second point, the bug comes out very often, in some test sessions even at each restart of the system, after a few minutes. I suspected that it was related to the same issue, hypothesizing that a ri-negotiation would be done by the PHYs after a while. If so, the deadlock scenario, that didn't occur at the start-up, would come out. Is this possible? Regarding the PHY/MCU power rails and input clocks stability, I do not have an answer, since I didn't measure them during tests. Do you have any idea what could be the problem in this case?
    Just a few more info for this point: I experienced this bug both with two identical boards and with two different ones. In this last setup, the boards communicating have the same SOM module (am65xx-based), but different carrier boards. They use the same PHY. Additionally, in this setup, we also experienced the "deadlock-like" issue many times.

    Sorry If I'm providing all this info in this "chaotic" way, but I didn't expect the discussion to come so far. Please, just let me know if I could do some other tests and what kind of data would be good for helping.

    Thanks.

    Andrea

  • Hi Andrea,

    It's possible deadlock can occur after powering, if both PHYs restart auto-negotiation process at the same time. I am unclear if this the case for the issue seen.

    The simplest way to confirm if the issue is PRN related is to apply the workaround and see if the same behavior occurs.

    Regarding the PHY/MCU power rails and input clocks stability, I do not have an answer, since I didn't measure them during tests. Do you have any idea what could be the problem in this case?

    Link failure after some time could be a symptom of PHY or MCU having instability on power rails or clock inputs. During the time window when link is up, is communication working as expected without errors?

    Thank you,

    Evan

  • Hi Evan,

    I'll let you know if both the bugs persist even after the driver patch for the master-slave config.

    However, I have some more doubts:

    • the baseboards used during tests have 7 ethernet ports (6 x PRUSS-based, 1 x MCU-based), all managed by the same PHYs. During my test, only 1 PRUSS-based and the MCU one were connected on two different sub-nets (192.168.10.x for PRUSS, 192.168.0.x for MCU). How all these should be configured? One master all the other slaves? One master per sub-net? What if both networks pass on an ETH switch? 
    • I just read again the Application Note that I mentioned in my first post: there it says that the errata I referred to is only for the old revision of DP83867 (register 0x0003 = A0F1). I read the register of my PHY and they return 0xa231. Does this mean that this bug should not be present in my version? If so, what's the bug I experienced?

    Thanks.

    Andrea

  • Hi Andrea,

    Sounds good, will wait for results.

    Do each of the ethernet ports have their own MPU/MCU? Please share a block diagram if possible to help me understand the topology.

    This bug should not be present on newer silicon revision, but I still recommend implementing the workaround to confirm any differences seen.

    Thank you,

    Evan

  • Hi Evan,

    The eth ports are all connected to the same MPU/MCU. For the block diagram, you can refer to the TMDX654IDKEVM, it is very similar to my board (same processor, same PHY, same PRUSS).

    Thanks.

    Andrea

  • Hi Andrea,

    In this case, the PHYs should all be set to the master/slave mode opposing the MPU.

    E.g. MPU set in master, all PHYs set to slave

    I'm unclear if the MPU will need distinct IP addresses for each of these ports (is this a switching application or all ports need to be active simultaneously?)

    For further information on this, I recommend opening a new thread to the Sitara team.

    I can help address further PHY-related queries here.

    Thank you,

    Evan

  • Hi Evan,

    I think that maybe I was not clear in my last message.

    For my architecture, I have an AM65xx processor connected to 6 PHYs for driving 6 ETH ports. My doubt was about all these PHYs, should I set all as masters? What do you mean by saying "opposing to the MPU"? Should I also set the MPU as master/slave? I didn't read about any configuration master/slave on the cpu side, do you have any reference for this?

    In this architecture, these ports will be either bridged or set independently (each with a dedicated IP), yet not defined.

    Thanks.

    Andrea

  • Hi Andrea,

    I will confirm with team and get back to you tomorrow.

    Thank you,

    Evan

  • Hi Andrea,

    Manual master/slave configuration on PHY-side is not required for RGMII, this will be resolved through auto-negotiation.

    I'm not clear on the appropriate IP/subnet assignments for these ports, can you create a new thread targeted at AM65xx team to address this?

    I can help address further PHY-related queries here.

    Thank you,

    Evan

  • Hi Evan,

    I'm a bit confused, why should I open a new thread?

    My question is still related to the PHY topic, I'm just not sure how to configure all the PHY connected to the same board. What do you mean by saying " manual master/slave configuration on PHY-side is not required for RGMII"? Does this change the scenario of a DP8386-to-DP83867 connection?

    From previous messages, I understood that for eth connection involving this specific PHY on both sides, the manual configuration is mandatory to avoid deadlock scenario. Does this change if the connection is RGMII?

    Thanks

    Andrea

  • Hi Andrea,

    The new thread is intended to address your question on IP address / subnet assignment with the processor. I will continue addressing PHY-related queries here.

    I understood that for eth connection involving this specific PHY on both sides, the manual configuration is mandatory to avoid deadlock scenario. Does this change if the connection is RGMII?

    This is correct, and does not change for RGMII connection. Sorry for the confusion here.

    There are two separate master/slave configurations:

    1) RMII master/slave (decides clocking connections, will be set between MAC/PHY)

    2) 1000Base-T master/slave (decides auto-negotiation priority, will be set between PHY/PHY)

    Considering only (2), our goal is to confirm that each set of link partners is a master-slave pair.

    Thank you,

    Evan

  • Hi Evan,

    Thanks. Now it is clearer to me.

    I also think that I wasn't very clear regarding the processor. My question is not related to IP assignment but to PHY configuration. The fact that I have many ETH ports on the same board (each with a dedicated PHY connected to the same CPU), does it change anything on the master-slave config? Is the rule always master-to-slave following the cable connecting the two ports, or in this case they interact somehow internally so they need a specific config?

    I hope this is clearer now, for any doubt just let me know.

    Thanks.

    Andrea

  • Hi Evan,

    Thanks. Now it is clearer to me.

    I also think that I wasn't very clear regarding the processor. My question is not related to IP assignment but to PHY configuration. The fact that I have many ETH ports on the same board (each with a dedicated PHY connected to the same CPU), does it change anything on the master-slave config? Is the rule always master-to-slave following the cable connecting the two ports, or in this case they interact somehow internally so they need a specific config?

    I hope this is clearer now, for any doubt just let me know.

    Thanks.

    Andrea

  • Hi Andrea,

    Having many ETH ports to same CPU should not change the 1000Base-T master/slave config required. 

    Is the rule always master-to-slave following the cable connecting the two ports

    This is correct, this setting is independent from CPU-side.

    Hope this helps - does configuring following this rule present any issues for link?

    Thank you,

    Evan

  • Hi Evan,

    thanks for clarifying this aspect.

    The master-slave solution surely makes the architecture more robust, but I'm still facing some issues.

    I need to conduct a deeper investigation to determine whether the problem is still related to the PHY or something else.

    Thanks.

    Andrea

  • Hi Andrea,

    Sounds good, please share further details on the failure rate and possible root causes when able.

    Thank you,

    Evan