This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DS110DF111: Ports Not Coming Up Every 16 Reboot Cycles

Part Number: DS110DF111
Other Parts Discussed in Thread: TMUXHS4212

Hi,

We have a design using two DS110DF111 connected to a Marvell processor on one side and SFP+ modules on the other.  We are having a strange issue. Every 16 software reboots of our system, the SFP port will not operate correctly. Checking deeper it seems that the link between the Cavium and the DS110DF111 has not completed negotiation. After another soft reboot, the link will come up fine. This continues every 16 soft reboots. A power cycle resets the cycle. The DS110DF111 has no reset pin and while in the bad state, setting any of the RESET bits in the registers seems to have no effect. We've also tried CDR reset and still the link refuses to come up. Is this 16 cycle behaviour something that has been seen before? Is there anything in the DS110DF111 which could change on 16 cycles of link up / dn?

Thanks for you help

Mark

  • Hi Mark,

    This seems like unusual behavior and is not specifically something I have encountered before.  Could you provide estimates as to the insertion loss between the retimer and the SFP+ modules and processor.

    Also, would it be possible for you to provide a register dump of both regular operation and the abnormal state.

    Thanks,

    Drew

  • Thanks Drew,

    We think this is probably a problem with the processor end of the link, however here is a list of reg differences between the working and bad states. I also have a question about auto negotiation. Our processor interfaces are setup to auto negotiate the link speed at startup using 1000BaseX. I can see no auto negotiation settings in the retimer. Presumably this is not required as its just a retimer 'pipe'. Does this mean that the processor is auto negotiating with the thing connected to the other side of the retimer - in our case an SFP module?

    Thanks Again

    Mark

    Register Values Meaning
    0x2 0x00 vs 0x9c or 0xdc Missing Lock and CDR Lock, plus a couple of Reserved bits
    0x24 0x40 vs 0x00 DFE Error - No Lock set
    0x27 0x00 vs 0x3d or 0x3e HEO Value
    0x28 0x00 vs 0x93 VEO Value
    0x29 0x00 vs 0x40 Eye Opening Monitor Voltage Range Setting
    0x3b 0x2f vs 0x31 PPM Count MSB
    0x3c 0x48 vs 0xfd or 0xfe PPM Count LSB
    0x52 0x95 vs 0x0 CTLE Boost setting readback register
    0x5a 0xfd vs 0x69 Undocumented
    0x92 0xf9 vs 0x0 Undocumented, might not actually exist!
    0x9a 0xf9 vs 0x69 Undocumented, might not actually exist!
    0xb2 0x80 vs 0x0 Undocumented, might not actually exist!
    0xba 0x79 vs 0x69 Undocumented, might not actually exist!
    0xf2 0xb0 vs 0x0 Undocumented, might not actually exist!
    0xfa 0xfd vs 0x69 Undocumented, might not actually exist!

  • Hi Mark,

    To answer your question regarding how the retimer behaves during auto negotiation, we would expect it to lock to the 1.25Gbps auto negotiation signal during negotiation, and then lock to 10.3125 Gbps signal after negotiation (assuming you are using default data rates, please let me know if this is not that case).  While locked to the auto negotiation signal, it should just pass through the auto negotiation data.

    It seems that for some reason, every 16th reboot the retimer cannot achieve CDR lock.  We can see that the CTLE boost setting has changed, indicating that the retimer did try to adapt CTLE.

    Something puzzling is the link doesn't come up after CDR reset.  Can you check if performing CDR reset enables the device to achieve CDR lock?  If auto negotiation fails, does the processor continue to send data?  The retimer would need valid data to lock to in order for this to be effective.

    Another idea: We could rule out any adaption issues on the retimer by changing from adapt mode 2 (default) ch_reg_0x31[6:5] to adapt mode 0.  Adapt mode 0 has no adaptation.  You would need to manually set whatever CTLE and DFE values you find to be optimal.  The reason I bring this up is that it seems as though the CTLE value is rather high in the failed set of registers, assuming CTLE index 0 works fine normally.

    Thanks,

    Drew

  • Hi Drew,

    We have narrowed this down to a TI switch. Our Marvell processor uses two XCVRs to connect to the SFP (via the retimer). One is fixed and 10G and the other is fixed at 1G (this is for various reasons in the config on the processor). These two XCVRs connect to a bidirectional switch TMUXHS4212IRKSR which selects one of the two XCVRs data to pass through to the retimer and likewise in the other direction. We had not considered this switch as part of the problem previously and focused on the processor and retimer, however, it seems that this switch is getting into some kind of latchup state and not allowing data to pass in either direction? As described previously, this does not happen every time but seems to occur once in every 16 cycles of the ports coming up? Any ideas? 

    Thanks again for you ongoing help.

    Regards

    Mark

  • Hi Mark,

    Glad to hear that you were able to narrow down the issue!  I will follow up with a team member who supports the TMUXHS4212 and get back to you on this.

    Thanks,

    Drew

  • Hi Mark,

    What is the voltage swing on your 1G signal?

    Thanks,

    Drew