DS110DF111: Ports Not Coming Up Every 16 Reboot Cycles

Mark Stephenson

Part Number: DS110DF111
Other Parts Discussed in Thread: TMUXHS4212

Hi,

We have a design using two DS110DF111 connected to a Marvell processor on one side and SFP+ modules on the other. We are having a strange issue. Every 16 software reboots of our system, the SFP port will not operate correctly. Checking deeper it seems that the link between the Cavium and the DS110DF111 has not completed negotiation. After another soft reboot, the link will come up fine. This continues every 16 soft reboots. A power cycle resets the cycle. The DS110DF111 has no reset pin and while in the bad state, setting any of the RESET bits in the registers seems to have no effect. We've also tried CDR reset and still the link refuses to come up. Is this 16 cycle behaviour something that has been seen before? Is there anything in the DS110DF111 which could change on 16 cycles of link up / dn?

Thanks for you help

Mark

over 3 years ago

0 Drew Miller1 over 3 years ago

TI__Mastermind 37483 points

Hi Mark,

This seems like unusual behavior and is not specifically something I have encountered before. Could you provide estimates as to the insertion loss between the retimer and the SFP+ modules and processor.

Also, would it be possible for you to provide a register dump of both regular operation and the abnormal state.

Thanks,

Drew

0 Mark Stephenson over 3 years ago in reply to Drew Miller1

Prodigy 10 points

Thanks Drew,

We think this is probably a problem with the processor end of the link, however here is a list of reg differences between the working and bad states. I also have a question about auto negotiation. Our processor interfaces are setup to auto negotiate the link speed at startup using 1000BaseX. I can see no auto negotiation settings in the retimer. Presumably this is not required as its just a retimer 'pipe'. Does this mean that the processor is auto negotiating with the thing connected to the other side of the retimer - in our case an SFP module?

Thanks Again

Mark

Register	Values	Meaning
0x2	0x00 vs 0x9c or 0xdc	Missing Lock and CDR Lock, plus a couple of Reserved bits
0x24	0x40 vs 0x00	DFE Error - No Lock set
0x27	0x00 vs 0x3d or 0x3e	HEO Value
0x28	0x00 vs 0x93	VEO Value
0x29	0x00 vs 0x40	Eye Opening Monitor Voltage Range Setting
0x3b	0x2f vs 0x31	PPM Count MSB
0x3c	0x48 vs 0xfd or 0xfe	PPM Count LSB
0x52	0x95 vs 0x0	CTLE Boost setting readback register
0x5a	0xfd vs 0x69	Undocumented
0x92	0xf9 vs 0x0	Undocumented, might not actually exist!
0x9a	0xf9 vs 0x69	Undocumented, might not actually exist!
0xb2	0x80 vs 0x0	Undocumented, might not actually exist!
0xba	0x79 vs 0x69	Undocumented, might not actually exist!
0xf2	0xb0 vs 0x0	Undocumented, might not actually exist!
0xfa	0xfd vs 0x69	Undocumented, might not actually exist!

0 Drew Miller1 over 3 years ago in reply to Mark Stephenson

TI__Mastermind 37483 points

Hi Mark,

To answer your question regarding how the retimer behaves during auto negotiation, we would expect it to lock to the 1.25Gbps auto negotiation signal during negotiation, and then lock to 10.3125 Gbps signal after negotiation (assuming you are using default data rates, please let me know if this is not that case). While locked to the auto negotiation signal, it should just pass through the auto negotiation data.

It seems that for some reason, every 16th reboot the retimer cannot achieve CDR lock. We can see that the CTLE boost setting has changed, indicating that the retimer did try to adapt CTLE.

Something puzzling is the link doesn't come up after CDR reset. Can you check if performing CDR reset enables the device to achieve CDR lock? If auto negotiation fails, does the processor continue to send data? The retimer would need valid data to lock to in order for this to be effective.

Another idea: We could rule out any adaption issues on the retimer by changing from adapt mode 2 (default) ch_reg_0x31[6:5] to adapt mode 0. Adapt mode 0 has no adaptation. You would need to manually set whatever CTLE and DFE values you find to be optimal. The reason I bring this up is that it seems as though the CTLE value is rather high in the failed set of registers, assuming CTLE index 0 works fine normally.

Thanks,

Drew

0 Mark Stephenson over 3 years ago

Prodigy 10 points

Hi Drew,

We have narrowed this down to a TI switch. Our Marvell processor uses two XCVRs to connect to the SFP (via the retimer). One is fixed and 10G and the other is fixed at 1G (this is for various reasons in the config on the processor). These two XCVRs connect to a bidirectional switch TMUXHS4212IRKSR which selects one of the two XCVRs data to pass through to the retimer and likewise in the other direction. We had not considered this switch as part of the problem previously and focused on the processor and retimer, however, it seems that this switch is getting into some kind of latchup state and not allowing data to pass in either direction? As described previously, this does not happen every time but seems to occur once in every 16 cycles of the ports coming up? Any ideas?

Thanks again for you ongoing help.

Regards

Mark

0 Drew Miller1 over 3 years ago in reply to Mark Stephenson

TI__Mastermind 37483 points

Hi Mark,

Glad to hear that you were able to narrow down the issue! I will follow up with a team member who supports the TMUXHS4212 and get back to you on this.

Thanks,

Drew

0 Drew Miller1 over 3 years ago in reply to Drew Miller1

TI__Mastermind 37483 points

Hi Mark,

What is the voltage swing on your 1G signal?

Thanks,

Drew

Interface

Interface forum

DS110DF111: Ports Not Coming Up Every 16 Reboot Cycles