FPC402: Port state machine is stuck when host side SCL is interrupted or delayed for long time during remote downstream port access.

Liming Zhou

Part Number: FPC402

Tool/software:

During remote downstream port access, we suspect the internal port state machine is stuck.

When the issue happens there is always NACK received after logical device I2C address is sent by Host, and also there is no any I2C signal in the downstream interface.

We captured the host side I2C waveform, found the SCL signal is not regularly driven, sometimes delayed for long time(keeping long “1” or “0”) during a whole byte transmission.

Also checked all the FPC402 registers, no abnormal information found.

Can you help to confirm if that SCL being interrupted will cause the port state machine stuck ?

Is there any register to indicating the stuck status ? We can not find this kind of register in the programmer guide.

Any workaround recommended to avoid this kind of stuck ?

5 months ago

0 Drew Miller1 5 months ago

TI__Mastermind 38363 points

Hi Liming,

Is it possible to share your scope captures?
Do you know why SCL is not regularly being driven? Typically SCL is periodic unless clock stretching is occurring.
I'll look to see if we have any status indicator for this. Will have feedback next week.
Have you tried a port reset (0x00[3:0])?

Thanks,

Drew

0 Liming Zhou 5 months ago

Prodigy 30 points

Hi Drew,

Just attach the scope captures.

We are implementing the I2C control with bit-banged simulated mode on GPIO at CPU host side,

when CPU load increases, host side I2C output will be interrupted or delayed for long time, then causes irregular SCL output.

Yes, we tried a port reset (0x00[3:0]), it does restore it from stuck.

0 Drew Miller1 5 months ago in reply to Liming Zhou

TI__Mastermind 38363 points

Hi Liming,

Thanks for sharing more details. We have seen some rare cases where irregular I2C timing can cause a "stuck" state on FPC devices. I have a few recommendations/suggestions:

You can continue to use port reset to help resolve this issue.
Try increasing the Master Watchdog Timer Register to see if this impacts the behavior
Try adjusting the Protocol Timeout Register to see if this impacts the behavior

Thanks,

Drew

0 Liming Zhou 5 months ago in reply to Drew Miller1

Prodigy 30 points

HI Drew,

Thank you for the valuable suggestions.

we actually tried decreasing the Protocol Timeout Register value from default 10ms to 1ms/2ms/5ms, which can mitigate the issue occurrence.

Also tried decreasing the Master Watchdog Timer Register, seems it has no impact, not sure for increasing it.

Still one question,

Is there any reserved or hidden register indicating this kind of abnormal status ?

Since we checked all related status register, no faulty information found.

0 Drew Miller1 5 months ago in reply to Liming Zhou

TI__Mastermind 38363 points

Hi Liming,

Glad you were able to mitigate the issue.

Unfortunately, there are not any additional hidden registers that could help indicate this status.

You could check the "Port SCL Stuck Interrupt Register" and "Port SDA Stuck Interrupt Register", but I'm not sure they will report this since the issue appears to be related to host-side timing.

Thanks,

Drew

0 Liming Zhou 5 months ago in reply to Drew Miller1

Prodigy 30 points

Hi Drew,

Yes, we had checked registers "Port SCL Stuck Interrupt Register" and "Port SDA Stuck Interrupt Register",

both registers have value 0x0 when the FPC402 is stuck.

More questions,

The current default value for Protocol Timeout Register is 10ms, and Master Watchdog Timer Register is 3ms.

We tried to adjust the Master Watchdog Timer Register and the Protocol Timeout Register.

Either decreasing Protocol Timeout Register to 3ms/5ms or increasing Master Watchdog Timer Register to 5ms/10ms can avoid the FPC stuck issue occurrence when dumping SFP module information, while there is still accidental read byte error with no value received.

1. Do you have any suggestions on the better values for both registers, or an optimal composition of both registers ?

2. Still not very understand how both registers impact the behavior, any documents to explain it in detail ?

3. How about this register "I2C SlaveWatchdogTimerRegister(offset =0x04)" ?

We tried to adjust it, no impact.

Thanks!

0 Drew Miller1 5 months ago in reply to Liming Zhou

TI__Mastermind 38363 points

Hi Liming,

I have observed a similar situation in the past in which reducing Protocol Timeout Register or increasing Master Watchdog Timer register resolved the situation. I'm not sure if your I2C host is behaving exactly the same as the test condition I experimented with, but I think the issue is similar.

The issue I observed occurred when trying to perform a downstream read. Key things that were required to reproduce this "port stuck" issue:

In this case, the I2C host controller had a long delay (~6ms) between the NACK and STOP bit during the downstream read.
Between the write and read, a "repeated start" condition was required. Separate "STOP" and "START" conditions would not cause the issue.

Key observations:

If delay between NACK and STOP was reduced to ~4ms, the "port stuck" issue did not occur
If delay between NACK and STOP was increased ~12ms, the "port stuck" issue did not occur
If Protocol Timeout Register was reduced below length of delay between NACK and STOP (i.e NACK-STOP delay of 9ms, Protocol Timeout Register of 7ms), issue was resolved.
If Master Watchdog Timer was increased to 6ms, with 6ms NACK-STOP delay, issue did not occur.
- There seemed to be ~1-2ms delta between Master Watchdog Timer register value and which NACK-STOP delay values caused "port stuck" issue. For example, default Master Watchdog Timer is 3ms, but reducing NACK-STOP delay from 6ms to 4ms resolved issue. In other words, as it pertains to this issue, it seemed like Master Watchdog Timer value acted more like register_value + 2.

Summary:

We observed that in this unusual situation with long NACK-STOP delay, if the NACK-STOP delay time fell between the Master Watchdog Timer and Protocol Timeout Register values, a port stuck situation would occur.
In our test bench, we observed issue could be mitigated by increasing Master Watchdog Timer to 5ms with Protocol Timeout Register at default (10ms).

>> Do you have any suggestions on the better values for both registers, or an optimal composition of both registers ?

We observed issue could be mitigated by increasing Master Watchdog Timer to 5ms and leaving Protocol Timeout Register at default. I'd consider trying this.

>> Still not very understand how both registers impact the behavior, any documents to explain it in detail ?

Unfortunately, I don't have much documentation on this beyond programming guide and data sheet
Master Watchdog Timer monitors port side to ensure transaction occurs within certain time period.
Protocol Timeout Register monitors host side to ensure that transaction occurs within certain time period.

>> How about this register "I2C SlaveWatchdogTimerRegister(offset =0x04)"

We didn't observe this having an impact on this "port stuck" issue.

Thanks,

Drew

0 Liming Zhou 5 months ago in reply to Drew Miller1

Prodigy 30 points

Hi Drew,

Thank you for sharing the detailed information.

For our case, maybe it is different condition from your side triggering the "port stuck issue" since we did not observe long delay (~6ms) between the NACK and STOP bit.

Anyway, the same point as your side is long delay somewhere during the host transmitting.

We'd take all your suggestions into account.

BTW,

Would you recommend simultaneously increasing Master Watchdog Timer and reducing Protocol Timeout Register ? like changing both to 5ms ?

Thanks!

0 Drew Miller1 5 months ago in reply to Liming Zhou

TI__Mastermind 38363 points

Hi Liming,

I believe we performed more testing with just increasing Master Watchdog Timer, so I'm more confident in this. With that said, I'm not aware of any issues with reducing both to 5ms. Feel free to try this.

Thanks,

Drew

Interface

Interface forum

FPC402: Port state machine is stuck when host side SCL is interrupted or delayed for long time during remote downstream port access.