initializing network I/O with multiple c66x devices connected

Jeff Brower73

All-

When using four (4) c6678 devices with their SGMII ports chained together, we've run into trouble achieving reliable network I/O (NetCP and PA) initialization with all four devices. The chaining occurs on the DSPC8681 PCIe card and looks like this:

   __________     ________     ________     ________     ________
| BroadCom |   | DSP0 |   |   DSP1 |   | DSP2 |   | DSP3 |
| PHY      |   |        |   |        |   |        |   |        |
|     SGMII|---|SGMII0 |   | SGMII1|---|SGMII0 |   |        |
|      MDIO|---| SGMII1|---|SGMII0 |   | SGMII1|---|SGMII0 |
|__________|   |________|   |________|   |________|   |________|

The issue manifests as an intermittent "No Tx free descriptor" error, as documented here:

e2e.ti.com/.../1495849

As noted in the above thread, (i) the error state occurs due to packets coming in during init before all network I/O subsystems are ready, and (ii) we solved this issue for DSP0 by simulating a "cable disconnect" during network I/O initialization (i.e. we set the BroadCom PHY RJ-45 interface into high-impedance mode). But adding "upstream CPUs" into the equation causes the issue to resurface, even with the cable disconnected and initializing DSPs in sequence from 0 to 3.

We've looked through a number of e2e threads that seem to touch on this issue **. There doesn't seem to be a reliable way to fully reset the NetCP and PA upon an error condition, and as we're using a PCIe card, we cannot give a hard reset.

So I think the first question is: should we focus on a "soft reset and retry" method *after* the error state occurs ? Or should we focus on preventing the problem from occurring in the first place, through some sequence of initialization and coordination between CPUs ? Thanks.

-Jeff
Signalogic

** List of relevant threads:

1) OP has custom switch device that occasionally resets all ports; ie Link down then up. As a result, the PHY restarts SGMII link to the c6678 (puts it in not-linked state and start auto-negotiation process). This causes outgoing packets to fail.

e2e.ti.com/.../287294

2) OP describes the problem as "TX dosn't start. 6657's EMAC Transmitter stalls once SGMII Link Status is down, so SGMII Link must be established before EMAC SOFT RESET."

e2e.ti.com/.../819287

e2e.ti.com/.../233326.aspx

3) OP does an Ethernet boot, then re-purposes NetCP and PA for normal use:

e2e.ti.com/.../1159216

4) Problems in disabling and re-enabling NetCP power domain:

e2e.ti.com/.../713082

e2e.ti.com/.../1159216

over 10 years ago

0 Raja over 10 years ago

TI__Guru* 81335 points

Hi Jeff,
Thank you for the post. We are looking into this issue and get back to you with possible solutions. If possible, can you please re-produce the issue in TI C6678 EVM?

0 Jeff Brower73 over 10 years ago in reply to Raja

Genius 3420 points

Raja-

The problem has to be reproduced with 6678 devices chained together, as shown in my ASCII text diagram. As mentioned in my post, with just one 6678 we worked-around the problem by programming the PHY to simulate a cable disconnect.

Can you confirm that you have a card available ? We can provide a modified TI example project that runs on the card and shows the problem.

Thanks.

-Jeff

0 Chris Johnson1 over 10 years ago

Expert 1715 points

A short update on this issue:

We've tried a hard reset combined with restoring the PCIe configuration, but this doesn't seem to reset the NetCP as the chip doesn't come out of the error state. Are there any additional register bits that we might set prior to doing the hard reset to cause a full network I/O reset?

Processors

Processors forum

initializing network I/O with multiple c66x devices connected