This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/66AK2H14: 10GbE initialization fail in rev 3.1 SoC

Part Number: 66AK2H14

Tool/software: Linux

Hello TI,

 

Recently we received a batch of the new 66AK2H14 revision 3.1 (JTAGID  reg: 0xbb98102f). We’ve noticed a major problem with 10GbE initialization – it fails approx. 25% of the time.

The 10GbE initialization is handled by the kernel. Our kernel is based on MCSDK 3.01.04.07 (kernel version is 3.10.72). Can you please confirm the SerDes driver in this kernel works with the new silicon revision?

We would like to stick to our current kernel, as porting our software to “processor-sdk” is a significant effort.

 

Kind regards,

Krzysztof Olejarczyk

 

  • Krzysztof,

    From the XGE driver under /drivers/net/ethernet/ti, the cpswx_serdes_init is called multiple times:

    1, cpswx_probe->cpswx_serdes_init

    2. cpswx_attach->cpswx_attach_serdes->cpswx_serdes_init per port

    It is found out if remove the second call of cpswx_serdes_init(), it worked both at PG 3.1 and PG 2.0 chips. Can you have a try if this work for you?

    Also, we want to understand more of the failure:

    What are the # of units tested and # of units failed?

    What is the exact nature of the failure? (How do they know it has failed?)

    Are these failures on new boards?

    Have they tried replacing a known failing device with a known good device and verifying results?

    Have they tried replacing a known good device with a failing device and verifying results?

    What is the silicon revision of working chip? PG 2.0?

    Do you still use XFI mode?

    Do you still hard code the Tx and Rx parameters? Or Rx side uses adapation?

    Regards, Eric

     

  • Hello Eric,

     

    We also performed our own investigation and we have the same conclusion: removing the second SerDes init looks to improve the 10GbE initialization. We still have to prove the stability in a larger scale - more devices and more reboot cycles

     

    • What are the # of units tested and # of units failed?

    The failure rate is approx. 10-15% per Hawking.

     

    • What is the exact nature of the failure? (How do they know it has failed?)

    No communication is possible over Ethernet. The interface (eth0) is created but does not send/receive any data.

     

    • Are these failures on new boards?

    Yes, the boards are brand new.

     

    • Have they tried replacing a known failing device with a known good device and verifying results?
    • Have they tried replacing a known good device with a failing device and verifying results?

    No, we haven’t tested it this way.

     

    • What is the silicon revision of working chip? PG 2.0?

    Yes, it was 2.0

     

    • Do you still use XFI mode?

    Yes, we use XFI.

    • Do you still hard code the Tx and Rx parameters? Or Rx side uses adapation?

    We store the SerDes parameters in the DTS file

     

    Regards, Krzysztof

  • Krzysztof,

    Thanks! Please let us know the regression test results!

    Regards, Eric
  • Krzysztof,

    I hope the regression test was OK and please open an new thread if problem unresolved.

    Regards, Eric
  • Hello Eric,

    The suggested fix works for us. Thank you for the assist.

    Regards, Krzysztof