This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[C6678] Srio error recovery

Hi

My customer is used to using multiple C6455 and they are now considering to add a C6678 to the current C6455 SRIO network.

They are asking the way to recover from SRIO errors on C6678.

Well, please let me share some background with you:
They had SRIO network between multiple C6455 devices on the same PCB and it seems they sometimes saw some errors on SP(n)_ERR_STAT register after the port established (port_ok). In order to recover from the error states, they had been applied the software assisted error recovery.

Please take a look at "Appendix B Software-Assisted Error Recovery"  in the following manual.

They are considering the same thing on C6678.
I checked C:\ti\pdk_C6678_1_1_2_6\packages\ti\drv\srio\test\tput_benchmarking\srio_device_tput.c, but I could not find the related code for error recovery.

Could you please let me know how they could recover from the errors on SP(n)_ERR_STAT (except port_ok) on C6678 ?

Also, in their use case, a local reset (hardware reset) can be applied to individual DSP during SRIO transactions.
Please assume C6455 and C6678 are connected via SRIO, and a local reset is applied to C6455, but no reset to C6678. In this case, at C6678 end, the ports has to de-initialized and initialized again to get worked with C6455. What is the recommended de-initialization sequence ?

Best Regards,
Naoki

  • The C6678 and C6654 contain the exact same SRIO IP. The recovery techniques and the clearing of error status bits is handled in the exact same way, so if the customer wrote code to handle this on the C6654, it will work on the C6678.

    Below are some threads that talk about the reset issue:

    e2e.ti.com/.../264450
    e2e.ti.com/.../124376
    e2e.ti.com/.../97140
    e2e.ti.com/.../11780

    Can you control the packets in and out of the C6678 before you reset the C6654? This could be essential so that you don't hang the C6678 with packets inflight in the internal datapath architecture. Basically if you can make sure all the outbound packets in the C6678 have been sent, and all the received packets into the C6678 have been received and responded too, then you can simply reset the SRIO in the C6678 with the RIO_GBL_EN. Before you do that, you will want to turn off the reset isolation in the C6678. You may need to reset the SERDES registers too (they are outside SRIO space). After the reset, you will need to reinitialize the SRIO peripheral registers in the C6678, just like it was done the very first time SRIO was used.

    Regards,
    Travis
  • Hi Travis,

    Thanks for your reply.

    tscheck said:
    The C6678 and C6654 contain the exact same SRIO IP. The recovery techniques and the clearing of error status bits is handled in the exact same way, so if the customer wrote code to handle this on the C6654, it will work on the C6678.

    Please note C6455 (it is very old device!) is a link partner for C6678. Not C6654. So these devices should not contain the same SRIO IP.

    tscheck said:
    Can you control the packets in and out of the C6678 before you reset the C6654? This could be essential so that you don't hang the C6678 with packets inflight in the internal datapath architecture. Basically if you can make sure all the outbound packets in the C6678 have been sent, and all the received packets into the C6678 have been received and responded too, then you can simply reset the SRIO in the C6678 with the RIO_GBL_EN. Before you do that, you will want to turn off the reset isolation in the C6678. You may need to reset the SERDES registers too (they are outside SRIO space). After the reset, you will need to reinitialize the SRIO peripheral registers in the C6678, just like it was done the very first time SRIO was used.

    You might assume C6654 is a link partner for C6678, so please let me ask you again that your answer is also applicable to the customer's use case. Remember that C6678 will be connected to C6455 by using SRIO.

    Best Regards,
    Naoki

  • Oh, sorry I read the part number too quickly.  The error recovery procedure for the C6678 is slightly different, it should be in the appendix of the SRIO user's guide, but we just noticed last week that it is missing.  In the process of getting it updated, but until then, the very first link that I sent in the last message takes you to a e2e thread that contains another link, which points to the software error recovery document for C66x.  I've put it here for simplicity:

    http://e2e.ti.com/cfs-file.ashx/__key/communityserver-discussions-components-files/639/6557.Keystone-Software-Assisted-Error-recovery_5F00_addendum.pdf

    Everything else I mentioned above is still relevant.  You can do a reset of the SRIO via RIO_GBL_EN or you could use the process in the error recovery doc to re-align ackids. 

    Regards,

    Travis

  • Hi Travis,

    Thank you so much for sharing the documentation. It would be helpful.
    Now I close the thread.

    Thanks,
    Naoki