AM5706: PCIE Endpoint runtime failure.

Part Number: AM5706

Tool/software:

Hi 

Our issue is not resolved yet. We did some more experiments regarding this and came up with few more information as below.

Previous Link : https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1395724/am5706-pcie-endpoint-runtime-failure

1. In our system we have one Root Complex and 4 End Points . Out of them 2 End Points are TI AM5706 and other 2 End Points are 2 FPGAs. By doing regress test we observed that there is no link down in case of FPGAs so it concludes the Root Complex is fine with its functionality.

2. By analyzing the register dump we found out that the value of configuration register are getting to default so we are doubting the controller is getting reset which is leading the Link Down.

  i) To debug this issue further we need to know if there is any reset sequence available for the PCIE controller ?

 ii) Is there any methods to identify if the controller reset happened ? Any count register or something from which we can conclude that reset happened. 

3. We conducted 

       Signal integrity waveforms for Tx and Rx

       Clocking scheme and clock waveforms

       Register details for Clock settings

All details are written in a excel file which I am attaching here.

AM5706 PCIe.xlsx

  • Hi Jigyansu,

      i) To debug this issue further we need to know if there is any reset sequence available for the PCIE controller ?

    Please refer to "24.9.4.4 PCIe Controller Reset Management" section of the Technical Reference Manual.

     ii) Is there any methods to identify if the controller reset happened ? Any count register or something from which we can conclude that reset happened. 

    Reading the previous thread, we identified the STATUS_COMMAND_REGISTER to set some error bit saying RCVD_MASTERABORT was received.

    In terms of software, can you log the line number that fails in each of your experiment to see if there is a pattern and share this data?

    In terms of hardware, looking through the PCI Express Card Electromechanical Speicifaction from PCI-SIG, I see that Vtxa_d minimum is 380mV (0.38V) for PCIe 2.0 speeds

    Additionally, I see that eye height from Intel to TI is 0.369844V in the spreadsheet. I'm not too savvy on the hardware-side, so I can pass this on to our hardware engineer, but it does look a bit concerning to me.

    For clarification, is the system running at PCIe 2.0 speeds?

    Regards,

    Takuma

  • Hi Takuma

    By going through the " PCIe Controller Reset Management" I found out few observations as follows.

    We are suspecting "PCIe soft reset condition" is getting satisfied which leads to a reset as per the reference manual.

    In PCIECTRL_EP_DBICS_PM_CSR register NSR bit is set to 1 to enable "No Soft Reset (CS)" .

    After this configuration the PCIE link fail is not observed for 3 set of testing and continuing.

    Can you please confirm and share some information regarding the "PM_STATE" transition. I mean why it is getting triggered?

    So that we can debug further.

    Note : We could not capture any PM_STATE change from the register dump. It is always in D0 state.

    Regards

    Jigyansu Jena

  • Hi Jigyansu,

    The PM_STATEs are defined in the PCI Express Base specification from PCI-SIG and it is industry standard. You may search on PCI-SIG website for the documentation. But in summary, D3hot is for low power states while D0 means fully on state.

    In general, this transition in state from D0 to D3 happens when the RC initiates a request to EP to go to low power. You will lose some of the power saving from low power mode, but if it is working with "No Soft Reset (CS)", then functionally I do not see any concern disabling soft reset. 

    Regards,

    Takuma

  • No the system is defined to run at PCIe 1.0 speed

  • Hi Alan,

    Please correct me if I am wrong in trying to understand the statement.

    I assume there is a misunderstanding between D0/D3 states with transition from D0 to PCIe 3.0 state. D3 state is different from PCIe 3.0, and it is a state used by all PCIe generations including PCIe 1.0 to signify low power mode. D3cold is everything but wake pin is turned off, and D3hot is software off (aka, can be detected by PC as a card being connected, but turned off software-wise).

    Regards,

    Takuma

  • Hi Takuma,
    Apologies for the confusion with the reply. As per our requirements the PCIe speed is set to Gen 1, which you have asked for in one of the previous replies.

    Regards, 
    Alan

  • Hi Alan,

    Ah, understood. Thanks for clearing up my misunderstanding. Since this behavior should apply regardless of gen 1 or gen 3 speeds, can I assume the thread can be closed?

    Regards,

    Takuma