This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320DM8168: PCIe does not Link up (LTSSM is not in L0 state) on power up

Part Number: TMS320DM8168

TMS320DM8168  PCI express Link up failure.

Hi,

We have designed a custom board using the two DM8168, two FPGA all connected to an IDT switch 32NT24B. The IDT upstream port is connected to another idt switch (located on a different board).

Both DM8168's are configured as EP, Gen2.0, 4 lanes.

Occasionally  (1 out of 10)  on power up, while observing (polling) Debug0 register (according to the suggested EP init sequence) we see that the LTSSM state stays at 0x3 (pooling compliance) and not moving to 0x11 (L0) state. 

The below registers were examined and verified, the PCIESS is not at reset ,PLL is locked and values are as expected and the LTSSM is enabled.

All the EP (DM8168's and FPGA's)  are fed with same clock generator.

The issue was observed on both DM8168's, although no at the same power up.

PCIE_CFG: 0x48140640: 01C90300

RM_DEFAULT_RSTCTRL: 0x48180b10: 00000003

CM_DEFAULT_PCI_CLKSTCTRL : 0x48180510: 00000102

CM_DEFAULT_PCI_CLKCTRL : 0x48180578: 00000002

Debug 0: 0x51001728: 00AED703

Debug 1: 0x5100172c: 08200000

I have also looked on this post, 

https://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/717/t/490939

have not tried yet to force Gen1

 The device is not hanged, still running , but the link never go up until next power up. 

Voltage, clock and reset were measured with scope to find any issue. it seems as require.

Happens at room temperature.

DDR3 testing is performed on every powerup, did not failed.

Can you suggest what to look in order to find any clue for this behavior?

Debug 0
  • Hi Oded,

    Oded Asulin said:
    Both DM8168's are configured as EP, Gen2.0, 4 lanes.

    DM816x PCIe support 1 lane or 2 lanes, but not 4. It has one port, x2 lanes.

    Do you use DM816x EZSDK 5.05.02.00? Can you try to reproduce the issue with DM816x TI EVM?

    Can you also monitor the below bits and see what they log:

    LINK_STAT_CTRL[27] LINK_TRAINING

    DEBUG1[29] LINK_IN_TRAINING

    See also if the below pointers will be in help:

    DM816x TRM, section 17.2.9.3.1.1 Host Reset Request Interrupt Reception in EP Mode

    Regards,
    Pavel

  • Hi Pavel,
    Thanks for your help.
    You are correct regarding the link characteristics. The link width is x2, and Gen1.0 (2.5G), EP.
    I cannot duplicate the issue on the EVM as we use custom board, as a part of whole system.
    We use the DVR RDK 4.01

    I have read the register fields you have asked, during the failure,
    LINK_STAT_CTRL[27] LINK_TRAINING = '0'
    DEBUG1[29] LINK_IN_TRAINING =’0’,
    Thus i assume LTSSM is not performing link training



    I have noticed something else. i am not sure if it is related or perhaps i have additional issue.
    This happens very rarely, but sometimes on power up, I see an issue whereas the PCIe link status is up but the software is continuously polling the BAR3 register value, seeking for non-zero value. it never succeeding in this task. Although reading from the TI console returns expected BAR2 values (those values are initialized by an external RC, which has completed this task succesfully). During this fail i also noticed the DIVCLK is not running.  

    PCIE_STSPLL = 0x1 as opposed to 0x3 in normal operation

    Meaning DIVCLK=’0’. DIVCLK is not running.

    PCIE_CFGPLL =0x01C9
    PCIE_CFGPLL[ENPLL,#16]=’1’
    PCIE_CFGPLL[ENDIVCLK,#24]=’1’

    Can you please explain the benefit of this clock? Where is it routed to? If it is not running, what are the implications? What can affect it not to run?

    Could the DIVCLK='0' can resolve in that the software is not able to read the BAR2 register? and if so how it succeeded from the console?

  • Oded Asulin said:
    You are correct regarding the link characteristics. The link width is x2, and Gen1.0 (2.5G), EP.
    I cannot duplicate the issue on the EVM as we use custom board, as a part of whole system.
    We use the DVR RDK 4.01

    I have read the register fields you have asked, during the failure,
    LINK_STAT_CTRL[27] LINK_TRAINING = '0'
    DEBUG1[29] LINK_IN_TRAINING =’0’,
    Thus i assume LTSSM is not performing link training

    Make sure you are using the latest version of the rdk linux kernel.

    You stated that this issue (link training fail) is observed rarely, correct? This might be HW malfunction of your custom board. Please double check your design with DM816x datasheet and DM816x TI EVM.

    There is also HW diagnostic test for the PCIe in DM814x TI EVM. You can check if this test can be applied to your DM816x custom board.

    Software -> Diagnostic Software -> Base Board -> Rev D -> src -> CCS_Test_code -> Base_Board -> pcie

    I will check for your other issue and come back to you.

    Regards,
    Pavel

  • Oded Asulin said:

    I have noticed something else. i am not sure if it is related or perhaps i have additional issue.
    This happens very rarely, but sometimes on power up, I see an issue whereas the PCIe link status is up but the software is continuously polling the BAR3 register value, seeking for non-zero value. it never succeeding in this task. Although reading from the TI console returns expected BAR2 values (those values are initialized by an external RC, which has completed this task succesfully). During this fail i also noticed the DIVCLK is not running.  

    PCIE_STSPLL = 0x1 as opposed to 0x3 in normal operation

    Meaning DIVCLK=’0’. DIVCLK is not running.

    PCIE_CFGPLL =0x01C9
    PCIE_CFGPLL[ENPLL,#16]=’1’
    PCIE_CFGPLL[ENDIVCLK,#24]=’1’

    Can you please explain the benefit of this clock? Where is it routed to? If it is not running, what are the implications? What can affect it not to run?

    Could the DIVCLK='0' can resolve in that the software is not able to read the BAR2 register? and if so how it succeeded from the console?

    Yes, bit [9] DIVCLK should be also 1, for PCIe to work properly. See DM816x TRM (sections below) for more details:

    17.1.4.4 Clock, Reset, Power Control Logic
    17.2.1 Clock Control


    DIVCLK is the output clock of PCIe SerDes PLL and input clock (functional clock) for PCIe module.

    Check if the 100MHz refclk is correct (input clock for the PCIe SerDes PLL) and stable. Check also if you have differences in the PRCM PCIe related registers between working and non-working case (CM_DEFAULT_PCI_CLKSTCTRL, CM_DEFAULT_PCI_CLKCTRL, RM_DEFAULT_RSTCTRL[7] PCI_LRST, RM_DEFAULT_RSTST[7] PCI_LRST)

    Regards,
    Pavel

  • Dear Pavel,
    Please see the read values of the below registers:

    CM_DEFAULT_PCI_CLKSTCTRL, 0x48180510 = 0x102 (same as normal operation)
    CM_DEFAULT_PCI_CLKCTRL, 0x48180578= 0x00000002 (same as normal operation)
    RM_DEFAULT_RSTCTRL[7] PCI_LRST = 0, 0x48180b10=0x00000003 (same as normal operation)
    RM_DEFAULT_RSTST[7] PCI_LRST= ‘1’* , 0x48180b14= 0xFC (same as normal operation), take into account that this status is not cleared during power up. i will clear it from the console to verify LRST is not constantly in reset state (probably not, link is in L0 during the fail).

    I read the PCIE_CFG register also during several of normal power ups. i see that in many cases the DIVCLK is not running, PLL Is locked (LOCK, bit[8] = '1'), and PCIe Link is L0, and having no issue in the PCI domain.

    Can you tell where will i see degradation due to the lack of the DIVLCK? ( you have mentioned that DIVCLK is input clock (functional clock) for PCIe module).

    I did not see any explanation for this "div by 5" PLL clock (DIVCLK) in the pointers you have send, only general explanation. is this clock is the 250MHz clock mentioned?
    I will try to measure/change the REFCLK input clock in order to maintain that DIVCLK will always be running.

    Appreciate your help,
    Regards,
    Oded
  • Dear Pavel,
    Can you please share some information regarding this DIVCLK status. documented information is not clear enough,
    Thanks,
    Oded
  • Dear Pavel,
    We wanted to verify the effect of the DIVCLK on the PCIe functionality.
    We have disabled the DIVCLK (EN-DIV-CLK=0, bit 24 in the PCI_CFG register ).
    We are not noticing any degredation on the PCIe link. so we are un-certain if we still want to investigate our issue in this direction.
    Can you approve our assumption regarding the DIVLCK?
    Can please suggest any other direction to investigate our issue?

    Thanks,
    Oded
  • Oded,

    Refer to the below e2e post:
    e2e.ti.com/.../426680

    Regards,
    Pavel
  • The discussion continues in the below thread:

    e2e.ti.com/.../632753