This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TCI6638K2K: Performing ATT and BOOST Calibration for PCIe Interface

Part Number: TCI6638K2K

In Errata: KeyStoneII.BTS_errata_usagenote.28 the work around is to perform ATT/BOOST calibration.

We did this successfully for the SRIO interface but for the PCIe interface I am getting really strange results.

Note PCIe interface does not fail all the time. In fact it fails only with High temperature. We are configured for x1 Lane PCIe @ 5.0 Gpbs.

While PCIe is full active and functional I read the following value of the registers:

COMLANE_1F8 : 0x04010000

LN1_OK : 0x0
LN0_OK : 0x0
LN1_SIG_LEVEL_VALID : 0x0
LN0_SIG_LEVEL_VALID : 0x0
CMU_OK : 0x1

PLL_CTRL : 0x10000002

PLL_ENABLE_VAL : 0x0
PLL_OK : 0x1
LN1_OK_STATE : 0x0
LN0_OK_STATE : 0x0
LN1_SD_STATE : 0x1
LN0_SD_STATE : 0x0


Since we are using lane 0 of PCIe phy, and it is fully up and running I am expecting the LN0 values to be set in the above registers.

Furthermore, when I try to read the value for the ATT and BOOST set in the Serdes, I am not able to RXValid for lane0.  The API is in csl_serdes2.h and I am copying it here:

static inline void CSL_SerdesWaitForRXValidPerLane(uint32_t base_addr,
uint8_t lane_num)
{
uint32_t stat;
uint32_t timeout = 100000000;
stat = (CSL_SerdesReadSelectedTbus(base_addr, lane_num+1, 0x2) & 0x020)>>5;
while ((stat != 1) && (timeout != 0))
{
stat = (CSL_SerdesReadSelectedTbus(base_addr, lane_num+1, 0x2) & 0x020)>>5;
timeout--;
}
}

So it seems that there is something special about the PCIe mode for Serdes that is not covered by the documents. Please help.

Ziad A.

  • Hi,

    For PCIE Serdes, it is PHY-A 2 lanes (instead of PHY-A 4 lanes). The register you need to look at is Table 16-1. Memory Mapping for PHY-A 2 Lane Sub-Systems in the Serdes user guide.

    However, some registers are the same as those defined in Table 16-2. Memory Mapping for PHY-A 4 Lane Sub-Systems. For the PCIE Serdes, PCIE lane 0 (0-based) is the lane 1 (0-based) in the Serdes, PCIE lane 1 (0-based) is the lane 2 (0-based) in the Serdes.

    That is why you saw LN_1 is active / set when use use PCIE Serdes 0.

    Older CSL code may have bugs in this lane index conversion and there were PCIE diagnostics (BER, ATT/BOOST test) failure. It was fixed around 2018 Q2. Please use the latest PRSDK 5.3 release for K2H for Serdes work.

    Regards, Eric
  • Can you please send a link to download PRSDK 5.3
  • We don't support part number 6638K2K, but this is the same as 66AK2H, software-dl.ti.com/.../index_FDS.html

    Regards, Eric
  • I downloaded the latest CSL and diag. I do not see the changes. Although I see V1 directory, the diagnostic tools still refer to files in V0 (older version) not new one. I am still confused how function: Serdes_Diag_Att_Boost_Calibration in serdes_diag.h refer to the new library. It still calles CSL_SerdesWaitForRXValid which is defined the same way as before.

    Can you also explain why I am not getting Lane1 OK set in COMLANE_1F8 and PLL_CTRL even though the PCIe link is up and running.
  • Hi,

    I created a setup of PCIE x 2 lane test between TI K2H EVM and another device. I knew the PCIE link is operating properly with GEN2 x 2 lanes. I looked at:
    - the CMU (0x2320BF8): 0x0001_0000
    - PL_CTRL (0x2321FF4): 0x1000_0000

    Those values are even different from you case (GEN2 x 1). I agreed that those LNx_OK and LNx_SIG_LEVEL_VALID, LNx_SD_STAT, LNx_OK_STATE bit doesn't give any clear indication.

    I also tried another Serdes PHY-4 (Hyperlink) and looked at above registers and those fields are meaningful.

    The Serdes is provided from a third party with limited information in the document. We may not get further explanation and so we suggest to use the Serdes code inside CSL (called by Processor SDK RTOS) PCIE examples as it is. If there is any issues, we can debug.

    Regards, Eric
  • Hi,

    For the CSL_SerdesWaitForRXValid(), this function is not changed and it is common for PCIE and other interfaces. What is your question? " I am not able to RXValid for lane0", do you mean you code get stuck when configured as x1 lane? But I thought x1 is working for you.

    For me, x 2 lane worked and PCIE is functioning (judged by pcie_debug0 register 0x2180_1728) it is in L0 state.

    Regards, Eric
  • Attached to the TI I have an FPGA which is driving PRBS31 pattern to the Keystone PCIe interface lane 0.

    To do ATT BOOST calibration the first thing is to get valid by running CSL_SerdesWaitForRxValid() I used Lane number 1 and lane number 0 as well as 2 and 3 for good measure. None of them return a valid.

    I do the same thing while the FPGA connection is in PCIe mode. i.e., there is a PCIe on the other side and I can do lsspci -v and verify that the link is up. I do the same thing and again  CSL_SerdesWaitForRxValid() never return with lane valid signal. 

    Ziad A.

  • Hi,

    When you run the PCIE test (the one with lspci and link up), is that correct that FPGA has some setting into PCIE mode? And in the K2HK side you used the PCIE driver example under pdk_k2hk_4_0_xx\packages\ti\drv\pcie\example\sample. This RTOS PCIe driver also called CSL_SerdesWaitForRxValid() with serdes_lane_enable_params.operating_mode = CSL_SERDES_FUNCTIONAL_MODE; The code passed, correct? And which lane (0, 1, 2, 3) has this signal detection valid? Is it lane 1?

    When you run the PRBS31 test, you have some configuration that made FPGA into PRBS31 transmission mode, and K2H side is the RTOS diagnostics pdk_k2hk_4_0_xx\packages\ti\diag\serdes_diag with CSL_SERDES_DIAGNOSTIC_MODE? We have this tested with K2H to K2H connection and didn't have any issue? Is it possible that FPGA PRBS31 generation problem?

    Regards, Eric
  • I'll double check the first issue with the driver and get back to you. I assume you are right but have not looked into the driver code much since it has been working fine.

    But regarding what I am doing, is that I run the CSL_SerdesWaitForRxValid() with PCIe up and I tested all Lane numbers. Non read back a valid.

    I also did the same experiment with the FPGA driving PRBS31 and I see no difference in the results from  CSL_SerdesWaitForRxValid().

    Ziad

  • Hi Eric,

    I talked to our software team about the driver code and they mentioned that since our system is Linux, we are not using the drivers you mentioned. Furthermore, they indicated that CSL_SerdesWaitForRxValid() does not seem to be used in the driver used.

    Does this help?

    Ziad 

  • Hi,

    PCIE is relative lower speed Serdes (5Gbps), compared to Hyperlink and 10GbE. We had do some TX side tuning (CM, C1, C2) and RX side tuning (RX_ATT and BOOST) when running them at 10Gbps for stability. For PCIE we don't have many customer tuning the Serdes.

    I learnt that you have instability issues when the temperature is high where the PCIE link is lost. If you re-start the link training, is that recoverable? As you use the Linux to configure the PCIE Serdes, do you force RX_ATT and BOOST to certain values or use adaption? And if it is the former, what are the forced values and how that was derived? Or, you are working on find good values and want to force them?

    There are 3 Serdes Diagnostics can run on DSP core (not ARM) between 2 TI Keystone devices. Among them,
    - BER test is a Tx sweep test, by changing different C1, C2, CM (for PCIE it is tx_swing and tx_deemph) and check the BER at the receiver side, to find the best Tx parameter.
    - ATT_BOOST calibration test, this is to use the known Tx parameter from the remote end, and let local side to adapt to find the best RX_ATT and BOOST

    The test assumes symmetric connection between two TI Keystone devices, and is driven Java script/CCS. So running between a K2 and a FPGA is difficult.

    For your operating PCIE link, are you able to read out RX_ATT and BOOST value using CSL API? When the link failed, is it drift to 0? I knew you are trying to tuning them, can you elaborate what is the SW you used? Is it based on pdk_k2hk_4_0_13\packages\ti\diag\serdes_diag?

    I checked CSL_SerdesWaitForRxValid(), it is not called in our RTOS PCIE driver code, nor the Serdes_diag function. Why you come into a sequence to call this API? If you look at the Serdes_Example_PRBSTest(), will this help?

    Regards, Eric
  • Hi Eric, Thanks for the details response. I will try to address all the matters you raise.

    1. Yes indeed we do have several boards which fail erratically to have PCIe up between FPGA and Keystone device.  We have seen this issue to appear only on few boards at high temperature.  The symptom is that the PCIe interface works fine as temperature goes up but once we try to restart the system (reboot keystone II) the PCIe fails to train.  My first inclination is that is an issue related to the TI Errata KeyStoneII.BTS_errata_usagenote.28, Literature Number: SPRZ401F June 2013–Revised May 2017 Page 95.

    However, even if this is not related to the Errata, my goal is to setup a PRBS test between the FPGA and Keystone II. This will help us characterize if we have a board issue .

    I have done this successfully for SRIO on the same board and according to the documentation and the diagnostic source code PCIe and SRIO is similar in the way the transceiver work.

    2. I have not and do not plan to do a TX sweep test at this time because I do not believe we need it. At least not at this time.

    3. I have been trying to do ATT_BOOST calibration test. Again I am following same procedure I am following with SRIO.

    4. We realized the issue you outlined regarding running your diagnostic code on our target between FPGA and Keystone II so what I did is essentially created scripts that mimicked your functions. With the assistance of you source code and documentation I have successufly executed this for SRIO and was able to diagnose our link issue and fixed our drivers for SRIO. Now we have a big issue with the PCIe interface.

    5. Here is the details of what I am doing for PCIe ATT and BOOST calibration test. Please note the procedure I followed is the same as I followed for SRIO and worked very well. As mentioned above I followed a combination of documentation and source code to perform the calibration procedure My procedure involves two steps:

    a. Check the register status of the transceiver in normal running before any changes to the register.  In this mode, PCIE is operating normally and Keystone is actually performing read/write to FPGA. My issue is that even with that mode, I expect to see that COMLANE_1F8 Register to report  Lane 0 as OK and valid. But it does no such thing. Here are the results we see from COMLANE_1F8

    COMLANE_1F8 : 0x04010000

    LN3_OK                             : 0x0

    LN2_OK                             : 0x0

    LN1_OK                             : 0x0

    LN0_OK                             : 0x0

    LN3_SIG_LEVEL_VALID                : 0x0

    LN2_SIG_LEVEL_VALID                : 0x1

    LN1_SIG_LEVEL_VALID                : 0x0

    LN0_SIG_LEVEL_VALID                : 0x0

    CMU_OK                             : 0x1

    As you can see none of the lanes are indicated as running. I also find that PLL_CTRL is also showing signs of inaccuracy. Here is what I am getting there:

    PLL_CTRL : 0x10000002

    PLL_ENABLE_VAL                     : 0x0

    PLL_OK                             : 0x1

    LN3_OK_STATE                       : 0x0

    LN2_OK_STATE                       : 0x0

    LN1_OK_STATE                       : 0x0

    LN0_OK_STATE                       : 0x0

    LN3_SD_STATE                       : 0x0

    LN2_SD_STATE                       : 0x0

    LN1_SD_STATE                       : 0x1

    LN0_SD_STATE                       : 0x0

    As you can see none of the lanes are listed as OK and for the signal detect in COMLANE_1F8 Lane2 _SIG_LEVEL_VALID is asserted. But LN1_SD_STATE is set in the PLL_CTRL.

    This indicates that PCIe is behaving different than SRIO. Either because register offsets in the document are wrong or there is something special about PCIe transceiver.

    b. When I run the ATT_BOOST calibration I was using the function: Serdes_Diag_Att_Boost_Calibration in serdes_diag.h this function is used by the example you indicated (Serdes_Example_PRBSTest()).    Fundamentally, I believe the same issue I highlighted in a above is inflicting the success for this step. The transceivers are not acting in the same way as advertised.

    Hope this clarifies our issue.

    6. I want to add a couple of issue based on what you raised.

    According to the TI Errata I outlined it clearly states that ATT and BOOST values must be fixed and not left in auto mode. Do you believe this is not the case. You mentioned that your customers are not doing this. How are they working around the errata then. If we need to fix the ATT and BOOST, we must be able to run the calibration sequence and figure out the best values for our system.

    I also want to add that I checked the state of our system in run time. And here are the values for the registers that Affect our ATT and BOOST. The values i am listing here are for lane 0 but I actually get the same values for all Lanes:

    LANE 0 State

    LANE0_030 : 0x00000000

    RXEQ_ATT_GAIN_OVR                   : 0x0

    RXEQ_ATT_GAIN_AUTOCAL_DIS           : 0x0

    LANE0_084 : 0x2d0f0385

    REQ_RATE2_BOOST_START_O_3_0         : 0x2

    REQ_RATE2_ATT_START_O_3_0           : 0xd

    REQ_RATE1_BOOST_START_O_3_0         : 0x0

    REQ_RATE1_ATT_START_O_3_0           : 0xf

    LANE0_08C : 0x20404820

    REQ_RATE3_BOOST_START_O_3_0         : 0x4

    REQ_RATE3_ATT_START_O_3_0           : 0x8

    LANE0_0A0 : 0xffee3048

    TXCTRL_C1_IN_OVR_EN_O               : 0x1

    LANE0_0A8 : 0x50000000

    TXCTRL_MASTER_OVR_EN_N_O            : 0x0

    COMLANE_000 : 0x0400019f

    L3_MASTER_CDN_O                     : 0x1

    L2_MASTER_CDN_O                     : 0x1

    L1_MASTER_CDN_O                     : 0x1

    L0_MASTER_CDN_O                     : 0x1

    LC_MASTER_CDN_O                     : 0x1

    COMLANE_084 : 0xf0000301

    RXEQ_RATE3_CAL_EN_O                : 0x0

    RXEQ_RATE2_CAL_EN_O                : 0x1

    RXEQ_RATE1_CAL_EN_O                : 0x1

    RXEQ_INIT_CAL_O_1                  : 0x0

    RXEQ_INIT_CAL_O_0                  : 0x1

    COMLANE_08C : 0x8103232f

    RXEQ_RATE2_INIT_CAL_EN_O_0-ATT     : 0x1

    RXEQ_RECAL_O_1                     : 0x0

    RXEQ_RECAL_O_0                     : 0x1

    COMLANE_090 : 0x55506001

    RXEQ_RATE2_RECAL_O_1-BOOST         : 0x1

    RXEQ_RATE2_RECAL_O_0-ATT           : 0x1

    RXEQ_RATE2_INIT_CAL_EN_O_1-BOOST   : 0x1

    This shows that our default is that ATT and BOOST are in auto calibration. Our starting values come from the default drivers provided by TI.

    7. Finally, to eliminate this issue. I tried the procedure outlined Literature Number: SPRUHO3A Page 151 to put transceiver into internal loopback and it also did not work correctly I am expecting to get the link synch mode detected per procedure outlined in step 10, but it did not work correctly.

    In summary:

    1. We see some board failing to have PCIe come up at high temperature. Our first suspension is the TI errata outlined above.

    2. Access to transceiver registers do not match documentation.

    3. To perform ATT and BOOST calibration we need are not able to read accurate values due to transceiver not behaving correctly.

    4. I am not able to run internal loopback on the transceiver.

    5. All these were executed the same way on SRIO and they worked correctly.

    Thanks for your attention.

    Ziad A.

  • Ziad,

    Sorry for the later, I checked with the Serdes code development team. In the Diagnostics, pdk_k2hk_4_0_13\packages\ti\diag\serdes_diag\serdes_diag.h

    static inline SERDES_DIAG_STAT Serdes_Diag_BERTest(const SERDES_DIAG_BER_INIT_T *ber_params_init,
    SERDES_DIAG_BER_VAL_T *pBERval,
    CSL_SERDES_TAP_OFFSETS_T *pTapOffsets)

    {

    .....

     if(ber_params_init->phy_type != SERDES_PCIe) {

    .....

    stat = (CSL_SerdesReadSelectedTbus(ber_params_init->base_addr, lane_num+1, 0x2) & 0x020)>>5;

    ...}

    else {

    ....

    stat = (CSL_SerdesReadSelectedTbus(ber_params_init->base_addr, lane_num+1, 0x2) & 0x008)>>3;

    ...}


    This is the correct implementation.

    However, in the CSL code, pdk_k2hk_4_0_13\packages\ti\csl\src\ip\serdes_sb\V0\csl_serdes2.h, 

    CSL_SerdesWaitForRXValidPerLane() { 

     stat = (CSL_SerdesReadSelectedTbus(base_addr, lane_num+1, 0x2) & 0x020)>>5;

    } This code didn't check the Serdes interface type and is wrong.

    So please refer to the pdk_k2hk_4_0_13\packages\ti\diag\serdes_diag\serdes_diag.h for this. Sorry for the issue.

    Regards, Eric