This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM3352: DDR3 software leveling issue

Part Number: AM3352

Tool/software: Linux

DDR3 Part: H5TQ4G63EFR-RDC (256M x 16)

Trace Lengths (inches):

DDR_CK trace - Byte 0 : 0.635335  Byte 1 : 0.635335

DDR_DQSx trace - Byte 0 : 1.100645  Byte 1 : 1.09342

 Received Seed values from RatioSeed excel sheet:

RD_DQS : 40

FIFO_WE : F4

WR_DQS : 77

With above mentioned seed values, DDR3 software leveling is not working, we are getting all the optimum values as 0x0.

But if we use below mentioned different seed values then we are getting the optimum values.

Seed values: RD_DQS - 0x20, FIFO_WE - 0xE0, WR_DQS - 0xAF

Can anyone please help why the seed values calculated from RatioSeed excel sheet are not working?

  • Yashavadan, the ratio seed spreadsheet should be choosing a INVERT_CLKOUT=1. Be sure to use this value in the software leveling algorithm. Also, with this setting, PHY_CTRL_SLAVE_RATIO=0x100. Ensure this value is set in the GEL file you are using when running the software leveling algorithm.

    Regards,
    James
  • SK_Initialization.txt
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    CortxA8: Output: **** AM3358_SK Initialization is in progress ..........
    CortxA8: Output: **** AM335x ALL PLL Config for OPP == OPP100 is in progress .........
    CortxA8: Output: Input Clock Read from SYSBOOT[15:14]: 25MHz
    CortxA8: Output: **** Going to Bypass...
    CortxA8: Output: **** Bypassed, changing values...
    CortxA8: Output: **** Locking ARM PLL
    CortxA8: Output: **** Core Bypassed
    CortxA8: Output: **** Now locking Core...
    CortxA8: Output: **** Core locked
    CortxA8: Output: **** DDR DPLL Bypassed
    CortxA8: Output: **** DDR DPLL Locked
    CortxA8: Output: **** Setting DDR PLL to 400MHz .........
    CortxA8: Output: **** PER DPLL Bypassed
    CortxA8: Output: **** PER DPLL Locked
    CortxA8: Output: **** DISP PLL Config is in progress ..........
    CortxA8: Output: **** DISP PLL Config is DONE ..........
    CortxA8: Output: **** AM335x ALL ADPLL Config for OPP == OPP100, 25MHz input is Done .........
    CortxA8: Output: **** AM335x DDR3 EMIF and PHY configuration is in progress...
    CortxA8: Output: EMIF PRCM is in progress .......
    CortxA8: Output: EMIF PRCM Done
    CortxA8: Output: DDR PHY Configuration in progress
    CortxA8: Output: Waiting for VTP Ready .......
    CortxA8: Output: VTP is Ready!
    CortxA8: Output: DDR PHY CMD0 Register configuration is in progress .......
    CortxA8: Output: DDR PHY CMD1 Register configuration is in progress .......
    CortxA8: Output: DDR PHY CMD2 Register configuration is in progress .......
    CortxA8: Output: DDR PHY DATA0 Register configuration is in progress .......
    CortxA8: Output: DDR PHY DATA1 Register configuration is in progress .......
    CortxA8: Output: Setting IO control registers.......
    CortxA8: Output: EMIF Timing register configuration is in progress .......
    CortxA8: Output: PHY is READY!!
    CortxA8: Output: DDR PHY Configuration done
    CortxA8: Output: EMIF Timing register configuration is done .......
    CortxA8: Output: CMD_PHY_CTRL_SLAVE_RATIO: 0x00000100
    CortxA8: Output: CMD_PHY_INVERT_CLKOUT: 0x00000001
    CortxA8: Output: DDR_IOCTRL_VALUE: 0x0000018B
    CortxA8: Output: ALLOPP_DDR3_READ_LATENCY: 0x00000007
    CortxA8: Output: ALLOPP_DDR3_SDRAM_TIMING1: 0x0AAAD4EB
    CortxA8: Output: ALLOPP_DDR3_SDRAM_TIMING2: 0x26657FDA
    CortxA8: Output: ALLOPP_DDR3_SDRAM_TIMING3: 0x501F861F
    CortxA8: Output: ALLOPP_DDR3_REF_CTRL: 0x00000C30
    CortxA8: Output: ALLOPP_DDR3_ZQ_CONFIG: 0x50074BE4
    CortxA8: Output: ALLOPP_DDR3_SDRAM_CONFIG: 0x61C05332
    CortxA8: Output: **** AM3358_SK Initialization is Done ******************
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    Result.txt
    Fullscreen
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    [CortxA8]
    Enter the PHY_INVERT_CLKOUT value (0 or 1) from the spreadsheet
    1
    Enter the Seed RD_DQS_SLAVE_RATIO Value in Hex to search the RD DQS Ratio Window
    40
    Enter the Seed FIFO_WE_SLAVE_RATIO Value in Hex to search the RD DQS Gate Window
    f4
    Enter the Seed WR_DQS_SLAVE_RATIO Write DQS Ratio Value in Hex to search the Write DQS Ratio Window
    77
    ***************************************************************
    The Slave Ratio Search Program Values are...
    ***************************************************************
    PARAMETER MAX | MIN | OPTIMUM | RANGE
    ***************************************************************
    DATA_PHY_RD_DQS_SLAVE_RATIO 0x000 | 0x000 | 0x000 | 0x000
    DATA_PHY_FIFO_WE_SLAVE_RATIO 0x000 | 0x000 | 0x000 | 0x000
    DATA_PHY_WR_DQS_SLAVE_RATIO 0x000 | 0x000 | 0x000 | 0x000
    DATA_PHY_WR_DATA_SLAVE_RATIO 0x000 | 0x000 | 0x000 | 0x000
    ***************************************************************
    rd_dqs_range = 0
    fifo_we_range = 0
    wr_dqs_range = 0
    wr_data_range = 0
    Optimal values have been found!!
    ***************************************************************
    The Slave Ratio Search Program Values are...
    ***************************************************************
    PARAMETER MAX | MIN | OPTIMUM | RANGE
    ***************************************************************
    DATA_PHY_RD_DQS_SLAVE_RATIO 0x000 | 0x000 | 0x000 | 0x000
    DATA_PHY_FIFO_WE_SLAVE_RATIO 0x000 | 0x000 | 0x000 | 0x000
    DATA_PHY_WR_DQS_SLAVE_RATIO 0x000 | 0x000 | 0x000 | 0x000
    DATA_PHY_WR_DATA_SLAVE_RATIO 0x000 | 0x000 | 0x000 | 0x000
    ***************************************************************
    ===== END OF TEST =====
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    We have set INVERT_CLKOUT=1 and PHY_CTRL_SLAVE_RATIO=0x100 in GEL file as well as  in software leveling program. For your reference i have attached log files here.

  • Can you snapshot the whole CCS window after the algorithm produces zeros? What version of CCS are you using?

    Regards,
    James
  • Hi,

    In CPU PCB  our require impedance is 45 to 55 ohm.
    but PCB manufacturer maintain 42 ohm for inner routing layer (Layer 3 & 6)
    In layer 3 and 6 we route below define signals
    Address Control line  :-A0, A3, A8, A10, A11, A12, A13, A14,CKE, CS0, ODT, BA0, BA2, RAS, WE.
    Data line:- D0,D1, D3, D5, D7,  D11, D12, D13, D14 DQM.
    Does it make any issue in DDR leveling?
  • Yashavadan, I'm not sure if that is the issue. Since you have a point-to-point solution, it shouldn't be necessary to run the leveling algorithm, the seed values should suffice. Can you try using just the seed values?

    Regards,
    James
  • Hi,

    We have tried simply putting seed value derived from excel sheet. DDR is not working if we put directly seed value derived from excel sheet.

  • Yashavadan, looking back at your routing, i'm a little concerned that you have spread the routing within a byte across multiple layers. Typically, you would want to keep the routing within a byte across the same layers to maintain flight time. What does your stackup look like? What are your reference planes?

    Regards,
    James
  • Yashavadan, I think you will have issues since your signal routing and reference planes are not consistent. Note 4 on Table 7-60 says "Reference planes are to be directly adjacent to the signal plane to minimize the size of the return current loop" You have some signals routed on 2 plane, while other on only one. It is best to keep the number of vias consistent within a byte. Also, it is unclear what the reference plane is for layer 3 and 6. They both have a power and ground plane adjacent, and the dielectric thickness seems to be the same, so it's possible either is used as a reference plane. In your stackup, the best routing layers would have been top or bottom (only possible reference plane is ground)

    With that said, we can still try to get your design to work. You will have to perform your own stress tests to ensure stable operation. As stated before, you should be able to use the values straight out of the RatioSeed spreadsheet with INVERT_CLKOUT=1. If that is not working, we need to double check the timing spreadsheet values. Can you send that spreadsheet?

    Regards,
    James
  • Hi JJD,

    We have set INVERT_CLKOUT=1 in spread sheet and derived seed value from excel sheet. I am attaching over here for your reference.

    We have already shared the CCS log in this thread for your reference. From that you can find out what value set in DDR controller register and what value entered while running software leveling utility.

    I am sharing DDR controller register calculation excel sheet for your reference. We have used H5TQ4G63EFR-RDC (256M x 16) DDR.RatioSeed_AM335x_boards_400.xls

    4743.AM335x_DDR_register_calc_tool.xls

  • Not sure if i was clear. I'm asking you to try the RatioSeed values straight in the GEL (do not run the software leveling algorithm), and see if the DDR initializes and is stable. You may also need to adjust READ_LATENCY value +1. I took a quick look at the timing registers, and they look ok.

    Regards,
    James
  • Hi JJD,

    We have put Ratio Seed value directly in U-Boot and checked it is not working. I guess putting Seed value directly U-boot or in GEL file both are same.

    READ_LATENCY  = (CL+2) - 1 = (6+2) - 1

    We have derived READ_LATENCY from datasheet and it should be 7. But if we increase read latency value by 1 (READ_LATENCY=8) then software leveling start working.

    Now, if we increase Read latency then how much percentage DDR throughput will get affected?

  • The read latency is dependent on the cas latency of the memory, so i'm not sure determining DDR throughput with different read latencies makes sense, since the device will not work with lower read latencies. You have to +1 the read latency because the INVERT_CLKOUT setting is basically shifting the clock by 1/2 a cycle. This allows for more timing margin on the DDR interface.

    Regards,
    James
  • Hi,

    1. we have to route D0 to D7( a byte)within same layer or with a same adjusten reference layer?
    2. As you suggested common reference layer for a byte. Does DQS0 & DQS1 also have same reference layer corresponding to their byte??

  • 1. Ideally, yes, bytes should be routed on the same layer to maintain the same reference. If you must route on different layers and multiple reference planes exist, you must implement stitching via close to the area where the signals change layers. Also, numbers of vias should be the same. Basically, the data bits within a byte should take the same route between processor and memory.
    2. Yes, DQSx are in the same net class as DQx and DQMx. These are the highest speed signals and care should be taken when routing these

    Please refer to section 7.7.2.3 of the datasheet for specific design guidelines for DDR3. Also refer to our high speed layout guidelines app note: www.ti.com/.../spraar7

    Regards,
    James
  • Dear james,

    thanks for your valuable reply.  below i float one query regarding  byte routing.

    1.If i route D0 to D7 in Layer 1 and Layer 3  with same referance plane layer 2(Ground)  its accepaable or not?

    Regards,

    Himanshu Bhoi

    CAD Enginner

    Matrix comsec  

  • Yes, that should be fine

    Regards,
    James
  • Dear James,
    Sorry for the inconvenience.
    let me explain the above query in brief.
    1. I have routed D0 to D3 in layer1 & D4 to D7 in layer3. But all D0 to D7 have same reference of layer 2. So is it accetable or not?
    2. one more query ie regarding of routing of DQS signal. for above data byte i have routed DQS signal in layer 3 but having same reference of layer 2. so is it acceptable or not ?

    Regards,
    Himanshu Bhoi
    CAD Enginner
    Matrix comsec
  • Himanshu, keeping the same reference for DQ0-7 and the corresponding DQS is correct.
    However, you should try to maintain the same difference in vias for the byte. The fact that you have half the byte on 2 different layers implies that the trace is routed on layer 1, then transitions to layer 3, and then back to layer 1 to connect to the memory. That means half the byte has 2 layer transitions (2 vias per trace), and the other half has no layer transitions. This will result in flight time differences in each nibble which will reduce your timing margin. It is best to keep the routing consistent within a byte.

    Regards,
    james
  • Dear James

    I have a query regarding DDR3 Clock and DQS length.

    Clock +\- length is 635 mils and DQS +\- length is 1100 mils. That means address group having length lesser than Data group.
    And the length difference is around 470 mils.
    Does it create any issue in DDR3 functionality (even in DDR3 levelling)?
  • This will not create an issue. There is no length matching requirement between Clock and DQS signals.

    Regards,
    James
  • Hi,

    As per technical reference manual Read Latency to be configured in register is (CL + 2) - 1. And for our case we are running DDR3 at 400MHz so it will be (6+2)-1=7.

    But in below link, it is mentioned that if "PHY_INVERT_CLK_OUT"=1  then reg_read_latency should be ( (CL + 2) -1) + 1) = ((6 + 2) - 1) + 1) = 8 and if "PHY_INVERT_CLK_OUT"=0  then reg_read_latency should be  (CL + 2) -1 = (6 + 2) - 1) = 7.

    http://processors.wiki.ti.com/index.php/AM335x_EMIF_Configuration_tips#DDR_PHY_Registers_for_DDR3

    But in technical reference manual there is no any information which says that "PHY_INVERT_CLK_OUT" will have impact on reg_read_latency. Please, confirm above as in one of our board DDR3 software leveling is not happening with reg_read_latency= 7 and it is happening with reg_read_latency=8. We have DQS signal trace length is higher compared to clk signal.

  • Hi Yashavadan, yes, i can confirm that you need to add 1 to reg_read_latency when operating with invert_clkout=1. I will get that clarified in the TRM in the next revision.

    Regards,
    James