This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DDR3 Leveling with x16 RAM ICs

Hi all,

The latest revision of our DM8169 design uses four x16 memory ICs (MT41K128M16JT-125).  It appears that because of the nature of my layout there might be a little too much variance in the lengths of the byte lanes for the code composer leveling program to find matches that suit all lanes.  (Note: All net classes are routed to within 1 mil.  The guidelines in the datasheet were followed in the design).

EMIF 0 has the following lengths:

Trace Length (inches)
Byte 0 Byte 1 Byte 2 Byte 3
CLK trace 2.803 2.803 2.332 2.332
DQS trace 1.537 1.32 1.348 1.142

EMIF 1 has the following lengths:

Trace Length (inches)
Byte 0 Byte 1 Byte 2 Byte 3
CLK trace 2.887 2.887 2.42 2.42
DQS trace 1.675 1.46 1.493 1.35

As you can see, in both cases lanes 1 and 3 are quite a bit shorter than lanes 0 and 2.  I think the Excel spreadsheet's averaging of all values is producing seed values that are hitting between higher area and lower areas that don't converge, particularly in EMIF 0 where the variance is most extreme.  I have found some values that work at 796.5Mhz, but the windows appear tight and I'm worried that the board will be unstable across the temperature range.

Do you have any suggestions on how I might get some better seed values.  I've tried to just get one lane to pass at a time, and while that works somewhat, it appears to confuse the tool a bit.

Thanks,

Tate

  • Tate,
    Are you using Word Leveling or Byte Leveling? You need to be using Byte Leveling. Word Leveling is only applicable for older silicon versions and slow interfaces.
    Tom
  • Tom,

    I am using Byte Leveling.

    Thanks,

    Tate
  • Please post the results from the leveling program.Tom
  • Tom,

    Using the spreadsheet values (below) I get results as follows for EMIF 0. Based on the spreadsheet I'd expect to see lane 0 and 2's values (especially for GATE) significantly higher in the results than 1 and 3, but I see similar ranges, and as a result often I see what I see below. I've been tweaking the drive strengths up and down to see if that has any major effect, but it's difficult to equate the results.

    See below. As far as EMIF0 I've also had luck using 0x19D, 0x20, 0x57 for the seed values, and 0x120, 0x1B, 0x70 for EMIF 1.

    ===========================================================================
    FOR EMIF 0:

    Parameters
    DDR3 clock frequency 796.5 MHz
    Invert Clkout 1

    Trace Length (inches)
    Byte 0 Byte 1 Byte 2 Byte 3
    CLK trace 2.803 2.803 2.332 2.332
    DQS trace 1.537 1.32 1.348 1.142


    Seed values (per byte lane)
    WR DQS B5 BE A9 B2
    RD DQS 40 40 40 40
    RD DQS GATE 1B9 1A6 195 183

    Seed Values to input to program
    WR DQS B3
    RD DQS 40
    RD DQS GATE 19D


    Enter 0 for DDR Controller 0 & 1 for DDR Controller 1
    0
    DDR START ADDR=0x80000000

    Enter the Seed Read DQS Gate Ratio Value in Hex to search the RD DQS Gate Window
    0x19D

    Enter the Seed Read DQS Ratio Value in Hex to search the RD DQS Ratio Window
    0x40

    Enter the Seed Write DQS Ratio Value in Hex to search the Write DQS Ratio Window
    0xB3
    RD DQS GATE RATIO MINIMUM VALUE DIDN'T CONVERGE
    *********************************************************
    Byte level Slave Ratio Search Program Values
    *********************************************************
    BYTE0 BYTE1 BYTE2 BYTE3
    *********************************************************
    Read DQS MAX 3f 3f 3e 3d
    Read DQS MIN 39 8 39 34
    Read DQS OPT 3c 23 3b 38
    *********************************************************
    Read DQS GATE MAX 174 1ab 175 1af
    Read DQS GATE MIN 15b 14f 16b 159
    Read DQS GATE OPT 167 17d 170 184
    *********************************************************
    Write DQS MAX 89 92 90 a2
    Write DQS MIN 83 85 aa 82
    Write DQS OPT 86 8b 8000009d 92

    ===== END OF TEST =====

    ===========================================================================
    For EMIF 1:

    Parameters
    DDR3 clock frequency 796.5 MHz
    Invert Clkout 1

    Trace Length (inches)
    Byte 0 Byte 1 Byte 2 Byte 3
    CLK trace 2.887 2.887 2.42 2.42
    DQS trace 1.675 1.46 1.493 1.35


    Seed values (per byte lane)
    WR DQS B3 BC A7 AD
    RD DQS 40 40 40 40
    RD DQS GATE 1C8 1B6 1A5 199

    Seed Values to input to program
    WR DQS B0
    RD DQS 40
    RD DQS GATE 1AF



    Enter 0 for DDR Controller 0 & 1 for DDR Controller 1
    1
    DDR START ADDR=0xc0000000

    Enter the Seed Read DQS Gate Ratio Value in Hex to search the RD DQS Gate Window
    0x1AF

    Enter the Seed Read DQS Ratio Value in Hex to search the RD DQS Ratio Window
    0x40

    Enter the Seed Write DQS Ratio Value in Hex to search the Write DQS Ratio Window
    0xB0
    *********************************************************
    Byte level Slave Ratio Search Program Values
    *********************************************************
    BYTE0 BYTE1 BYTE2 BYTE3
    *********************************************************
    Read DQS MAX 72 6c 43 42
    Read DQS MIN c 7 37 26
    Read DQS OPT 3f 39 3d 34
    *********************************************************
    Read DQS GATE MAX 1f6 1ff 188 189
    Read DQS GATE MIN a5 b2 17f 17f
    Read DQS GATE OPT 14d 158 183 184
    *********************************************************
    Write DQS MAX 119 11b b2 b3
    Write DQS MIN 37 43 ab 9f
    Write DQS OPT a8 af ae a9

    ===== END OF TEST =====
  • Please also provide your report showing the length matching rules have been met such as the spreadsheet tool at this link
    http://processors.wiki.ti.com/index.php/File:DDR3_PCB_ConformanceV8.zip and the ratio seed spreadsheet tool at this link http://processors.wiki.ti.com/index.php/File:RatioSeed.zip.
    Tom
  • Drive strength settings are not related to leveling seed values. Please keep the strength set to the nominal value.
    Tom
  • The RD DQS GATE length is proportional to the round trip delay of CLK+DQS for the byte lane. Byte lanes 0 and 1 have the longest round trip length.
    Since you are not getting convergence with the RD DQS GATE seed for EMIF 0, try using a smaller seed value like 0x150.
    Tom
  • Thanks Tom. That turned up something interesting.

    I've spent some time seeding values, and I'll get a pass, and try the same values again and have a misconvergence. I've tried decreasing the values more and more and there's quite a window (even RD DQS GATE = ~0x80) where I'll see a match. My flight times (per net class from ball to ball) are matched right on the money and our hyperlynx simulation shows nice wide EYEs and clean waveforms. Once I get passing values I would expect that those should work every time, right? Something peculiar is that two back to back runs with identical seeds produce very different , but passing results. Any thoughts on this?


    Enter 0 for DDR Controller 0 & 1 for DDR Controller 1
    0
    DDR START ADDR=0x80000000

    Enter the Seed Read DQS Gate Ratio Value in Hex to search the RD DQS Gate Window
    0x167

    Enter the Seed Read DQS Ratio Value in Hex to search the RD DQS Ratio Window
    0x40

    Enter the Seed Write DQS Ratio Value in Hex to search the Write DQS Ratio Window
    0xA4
    *********************************************************
    Byte level Slave Ratio Search Program Values
    *********************************************************
    BYTE0 BYTE1 BYTE2 BYTE3
    *********************************************************
    Read DQS MAX 39 37 38 39
    Read DQS MIN 33 6 30 2c
    Read DQS OPT 36 1e 34 32
    *********************************************************
    Read DQS GATE MAX 16f 16b 173 16c
    Read DQS GATE MIN 154 168 163 164
    Read DQS GATE OPT 161 169 16b 168
    *********************************************************
    Write DQS MAX d4 a7 c3 a8
    Write DQS MIN 9f 92 9c 98
    Write DQS OPT b9 9c af a0

    ===== END OF TEST =====
    Enter 0 for DDR Controller 0 & 1 for DDR Controller 1
    0
    DDR START ADDR=0x80000000

    Enter the Seed Read DQS Gate Ratio Value in Hex to search the RD DQS Gate Window
    0x167

    Enter the Seed Read DQS Ratio Value in Hex to search the RD DQS Ratio Window
    0x40

    Enter the Seed Write DQS Ratio Value in Hex to search the Write DQS Ratio Window
    0xA4
    *********************************************************
    Byte level Slave Ratio Search Program Values
    *********************************************************
    BYTE0 BYTE1 BYTE2 BYTE3
    *********************************************************
    Read DQS MAX 42 46 44 43
    Read DQS MIN 39 31 3e 39
    Read DQS OPT 3d 3b 41 3e
    *********************************************************
    Read DQS GATE MAX 111 10d 10f 10b
    Read DQS GATE MIN 107 102 103 102
    Read DQS GATE OPT 10c 107 109 106
    *********************************************************
    Write DQS MAX 96 90 b0 a2
    Write DQS MIN 8d 85 89 78
    Write DQS OPT 91 8a 9c 8d

    ===== END OF TEST =====
  • Tate,
    The leveling algorithm searches for the edges of the sample windows that provide correct results. Jitter on the signals due to clock jitter, crosstalk, ISI, SSO and power supply noise will cause the window edges to move from run to run. Occasionally the search algorithm fails and produces bad results. The results shown above are invalid. The READ DQS value should always be between 0x30 and 0x40. The run above yeilded a result of 0x1E. This result should be discarded.
    Tom
  • Tate,
    I also see that the 2nd run produced RD DQS GATE values that are too low. It is also not right. That is why I pointed you back to the RatioSeed spreadheet. It estimates the seed values based on routed lengths. If the software results are vastly different that the estimates on rows 12-14, then the results are questionable. Please attached your completed RatioSeed sheets.
    Tom
  • Enter 0 for DDR Controller 0 & 1 for DDR Controller 1
    0
    DDR START ADDR=0x80000000

    Enter the Seed Read DQS Gate Ratio Value in Hex to search the RD DQS Gate Window
    0x17F

    Enter the Seed Read DQS Ratio Value in Hex to search the RD DQS Ratio Window
    0x40

    Enter the Seed Write DQS Ratio Value in Hex to search the Write DQS Ratio Window
    0xAC
    RD DQS GATE RATIO MINIMUM VALUE DIDN'T CONVERGE
    *********************************************************
    Byte level Slave Ratio Search Program Values
    *********************************************************
    BYTE0 BYTE1 BYTE2 BYTE3
    *********************************************************
    Read DQS MAX 48 4b 46 4b
    Read DQS MIN 3f 3d 37 3f
    Read DQS OPT 43 44 3e 45
    *********************************************************
    Read DQS GATE MAX f6 f2 f4 185
    Read DQS GATE MIN eb e7 e8 e4
    Read DQS GATE OPT f0 ec ee 134
    *********************************************************
    Write DQS MAX fa b0 b4 bd
    Write DQS MIN ae aa a0 ab
    Write DQS OPT d4 ad aa b4

    ===== END OF TEST =====
    Enter 0 for DDR Controller 0 & 1 for DDR Controller 1
    0
    DDR START ADDR=0x80000000

    Enter the Seed Read DQS Gate Ratio Value in Hex to search the RD DQS Gate Window
    0x17F

    Enter the Seed Read DQS Ratio Value in Hex to search the RD DQS Ratio Window
    0x40

    Enter the Seed Write DQS Ratio Value in Hex to search the Write DQS Ratio Window
    0xAC
    RD DQS RATIO MINIMUM VALUE DIDN'T CONVERGE
    RD DQS RATIO MINIMUM VALUE DIDN'T CONVERGE
    RD DQS RATIO MINIMUM VALUE DIDN'T CONVERGE
    RD DQS RATIO MINIMUM VALUE DIDN'T CONVERGE
    RD DQS GATE RATIO MINIMUM VALUE DIDN'T CONVERGE
    *********************************************************
    Byte level Slave Ratio Search Program Values
    *********************************************************
    BYTE0 BYTE1 BYTE2 BYTE3
    *********************************************************
    Read DQS MAX 20000050 20000036 20000038 2000003a
    Read DQS MIN 3e 2b 3f 35
    Read DQS OPT 10000047 10000030 1000003b 10000037
    *********************************************************
    Read DQS GATE MAX 17f 181 180 180
    Read DQS GATE MIN 179 169 17b 169
    Read DQS GATE OPT 17c 175 17d 174
    *********************************************************
    Write DQS MAX b3 b3 b3 b7
    Write DQS MIN a7 9f a3 97
    Write DQS OPT ad a9 ab a7
  • Tate,
    I got the RatioSeed sheets. You might want to check the 180ps/in value for your PCB substrate. A value like 165 may be more accurate. This is calculated from the dielectric constant of the PCB. However, this is minor and not required.
    As you have in your RatioSeed sheets, the READ DQS and WRITE DQS values from the SW leveling program should be similar. If not, then the algorithm did not converge. In general, the READ DQS GATE estimates are large and the result should be somewhat smaller.
    Tom
  • Tom,

    Thanks again. Still working on this.

    It's been suggested that in comparison to the CLK/Control that my data/DQS might be too short. (For example, data Lane 1 of EMIF0 is 47% of the length of CLK). (Nothing in the layout guidelines suggested that was an issue). Can you see any reason why that might pose an issue? (I would assume it would just require a little more delay). I'm trying to rule out everything thing I can.

    I can get passing values, but I'd like to see them consistently pass, not such hit or miss response.

    Tate
  • Tate,
    That is not valid. You do not match the length between the Fly-by routes (Clock, Address, Command and Control) and Data Group nets. In general, the fly-by routes will always be longer to each SDRAM that the data group nets.
    Tom
  • Tom,

    After eliminating several other factors and getting some boards that we are confident are properly manufactured (or in some cases have been properly reworked to overcome manufacturing issues) I have found the following and I would like your thoughts...

    We originally were using Micron MT41K128M16JT-125 (D9PTK marking) and having mixed results. On one board (that worked very poorly with this DDR3) we replaced all ICs with Samsung K4B2G1646E-BCK0. This board works spectacularly at all frequencies and at a wide range of temperatures. The leveling data comes back with very wide windows with the READ_DQS windows right around 1/4 cycle (0x3a-0x3f). In fact, all of the optimal results returned by the search program look almost exactly like the spreadsheet calculated values.

    I'm wondering if you (or anyone else) might have some insight on what I'm seeing. (We also tried the prior die Micron MT41J128M16JT-125 (D9PSL) with similar mixed results). I'm looking through the datasheets, but not seeing much that jumps out.

    www.samsung.com/.../ds_k4b2g1646e_rev121-0.pdf
    www.micron.com/.../2gb_1_35v_ddr3l.pdf

    Hoping someone who's seen something like this will have some useful ideas.

    Any thoughts?

    Thanks in advance,

    Tate
  • This may end up being a duplicate post, but after submission the previous post didn't appear to have been successful... there was a dialog for about 30 milliseconds that seemed to suggest it may require moderators approval... didn't get a very good look at it. My apologies if this is the case. (I'm guessing the links to the datasheets may have disqualified it... so I'll leave those out.) (Will copy the post from now on before I attempt to submit it!)

    Tom,

    After eliminating some assembly issues and doing some rework I discovered that Samsung K4B2G1646E-BCK0 seems to work at all frequencies and a wide range of temperatures. We originally assembled with Micron MT41K128M16JT-125 with mixed results. Some lanes work well, some don't. (Different lanes on every board...) We also used the previous J-die Micron (MT41J128M16JT-125), without great results.

    To note, the board with Samsung originally had Micron (K-die) on it and would not work even at 400Mhz on two or so lanes. With Samsung the search program returns values almost exactly in-line with the excel spreadsheet. The Windows are really wide and the READ_DQS values are right around the 0x40 area (1/4 cycle) where they should be. (Like I said, I think we are confident we have eliminated assembly issues and are just down to the DDR3 IC model.)

    I've compared the datasheets, but nothing is jumping out. My question is does this suggest anything to you? Any ideas why the Samsung might work so spectacularly and the Micron so poorly?

    Hoping someone who's seen something like this might have some insight on what I'm seeing.

    Thanks in advance,

    Tate
  • Tate,

    This thread was dealing with instability in the results from the software leveling application.  You are not talking about functional behavior of different memories.  Are you indicating that you get completely different results from the software leveling program when you use the different memories?  The intent is to receive a set of optimized values which you then program into uboot.  You mentioned getting values with one memory type that matched expectations.  Did you try that set of values for all memory types?

    Tom