This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

66AK2H06 DDR3 Read Leveling

Guru 15520 points
Other Parts Discussed in Thread: 66AK2H06

Hi,

I have questions about 66AK2H06 DDR3 memory controller.

I'm having a problem that value of DX6LCDLR2 register are not constant.
Sometime the value will be 0x57 but sometimes it will be 0x25.

I have following queston:
Q1.
Are the value of DX6LCDLR2 register the result of READ DQS gate training?

Q2.
If anwser of Q1 is yes, I want to check the signal durning the training.
Is it available to start only READ DQS gate training by setting QSGATE bit of PHY Initialization Register?

Q3.
As I mention above, the value of DX6LCDLR2 register is fluctuating.
Is this means that the time between submitting read commands and getting the response data are fluctuating?

best regards,
g.f.

  • Hi g.f.,

    I've forwarded this to the DDR3 experts. Their feedback should be posted here.

    BR
    Tsvetolin Shulev
  • gf,

    The delay line registers are not deterministic.  They may be different after every initialization.  Variations are due to temperature, voltage and instantaneous phase offsets due to jitter on the CLK and the DQS pairs.  They will also differ from chip to chip.  That being said, the values read should be similar between initialization trials on the same board.

    You should not be focused on the value of the delay line registers.  The focus needs to be on verifying that the layout guidelines are met and that the initialization sequence is compliant.  Please verify that the initialization sequence matches the Keystone II DDR3 Initialization Application Report (SPRABX7) available at www.ti.com/lit/pdf/sprabx7.  Also, please provide a completed REG_CALC spreadsheet from http://www.ti.com/lit/zip/sprabx7.  The layout must be compliant to the DDR3 Design Requirements for KeyStone Devices Application Report (SPRABI1) available at www.ti.com/.../sprabi1b.

    Please provide details about the failures observed:

    1. How often do they fail?
    2. How many boards have been manufactured?
    3. Do they all fail at the same rate?
    4. Is there anything different between the passing and failing boards?
    5. What are the symptoms of the failure?
    6. Can the failure be made better or worse by changing the board temperature?

    Tom

  • The last link is broken and should be www.ti.com/lit/sprabi1.
    Tom
  • gf,

    Please see the following link for a length matching rules template.  e2e.ti.com/.../452232.  A completed template needs to be provided.  Good discussion on the requirements of length matching are available at e2e.ti.com/.../394547.

    Tom

  • Hi Tom,

    Thank you for the reply and sorry for the dely.

    I'm asking my customer to provide their REG_CALC spread sheet
    and length of each lines(ADDR/CMD/CLK/DQ/DQS), so please wait for a while.

    I will answer to your question which I know so far.

    >2.How many boards have been manufactured?
    >3.Do they all fail at the same rate?
    They check with 3 boards and they all fail. But they fail at the different rate.
    There are no difference between these 3 boards.

    >5.What are the symptoms of the failure?
    A bit error occurs.
    When read access to DDR3, at the specific bit of the specific address
    it should be 1 (correct value) but it is read as 0 (mistaken value).
    Ex)
    Address: 0x80005A04
    DQ     : 0xE2833001 (Correct Value)
    DQ     : 0xE2033001 (Mistaken Value)

    The memory which a bit error occurs is only the most far-off memory in flyby topology.
    Please see the attached file.

    DSP_DDR3_bit_error.pdf

    By the way, I can't open the E2E post which you attached.
    When I try to open the post the following error occurs:
    "Server Error in '/' Application.
     Runtime Error
    "
    best regards,
    g.f.

  • g.f.

    I apologize about the links.  For some reason the editor is including the "." at the end in the link.  The links are good if this is removed.  I have pasted the links again below.  (I can no longer edit my previous post to fix the links.)

    Please see the following link for a length matching rules template: https://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/452232

    Good discussion on the requirements of length matching are available at: https://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/394547

    Tom

  • g.f.

    The single bit error on a single SDRAM at a single location when running an application is usually indicative of a data-dependent error.  The problem may vanish or move to a different memory location if the code is recompiled causing the memory allocation to change.  It may also get more or less frequent if the chip is heated or cooled.  I would not normally associate this with a leveling issue.  Instead, this appears to be due to signal degradation such as from crosstalk or reflections.  Please validate that the trace length rules are met and that the trace spacing rules are met and that the traces have proper reference planes without splits or gaps.

    Tom

  • Hi Tom,

    Thank you for the reply.

    So, do you mean that discussing leveling value is insignificant?

    My customer are asking about heating and cooling the chip.
    Q1.
    Where should be heated and cooled? Is it board or 66AK2H06 chip?

    If 66AK2H06 chip, they are mounting heat-sink on the 66AK2H06 chip,
    so that is it okay to heating and cooling the heat-sink?

    Q2.
    If bit error occur more or less frequent by the temperature change,
    What kind of thing is considered as the cause?

    By the way, I got DDR3 calculation spreadsheet from my customer.
    I will attach in this post, so that can you please take a look?
    They are using "MT41K256M16 125 (1600)" DDR3 memory.

    K2 DDR3 Register Calc v1p60.xlsx

    best regards,
    g.f.

  • g.f.,

    Q1: Heating or cooling is regarding the 66AK2H06 chip.  This can be accomplished by either adding or removing some airflow.  The device has significant intrinsic heating so not additional heating is needed.  This will cause the chip to run at a higher or lower case and junction temperature.  Note that the device must not be allowed to heat up beyond the maximum rated operating temperature or damage may result.

    Q2: The problem being described is a marginality.  It only occurs at some combination of signals on the DDR signals.  DDR signal rise time is a function of buffer drive strength.  The gain and impedance of transistors in the IO cells will change as the buffer is operated at higher or lower temperatures.  This will normally increase of decrease to rate of occurrence of marginality failures.

    Thanks for providing the REG_CALC spreadsheet.  I will review it.  Do the register writes in the initialization code for the DDR Controller and PHY match the values in the spreadsheet?

    I also need to see a report showing that the length matching rules have been met.

    Tom

  • Hi Tom,

    I have received DDR3 length file from my customer.
    I will attach to this post, so please take a look.

    66AK2H06_DDR3_LengthList.xlsx

    In the DDR3 length file, there are  CLK, CMD, ADDR, DQ, DQS length for each DDR3A and DDR3B.
    Each length are written in [mm], and length difference against the "Target(Green color)" are
    written in [mm] and [mil].

    As you can see from DDR3 length file, I think some of length size don't match the layout guideline.
    My customer also understood that their DDR3 layout aren't matching the guideline,
    but before modifying the length to match the guideline, the customer is eager to know
    whether the unmatch length size against the guideline causes the data bit failure.
    In other words, the customer wants the conviction that a problem settles it by correcting the length size.

    By the way, the data bit failure are occuring at DDR3A.

    best regards,
    g.f.

  • g.f.,

    I do recommend that they tighten their routing in a PCB revision.  This will add robustness in production and operation where there will be variation in silicon process strength across the population of chips and also variation in operating voltage and temperature.  However, I do not believe these length violation are the cause of the bit errors.  I recommend that they look at other routing rules.  Do all of the traces have proper spacing?  Specifically, are the Clock and DQS pairs routed differentially with controlled impedance?  Are the Data, DQS and Clock traces routed adjacent to a solid ground reference plane at 50 ohms?  Do any of these cross splits or voids in the ground plane?  Are the Address, Command and Control traces routed at 50 ohms adjacent to either a solid ground plane or a solid VDDQ plane without voids or splits?  Are there other aggressor circuits routed close to the DDR circuits?  This include POL power supplies that have large AC ripple and transient voltages and currents - including in the ground planes.  Is there proper decoupling both at the SDRAMs and at the K2H06 to provide low impedance return paths for these high bandwidth signals?

    Another perspective for troubleshooting the layout is to adjust the SDRAM_DRIVE, DDR_TERM, PHY_DATA_ZO and PHY_ACCC_ZO impedances as shown in the REG_CALC spreadsheet.  The application of this is discussed in Section 4.5.2 Routing Impedances – KeyStone II Devices of the DDR3 Design Requirements for KeyStone Devices Application Report (SPRABI1B).  You might also try running the DDR3 interface at 1333MT/s to see if this solves the problem.  If termination changes solve the problem or reducing the clock rate solves the problem, this is additional evidence of a signal integrity problem.

    Tom