This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6678: DDR3 Read access problem

Guru 15520 points
Part Number: TMS320C6678

Hi,

My customer using the C6678 are experiencing issues with DDR3 read access during pre-shipment testing.
The one test pattern include such as "walking test", "marching test", etc..
This single test performs hundreds of millions of write/read accesses and takes about 20hours to be done.
The problem occurs mostly in "walking test", and when the problem occurs,
when "1" is written, "0" is read, and when the same address is read continuously, "1" is read correctly,
so we think it is a memory read issue.
The frequency of occurrence of the problem is once at room temperature, and more than 300 times
when the temperature is raised to about 50°C.

Two C6678(DSP#1,DSP#2) are mounted on their customer board and each DSP are connected to four 16bit DDR3(total 64bit).
The issue occured only at 1bit of byte lane #6 or #7 of DSP#1.
This board has been mass-produced for several years, and the DDR3 peripheral layout has been done according to
TI's design guide, and since this problem has not occurred so far, we believe there is no problem with the layout design.

For now, they are using partial leveling of DDR3.
But there are above issue, so the customer are thinking to try full leveling.
With that in mind, they have the following questions.

Q1.The customer want to know the details of each parameter of leveling register which are
defined in spead sheet "DDR3 PHY Calc v11" such as DATAx_WRLVL_INIT_RATIO, DATAx_GTLVL_INIT_RATIO,
RD_DQS_SLAVE_RATIO, WR_DQS_SLAVE_RATIO, WR_DATA_SLAVE_RATIO, FIFO_WE_SLAVE_RATIO.
They want to know what kind of process will be done by using these parameters.

Q2.The customer are trying to adjust the above parameter after calculated by the spread sheet.
And they want to know which parameters to adjust for a problem like this.

Q3.Is there a register that allows me to check how the initial value of each register calculated
in the spreadsheet has changed after leveling?

Q4.We would like to know about configuration flow to use full leveling.
Is configuration flow of full leveling same as partial leveling?
In full leveling, is it just that there is no setting to enter the fixed value(0x200) to DDR3_CONFIG_REG23
that was recommended in the partial leveling?

Q5.To use full leveling, I guess incremental leveling are needed after full leveling because
of errata advisory 9 workaround 3.
In Keystone I DDR3 Initialization(sprabl2e) page.16, it said as follow:
********************************************************************************
Example 25. Incremental Leveling After Full Automatic Leveling
RDWR_LVL_RMP_WIN = 0x00000502;
RDWR_LVL_RMP_CTRL = 0x80030300;
RDWR_LVL_CTRL = 0x7F090900;
********************************************************************************
Does this setting mean workaround 3 of advisory 9 ?

best regards,
g.f.

  • Hi g.f,

    we are using the recommended value, but do you mean that it is too low?

    As shown in the experiments, there is a clear dependency on voltage. The customer may be configuring the voltage to the recommended value; however, there could be an IR drop in the system such that the voltage in the DDR PHY is too low.

    but we have seen that only the read of a specific memory (MEM4) is abnormal.

    I do not understand this statement. What is "MEM4"? My previous understanding was that there were DDR read errors which followed a particular C6678 unit. 

    Regards,
    Kevin

  • Hi Kevin,

    Thank you for the reply.

    >As shown in the experiments, there is a clear dependency on voltage.
    >The customer may be configuring the voltage to the recommended value;
    >however, there could be an IR drop in the system such that the voltage in the DDR PHY is too low.

    I will answer to the customer.

    >I do not understand this statement. What is "MEM4"?
    >My previous understanding was that there were DDR read errors which followed a particular C6678 unit.

    I'm sorry.
    Yes, you're right.Also my understanding is same as you.
    I guess they are trying to say that DDR read erros occur at the specific byte lane of specific C6678 unit.
    From customer schematics, DDR3(MEM4) is connected to DQS6 and DQS7 of C6678 and previously they was saying that
    this errors occurs only at byte lane 6 or byte lane 7 of specific C6678 unit.
    Anyway, let me ask to my customer what they mean of "specific memory(MEM4)" is.

    best regards,
    g.f.

  • Hi Kevin,

    Thank you for many support.

    I received the following question from a customer in response to your recent response.
    >As shown in the experiments, there is a clear dependency on voltage.
    >The customer may be configuring the voltage to the recommended value;
    >however, there could be an IR drop in the system such that the voltage in the DDR PHY is too low.

    <Question>
    -------------------------------------------------------
    When you say in the system, do you mean inside the DSP?
    Why does IR drop occur if the voltage is too low?
    Isn't IR drop something that always occurs?
    -------------------------------------------------------

    >I do not understand this statement. What is "MEM4"?
    >My previous understanding was that there were DDR read errors which followed a particular C6678 unit.

    I asked to the customer and as I told you previously, MEM4 means that the anomaly occurs not in a specific memory,
    but only in a specific Byte Lane (#6, #7) of a specific device (C6678).

    best regards,
    g.f.

  • Hi g.f,

    When you say in the system, do you mean inside the DSP?

    No, I am referring to the entire board. (from the voltage source to the DDR PHY)

    I did not state that there is a voltage drop. I stated there could be a voltage drop. Since we do not even know if there is a voltage drop, we can't specify where the voltage drop occurs.

    Why does IR drop occur if the voltage is too low?

    I never stated this. I stated that a voltage drop in the system could result in the DDR PHY observing too low of a voltage. As an example, if the recommended voltage for a given part is 900 mV and there is a 50 mV drop on the PCB, then the DDR PHY will only see 850 mV if the voltage source is programmed to 900 mV.

    Regards,
    Kevin

  • Hi Kevin,

    I'm very sorry. There was my misunderstanding about the voltage drop.
    I'm going to reply to the customer with the corrected answer.

    best regards,
    g.f.

  • Hi Kevin,

    Our customer are discussing about this issue inside their company,
    and they have following questions:

    1.It seem that this phenomenon are improved by changing SlaveRatio and CVDD,
    but what kind of phenomenon is occurring inside the DSP that improves it?

    2.If using the C6678 with SmartReflex disabled, will it still get the manufacturer's warranty?
    Is there any risk by disabling SmartReflex?

    3.Is it ok to increase CVDD if the voltage is within device specifications?
    Is there any risk by raising the CVDD?

    best regards,
    g.f.

  • Hi g.f., 

    I'll comment on 2 & 3 for now (don't have good answer for #1, Kevin any theories?)

    2. They should not completely disable smart reflex.

    3. One idea would be to increment by ~100 mV, but limit the max programmed value to 1.15V.

    I.e., SRV_used = min(SRV_fuse+100mV, 1.15V).   

    This will constrain the maximum voltage to be lower than the value that is in the datasheet for 1400 MHz operation.

    For example if SRV is set to 0.9 V => you would program 1.0V.

    But if SRV is set to 1.1V => You would program to 1.15V.

    Regards,

    Kyle

  • Hi Kyle,

    Thank you for the reply, and I'm sorry for the late reply, I was out of office last week.

    Is there any progress on Question #1?

    >2. They should not completely disable smart reflex.

    So, if they disable smart reflex, there are no warranty and it will be risk, is it correct?
    If yes, what are the risks if they disable the smart reflex?

    >3. One idea would be to increment by ~100 mV, but limit the max programmed value to 1.15V.
    >I.e., SRV_used = min(SRV_fuse+100mV, 1.15V).
    >This will constrain the maximum voltage to be lower than the value that is in the datasheet for 1400 MHz operation.

    The customer's device is a 1 GHz speed grade product, so there is no risk in increasing the CVDD below the 1 GHz operating value (Max) listed in the datasheet, is that correct?

    best regards,
    g.f.

  • 1. No update.

    2. We do not recommend disabling smart reflex.  There is a potential reliability impact, such as a reduction in POH.

    3. Correct.  As long as they don't exceed the MAX nominal voltage allowed for the 1.4 GHz device (which is the same silicon), which is 1.15V.

    Regards,

    Kyle

  • Hi Kyle,

    Thank you for the reply.

    >3. Correct. As long as they don't exceed the MAX nominal voltage allowed for the 1.4 GHz device (which is >the same silicon), which is 1.15V.
    I thought the CVDD voltage was allowed up to the maximum voltage of 1 GHz.
    Since the 1GHz and 1.4GHz products are the same silicon, I understood that they can raise the voltage
    as long as it is up to the 1.4GHz maximum voltage. In other words, they are allowed to raise CVDD
    up to 1.15V instead of 1.1V. Am I correct?

    Do you have any update for Question#1? The customer are waiting for any information.

    best regards,
    g.f.

  • Hello g.f., 

    We are not able to further diagnose the problem.  Our recommendations are based on the empirical data that is observed by the customer.

    Regards,
    Kyle

  • Hi Kyle,

    Thank you for many support.
    After answering to my customer, they have additional question about changing SlaveRatio Value.

    They want to know how to change SlaveRatio value after DDR initialization.
    They would like to start with the default value of 0x34 at first, and then change
    to appropriate SlaveRatio value(e.g. 0x2C) during the process to be able to access the memory.

    We are assuming the following procedure, but we would like to know if there is a formal method in such case.
    1) Write the SlaveRatio value to the lower 9 bits of DDR3 Configuration 23 Register (DDR3_CONFIG_23).
    2) Write 1 to the most significant bit of Read-Write Leveling Control Register (RDWR_LVL_CTRL)

    Are there formal method other than above in this case?

    best regards,
    g.f.

  • g.f.,

    We don't have any other formal description of how to change the SlaveRatio after initialization.

    Regards,
    Kyle

  • Hi Kyle,

    Thank you for the reply.
    So, the procedure indicated above is OK, am I correct?

    best regards,
    g.f.