This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hello ,
We Have an urgent need for design guidance (see below)
Background…
We are experiencing DDR3 read back issues on our first spin PCB design which has a C6655 DSP and two Micron MT41K128M16JT-125:K 2Gb DDR3 (16 M x 16 x 8) memories operating at 1033MHz. We are have been experiencing single bit errors on some of the boards (the quantity of errors vary from a few thousand to only a few when repeatedly reading back 1MByte of data previously written). The bit location of the error(s) change somewhat but there are certain bits which have a much higher probability of error. Other boards repeatedly have no read back errors issue. There are no indications so far with issues writing to memory reliably. We tried slowing the DDR clock from 1033 to 825 and still had issues although there is some reduction in the number of errors.
Up until yesterday we have been using the “ Partial Automatic Leveling” mode which is the same method used on the EVM for the 6657 and seems to be the most common method discussed on the TI E to E. Yesterday, we changed the leveling method used to “ Full Automatic &Incremental “ and followed the details exactly as described in sprugv8c.pdf example 22. With this change, all of the PCBs are working 100 % of the time ( no readback errors) . There are excerpts from the GEL for the DDR initialization for both leveling cases included in attachment.
I believe we have followed all of the schematic guidelines, all of the registers are set using the two TI excel files DDR3 Register Calc v4.xlsx and DDR3 PHY Calc v10.xls to obtain the proper register (see attached) settings. We have followed the DDR initializing in the same manner as described in sprugv8c.pdf and the EVM GEL.
We believed we were following all of the TI PCB layout guidelines but now see that some aspects of the net class routing rules specified in section 4 of sprabi1a.pdf are being exceeded. Namely the following;
Attached in the file Signal Routing Lengths.xls it has routing length of each net as well as the simulated propagation delays.
We are correcting the net class routing to meet the class limits in the production design but want to have confidence that the read back problem we are experiencing will be remedied with the layout changes and there is not some other reason for our issue. Ideally we would like to use the more common “ Partial Automatic Leveling” method.
OK , now for some questions…
1) Is there anything you can see we are doing incorrectly or should be checking into?
2) To the degree we are exceeding the net class routing rule limits, would it likely explain why we have read back errors when using “ Partial Automatic Leveling “ but none with “ Full Automatic &Incremental’ ?
3) Should we being using routing rules for net classes based on absolute length as described above or use propagation time from modeling ? The results are likely to be mutually exclusive.
4) There is very little information given on the Full Automatic &Incremental leveling method and there are several caveats listed in sprugv8c.pdf and the silicon errata.
DDR_RDWR_LVL_RMP_CTRL = 0x80000000; //enable full leveling
DDR_RDWR_LVL_CTRL = 0x80000000; //Trigger full leveling
Delay_milli_seconds(10ms);
// Turn on Full Automatic & Incremental Leveling per SPRABL2A example 20
DDR_RDWR_LVL_RMP_WIN = 0x00000502;
DDR_RDWR_LVL_RMP_CTRL = 0x80030300;
DDR_RDWR_LVL_CTRL = 0xFF090900;
Delay_milli_seconds(500); // How long to wait before disabling ?
// turn off Full Automatic & Incremental Leveling
//turn off inc. DQS gate and read eye training , don't retrigger.
DDR_RDWR_LVL_CTRL = 0x00000000;
//turn off inc DQS gate and read eye training
DDR_RDWR_LVL_RMP_CTRL = 0x80000000;
DDR_RDWR_LVL_RMP_WIN = 0x00000000;
Thanks so much for your assistance in the urgent matter.
Larry
Larry,
We will review the data provided and provide a response. The 'partial automatic leveling' and 'full automatic leveling' procedures should work equally well. Both are fully qualified for long-term use.
Tom
Larry,
We will have an initial response by Friday after a thorough review of the provided data.
Tom
Tom,
Thank you in advance for reviewing our design documentation. I also have a few questions which should have nothing to do with our specific design details and are simply questions about the C6655 DDR interface. I was hoping to get a quick reply on these matters.
Re Leveling:
I glad to hear the “full -automatic “leveling mode is fully qualified on the C6655. In our case, to get robust error-free read back on all PCBs we also need to turn-on the “incremental leveling after full leveling “ ( Example 21 and 22 in section 3.3 of SPRABLA2A ) . Using just the auto-leveling ( example 21 only) , the read back can still be erratic on some PCBs. Based on information, it sound like the read-eye sample point may not converge but when we read the DDR Status register, it always give a successful result, 0x40000004
Following the DDR initialization, we are using the leveling procedure as described in section 3.3 of SPRABLA2A
// Removed for Auto leveling DDR3_CONFIG_REG_23 |= 0x00000200;
// Example 21 - auto leveling
RDWR_LVL_RMP_CTRL = 0x80000000; // Enable leveling
RDWR_LVL_CTRL = 0x80000000; // Trigger full leveling
for(i=0;i<1000;i++); //Wait 3ms for leveling to complete
//Example 22 – incremental leveling
RDWR_LVL_RMP_WIN = 0x00000502; // Set incremental ramp window to 10ms ?
RDWR_LVL_RMP_CTRL = 0x80030300; // Set Read Data Eye And Read Gate interval during ramp window
RDWR_LVL_CTRL = 0xFF090900; // Set Read Data Eye And Read Gate training interval
Delay ( TBD ms) How long will 64 iterations take ?
If we leave the leveling mode registers unchanged at this point, according to documentation, incremental leveling adjustments will be made on a 10ms basis to compensate for temperature and voltage changes. I don’t know if the incremental leveling would interfere with normal memory operation (refresh , Mem R/W operations) without halting operation. Is this a valid concern?
Is there a procedure to turn-off the incremental leveling while retaining the read-gate and read data eye leveling settings (understand that it will no longer be tracking voltage and temperature changes) ? I added the lines of code below after example 22 and it seems to work ( robust read back with all PCBs) but I am uncertain if the incremental leveling operations have ceased .
DDR_RDWR_LVL_CTRL = 0x00000000; //turn off incr DQS gate & read eye training , don't retrigger.
DDR_RDWR_LVL_RMP_CTRL = 0x80000000; //turn off DQS gate and read eye training
DDR_RDWR_LVL_RMP_WIN = 0x00000000;
Please provide guidance in for the usage of Incremental leveling after Automatic leveling ?
Does the fact we need Incremental leveling after Automatic leveling mode to make the readback operation robust indicate another design issue is present ?
Re Layout Design Rules:
We are reexamining PCB layout contraints and want to know if routing length guidelines ( ie. Total length from DSP to each SDRAM for all respective DQ and DM signals within a byte lane should be skew matched to the DQS line +/- 10.00 mils ) take precedent over ICX propagation timing simulation results.
Thank you for your consideration in this matter.
Larry
Hi Larry and TI experts:
Glad to see Larry's DDR works fine.I have the same DDR MT41K128M16JT-125IT for my own board which contains a c66x dsp.After downloading Larry's files pasted here ,I tried to make my own initialization code,but failed anyway.
I have some questions that hope you can help:
1.Does DDR initializaiton has anything to do with DSP type(6614 or 6657 make a difference?)?We have the same DDR,does it mean we are going to have the same register values calculated by those XMLs?
2.Larry,What is your configuration for pll2:What's the value of your PLLM_DDR and PLLD_DDR ? Why you choose 1033Hz as output DDR data rate? As far as I know,our DDR should support 1600 data rate.
3.Do you have any suggestions for debugging the DDR registers?
Thank you all in advance !
Striker Qian
Striker,
You cannot use the XLS from another design. Some of the inputs are from the board layout.
KeyStone-I devices are currently limited to 1333MT/s. Please refer to the Data Manuals and Errata docs.
Tom
Striker,
I agree with Tom's comments ( ie, 1333MHz is max bus speed of this DSP)..
BTW - I am using 1033 because original design used 1066 memory ( now EOL) and due to the clock input frequency ( 51.6 MHz) I had available for the DDR PPL Clock input.
Larry
Hi Tom,
Can you specify which part will be related with my DDR type and which part will be from my board layout?
In my opinion,the values calculated from DDR3 Register Calc v4.xls, which contains:
DDR_SDTIM1 |
DDR_SDTIM2 |
DDR_SDTIM3 |
DDR_SDCFG |
DDR_SDRFC (normal) |
DDR_SDRFC (ext temp) |
DDR_SDRFC (initialization) |
will be from MT41K128M16JT-125I datasheet.
And the DDR3 PHY Calc v10.xls will be determined by my board layout.which contains:
Register |
DATA0_WRLVL_INIT_RATIO |
DATA1_WRLVL_INIT_RATIO |
DATA2_WRLVL_INIT_RATIO |
DATA3_WRLVL_INIT_RATIO |
DATA4_WRLVL_INIT_RATIO |
DATA5_WRLVL_INIT_RATIO |
DATA6_WRLVL_INIT_RATIO |
DATA7_WRLVL_INIT_RATIO |
DATA8_WRLVL_INIT_RATIO |
Register |
DATA0_GTLVL_INIT_RATIO |
DATA1_GTLVL_INIT_RATIO |
DATA2_GTLVL_INIT_RATIO |
DATA3_GTLVL_INIT_RATIO |
DATA4_GTLVL_INIT_RATIO |
DATA5_GTLVL_INIT_RATIO |
DATA6_GTLVL_INIT_RATIO |
DATA7_GTLVL_INIT_RATIO |
DATA8_GTLVL_INIT_RATIO |
Register |
DDR3_CONFIG_REG_12 OR Mask |
Is that the way to specify all the registers we need? I asked the board PCB designer of our side and he told me that the board's DDR layout is the same with 6614evm board. Should I refer to he PHY configuration of the evm board?
Larry,
Many thanks for your kind help. I just to know from you which documents can offer me enough infomation to determine all those many registers values using the xls.I have download the MT41K128M16JT-125IT data sheet 2GB_1_35V_DDR3L.pdf . But the infomation I got is very limited. Can you list the docs which can help to specify all the values?
Striker, Yes, I know the data sheet for the MT41K128M16JT-125IT has limited timing info.. As you know this it is a LV part ( 1.35V) . A FAE from Micron indicated that where not specfied otherwise on the MT41K128M16JT-125IT , I can use the MT41J128M16 spec sheet (1.5V part).. This spec sheet had all of the timing specs I needed and it is what I used into the TI Excel file to deteremine register settings. Here is link to a Micron Technical note regarding 1.35V memory http://www.micron.com/~/media/Documents/Products/Technical%20Note/DRAM/tn4114_ddr3_1_35v_1_5v_compat.pdf
"Micron 1.35V DDR3L and DDR3L-RS devices use the same die as 1.5V DDR3 devices,
but have been separated during the test screen and marking process. The Micron 1.35V
test screen incorporates testing to ensure backward compatibility to 1.5V operation.
Therefore, all parts marked as DDR3L or DDR3L-RS are backward compatible to parts
marked as DDR3, and meet the JEDEC 1.5V voltage level operation specifications. This
is in compliance with both Micron and JEDEC specifications."
Note: I am using a 1.5 V power with the MT41K128M16JT-125IT. I don't know if there would be voltage incompatiablity issues with the DSP if you were to operate DDR3 on 1.35V . This is something you would need to discuss with TI.
Larry
Hi Larry,
You’ve described your problem as a read error issue. I just want to be sure about the specific test that you are preforming. Based on your notes the memory test writes to a block of data and then reads back multiple times, comparing the value read with the value expected. Is that correct? When you see the errors do the bad values change from read to read or remain consistent? Do the errors appear more frequently in one bit vs another or do the errors appear randomly across the byte lanes? Are any of the byte lanes stable? What memory test are you using? Is this a test provided by TI or a test that you have developed? As you commented, reading back same error multiple times from the same memory location would indicate a write error but if you are seeing different incorrect values over multiple reads it would indicate read errors. Often customers use a memory test that will write to memory followed by a read, repeated multiple times. This doesn’t narrow the problem to either a write error or a read error. Based on your comments, you’re reading multiple times with different incorrect results indicating a read error but I wanted to be sure.
We've reviewed the data that you sent and we have some questions and concerns. We want to be sure that any changes to your PCB are complete and successful.
Length spreadsheet
It's clear from the length matching spreadsheet that you have routed the two memories in fly by mode but we have some questions. You asked about the length requirements compared to simulated timing requirements. Ideally we would present requirements as a timing relationship, but we have found that generating these requirements in terms of length is more accepted by PCB designers. On most designs the length and the delay correlate pretty closely since groups of signals are routed on the same layer.
1) The length of the data lanes is listed as a single number with no reference to layers. In the PHY calc spreadsheet this length is reflected entirely as stripline. What portion of each trace is on the top or bottom layers?
2) In a number of cases your delay and your length do not correlate as expected. For example for your DQSN/P_0 and your EDQ0 signals, the DQS have shorter lengths then the EDQ0 but the delay for DQS are longer. Since these signals should be routed on the same layers, we’re trying to understand what caused this difference. Were they routed on the same layer? Was the top or bottom layer used to route a significant portions of these traces? Were the same number of vias used for each of the traces in each byte lane?
3) How was the length of each of the address and command traces calculated? The length should have been measured from the ball on the C665x to the ball of the memory device. How many vias were used on each trace? Were the terminating resistors placed at the end of the trace?
4) One of the concerns that we have that is not captured well in the routing guidelines is the length of the stub between the address trace and the ball of the memory. Since the memory trace continues from the C665x to the terminating resistor there is always a stub from the fly-by trace to the ball of the memory. This stub should be kept as short as possible.
5) Can your PCB layout tool provide a report that specifies the length of the address, command, clock and data traces? This report should show the portion of the length between any vias that are part of the layout and the layers where each portion resides. For example a data lane routed between the C665x on the top layer to a memory on the top layer would have a length on top between the C665x and a via, a length of some other layer between the via and a second via and, finally, a length between the second via and the ball of the memory on the top layer.
Schematic
1) Would you be willing to provide the schematic pages in a searchable PDF for our review?
PCB layout
In addition to lengths there are a number of other layout issues that could cause memory errors. The error we see most commonly is the lack of a solid ground plane adjacent to the data and clock traces and the lack of a solid ground or solid DVDD15 adjacent to the address and command traces.
1) Can you provide the PCB stackup? Can you specify which layers were used for routing DDR signals?
2) Were the signals for each byte lane routed on a common layer? To be clear each byte lane consists of eight data signals and the associated DQM and DQSP/N.
3) Have the routing requirements associated with ground planes and DVD15 planes been followed?
4) Would you be willing provide a PDB database that we could check? We have the ability to view Cadence and PADS databases.
REG Calc spreadsheet
Based on the 1.5V version of the datasheet on the Micron website for the memory device that you are using, the timing information needed for the REG calc spreadsheet appears to be correct.
PHY calc spreadsheet
This spreadsheet looks fine based on the numbers you entered but, as mentioned previously, the lengths are not broken down into stripline and microstrip. Are the lengths accurately entered?
Regards, Bill
Hi All,
We are also facing the same sort of issue. was the issue solved on your side
Regards,
Avinash N
Hi Avinash, Sorry, I did not see your reply until today but I did see the post "2Gb DDR3L" you initiated on 11/5and was wondering if you were having the same issue. We have still not determined reason why the partial automatic leveling does not work ( erratic readback) with the MT41K128M16JT-125 DDR3L part # (same part # you are using).. We swirched to full automatic leveling along with incremental leveling ( turn on for 100ms and then off) after DDR initialization completes and that is our plan for production. This leveling method has worked 100% for us even with a higher DDR speed planned for the product ( 1315 vx 1033). We duid have a few boards made with two different 1.5V DDR3 part #. The 1.5V DDR parts all seem to work fine using the partial automatic leveling ( same GEL settings) but we have not done extensive testing . This seems to point to a difference in the DDR3L part ( Note we are still operating the DDR3 with a VDD of 1.5V) but to dat have not found an explaination.
Can you provide update/info from your end.. Is the problem you are having erratic readback.. (ie, .typically a single bit in a word, vary between 100 to 10, 000 errors when writing and then reading a 3MB file The errors are data dependent and the bit locations for the errors vary from board to board).. when using partial automatic leveling ? Does the memory work fine with DDR3 ( 1.5V) part using partial automatic leveling ? Have you tried adding full leveling with incremential wi the DDR3L part to see uif problem is fixed.. Thanks
Larry
Larry,
The C6654/55/57 DDR3 initialization sequence was found to be deficient. This has been addressed in other e2e posts. The corrected initialization process is now fully documented in the on-line documentation.
The revised KeyStone I DDR3 Initialization Application Report SPRABL2D can be downloaded from: http://www.ti.com/litv/pdf/sprabl2d. Please also download the latest support spreadsheets from: http://www.ti.com/litv/zip/sprabl2d.
The revised KeyStone Architecture DDR3 Memory Controller User Guide SPRUGV8E can be downloaded from: http://www.ti.com/lit/pdf/sprugv8.
The latest C6654/55/57 GEL file change is in Keystone1 Emupack release 1.0.9.1 available at: http://software-dl.ti.com/sdoemb/sdoemb_public_sw/ti_emupack_keystone1/latest/index_FDS.html.
Tom