TMS320C6678 DDR3 interface

Bruno B

Other Parts Discussed in Thread: TMS320C6678

Hi,

I have some questions relative to the DDR3-1333MT/s interface implementation for the TMS320C6678.

We want to implement a 64 bit interface based on x16 bit DDR3 chips, therefore requiring 4 devices. And since there is very little space available on our board, the placement of the DDR3 must be TOP and BOTTOM side of the PCB. To help reduce stub effects, µ-via can be used for this board.I am considering the following kind of routing topology (either diag 1 or 2 will be chosen for ADR routing) :

Here are my questions :

1) Since traces from DSP to first DDR3 will be short (< 5cm) , we will have to use the INVERT_CLOCK feature.

==> Is this feature fully operational ?

It is strange that in the SPRABI1A, there are a lot of calculations based on "Invert_clock" enabled, with finally (page 38) a note stating that :

"all topologies should be designed for a positive
skew between the command delay and data delay to avoid this situation"

If the feature is not operational, is it possible instead to simply invert the routing of clock_out_P and clock_out_N so that to produce a physical inversion of the clock ?

2) With the above routing topologies for ADR/CMD/CTRL/clock there will be very little skew between chips U1/U2 and also U3/U4 (toplogy 1) or U2/U3(topology2)

Is this a problem for the levelling process, that 2 different DDR have the same DSP-to-DDR delay ?

( ==> Is there a minimum skew required between 2 different DDR devices ?)

3) Oftently the recommendations for DDR3 Data routing are to implement 40 or 45 ohms single-ended traces(DQ) and 80 ohms differential traces(DQS) in order to match with the (JEDEC JESD79) 34 ohms or 40 ohms output driver impedance.

Also the common recommendations for ADR/CMD/CTRL are to implement 40 ohms "lead-in" traces and 60 ohms "inter DRAM" traces. The purpose being to account for the 4 DDR package input capacitance, lowering the effective impedance of the transmission line.

==> As far as I could see, T.I simply recommends 50 ohm traces. Is it a mistake ? or is it because the DSP driver output impedance are not 34 nor 40 ohms calibrated ? or is it because simulations show that signal integrity is OK anyway ? or other reason ?

Thank you for your help,

With best regards,

Bruno

over 12 years ago

0 Tom Johnson 16214 over 12 years ago

TI__Mastermind 46460 points

Bruno,

Routing the DDR3 memory devices on both top and bottom is an acceptable implementation. I know of multiple designs in production that have this layout. This can best be accomplished by using buried or blind via stack-ups including those with micro-vias. The fly-by routing topology must still be met. All fly-by nets must follow the same basic route and connect to the SDRAMs in the same order and with similar stub lengths. I believe your second figure will be easier to implement. DDR3 BGA SDRAMs are not available in mirrored packages like older flat-pack memories were. The criss-crossing fly-by routes created by the first diagram will cause difficulties.

1. The INVERT_CLOCK comment in the note on page 38 of the DDR3 Layout Guidelines SPRABI1A is misleading. Logically, the clock delay must always be longer than the data strobe delay. The INVERT_CLOCK feature adds an apparent 1/2 clock length to make sure this is always met. In the physical layout, you should keep the clock length longer or only slightly shorter than the shortest data strobe length. INVERT_CLKOUT is still needed even when the Data Strobe and Clock nets are the same length. Do not swap the P and N of the DDRCLKOUT signal. The software will not be able to reflect the state of the clock polarity.

2. There is no minimum skew between SDRAMs and their fly-by routing. Each byte lane is independently leveled simultaneously. However all fly-by nets must be length-matched from the controller to each SDRAM. Routing all of these nets using only micro-vias will be difficult. I expect that you will need 3 to 4 routing layers plus reference planes both top and bottom to make this work. I have seen this done elegantly on a blind-via stack-up where there are separate 8-layer stacks top and bottom.

3. These impedance decisions are dependent on the preferences of the customer. We provide the impedances that we have used in our validation. Whichever you choose, you need to verify it through IBIS simulation. Techniques like you mention are discussed in the JEDEC UDIMM spec. It also discusses different impedance strategies for various topologies.

Tom

0 Bruno B over 12 years ago in reply to Tom Johnson 16214

Expert 1290 points

Hi Tom,

Thank you for your detailed answer. Some clarification could be helpfull for me :

- You explain that " In the physical layout, you should keep the clock length longer or only slightly shorter than the shortest data strobe length"

1) Do you agree that "only slightly shorter" can extend up to 4-to-5 cm , as shwon in the SPRABI1A-table 17, meaning that clock and ADR/CMD bus being shorter than DQ/DQS bus.

Is this correct ?

2) 1) in your above statement, "clock length" represents the clock length from the DSP to one DDR chip, and "strobe length" represents the strobe length from DSP to the SAME DDR chip. And this applies to all the 4 chips individually.

Is this correct ?

With best regards,

Bruno

0 Tom Johnson 16214 over 12 years ago in reply to Bruno B

TI__Mastermind 46460 points

Bruno,

#1 I recommend that the clock not be 4 to 5 cms shorter than the data strobe although this is functionally valid at the lower operating rates. It will limit the maximum data rate supported in the design. The PHY_CALC spreadsheet distributed along with the KeyStone DDR3 Initialization Application Report SPRABL2A available at http://www.ti.com/litv/pdf/sprabl2a shows the equations again at the bottom of the Instructions tab that are enforced when calculating the PHY initialization register values. All of these limits vary with clock frequency.

#2 Your statement is correct. You can see the the equations in SPRABI1A apply to each byte lane individually.

Tom

0 Feng over 9 years ago in reply to Tom Johnson 16214

Expert 2985 points

Hi Tom,

In my project, I met a same question. Because of the limitation of PCB layout area, our PCB layout engineer put the 4 DDR3 chips on both Top and Bottom and routed the Fly-By nets like below.

My questions are:

1. For fabrication cost consideration, we just used through hole vias (not contained buried vias or blind vias) as we did normally when putting all the 4 DDR3 chips on the top layer before.

Could this via-stubs damage the signal integrity to hurt the data transfer stability when running at 1066MT/s or 1333MT/s? We have 18 layers and the board thickness is around 2.3mm.

As we did before when all the 4 DDR3 chips are putted on the top layer, we have no buried vias or blind vias and DDR3 can run OK at the 1333MT/s.

2. For ease layout, our PCB engineer changed the chips' order as the figure above. The clock nets are routed to layer 14 firstly and connected to U3 through a via. Then the clock nets are routed to U4 then U2 and U1 finally.

So when using fly-by routing, does the DD3 chips order is critical to follow the ascending byte sequences ?

3. In one word, is this fly-by topology OK which is showed in the figure I attached?

Regards,

Feng

0 Tom Johnson 16214 over 9 years ago in reply to Feng

TI__Mastermind 46460 points

Feng,

1. Yes this should be acceptable. But we do acknowledge that the long via stubs will degrade signal integrity. Also, you only show the fly-by group routes and vias. The data group routes and vias are even more problematic. When all SDRAMs and the SOC are on top, the data group routes can be placed on layers close to the bottom to minimize via stub length. With SDRAMs placed top and bottom, this is not an option. Some data group via barrels will be very long. These may need back-drilling.

2. The fly-by order is not significant. Leveling is fully independent for each byte lane on the K2H device.

3. This layout is possible but we prefer the single layer topology. We have seen customers have much better success when all SDRAMs are on a single layer.

Tom

0 Feng over 9 years ago in reply to Tom Johnson 16214

Expert 2985 points

Hi Tom,

Thanks for your detailed replies!

Regards,
Feng

Processors

Processors forum

TMS320C6678 DDR3 interface