This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3352 crashed after long time running

We use the am3352BZCZD60 made our customer boards. Software is SDK6.0.

We made 3 times boards. First time 10 boards, all are ok. The second time 10 boards all are ok. But the thirds time we made 60 pcs boards, only 6pcs board is ok, others some cannot boot(MLO is running, u boot crashed), some can boot but after run a few minutes crashed.

The pcb layout not edit during the three times. The pcb board is new process every time.

 

As some board MLO can runing, uboot can not running. We think may the problem is on the DDR.

 

The board’s DDR is DDR2, partnumber: SAMSUNG 407 K4TIG64QF-BCF7

For some reason, we design the pcb only 4 layers, and the DQS is single ends.

 

So we recheck the setting for DDR, and find some settings have huge effect:

DDR_PHY_CTRL_1 (@0x4C0000E4) field REG_PHY_RD_LOCAL_ODT

Before this time , we use this features as “ ODT OFF”, and many example of EVM are use ODT OFF. But this time it doesn’t work.

When we edit the DDR_PHY_CTRL_1 (@0x4C0000E4) field REG_PHY_RD_LOCAL_ODT from “ODT off” to Full thevenin load” or “half thevenin load” in the uboot file ddr_defs.h, all the boards can boot to the linux, and then when run app, it will run for longer time(from few minutes to serveral ours or three days). But there are many boards will crash for long time run.

 

Qestion is:

1I want to know, which one should I chose “Full thevenin load” or “Half thevenin load” ?

2When can I disable the AM335x side termination during reads, or chose “ODT off”

3what is the Function of “REG_PHY_RD_LOCAL_ODT” . This register is depend on what :the DDR Settings or the hardware board design?

 

Another test:

I want to slow down the DDR clock frequency. so I choose 150MHz DDR clock, use the parameter test passed on CCS to the MLO & uboot‘s ddr_defs.h file. Rebuild and download it to the board.

The MLO can run,the console connect to the board print some information like U-Boot SPL 2013.01.01 (Nov 14 2015 - 16:12:12)

test header info:magic=0xee3355aa, name=A3352EVM1.5, version=1.5, serial=, config=SKU#01

A335XEVM pass.

but the uboot doesn’t run.

Then I connect the board with XDS100V2 & CCS6, with no gel file.

Then load the DDR test program in CCS after loadrun it the test is passed.

I does not know what is the reason?

 

To simplify the test we use CCS V6 & XDS 100V2

the DDR test program is refre to the dm81xx's proram.

1:

DDR Clock = 266MHZ REG_PHY_RD_LOCAL_ODT = “ODT OFF” .

the DDR Test PASSED.

DDR Settings in Gel file:

#define DDR2_SDRAM_TIMING1 0x0666B3D1

#define DDR2_SDRAM_TIMING2 0x342431CA

#define DDR2_SDRAM_TIMING3 0x0000021F

 

#define CMD_PHY_CTRL_SLAVE_RATIO       0x80

#define CMD_PHY_INVERT_CLKOUT           0x0

 

#define DATA_PHY_RD_DQS_SLAVE_RATIO     0x40

#define DATA_PHY_FIFO_WE_SLAVE_RATIO   0x7F

#define DATA_PHY_WR_DQS_SLAVE_RATIO     0x1

#define DATA_PHY_WR_DATA_SLAVE_RATIO   0x40

#define DDR2_SDRAM_CONFIG   0x42005232

#define DDR2_REF_CTRL       0x0000081A   //266*7.8us = 2074.8 = 0x81A

#define DDR2_READ_LATENCY   0x00000009

 

2. Power down the board Edit the gel file define for DDR2

 

DDR clock = 150MHz REG_PHY_RD_LOCAL_ODT = “Full thevenin load” .

DDR Settings in Gel file:

#define DDR2_SDRAM_TIMING1 0x04446209 //150M

#define DDR2_SDRAM_TIMING2 0x341431CA //150M

#define DDR2_SDRAM_TIMING3 0x0000013F //150M

 

#define CMD_PHY_CTRL_SLAVE_RATIO       0x80

#define CMD_PHY_INVERT_CLKOUT           0x0

 

#define DATA_PHY_RD_DQS_SLAVE_RATIO     0x40

#define DATA_PHY_FIFO_WE_SLAVE_RATIO   0x63

#define DATA_PHY_WR_DQS_SLAVE_RATIO     0x0

#define DATA_PHY_WR_DATA_SLAVE_RATIO   0x40

#define DDR2_SDRAM_CONFIG   0x42005232

#define DDR2_REF_CTRL       0x00000492 //150*7.8us

#define DDR2_READ_LATENCY   0x00000209  

 

设置DDRPLL150MHZ

加载运行DDR test程序结果

 ALL Tests Passed

 

3

Power down the board

 

Just edit below line

#define DDR2_READ_LATENCY   0x00000209

To: #define DDR2_READ_LATENCY   0x00000009

Connect the cortex A8 With xds100v2

Then load the DDR test program and run it

The reulst is:

 Testing DDR Memory ...

Error at 80200004

Error at b0010000

FAIL ... error code 00000050.. quiting

  • jingtao wang said:

    So we recheck the setting for DDR, and find some settings have huge effect:

    DDR_PHY_CTRL_1 (@0x4C0000E4) field REG_PHY_RD_LOCAL_ODT

    Before this time , we use this features as “ ODT OFF”, and many example of EVM are use ODT OFF. But this time it doesn’t work.

    When we edit the DDR_PHY_CTRL_1 (@0x4C0000E4) field REG_PHY_RD_LOCAL_ODT from “ODT off” to Full thevenin load” or “half thevenin load” in the uboot file ddr_defs.h, all the boards can boot to the linux, and then when run app, it will run for longer time(from few minutes to serveral ours or three days). But there are many boards will crash for long time run.

    ODT consumes significant power.  The EVM design set out to prove that it's not specifically required to use ODT to have a reliable design.  This was important for certain portable applications that are battery powered and therefore sensitive to power consumption.  However, in my experience most applications should be using ODT, particularly if you are not battery powered.  ODT will substantially improve signal integrity on reads.  (It's performed in the external memory for writes.)

    jingtao wang said:
    1I want to know, which one should I chose “Full thevenin load” or “Half thevenin load” ?

    There's no definitive answer here or else we would have just hard-wired the chip.  The recommended path would be to perform IBIS simulations if you want a specific answer.

    jingtao wang said:
    2When can I disable the AM335x side termination during reads, or chose “ODT off”

    You would be expected to do extensive modeling and validation in order to be certain that your signal integrity is sufficiently good (across temperature) to support operation without ODT.  Generally ODT is recommended.

    jingtao wang said:
    3what is the Function of “REG_PHY_RD_LOCAL_ODT” . This register is depend on what :the DDR Settings or the hardware board design?

    This field is what controls ODT on the AM335x side of things.  ODT on the DRAM is controlled in the SDRAM_CONFIG register.  These details and many more are discussed in this article:

    http://processors.wiki.ti.com/index.php/How_to_use_the_AM335x_IBIS_Models

  • Thanks. It's what we want to know, we will have a try. Thanks very much.
    I had find why 150M DDR Clock cannot runing. Becasue for debug purpose, In the SPL I add a while(1) in the program,but forget to remove it later. After remove it , it working.
    But in 150Mhz DDR Clock, some board still crashed.
    Now we are do more test to find why it crashed for long time runing. And we want to edit 4 layers PCB to 6 Layers PCB, Do you think this would be better?
  • It's hard to say if your proposed changes will make an improvement without having a better understanding of the underlying failure. For example, many failures are related to power (e.g. wrong voltages, insufficient decoupling caps, noise, etc.). If your instability is due to power issues then no amount of DDR improvement is going to help! On the other hand, DDR issues are another common source of instability. It would be best if you could actually "see" the instability through some kind of memory test. For example, if you were consistently able to find bit errors on a specific lane or a specific bit, then you could focus your changes on fixing those issues. I think you need to spend more time diagnosing the underlying issues.

    A couple thoughts are:

    1. Does raising vdd_mpu and/or vdd_core a bit (e.g. 25-50mV) make the instability go away? If so, I think you need to look more closely into power.
    2. Can you actually detect bit errors using a memory test?
  • Thank you Brad. In the past  days we did more test.


    We turn on the heart beat led for test.


    1、@150MHz DDR2 clock, 18 boards continus run for 4 days, none crash.
    We did power on/off tests for 180 times, none  board dead during booting up.

    2、@266MHz DDR2 clock, 2 board rasing the vdd_mpu& vdd_core 50 mv, one board crash after 2 hours. 2 board vdd_mpu&vdd_core according to the datasheet@1.1V, one board crash after 3 hours. The crashed board's heart beat led die also.
    The power on/off tests: sometimes  the board bead during booting up.

    so, we want to redesign the board. PCB Layer will form 4 layers to 6 layers.

  • jingtao wang said:
    so, we want to redesign the board. PCB Layer will form 4 layers to 6 layers.

    I think re-doing the DDR2 layout should help.  Make sure the adjacent reference layers don't have splits or any other discontinuities that will impact the impedance of the DDR2 transmission lines.  And of course follow the skew requirements listed in the data manual.

  • PS. I checked your timings quickly as a sanity check, and everything looked good. So I agree the issue is the DDR interface and not power nor EMIF timings.