This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

am3517 DDR2, kernel crashes

Other Parts Discussed in Thread: AM3517

Hello TI community!

We have a custom am3517 board.

On some boards we have been experiencing random kernel crashes (linux 2.6.37 from arago git).

Further investigation has shown that this might be due to DDR2 issues. Running memtester from linux on faulty boards confirms

that some memory operations fail (especially the bit-flip test). We are using 2x128MB Winbond W971GG6JB chips where one chip uses the lower 16 bits of the 32bit address/data and the other chip the high 16 bits (one CS used).

We have found that setting EMIF4_CFG_DDR2_DDQS=0 (single ended datastrobe) makes these boards pass all memory tests.

It is also impossible to start u-boot when full drive strength (EMIF4_CFG_SDR_DRV=0 in x-loader) is used.

We are pretty sure that the timing registers EMIF4_TIM(1-3) are OK. We have checked these with the spreadsheet tool, values from datasheet

gives even narrower values than what we are currently using in x-loader.

One discrepancy on our board is that we are using a 3.3V input to the 26Mhz core clock (SYS XTALIN) and this should be 1.8V according to updated datasheets.

ANY thoughts on why we are getting these symptoms would greatly be appreciated!

- Why does single ended DQS work when differential should be better?

- Could this be a timing problem nonetheless?

- What could cause 100% drive-strength NOT to work when 60% works? Feels like it should be the other way around with DDR2?

- Could this be a problem with supply voltage to RAM?

- Could this be a symptom of 3.3V core clock?

We are hesitant to run our DDR2 with single ended DQS and on reduced drive-strength as we are unsure how this will affect long time reliability.

Best regards,

Anton Olofsson

  • Hi Anton,

    I'm not an expert on DDR configuration by any means, and I can't provide a "why" explanation (I hope someone else can because I'm curious about this too), but I just thought I'd mention that I ran into a very similar issue on the AM3517 Craneboard, which I believe is configured identically (two 128 MB chips, one of them on D0-D15, the other on D16-D31).

    The Craneboard uses the Micron MT47H64M16HR-25E (revision H) rather than Winbond chips, but what I was seeing is very similar to what you described. It was failing memtester's bit flip test (always on bit 6 [0x40] for some reason), and I was seeing weird behavior like 'a' characters turning into '!' in displayed messages and random kernel crashes.

    I ended up finding a fix from another company's (pironex) board that's based on the Craneboard, where they discovered that a change between Micron's revision G and revision H chips seemed to cause different behaviors. Looking through their github commit history, they started out by fixing it with half drive strength which seemed to fix the problem, but their eventual solution was to use full drive strength and enable memory termination on read (DDR_CONFIG_TERMON bits in the CONTROL_DEVCONF3 register).

    I don't know if this fix will work for you, but it might be worth a try--after we made the change to the Craneboard x-loader, we are passing all of the memory tests now too. We also added the dynamic VTP compensation that was added to their x-loader as well, but the termination is really what fixed our problem.

    Hope this helps,

    Doug

  • Doug!

    You are our hero! Cant thank you enough!

    I haven't been able to check all "faulty" boards yet, but your post really seems to have fixed our problems! I've been running memtester all morning on a card with memory problems without any symptoms at all.

    Setting DDR_CONFIG_TERMON in DEVCONF3 register first thing in x-loader's config_emif4_ddr() function  was the ticket. With this fix it is now possible to use full drive strength and differential DQS signals again, just as you mentioned (reduced drive strength also seems to work with termination enabled).

    To be honest we have tried different termination, but only on DDR-chip side (EMIF4_CFG_DDR_TERM). 
    DDR_CONFIG_TERMON im guessing is am3517 side termination.
    As a side note the power increase for one of our boards in "idle" linux seems roughly to be around 0.2W to 0.3W, with the above termination.
    There is now at least three different designs, although with very similar memory configurations, all with the same problem and solution.
    Im thinking that this is some quirk with the am3517 itself...sadly i cant do much more than observing when it comes to questions about DDR2 though,
    Best regards,
    Anton
  • Awesome, glad it seems to have worked for you too! You're right, it appears to be enabling termination on the AM3517 side.

    As for the power increase -- I should have mentioned that. They talk about those bits in the TI wiki and mention that values other than 0 are "not typically necessary and will result in increased power consumption".

    Also, TI's ODT support presentation says "please use this with caution since this is not exhaustively tested."

    It seems to work for us! :-)

  • Doug,

    I would like to echo Anton's well wishes.  I was having the same issues and your research has helped us to solve our DDR2 timing issues. 

    Thanks for your thoughtfulness in posting issues that you have solved which could benefit some one else.  This is what engineering is all about.