This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
We are using the AM3352 processor with a single 512MB Micron DDR3 RAM chip for several years in our products. During manufacturing a small amount of boards fail testing (around 5 of 2000), because they do not boot after a warm reset (induced by reboot command of the Linux OS). In such cases the MLO is loaded, initializes the EMIF and then crashes after a certain amount of accesses to the DDR3. After a cold reset all of these boards run stable and produce no errors when running a deep RAM test.
The bootloader (barebox 2022.08, but it's the same for u-boot) always initializes the EMIF in the same way (full initialization). From investigating the signals DDR_RESET and DDR_CKE with an oscilloscope, I can tell that even with the same init code the hardware behaves differently in both reset cases:
On cold reset there is a proper delay (~500us) between the rising edge of DDR_RESET and the rising edge of DDR_CKE, which is not the case for a warm reset (only ~30us delay in between the rising edges).
From what I have read in a related thread (comment by TI employee JJD), in case of a warm reset, a different initialization sequence should be used.
What is the correct EMIF initialization sequence (which registers have to be written) for a warm reset? Unfortunately there is no guidance in the reference manual about this topic, so I kindly ask for support.
On a warm reset on AM335x, the processor should be putting the DDR into a self-refresh state (indicated by CKE low, RESET remains high throughout). Is this not happening? Coming out of a warm reset, the initial bootloader should detect reset reason is a warm reset (in PRM_RSTST) and skip over the DDR initialization
Now if you intend to initialize the DDR after every warm reset, the timing between CKE and RESET as you pointed out is incorrect. You should be able to fix that by ensuring that when initializing all the EMIF registers, the REF_CTRL.reg_refresh_rate should be set to 0x3100. Then after initialization, it can be set back to the optimal value for the refresh rate of the device. The 0x3100 should produce the required 500us delay., but it has to be set before the initialization is started
Regards,
James
DDR is put into self-refresh state, but after warm reset RESET is pulled low for a short period of time during the VTP initialization. I don't understand why the chip does this, because from my understanding VTP has nothing to do with RESET.
The barebox bootloader for some reason only configures 0x2800 as reg_refresh rate which actually results in a slightly lower delay than 500us, but still >400us. But this is only the case for the cold reset! Even with this configuration on warm reset the delay between RESET and CKE rising edge is not ensured.
For me it looks like the EMIF is not intended to be reconfigured after warm reset. But from what I have observed with the debugger, after warm reset the EMIF clock is disabled and CKE is low. So at least the clock has to be enabled, CKE has to be brought high somehow and maybe other registers have to be written. The TRM does not explain how exactly the EMIF behaves after a soft reset. How does the EMIF know that the DDR has been put into self-refresh during reset and has to exit self-refresh? And in reaction to which register access is the self-refresh exit sequence performed? Can you please tell me exactly which registers have to be written after warm reset and in which order?
I also found out boards which fail during warm reset boot will crash on cold boot, when the auto self-refresh feature is enabled (PWR_MGMT_CTRL is set to 0x2A0, as suggested by the EMIF configuration tool). On other boards not having the warm reset issue this configuration runs fine. We do not want to use this feature, so this register normally has its default value of 0.
For me it looks like that all of this boils down to an issue with self-refresh exit on some processor/DDR combinations which e.g. result in a DDR command timing out of specification, but unfortunately I do not have the measurement equipment to prove that.
In chapter 7.3.3.12.2 of the TRM it is mentioned that after self-refresh exit the EMIF performs incremental leveling/training. As far as I know from the errata DDR leveling is not supported. Might this be an explanation for failure after self-refresh exit (also after warm reset init)?
The VTP macro also needs to be checked coming out of warm reset. If it is enabled and a warm reset occurred, the VTP initialization should be skipped. Re-initializing VTP will affect all signals including RESET. This is the reason why you are seeing RESET low for a short period of time.
Once you fix that, i think the self-refresh initiated by warm reset should work. CKE should go low, indicating the memory is in self-refresh, and come back high indicating normal operation. RESET should remain high throughout.
The EMIF clock most likely is also disabled briefly because of the VTP re-initialization.
The self-refresh entry and exit is triggered by hardware. Self refresh entry and exit sequences are taken care of by the EMIF (there are a couple of timing register bit fields which are associated with self-refresh entry/exit). You shouldn't have to write anything to the EMIF registers, their values should be maintained across a warm reset.
You should be able to run fine with PWR_MGMT_CTRL=0. This just means you won't go into self-refresh at any time during normal operation, only during warm reset.
The EMIF on AM335x does not support incremental leveling/training. If you believe you would need to retrain after a warm reset (eg, if the warm reset during is long and environmental factors such as temperature would change a lot during warm reset), then i would recommend going thru a full initialization after every warm reset. But if the warm reset is brief, the self-refresh should work.
Try removing the VTP init from the warm reset code path, check the signaling to ensure reset stays high, and see if you get proper functionality. Let me know the results.
Regards,
James
Hi James, hope you had a good start into the new year.
I tried your suggestion of skipping VTP re-initialization after warm reset. I can confirm that the short pulse on DDR_RESET no more occurs, but unfortunately the bootloader still crashes after warm reset.
Since almost all boards run fine, even with the short pulse on DDR_RESET, I think the initialization code is not the root cause.
Do you have any other idea?
My only possible explanation is, that there is some kind of timing incompatibility in certain combinations of AM335x and the Micron DDR when leaving self-refresh. To confirm this, we would need to swap the processor or RAM on these boards.
What is the code path (with respect to the EMIF) upon bootup after warm reset? Are any registers modified in the EMIF during the boot after warm reset? There is some self-refresh exit timing that is configured in the EMIF, but this should be set correctly using the EMIF configuration tool. Was this used for your board? If so, can you send the spreadsheet and the Micron part number? I can check the settings.
Regards,
James
I tried both full EMIF register configuration (all registers written) and minimum required configuration (clock enable and ddr_cke_ctrl), the only difference was the missing reset pulse.
The Micron RAM we are using is MT41K256M16HA-125 IT or alternatively MT41K256M16TW-107 IT
Please find the register configuration tool (old version, since the project is 6 years old) in the attachment.
AM335x_DDR_register_calc_tool_SUB3.xls
These are the EMIF register values we are using:
DDR_PHY_CTRL_1: 0x100007
SDRAM_TIM_1: 0x0AAAD4DB
SDRAM_TIM_2: 0x266B7FDA
SDRAM_TIM_3: 0x501F867F
ZQ_CONFIG: 0x50074BE4
SDRAM_CONFIG: 0x61C05332
SDRAM_CONFIG2: 0x0
SDRAM_REF_CTRL: 0xC30
Can you explain more about "minimum required configuration (clock enable and ddr_cke_ctrl)". Which registers are you writing to?
I believe that the DDR should come out of self-refresh automatically after warm reset deasserts. You can check this by monitoring the CKE signal after warm reset goes high. You shouldn't have to write to any regs in the EMIF. This will ensure proper self-refresh exit sequence. Is this not happening?
Regards,
james
The init code of barebox is as follows:
am33xx_enable_ddr_clocks(); am33xx_config_vtp(); am33xx_ddr_phydata_cmd_macro(cmd_ctrl); am33xx_config_ddr_data(ddr_data, 0); am33xx_config_ddr_data(ddr_data, 1); am33xx_config_io_ctrl(ioctrl); val = readl(AM33XX_DDR_IO_CTRL); val &= 0xefffffff; writel(val, AM33XX_DDR_IO_CTRL); val = readl(AM33XX_DDR_CKE_CTRL); val |= 0x00000001; writel(val, AM33XX_DDR_CKE_CTRL); am33xx_config_sdram(emif_regs);
If no EMIF registers are written after warm reset, CKE will remain low and no clock is seen. To get the DDR interface working at least the clock has to be enabled and the ddr_cke_ctrl bit has to be set (to disable CKE gating). Only when doing this CKE goes high and clock is enabled.
am33xx_enable_ddr_clocks(); val = readl(AM33XX_DDR_CKE_CTRL); val |= 0x00000001; writel(val, AM33XX_DDR_CKE_CTRL);
Have you been able to check my DDR settings? I assume they are fine, but nevertheless would be highly interested in an independent checking.
Ok, this is fine for DDR_CKE_CTRL. This is actually a register in the control module, not the EMIF, and acts as an override for the CKE signal. Once you set it to 1, it disables the override, which is why you see the clock.
So i think you do need to set PWR_MGMT_CTRL to 0x2A0 before the warm reset. I think what is happening is that you have disabled self-refresh with PWR_MGMT_CTRL = 0, thus a self-refresh exit command is not happening.
So coming out of a warm reset, enable enable EMIF Clocks and set DDR_CKE_CTRL = 1. With the PWR_MGMT_CTRL set to 0x2A0, a proper self-refresh exit command will occur when the EMIF is not idle.
If your warm resets are deterministic, you could set PWR_MGMT_CTRL back to 0 for normal operation, then set it to 0x2A0 before initiating the warm reset. If you don't know when warm resets will be issued, you need to keep PWR_MGMT_CTRL=0x2A0 so the proper self refresh entry/exit commands will be sent to the memory.
Regards,
James
With PWR_MGMT_CTRL=0x2A0 even cold boot now fails after a certain amount of DDR accesses on boards which previously failed during warm reset. Something is wrong with the self-refresh entry or exit on these boards.
Do you have any other suggestions? Please let me know the results of checking my DDR settings, maybe some timing is at the limit.
I'm not sure what is going wrong with self-refresh entry/exit. I looked at your timings and they seem to be fine relative to the DDR datasheet.
Since you've had the product in production for some time, you may want to go back to the what you had originally, which is reinitializing the DDR on every warm reset. This seems like it would give you the most stable operation since cold reset hasn't given you any problems. I think originally you had some minor issues with the RESET and CKE timing which was contributing to the failures on just a handful of boards. Once you make that more robust, you should be able to get the warm reset exit more robust.
So my suggestion would be to return to the original code and then
-keep PWR_MGMT_CTRL=0 all the time
-skip VTP re-initialization after warm reset
-ensure that the write to REF_CTRL.reg_refresh_rate = 0x3100 occurs as the first EMIF register write. Then write all the other registers as you would in a cold boot, and then ensure the last register write is to SDRAM_CONFIG register (this will kick off the hardware init sequence).
-the final write to the EMIF will be REF_CTRL.reg_refresh_rate = 0xC30, which will provide the proper refresh rate to the memory.
When observing CLK, RESET and CKE on the scope, after a warm reset, you should see CLK reenabled first, RESET should be low for at least 200us. Once RESET goes high, CKE should go high after at least a 500us delay. If this isn't happening, then the initialization after warm reset is not correct.
Regards,
James