This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3358 Reset Issue

Other Parts Discussed in Thread: AM3358

Hello,

I have encountered an issue with what seems to be the AM3358 being constantly reset as it attempts to boot. With my current design, I have four boards that are working properly and one that is experiencing this issue, leading me to believe there is an issue with a piece of hardware installed on this particular board. Generally, our process is to get to u-boot and flash our empty EEPROM once there, but in this instance, we are not even able to make it to u-boot. Instead, we simply replaced the EEPROM with one that has already been flashed, which removed the incorrect magic number and board ID error and yielded the resulting message displayed in a repeating manner:

U-Boot SPL 2014.04-00015-gb4422bd (April 22 2014 - 13:24:29)

U-Boot SPL 2014.04-00015-gb4422bd (April 22 2014 - 13:24:29)

U-Boot SPL 2014.04-00015-gb4422bd (April 22 2014 - 13:24:29)

.... (constantly repeats)

I have probed the PMIC pins that interface with the ARM and everything here seems fine:

PMIC_PWR_EN = 1.816 V

Wakeup = 1.805 V

SYS_RESETn = 3.259 V

LDO_PGOOD = 1.805 V

PMIC_PGOOD = 1.8 V

The only thing that doesn't seem quite right is the PMIC_INTn, which is at 0.015 V, but it is also at the same value on the working boards, so this does not seem to be causing any issues. I'm wondering if this could be an issue with the DDR3 or a different piece of hardware, or if perhaps the issue lies with the ARM itself as well as if there is somewhere else that I should be probing to try to determine the cause of this issue.

  • Tony Kunkel said:
    PMIC_INTn, which is at 0.015 V

    It's normal for PMIC_INTn to be asserted (low) at boot. The pending irq is indicating the power-up reason.

    Tony Kunkel said:
    I'm wondering if this could be an issue with the DDR3

    Apparently the ROM bootloader is able to load the SPL and the SPL is able to display a message to the serial console, so obviously quite some things already need to be working for that. All this however runs from internal SRAM, so a problem with the DDR3 RAM sounds like a plausible cause for the behaviour you're describing.

    I suggest verifying via JTAG whether RAM is working properly. If it is a problem with the RAM connection then simply "playing" with the RAM in the debugger's memory view (after EMIF initialization) will probably already reveal the problem.

     

    (P.S. If this is a custom board, don't forget to also determine correct leveling values for your board and configure those in u-boot to maximize reliability.)

  • To add to the link Matthijs has posted, here is the AM335X EMIF tuning link: http://processors.wiki.ti.com/index.php/AM335x_EMIF_Configuration_tips In addition to this Matthijs has already given you the link for software leveling: http://processors.wiki.ti.com/index.php/AM335x_DDR_PHY_register_configuration_for_DDR3_using_Software_Leveling

    And here is a link to a step-by-step tutorial how to do this: http://processors.wiki.ti.com/index.php/Sitara_Linux_Training:_Tuning_the_DDR3_Timings_on_BeagleBoneBlack

  • I had assumed that the issue was with the RAM, so I swapped that out for a different module. However, the exact same output appears. Could you please explain how to verify via JTAG whether or not RAM is working? I have performed some troubleshooting via JTAG and included that here, but from what I can tell, there is an issue with the initial boot process, which could point toward MMC0 with the SD card.

    WITHOUT PUSH BUTTON PRESSED : NAND Flash Boot, then SD CARD card, OUTPUT SHOWN BELOW

    CONTROL: device_id = 0x2b94402e   * AM335x family   * Silicon Revision 2.1

    PRM_DEVICE: PRM_RSTST = 0x00000001   * Bit 0 : GLOBAL_COLD_RST

    CONTROL: control_status = 0x0040033c   * SYSBOOT[15:14] = 01b (24 MHz)   * SYSBOOT[11:10] = 00b No GPMC CS0 addr/data muxing   * Device Type = General Purpose (GP)   * SYSBOOT[7:6] = 00b MII (EMAC boot modes only)   * SYSBOOT[5] = 1 CLKOUT1 enabled   * Boot Sequence : MMC1 -> MMC0 -> UART0 -> USB0

    ROM: Current tracing vector, word 1 = 0x0000907f   * Bit 0  : [General] Passed the public reset vector   * Bit 1  : [General] Entered main function   * Bit 2  : [General] Running after the cold reset   * Bit 3  : [Boot] Main booting routine entered   * Bit 4  : [Memory Boot] Memory booting started   * Bit 5  : [Peripheral Boot] Peripheral booting started   * Bit 6  : [Boot] Booting loop reached last device   * Bit 12 : [Peripheral Boot] Device initialized   * Bit 15 : [Peripheral Boot] Peripheral booting failed

    ROM: Current tracing vector, word 1 = 0x0000f000   * Bit 12 : [Memory Boot] Memory booting trial 0   * Bit 13 : [Memory Boot] Memory booting trial 1   * Bit 14 : [Memory Boot] Memory booting trial 2   * Bit 15 : [Memory Boot] Memory booting trial 3

    ROM: Current tracing vector, word 1 = 0x00111000   * Bit 12 : Memory booting device SPI   * Bit 16 : Peripheral booting device UART0   * Bit 20 : [Peripheral Boot] Peripheral booting device USB

    ROM: Current copy of PRM_RSTST = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000001   * Bit 0  : [Memory Boot] Memory booting device NULL

    Cortex A8 Program Counter = 0x000233be

    ROM Exception Vectors   * 0x4030CE04 Undefined   * 0x4030CE08 SWI   * 0x4030CE0C Pre-fetch abort   * 0x4030CE10 Data abort   * 0x4030CE14 Unused   * 0x4030CE18 IRQ   * 0x4030CE1C FIQ

    ROM Dead Loops   * 0x00020080 Undefined exception default handler   * 0x00020084 SWI exception default handler   * 0x00020088 Pre-fetch abort exception default handler   * 0x0002008C Data exception default handler   * 0x00020090 Unused exception default handler   * 0x00020094 IRQ exception default handler   * 0x00020098 FIQ exception default handler   * 0x0002009C Validation test PASS   * 0x000200A0 Validation test FAIL   * 0x000200A4 Reserved   * 0x000200A8 Image not executed or returned   * 0x000200AC Reserved   * 0x000200B0 Reserved   * 0x000200B4 Reserved   * 0x000200B8 Reserved   * 0x000200BC Reserved

    For this output, the booting sequence is correct given the hardware configuration. However, I don't understand the tracing vector stating that bit 12: memory booting device SPI, before peripheral booting devices UART0 and USB. It seems to me that this should be MMC0 instead of SPI, but I'm sure I'm just missing something.

    WITH PUSH BUTTON PRESSED FOR BYPASSING FLASH AND BOOTING FROM SD CARD, OUTPUT SHOWN BELOW

    CONTROL: device_id = 0x2b94402e   * AM335x family   * Silicon Revision 2.1

    PRM_DEVICE: PRM_RSTST = 0x00000001   * Bit 0 : GLOBAL_COLD_RST

    CONTROL: control_status = 0x00400338   * SYSBOOT[15:14] = 01b (24 MHz)   * SYSBOOT[11:10] = 00b No GPMC CS0 addr/data muxing   * Device Type = General Purpose (GP)   * SYSBOOT[7:6] = 00b MII (EMAC boot modes only)   * SYSBOOT[5] = 1 CLKOUT1 enabled   * Boot Sequence : SPI0 -> MMC0 -> USB0 -> UART0

    ROM: Current tracing vector, word 1 = 0x0010009f   * Bit 0  : [General] Passed the public reset vector   * Bit 1  : [General] Entered main function   * Bit 2  : [General] Running after the cold reset   * Bit 3  : [Boot] Main booting routine entered   * Bit 4  : [Memory Boot] Memory booting started   * Bit 7  : [Boot] GP header found   * Bit 20 : [Configuration Header] CHSETTINGS found

    ROM: Current tracing vector, word 1 = 0x0001d000   * Bit 12 : [Memory Boot] Memory booting trial 0   * Bit 14 : [Memory Boot] Memory booting trial 2   * Bit 15 : [Memory Boot] Memory booting trial 3   * Bit 16 : [Memory Boot] Execute GP image

    ROM: Current tracing vector, word 1 = 0x00001000   * Bit 12 : Memory booting device SPI

    ROM: Current copy of PRM_RSTST = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000001   * Bit 0  : [Memory Boot] Memory booting device NULL

    Cortex A8 Program Counter = 0x402f0440

    ROM Exception Vectors   * 0x4030CE04 Undefined   * 0x4030CE08 SWI   * 0x4030CE0C Pre-fetch abort   * 0x4030CE10 Data abort   * 0x4030CE14 Unused   * 0x4030CE18 IRQ   * 0x4030CE1C FIQ

    ROM Dead Loops   * 0x00020080 Undefined exception default handler   * 0x00020084 SWI exception default handler   * 0x00020088 Pre-fetch abort exception default handler   * 0x0002008C Data exception default handler   * 0x00020090 Unused exception default handler   * 0x00020094 IRQ exception default handler   * 0x00020098 FIQ exception default handler   * 0x0002009C Validation test PASS   * 0x000200A0 Validation test FAIL   * 0x000200A4 Reserved   * 0x000200A8 Image not executed or returned   * 0x000200AC Reserved   * 0x000200B0 Reserved   * 0x000200B4 Reserved   * 0x000200B8 Reserved   * 0x000200BC Reserved

    For this output, the boot sequence updates to the expected sequence and the tracing vectors change. The confusing part of this message is in tracing vector 2, where it skips booting trial 1, which would be the SD card on MMC0 and goes to trial 2. I'd appreciate any help I can get on making some sense of these outputs. Thanks.

  • Tony Kunkel said:
    For this output, the booting sequence is correct given the hardware configuration. However, I don't understand the tracing vector stating that bit 12: memory booting device SPI, before peripheral booting devices UART0 and USB. It seems to me that this should be MMC0 instead of SPI, but I'm sure I'm just missing something.

    Most likely the docs are wrong. Almost nobody consults those tracing vectors so it's easy for mistakes to go unnoticed.

    You showed earlier that SPL sent output to UART0; this means that ROM had succesfully loaded SPL and transferred control to it. I would therefore not expect to find any enlightenment in examining the ROM tracing vectors.

    I would suggest loading the symbols for SPL into CCS (if I remember correctly menuitem Debug -> Program -> Load symbols... then open the file u-boot-spl produced in the spl subdirectory when building u-boot) and setting a breakpoint at, for example, spl_relocate_stack_gd since that's after early board initialization (including EMIF) and right before SPL will access external RAM for the first time (to relocate itself there)

    Open a memory view at 0x80000000, write some data there, zero-fill memory a chunk of memory elsewhere (say 64 KB at 0x80010000) and check that the data at 0x80000000 is still intact.

    If you never even reach spl_relocate_stack_gd, set an earlier breakpoint, if necessary all the way at __start (0x402f0400), step through SPL and see where it goes wrong.

    Or, open the "ARM advanced debug features" thing in CCS and enable catching of data aborts, prefetch aborts, and undefined instructions, since it seems likely one of those will occur if something goes wrong in SPL.

    NOTE: when setting breakpoints, be sure to use hardware breakpoints, not software breakpoints.

  • BTW, the relevant u-boot source files for understanding the program flow during early initialization are:

    arch/arm/cpu/armv7/start.S

    contains reset vector, calls lowlevel_init and then _main

    arch/arm/cpu/armv7/lowlevel_init.S

    contains lowlevel_init, calls s_init

    arch/arm/cpu/armv7/am33xx/board.c

    contains s_init and board_init_f

    arch/arm/lib/crt0.S

    contains _main, calls board_init_f (which calls sdram_init) and then spl_relocate_stack_gd

  • Hello again,

    Following successful testing with our initial board design, we made a few modifications and spun another PCB. Very little was changed by way of DDR3, with mostly minor tuning adjustments due to differences in trace length measurements from Altium 09 to Altium 15. After attempting to bring this board up, we are encountering issues once again getting the system to boot. Initially, we get the following error:

    U-Boot SPL 2015.07-rc3-00001-g2c9c20a (Jun 30 2015 - 09:02:07)

    ** Partition 1 not valid on device 0 **

    spl_register_fat_device: fat register err - -1

    spl_load_image_fat: error reading image u-boot.img, err - -1

    spl: no partition table found

    spl: mmc: no boot mode left to try

    ### ERROR ### Please RESET the board ###

    Following this, I followed the advice listed here and modified the .gel file for software leveling the DDR3. Having done this, we were able to run the AM335x System Initialization as well as converge upon optimal values for the DDR PHY slave ratio register values. After opening up the memory window in DDR space, our problem becomes clear. In trying to bulk write 0x1234 across a multitude of memory locations, we arrive at the following output:

    0x80000000  0x0000

    0x80000002  0x1234

    0x80000004  0x1234

    0x80000006  0x1234

    0x80000008  0x1234

    0x8000000A  0x1234

    0x8000000C 0x1234

    0x8000000E  0x1234

    0x80000010  0x0000 (repeats with either 0x0000 or 0x5555 at location where 0 in least significant nibble location)

    This is confirmed using the DDR_DataTransferCheck script, which produces the following results:

    CortxA8: GEL output: No of failed locations are :: 0x00000001

    CortxA8: GEL output: No of failed locations are :: 0x00000001

    CortxA8: GEL output: No of failed locations are :: 0x00000001

    CortxA8: GEL output: No of failed locations are :: 0x00000001

    CortxA8: GEL output: No of failed locations are :: 0x00000002 (increments every 4th attempt)

    ...

    CortxA8: GEL output: No of failed locations are :: 0x0000003F (final count)

    From what I can gather, it appears that we are having an issue with address bit 0 or address bit 1 on the DDR3. Am I correct in assuming this? If so, is there any way to narrow down whether this is a layout issue (very little changed from previous design) or if it is a component or workmanship issue? Would the values obtained from the DDR3_slave_ratio_search_auto.out file converge if there was a layout issue or is this tool only concerned with the clock and dqs lines?

    Thanks,

    Tony

  • A problem with one or more address lines would more likely cause at least half the memory locations to misbehave.

    Instead of filling memory, try writing some individual bytes and then refresh the memory view to see what reads back.

    DDR3 uses 8-cycle bursts, which means you might be seeing the first word of every burst corrupted (either on read or on write). If that is the case, this probably indicates a serious issue with timing parameters.

  • The issue we were previously experiencing with the DDR Test errors has been fixed after using a .gel file with the updated DDR3 configuration values. We have run the DDR test script,the EDMA test script and written data in the memory view to confirm proper operation. Therefore, it does not seem that our issue is with the DDR3. However, we are still experiencing the same issue with the board reset error as previously posted.


    In our troubleshooting, we have found that u-boot does not seem to be making it to the location in board.c where the DDR3 memory is configured. Would there be any way to move this initialization up in the boot process or would this be inadvisable? Through JTAG troubleshooting, we have found the boot process stalls out in the spl_start_uboot() section of board.c and seems to have an issue with #ifdef CONFIG_SPL_ENV_SUPPORT, which means that env_init() and env_relocate_spec() are never called. We have traced the location of the error message we are receiving to the hang.c file and have seen messages indicating there are no more boot devices to attempt. This occurs at memory location 0x402F18B2 in the spl_load_image() function, which is after the breakpoint set at spl_relocate_stack_gd.

    We have tried booting from multiple images of linux (debian and angstrom) which have previously worked with our previous design using a SD card and have tried one image by attempting to boot directly from JTAG, which did not yield any better results.

    Also- we have been using the following for our UBOOT source and configuration building process as we are using Ubuntu Linux:

    https://eewiki.net/display/linuxonarm/BeagleBone+Black.

    Just wanted to make sure the UBOOT target is correct?

    make ARCH=arm CROSS_COMPILE=${CC} am335x_evm_defconfig

    Given this information, could you offer any additional advice on potential sources of this problem, as we have exhausted most of our hunches?

  • Update: I have been able to track down the issue. It turned out that a resistor of an incorrect value (22 ohms instead of 10k) was installed on the DAT1 line of the SD card. This explains why it worked to a certain point (while it was operating in single bit mode), but failed once it switched to 4-bit mode to try to load the OS. I appreciate all of your assistance on this matter.