This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3358 uSD not Booting

Hello,

we are tying to bringup a new custom board based on the beaglebone black. I have a functional TI-Arago based uSD which does work on the BBB, as well as a previous version of our board.

However, for some reason the uSD does not boot on our new system. We see "CCCCCCCCC" over UART0 at boot without a uSD card installed since the eMMC flash is currently erased. If we turn on the system with a uSD card installed, we get no UART output at all. We can tell the ARM is still alive as it is drawing a normal amount of current for operation.

We have put the old board and the new board next to each other each with a uSD installed, then power both up simultaneously with o-scope probes on the same lines so we can see differences between the MMC lines. What we see is the the "Command" signal for the old board is logic high always, on the new board it looks like a data bus and is totally different.


Also, we noticed that the clock on the old board runs longer than the new, where the new one seems to quit after about ~300 to 400mS.

We have verified the boot strap resistors are installed properly. Also all the MMC signal lines appear to be connected to the appropriate pads on the ARM.

We are at a total loss as to whats going on!

Are there any suggestions you might make to troubleshoot this issue?

  • Hi,

    What are your SYSBOOT settings? Can you post the schematic fragments with the MMC0 connections?
  • We have hit the same obstacle this week, and even though I have no solution yet, I will share what I got so far:

    In our case when a known working Debian image is installed, we get no console output. Without the SD card we get the CCCC. We looked at the SD pins with a logic analyzer and were able to see that the ROM is apparently downloading the MLO file (137 sectors in our case, which exactly matches the file size). This operation takes about 300ms. After that nothing happens.

    The startup hints page in this wiki seems to suggest that when the 2nd stage bootloader is started, we should see something on the console (uart0), and only then we should go further and troubleshoot the DDR. I believe this is not the case. Today I loaded the "starterware" image (this is what you can use to program the ID eeprom), and it would actually give me the following:

    StarterWare AM335x Boot Loader

    Copying application image from MMC/SD card to RAM
    Jumping to StarterWare Application...

    that suggests that the 2nd stage bootloader actually loaded the application but failed to go further, so maybe DDR *is* the issue in my case ?!

    Not sure if this helps you any, but in your case at this point I would try to see if the MLO file is loaded and run...

  • It is best to connect with JTAG to see what is happening. Is it stuck polling somewhere, hit an exception, etc?
  • I agree that SPL is probably loading with TI- images and we cannot normally see the output unless a special image is used. I tried a test with an old Robert C. Nelson based image which provided the following display over UART0:

    RCN Uboot SPL display on avocado board.txt
    U-Boot SPL 2015.01-00001-gb2412df (Jan 29 2015 - 15:01:06)
    Incorrect magic number (0xffffffff) in EEPROM
    Could not get board ID.
    Incorrect magic number (0xffffffff) in EEPROM
    Could not get board ID.
    Unknown board, cannot configure pinmux.### ERROR ### Please RESET the board ###
    

    So, although I have this output with the RCN kernel, I am unable to get into the UBOOT console to modify the I2C EEPROM values which I suspect is what's holding up the works when checking for a Beaglebone or other board....

  • Additionally, I did two JTAG captures with the TI kernel uSD (which still works on a different system).
    I do see ROM Exception Vectors and TOM Dead loops although I do not know how to interpret or use them...

    ****JTAG Capture 1, no uSD installed****
    Board Default Configuration (uSD not installed, emmc not flashed) 9-21-16

    CONTROL: device_id = 0x2b94402e
    * AM335x family
    * Silicon Revision 2.1

    PRM_DEVICE: PRM_RSTST = 0x00000001
    * Bit 0 : GLOBAL_COLD_RST

    CONTROL: control_status = 0x0040033c
    * SYSBOOT[15:14] = 01b (24 MHz)
    * SYSBOOT[11:10] = 00b No GPMC CS0 addr/data muxing
    * Device Type = General Purpose (GP)
    * SYSBOOT[7:6] = 00b MII (EMAC boot modes only)
    * SYSBOOT[5] = 1 CLKOUT1 enabled
    * Boot Sequence : MMC1 -> MMC0 -> UART0 -> USB0

    ROM: Current tracing vector, word 1 = 0x0000907f
    * Bit 0 : [General] Passed the public reset vector
    * Bit 1 : [General] Entered main function
    * Bit 2 : [General] Running after the cold reset
    * Bit 3 : [Boot] Main booting routine entered
    * Bit 4 : [Memory Boot] Memory booting started
    * Bit 5 : [Peripheral Boot] Peripheral booting started
    * Bit 6 : [Boot] Booting loop reached last device
    * Bit 12 : [Peripheral Boot] Device initialized
    * Bit 15 : [Peripheral Boot] Peripheral booting failed

    ROM: Current tracing vector, word 1 = 0x0000f000
    * Bit 12 : [Memory Boot] Memory booting trial 0
    * Bit 13 : [Memory Boot] Memory booting trial 1
    * Bit 14 : [Memory Boot] Memory booting trial 2
    * Bit 15 : [Memory Boot] Memory booting trial 3

    ROM: Current tracing vector, word 1 = 0x00111000
    * Bit 12 : Memory booting device SPI
    * Bit 16 : Peripheral booting device UART0
    * Bit 20 : [Peripheral Boot] Peripheral booting device USB

    ROM: Current copy of PRM_RSTST = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000001
    * Bit 0 : [Memory Boot] Memory booting device NULL

    Cortex A8 Program Counter = 0x000233c0

    ROM Exception Vectors
    * 0x4030CE04 Undefined
    * 0x4030CE08 SWI
    * 0x4030CE0C Pre-fetch abort
    * 0x4030CE10 Data abort
    * 0x4030CE14 Unused
    * 0x4030CE18 IRQ
    * 0x4030CE1C FIQ

    ROM Dead Loops
    * 0x00020080 Undefined exception default handler
    * 0x00020084 SWI exception default handler
    * 0x00020088 Pre-fetch abort exception default handler
    * 0x0002008C Data exception default handler
    * 0x00020090 Unused exception default handler
    * 0x00020094 IRQ exception default handler
    * 0x00020098 FIQ exception default handler
    * 0x0002009C Validation test PASS
    * 0x000200A0 Validation test FAIL
    * 0x000200A4 Reserved
    * 0x000200A8 Image not executed or returned
    * 0x000200AC Reserved
    * 0x000200B0 Reserved
    * 0x000200B4 Reserved
    * 0x000200B8 Reserved
    * 0x000200BC Reserved



    ****JTAG Capture 2, uSD installed****
    CONTROL: device_id = 0x2b94402e
    * AM335x family
    * Silicon Revision 2.1

    PRM_DEVICE: PRM_RSTST = 0x00000001
    * Bit 0 : GLOBAL_COLD_RST

    CONTROL: control_status = 0x0040033c
    * SYSBOOT[15:14] = 01b (24 MHz)
    * SYSBOOT[11:10] = 00b No GPMC CS0 addr/data muxing
    * Device Type = General Purpose (GP)
    * SYSBOOT[7:6] = 00b MII (EMAC boot modes only)
    * SYSBOOT[5] = 1 CLKOUT1 enabled
    * Boot Sequence : MMC1 -> MMC0 -> UART0 -> USB0

    ROM: Current tracing vector, word 1 = 0x0010009f
    * Bit 0 : [General] Passed the public reset vector
    * Bit 1 : [General] Entered main function
    * Bit 2 : [General] Running after the cold reset
    * Bit 3 : [Boot] Main booting routine entered
    * Bit 4 : [Memory Boot] Memory booting started
    * Bit 7 : [Boot] GP header found
    * Bit 20 : [Configuration Header] CHSETTINGS found

    ROM: Current tracing vector, word 1 = 0x0001f000
    * Bit 12 : [Memory Boot] Memory booting trial 0
    * Bit 13 : [Memory Boot] Memory booting trial 1
    * Bit 14 : [Memory Boot] Memory booting trial 2
    * Bit 15 : [Memory Boot] Memory booting trial 3
    * Bit 16 : [Memory Boot] Execute GP image

    ROM: Current tracing vector, word 1 = 0x00001000
    * Bit 12 : Memory booting device SPI

    ROM: Current copy of PRM_RSTST = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000001
    * Bit 0 : [Memory Boot] Memory booting device NULL

    Cortex A8 Program Counter = 0x402f7448

    ROM Exception Vectors
    * 0x4030CE04 Undefined
    * 0x4030CE08 SWI
    * 0x4030CE0C Pre-fetch abort
    * 0x4030CE10 Data abort
    * 0x4030CE14 Unused
    * 0x4030CE18 IRQ
    * 0x4030CE1C FIQ

    ROM Dead Loops
    * 0x00020080 Undefined exception default handler
    * 0x00020084 SWI exception default handler
    * 0x00020088 Pre-fetch abort exception default handler
    * 0x0002008C Data exception default handler
    * 0x00020090 Unused exception default handler
    * 0x00020094 IRQ exception default handler
    * 0x00020098 FIQ exception default handler
    * 0x0002009C Validation test PASS
    * 0x000200A0 Validation test FAIL
    * 0x000200A4 Reserved
    * 0x000200A8 Image not executed or returned
    * 0x000200AC Reserved
    * 0x000200B0 Reserved
    * 0x000200B4 Reserved
    * 0x000200B8 Reserved
    * 0x000200BC Reserved
  • Mark Miliano said:
    * SYSBOOT[15:14] = 01b (24 MHz)

    Can you confirm that your hardware is using a 24 MHz clock?  If for example you switched to 25 MHz or 26 MHz, you would need to correspondingly update these boot pins.

    Mark Miliano said:
    ****JTAG Capture 2, uSD installed****
    CONTROL: device_id = 0x2b94402e
    * AM335x family
    * Silicon Revision 2.1

    PRM_DEVICE: PRM_RSTST = 0x00000001
    * Bit 0 : GLOBAL_COLD_RST

    CONTROL: control_status = 0x0040033c
    * SYSBOOT[15:14] = 01b (24 MHz)
    * SYSBOOT[11:10] = 00b No GPMC CS0 addr/data muxing
    * Device Type = General Purpose (GP)
    * SYSBOOT[7:6] = 00b MII (EMAC boot modes only)
    * SYSBOOT[5] = 1 CLKOUT1 enabled
    * Boot Sequence : MMC1 -> MMC0 -> UART0 -> USB0

    ROM: Current tracing vector, word 1 = 0x0010009f
    * Bit 0 : [General] Passed the public reset vector
    * Bit 1 : [General] Entered main function
    * Bit 2 : [General] Running after the cold reset
    * Bit 3 : [Boot] Main booting routine entered
    * Bit 4 : [Memory Boot] Memory booting started
    * Bit 7 : [Boot] GP header found
    * Bit 20 : [Configuration Header] CHSETTINGS found

    ROM: Current tracing vector, word 1 = 0x0001f000
    * Bit 12 : [Memory Boot] Memory booting trial 0
    * Bit 13 : [Memory Boot] Memory booting trial 1
    * Bit 14 : [Memory Boot] Memory booting trial 2
    * Bit 15 : [Memory Boot] Memory booting trial 3
    * Bit 16 : [Memory Boot] Execute GP image

    ROM: Current tracing vector, word 1 = 0x00001000
    * Bit 12 : Memory booting device SPI

    ROM: Current copy of PRM_RSTST = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000000

    ROM: Cold reset tracing vector, word 1 = 0x00000001
    * Bit 0 : [Memory Boot] Memory booting device NULL

    Cortex A8 Program Counter = 0x402f7448

    This looks like a normal/proper boot sequence for SD card.  The program counter shows that you are executing the downloaded image.  So the issue is with the MLO itself.  While you are in the state, you should connect to the Cortex A8 (no gel files!!!) and go to Run -> Load -> Load Symbols.  You can then load symbols from your u-boot-spl file (i.e. that's the version of MLO that would have ELF debug info for CCS).  You should then be able to see precisely what function you're in and what's happening.  Or alternatively you might simply try correlating that address with the map file from SPL.  You would have much better visibility (e.g. call stack info) using CCS though.

    Does your board have a EEPROM on it like some of the TI boards?  It's been a while since I looked at that part of the code, but I'm wondering if it's getting stuck somewhere early such as trying to decide whether it is an "EVM" or "BBB" or "SK".

  • I think my issue is exactly the same as Mark's. In my JTAG capture the dump looks almost identical. When I use an older image (debian 7.5 from Nelson), I also get this output:

    U-Boot SPL 2014.04-00014-g47880f5 (Apr 22 2014 - 13:23:54)

    Incorrect magic number (0xffffffff) in EEPROM
    Could not get board ID.
    Incorrect magic number (0xffffffff) in EEPROM
    Could not get board ID.
    Unknown board, cannot configure pinmux.### ERROR ### Please RESET the board ###

    The PC is at 0x402f9d4c in my case, so also in the SPL file. I do have an ID eeprom on my board, and its wired to be on the same address as the evm and beagle. The starter-ware image is supposed to be able to program that eeprom, but when I load it, my PC gets stuck in the 0x0002008C dead loop, indicating a "Data exception". 

    (not trying to hijack the thread, but I think we might have run into the same problem)

  • Take a look at <u-boot>/board/ti/am335x/board.c. I recommend adding some instrumentation into the am33xx_spl_board_init() function. I imagine you're getting stuck there. Or try as I suggested in the previous post to connect with jtag, load symbols, and see what is happening.
  • Yes my clock is 24MHz... Also...

    I have figured out my problem. It was the magic number for the board not being setup in the EEPROM, and the MLO/SPL detected this and quit without and error before Uboot console was loaded, but as Brad mentioned since there’s not instrumentation in the MLO/SPL file- there’s no indication to the user that the system has failed to boot. The RCN kernel I had previously used has all kinds of instrumentation for this very issue.

    I fixed this issue three ways:
    1) Hardware solution: hard wired a working board to the I2c lines of the non-working board and jumped into UBOOT console on the working board where I then inputted the I2C Magic numbers with the same process as outlined below.

    2) Software/firmware Fix:
    a. I added some code in board_is_bone_lt() within board.h, replaced the return with 1 so that I could at least get into the uboot console to set the i2c
    b. Physically ground Through hole “TP2” To disable write protection
    c. Enter the following commands to program the EEPROM registers:
    i2c mw 0x50 00.2 0xAA
    i2c mw 0x50 01.2 0x55
    i2c mw 0x50 02.2 0x33
    i2c mw 0x50 03.2 0xee
    i2c mw 0x50 04.2 0x41
    i2c mw 0x50 05.2 0x33
    i2c mw 0x50 06.2 0x33
    i2c mw 0x50 07.2 0x35
    i2c mw 0x50 08.2 0x42
    i2c mw 0x50 09.2 0x4e
    i2c mw 0x50 0a.2 0x4c
    i2c mw 0x50 0b.2 0x54
    i2c mw 0x50 0c.2 0x30
    i2c mw 0x50 0d.2 0x30
    i2c mw 0x50 0e.2 0x30
    i2c mw 0x50 0f.2 0x43
    i2c mw 0x50 10.2 0x32
    i2c mw 0x50 11.2 0x33
    i2c mw 0x50 12.2 0x31
    i2c mw 0x50 13.2 0x34
    i2c mw 0x50 14.2 0x42
    i2c mw 0x50 15.2 0x42
    i2c mw 0x50 16.2 0x42
    i2c mw 0x50 17.2 0x4b
    i2c mw 0x50 18.2 0x30
    i2c mw 0x50 19.2 0x32
    i2c mw 0x50 1a.2 0x31
    i2c mw 0x50 1b.2 0x32
    Check the memory to see if your programming took:
    i2c md 0x50 00.2 0x20
    0000: aa 55 33 ee 41 33 33 35 42 4e 4c 54 30 30 30 43 .U3.A335BNLT000C
    0010: 32 33 31 34 42 42 42 4b 30 32 31 32 ff ff ff ff 2314BBBK0212....


    3) For ease of future development- I modified code within the read_eeprom( )of board.c to detect if the EEPROM was previously programmed with the magic numbers seen above. If the correct magic numbers were not detected, use the i2c_write() with the above values (ex: i2c_write(0x50, 0, 2 ,0xAA , 2)) with a for loop to write all the correct magic numbers out to the EEPROM automatically.
    Here’s my modified read_eeprom() code:

    /* Read header information from EEPROM into global structure. */
    static int read_eeprom(struct am335x_baseboard_id *header)
    {
    /* Check if baseboard eeprom is available */
    if (i2c_probe(CONFIG_SYS_I2C_EEPROM_ADDR)) {
    puts("Could not probe the EEPROM; something fundamentally "
    "wrong on the I2C bus.\n");
    return -ENODEV;
    }

    /* read the eeprom using i2c */
    if (i2c_read(CONFIG_SYS_I2C_EEPROM_ADDR, 0, 2, (uchar *)header,
    sizeof(struct am335x_baseboard_id))) {
    puts("Could not read the EEPROM; something fundamentally"
    " wrong on the I2C bus.\n");
    return -EIO;
    }

    if (header->magic != 0xEE3355AA)//if the EEPROM magic id is not detected...
    {
    /*
    * read the eeprom using i2c again,
    * but use only a 1 byte address
    */

    puts("Board Magic ID Not detected\n");
    puts("Attempting to program the i2C EEPROM\n");
    puts("BE SURE TO GROUND WRITE PROTECT TO ENABLE EEPROM WRITTING...\n");

    if(strncmp(header->name, "FFFFFFFF", HDR_NAME_LEN))
    {
    //EEPROM ID data- this is the ID that the board will detect as "A335BNLT"...
    uint8_t data[] = {0xAA, 0x55, 0x33, 0xee, 0x41, 0x33, 0x33, 0x35, 0x42, 0x4e, 0x4c, 0x54, 0x30, 0x30, 0x30, 0x43, 0x32, 0x33, 0x31, 0x34, 0x42, 0x42, 0x42, 0x4b, 0x30, 0x32, 0x31, 0x32};

    uint8_t count;
    for (count = 0; count < sizeof(data); count++)
    i2c_write (0x50, count, 2, &data[count], 2);
    }

    if (i2c_read(CONFIG_SYS_I2C_EEPROM_ADDR, 0, 1, (uchar *)header,
    sizeof(struct am335x_baseboard_id))) {
    puts("Could not read the EEPROM; something " "fundamentally wrong on the I2C bus.\n");
    return -EIO;
    }

    if (header->magic != 0xEE3355AA) {
    printf("Incorrect magic number (0x%x) in EEPROM\n", header->magic);
    return -EINVAL;
    }
    }

    return 0;
    }
  • In my case, stepping through the uboot SPL code, I can see why it crashes:

    If the eeprom is not programmed,  the board_ti_get_config() function in board_detect.c returns NULL.

    board_is_idk() in board.h calls that function and uses its return for a strncmp without NULL checking first.

    Thus, board_is_idk() doesn't return 0 but access violates memory and ends up in the "hang and never return" loop...

  • Micha,

    yes you are correct. So basically, the MLO/SPL file tries to detect which processor/board the system is using by reading the ID from the I2C EEPROM. This ID is also referred to as the MAGIC Number or MAGIC ID. If the EEPROM is not programmed (it will likely be all 0xFFFFFFF......) the  MLO file will not be able to proceed since it doesn't know which processor to setup interfaces and speed for the CPU.

    So you are having the same problem that I had- any of the above three methods would work to fix your issue...

    Mark

  • MichaG said:

    In my case, stepping through the uboot SPL code, I can see why it crashes:

    If the eeprom is not programmed,  the board_ti_get_config() function in board_detect.c returns NULL.

    board_is_idk() in board.h calls that function and uses its return for a strncmp without NULL checking first.

    Thus, board_is_idk() doesn't return 0 but access violates memory and ends up in the "hang and never return" loop...

    I filed a bug on this behavior, as I believe it should print out a message in the case that no board is found.  And it shouldn't crash!

    If you haven't yet programmed your EEPROM, this would be a convenient time to test out a quick patch to that function.  For example, is this sufficient to avoid the crash:

    diff --git a/board/ti/am335x/board.h b/board/ti/am335x/board.h
    index 9776df7..d29158c 100644
    --- a/board/ti/am335x/board.h
    +++ b/board/ti/am335x/board.h
    @@ -33,7 +33,10 @@ static inline int board_is_evm_sk(void)

     static inline int board_is_idk(void)
     {
    -       return !strncmp(board_ti_get_config(), "SKU#02", 6);
    +       if ( board_ti_get_config() )
    +               return !strncmp(board_ti_get_config(), "SKU#02", 6);
    +       else
    +               return 0;
     }

    Or are there more of these?  The board_is_bbg1() function above it looked like it might also be problematic.  If we can quickly confirm a patch to this issue, that will make it easier to have this quickly/permanently fixed for everyone.

  • your patch works fine. I think in board_is_bbg1() and also board_is_icev2() it is not a problem. Its true that the board_it_get_rev can return NULL which is then plugged into the strcmp function. But it only returns NULL if the header is TI_DEAD_EEPROM_MAGIC.
    And if THAT is the case, the first part of the AND expression has already returned false, so the board_ti_get_rev() function is never called (board_ti_is() is checking for the dead eeprom magic number itself).
    Yeah its potentially a minefield, but works ;-)
  • Micha,

    Thanks a lot for the quick reply. Glad to hear it works. So what do you end up getting as a message with that patch? Hopefully something that is clear that the EEPROM is not programmed!

    I'll see about getting that patch integrated into u-boot...

    Brad
  • in mux.c it then falls through all the board checks and ends up with the default puts("Unknown board, cannot configure pinmux.") , makes sense.
    btw. am I supposed to see that message (or any SPL message) on my uart0 console? because I'm not. I'm not sure at what point (if at all) the uart0 is set up for console output. The uboot config file does say CONFIG_CONS_INDEX=1. I do see the "CCCCC" when it tries to boot from uart0 but that's all I ever get.
  • Well, it would certainly be more useful if it was printed to the console! I'll make a note of it in my bug report. Perhaps that's something additional we can consider...
  • The console port can be initialized earlier. If you modify arch/.../board.c and add to the board_early_init_f() function the following:

    set_uart_mux_conf();
    preloader_console_init();

    (must be after prcm_init)

    ...then we get the uboot banner and trace messages during mux and memory setup. Not that I need it anymore now, but that would have been helpful. Of course there may also be a reason why the console is initialized later...
  • FYI, for AM335x we cannot initialize the console that early because the console port is board-dependent. The ICEv2 board uses UART3 while the others use UART0. In the case of missing eeprom data we don't know which to use.

    It's of course perfectly fine to make that change when customizing u-boot for your own hardware. I imagine the older MLOs floating around didn't have support for ICEv2 and so they were hard-coded to dump messages to UART0.

    Our Linux/u-boot team took a slightly different approach to solving this issue. Rather than making the change to board_is_idk() like I took, they instead opted to make some of the higher level functions return an empty string rather than a NULL pointer. That way, strncmp can still function normally.