This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Redundant copies of MLO and u-boot in NAND

Hello,

We're using a custom board based on the BeagleBone Black. We have the "ti-sdk-am335x-evm-07.00.00.00" SDK. We're getting ready to move from having all code on the SD card, to having everything in NAND. I've noticed the default MTD partitions specify 4 copies of MLO, but only 1 copy of u-boot and kernel images.


Questions:

1. I'm guessing only 1 copy of the u-boot and kernel images are required because the MLO and u-boot code are using Error Correction Code (BCH-8) when reading the u-boot and kernel images (respectively) from NAND. (In contrast the ROM code is not using ECC to read MLO). Is this correct?

2. If this is not correct, could you explain why the MTD partitions (copies) are arranged this way?

3. Also, if this is not correct, is there a way to have redundant copies of u-boot and kernel images, and have MLO and u-boot detect if the images are OK, and if not, automatically use the redundant copies?

Thank you,

Everett

  • Hi Everett,

    ROM Code uses ECC, when reading boot image (MLO) from nand, see Section 26.1.7.4.1 Features of AM335x TRM.

    The reason there are four copies of the MLO in nand partition is because ROM Code is designed to look for up to four valid images in the NAND for reliability reasons (in case one of the copies is located in bad nand block it skips it and goes to the other copy).

    Check this article: processors.wiki.ti.com/.../RBL_UBL_and_host_program

    "The RBL does not do any form of bad block checking/management. It will simply rely on using the HW ECCs generated during the page reads to verify that a page read was correct. This means that the HW ECC values generated during the reads of each 512 bytes of data will be compared against the ECC values that are stored in the spare bytes of the NAND page. If the RBL sees an ECC mismatch occur during a page read, it will abort the operation from that block and try the next block. It will do this up to block 5 of the NAND device, strting from block 1 (skipping over block 0). If there is an ECC mismatch, the RBL will NOT use the ECC values to do bit error correction, even though the mismatched values could be used to find and correct a single-bit error.
    If the RBL can't get a successful read out of the first five blocks, the boot will fail and will default to attempting a UART boot"

    This is also very well explained in Seciton 26.1.7.4.2 Initialization and Detection & Section 26.1.7.4.2.1 NAND Read Sector Procedure in device TRM.

    Hope this helps.

    Best Regards,
    Yordan
  • Hello,

    Thank you for the information. I see that the RBL uses ECC to detect errors but not to correct them. That explains why there are 4 copies of the MLO. I'm looking at the article you provided, and the TRM. In the mean time, could you answer my other questions.
    Question:
    1. Does the MLO and u-boot use ECC to BOTH detect and correct errors? 
    2. And is this the reason there is only 1 copy of u-boot and the kernel?
    Thank you,
    Everett
  • Hello,

    Thank you for the information. I see that the RBL uses ECC to detect errors but not to correct them. That explains why there are 4 copies of the MLO. I'm looking at the article you provided, and the TRM. In the mean time, could you answer my other questions.

    Question:
    1. Does the MLO and u-boot use ECC to BOTH detect and correct errors?
    2. And is this the reason there is only 1 copy of u-boot and the kernel?

    Thank you,
    Everett