This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

How to implement redundant U-Boot partitions

Other Parts Discussed in Thread: AM3352

Hello,

I'm trying to create a redundant boot system for increased robustness, and I'm interested in duplicating U-Boot and SPL data in NAND.

From what I've read, the processor I'm using (AM3352), is able to look into the first 4 blocks on NAND, and load the SPL from the first of these blocks to contain an uncorrupted SPL image.

However, how do I go about implementing such a redundant system for U-Boot? Do I need to make any changes to the SPL code to verify if the U-Boot image is correct, and try to load from another partition if it is not?

Regards,

Guilherme

  • Guilherme,

    On a second read, I see you are actually looking for u-boot backup solution, not to SPL/MLO backup (which is implemented by default). Regarding u-boot backup, refer to the below pointers:

    processors.wiki.ti.com/.../Sitara_Linux_Training:_Boot_Time_Reduction_Update
    e2e.ti.com/.../334778
    e2e.ti.com/.../1034782

    Regards,
    Pavel
  • Pavel,

    I checked the documentation you sent, and created the same partitions. I also enable a debug message on SPL that should print the boot device. (The line 'debug("boot device - %d\n", boot_device)' on common/spl/spl.c).

    However, it is always printing '5', even after I erase some SPL partitions. I imagined it would print the partition number where SPL was lodaded from.

    Do you know why this happens?

    Regards,
    Guilherme
  • Guilherme,

    Are you using AM335x TI SDK? If yes, what version?

    What mtd partitions you have created in the u-boot code base exactly? Can you provide me a console log?

    Regards,
    Pavel
  • The latest AM335x TI SDK 02.00.01.07 comes up with the default u-boot NAND partitioning:

    SDK/board-support/u-boot-2015.07/include/configs/am335x_evm.h

    #define MTDIDS_DEFAULT "nand0=nand.0"

    #define MTDPARTS_DEFAULT "mtdparts=nand.0:" \

    "128k(NAND.SPL)," \

    "128k(NAND.SPL.backup1)," \

    "128k(NAND.SPL.backup2)," \

    "128k(NAND.SPL.backup3)," \

    "256k(NAND.u-boot-spl-os)," \

    "1m(NAND.u-boot)," \

    "128k(NAND.u-boot-env)," \

    "128k(NAND.u-boot-env.backup1)," \

    "8m(NAND.kernel)," \

    "-(NAND.file-system)"

    On mtd5 we have mapped the U-Boot (1m(NAND.u-boot)), that should be the reason for printing 5.

     

    Regards,
    Pavel 

  • Pavel,

    This is our partitioning scheme:

    device nand0 <omap2-nand.0>, # parts = 12
     #: name                size            offset          mask_flags
     0: SPL                 0x00020000      0x00000000      0
     1: SPL.backup1         0x00020000      0x00020000      0
     2: SPL.backup2         0x00020000      0x00040000      0
     3: SPL.backup3         0x00020000      0x00060000      0
     4: u-boot              0x00080000      0x00080000      0
     5: kernel              0x00260000      0x00100000      0
     6: fs                  0x01400000      0x00360000      0
     7: drivers             0x00500000      0x01760000      0
     8: app                 0x00a00000      0x01c60000      0
     9: upgd                0x00200000      0x02660000      0
    10: dmin                0x00a00000      0x02860000      0
    11: dgen                0x04da0000      0x03260000      0

    We are using TI SDK 6.0.00 for the AM335x. What is the rationale behind the u-boot-env and u-boot-env.backup1?

    Regards,

    Guilherme

  • Guilherme,

    Guilherme Costa said:
    What is the rationale behind the u-boot-env and u-boot-env.backup1?

    I am not sure I understand your question, but will try to answer. I can not see nor u-boot-env neither u-boot-env.backup1 in your partitioning scheme. u-boot-env is used to store the u-boot environments (like boot arguments passed to kernel), u-boot-env.backup1 is backup copy of the u-boot-env

    Regards,
    Pavel

  • Pavel,

    Thanks for your reply. I see, so this way I can keep my enviroment setup safe. However, what about U-Boot itself? Why there is not a backup partition for it?

    Regards,
    Guilherme
  • Guilherme,

    Guilherme Costa said:
    However, what about U-Boot itself? Why there is not a backup partition for it?

    Because you did not create it.

    You could probably modify the .h file and add a u-boot backup after u-boot and shift the u-boot environment/kernel/rootfs higher in NAND.

    Regards,
    Pavel

  • Pavel,

    Is there a config macro for enabling  this u-boot backup? All I've found was a macro for redundant environments, which is not interesting in our case, as our environment will be static.

    I've found this in the spl.c code:

    void spl_nand_load_image(void)
    {
       //...
          nand_spl_load_image(CONFIG_SYS_NAND_U_BOOT_OFFS,
                   CONFIG_SYS_NAND_PAGE_SIZE, (void *)header);
           spl_parse_image_header(header)
       //...
    }
    

    I checked nand_spl_load_image's return, and it always returns 0, discarding any error found during the loading process. What I want is a way to verify if the load failed, and load another image. I tried to create my own solution for that, but so far no success.

    Regards,

    Guilherme

  • Guilherme,

    Guilherme Costa said:
    Is there a config macro for enabling  this u-boot backup?

    I made a search but I can not locate such config macro in the u-boot code base. What I was able to find is that we have something ready from similar Sitara device (AM43xx) in some of the latest Sitara SDKs. Have a look in:

    ti-processor-sdk-linux-am335x-evm-02.00.01.07/board-support/u-boot-2015.07+gitAUTOINC+5922e09363/include/configs/am43xx_evm.h

    "u-boot.backup raw 0x080000 0x080000;

    512k(QSPI.u-boot.backup)

    Regards,
    Pavel

  • I managed to make it work! In case anyone is interested, here is a patch:

    diff --git a/U-Boot-2013.01.01-amsdk-06.00.00/common/spl/spl_nand.c b/U-Boot-2013.01.01-amsdk-06.00.00/common/spl/spl_nand.c
    index 61de5a4..75f1fda 100644
    --- a/U-Boot-2013.01.01-amsdk-06.00.00/common/spl/spl_nand.c
    +++ b/U-Boot-2013.01.01-amsdk-06.00.00/common/spl/spl_nand.c
    @@ -26,11 +26,17 @@
     #include <asm/io.h>
     #include <nand.h>
     
    +#ifndef CONFIG_SYS_U_BOOT_PARTITIONS
    +   #define CONFIG_SYS_U_BOOT_PARTITIONS   1
    +#endif
    +
     void spl_nand_load_image(void)
     {
    -  struct image_header *header;
    +   struct image_header *header;
       int *src __attribute__((unused));
       int *dst __attribute__((unused));
    +   int result;
    +   int i = 0;
     
       debug("spl: nand - using hw ecc\n");
       nand_init();
    @@ -90,11 +96,30 @@ void spl_nand_load_image(void)
          (void *)spl_image.load_addr);
     #endif
     #endif
    -  /* Load u-boot */
    -  nand_spl_load_image(CONFIG_SYS_NAND_U_BOOT_OFFS,
    -     CONFIG_SYS_NAND_PAGE_SIZE, (void *)header);
    -  spl_parse_image_header(header);
    -  nand_spl_load_image(CONFIG_SYS_NAND_U_BOOT_OFFS,
    -     spl_image.size, (void *)spl_image.load_addr);
    -  nand_deselect();
    +
    +   /* Load u-boot */
    +   for (i = 0; i < CONFIG_SYS_U_BOOT_PARTITIONS; i++)
    +   {
    +      int current_offset = i * CONFIG_SYS_NAND_U_BOOT_PARTITION_SIZE;
    +
    +      printf("Trying U-Boot image no. %d...\n", i + 1);
    +
    +      result = nand_spl_load_image(CONFIG_SYS_NAND_U_BOOT_OFFS + current_offset,
    +                                      CONFIG_SYS_NAND_PAGE_SIZE, (void *)header);
    +      if (result < 0)
    +         continue;
    +
    +      spl_parse_image_header(header);
    +
    +      result = nand_spl_load_image(CONFIG_SYS_NAND_U_BOOT_OFFS + current_offset,
    +                                   spl_image.size, (void *)spl_image.load_addr);
    +
    +      if (result == 0)
    +      {
    +         nand_deselect();
    +         return;
    +      }
    +   }
    +
    +   printf("All U-Boot images corrupted. Please re-flash your board\n");
     }
    diff --git a/U-Boot-2013.01.01-amsdk-06.00.00/drivers/mtd/nand/am335x_spl_bch.c b/U-Boot-2013.01.01-amsdk-06.00.00/drivers/mtd/nand/am335x_spl_bch.c
    index 9941e0e..2a3e0d1 100644
    --- a/U-Boot-2013.01.01-amsdk-06.00.00/drivers/mtd/nand/am335x_spl_bch.c
    +++ b/U-Boot-2013.01.01-amsdk-06.00.00/drivers/mtd/nand/am335x_spl_bch.c
    @@ -129,6 +129,7 @@ static int nand_read_page(int block, int page, void *dst)
       int eccsize = CONFIG_SYS_NAND_ECCSIZE;
       int eccbytes = CONFIG_SYS_NAND_ECCBYTES;
       int eccsteps = ECCSTEPS;
    +   int result;
       uint8_t *p = dst;
       uint32_t data_pos = 0;
       uint8_t *oob = &oob_data[0] + nand_ecc_pos[0];
    @@ -160,12 +161,11 @@ static int nand_read_page(int block, int page, void *dst)
       p = dst;
     
       for (i = 0 ; eccsteps; eccsteps--, i += eccbytes, p += eccsize) {
    -     /* No chance to do something with the possible error message
    -      * from correct_data(). We just hope that all possible errors
    -      * are corrected by this routine.
    -      */
    -     this->ecc.correct(&nand_info[0], p, &ecc_code[i], &ecc_calc[i]);
    -  }
    +      result = this->ecc.correct(&nand_info[0], p, &ecc_code[i], &ecc_calc[i]);
    +
    +      if (result < 0)
    +         return -1;
    +   }
     
       return 0;
     }
    @@ -174,8 +174,9 @@ int nand_spl_load_image(uint32_t offs, unsigned int size, void *dst)
     {
       unsigned int block, lastblock;
       unsigned int page;
    +   int result;
     
    -  /*
    +   /*
        * offs has to be aligned to a page address!
        */
       block = offs / CONFIG_SYS_NAND_BLOCK_SIZE;
    @@ -188,8 +189,12 @@ int nand_spl_load_image(uint32_t offs, unsigned int size, void *dst)
              * Skip bad blocks
              */
             while (page < CONFIG_SYS_NAND_PAGE_COUNT) {
    -           nand_read_page(block, page, dst);
    -           dst += CONFIG_SYS_NAND_PAGE_SIZE;
    +           result = nand_read_page(block, page, dst);
    +
    +            if (result < 0)
    +               return -1;
    +
    +            dst += CONFIG_SYS_NAND_PAGE_SIZE;
                page++;
             }
    

    
    

    With this, SPL will check another partition in case the first one has too many bitflip errors, but will not do this for bad blocks or if the partition is empty.

    Remember, the partitions must be sequential for this to work!

    Regards,

    Guilherme