This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

** Read error Verifying Checksum ... Bad Data CRC ERROR: can't get kernel image!

Hi all,

I am facing bad data CRC error issue frequently after write the kernel uImage into NAND,

if i kept my dm355 evm board for 10 to 15 days in switch OFF condition then my board getting into this issue.

Information about hardware:

-------------------------------------------

Processor: DM355
Filesystem: YAFFS2
NAND Flash: Micron 29F4G08ABADA
Kernel: 2.6.18
SDK: PSP2.10
U-Boot Version: 1.3.4
UBL Version: 1.3
Is there any micron NAND issue || type of ECC (HW or SW) in kernel || due to creating new Bad block in kernel area???
What could be the cause to failing the checksum while reading image??
 
due to management of bad block in NAND (wrong with either using HW ECC or SW ECC or HW_SYNDROME in board file) ) || due to creating new bad block in kernel area which is already stored uImage  ||  micron NAND 
 
What type of the ECC is using in RBL,UBL, U-boot and Kernel & which is better ECC type???
 
Kindly suggest the solutions (for NAND or kernel code) to avoid this issue
 
kindly do the needful at earliest.

=====

LOG

=====

Chip initialization passed!
TI UBL Version: 1.30
Booting Catalog Boot Loader
BootMode = NAND
Starting NAND Copy...
Valid MagicNum found.
 
 ENTRY POINT = 0x81080000
 NUM PAGES = 0x00000050
 BLOCK = 0x00000019
 PAGE = 0x00000001
 LOAD ADDRESS = 0x81080000   DONE
Jumping to entry point at 0x81080000.
 
 LSP 2.10 BETA RELEASE  
 VERSION: 2.10.008.02  
 
U-Boot 1.3.4 (Jul 26 2010 - 11:34:57)
 
I2C:   ready
DRAM:  128 MB
NAND:  NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron NAND 512MiB 3,3V 8-bit)
Bad block table found at page 262080, version 0x01
Bad block table found at page 262016, version 0x01
No NAND device found!!!
512 MiB
In:    serial
Out:   serial
Err:   serial
ARM Clock :- 216MHz
DDR Clock :- 171MHz
Hit any key to stop autoboot:  4  3  2  1  0  
 
Loading from NAND 512MiB 3,3V 8-bit, offset 0x400000
   Image Name:   Linux-2.6.18-lsp2.10_hw2.1_withE
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    1700372 Bytes =  1.6 MB
   Load Address: 80008000
   Entry Point:  80008000
** Read error
## Booting kernel from Legacy Image at 80700000 ...
   Image Name:   Linux-2.6.18-lsp2.10_hw2.1_withE
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    1700372 Bytes =  1.6 MB
   Load Address: 80008000
   Entry Point:  80008000
   Verifying Checksum ... Bad Data CRC
ERROR: can't get kernel image!
DM355 EVM :>

Thanks & Regards,

S.Titus.

  • Stalin,

    Titus Stalin said:
    NAND:  NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron NAND 512MiB 3,3V 8-bit)
    Bad block table found at page 262080, version 0x01
    Bad block table found at page 262016, version 0x01
    No NAND device found!!!

    Its saying No nand device found in the logs. Can you just perform a simple nand read write operation in u-boot and see whether you are able to read whatever you've written before, even after a reboot.

  • Hi Renjith,

    "No NAND Device found!!!" that the message is because, I have selected NAND maximum no is two in code thats why , it searching 2nd NAND.

    If i rewrite the kernel at the same location then I'm able to boot the device.

  • Titus,

    Does this mean that your issue is solved and you are able to boot the kernel from NAND successfully?

  • Hi,

    solved temporary, but it will raise the same issue in future also

    reliability issue and we cant reproduce it

    Regards,

    Titus.

  • Hi all,

    This issue(kernel corruption) is due to ECC misbehavior in u-boot code,

    Sometime We got u-boot corruption also , this is because ECC misbehavior in UBl code,

    Kernel corruption:

    U-boot source code changes:


    u-boot_source/cpu/arm926ejs/davinci/nand.c
    this file location could be different in other u-boot source code.

    E.g,

     u_boot_source/drivers/mtd/nand/davinci_nand.c

    1. “u-boot_source/cpu/arm926ejs/davinci/nand.c” this is the file is responsible for ECC correction and detection of NAND flash.

    2. Found condition check is not properly handled for ECC correction in “nand_davinci_4bit_compare_ecc “ function of nand file.

    3. Should use “if(iserror == ECC_STATE_NO_ERR)” condition check instead of if(iserror == ECC_STATE_NO_ERR || iserror == 5)” to correct the bitflips in NAND flash. So always this function will return zero (coming out of the loop without doing any ECC correction) before going to ECC correction function even bitflips present in NAND flash

    4. Refer NANDFSR register from sprued1b document, checking ECC_STATE is handled wrongly (condition check) in our nand.c file, If iserror (ECC_STATE) is equal to five then processor still calculating ECC errors in NAND flash but not to exit.

    U-boot corruption:

    UBL source code changes:

    1. “ubl_source/DM35x/Common/src/device_nand.c” this is the file is responsible for ECC correction and detection of NAND flash.

    2. Found condition check is not properly handled for ECC correction in “Uint32 DEVICE_NAND_ECC_correct“ function of nand file.

    3. Should use “if ((corrState == 1)) ” condition check instead of “if ((corrState == 1) || (corrState > 3))” to correct the bitflips in NAND flash. So always this function will return (coming out of the loop without doing any ECC correction) before going to ECC correction function even bitflips present in NAND flash.

    4. Refer NANDFSR register from sprued1b document, checking ECC_STATE is handled wrongly (condition check) in our device_nand.c file.

  • Hi Titus,

    Which U-boot code base are you referencing above? I've done a grep on the current head version from Wolfgang Denx site... & I've not found the file or the symbol. I've also done a search on the older 1.3.4 u-boot code base delivered with the PSP v5.1 IPNC.  I've been assigned a project on an already deployed product... about 30k-50k devices based on the DM368/IPNC reference design.  I'm needing to add support to U-boot for redundant storage of the u-boot parameters... In this case, the customer, my employer... doesn't have the source code for what was deployed.  The existing working u-boot is based on the 1.3.4 code base and works well... however, I don't have the source code for it... :).

    I'd like to apply the above described fix to the u-boot in the TI PSP v5.1 drop (v1.3.4).  The code in the drop doesn't boot our kernel... it emits the same error described in the head of this thread.

    Regards,

    Edwin Bland

  • OK... after a bit of googling... the source file appears in the flash utils delivered with the TI PSP ... as part of what's used to build the UBL.  I've previously built the UBL for a boot hang problem... which was resolved with the new UBL... Does the above comment suggest that the UBL is causing the uboot load to load nand incorrectly? 

    Our UBL was working fine with our previous UBOOT load...

    Trying to understand how the above code change fixes the uboot kernel load problem.... which didn't exist with the previous uboot... and the same UBL....

    Tx in advance!

    Regards,

    Edwin

  • Hi Edwin,

    Sorry for the bit late.

    I met a problem with TI u-boot version is 1.3.4.

    Trying to understand how the above code change fixes the uboot kernel load problem.... which didn't exist with the previous uboot... and the same UBL....

    We have to use if(iserror == ECC_STATE_NO_ERR)condition check instead of if(iserror == ECC_STATE_NO_ERR || iserror == 5)

    (i) if(iserror == ECC_STATE_NO_ERR || iserror == 5 This will work if no error or bitflip in NAND flash while reading kernel image

    (ii) if(iserror == ECC_STATE_NO_ERR || iserror == 5 This will not work if error or bitflip is present in NAND flash while reading kernel image, because 'iserror' bit wont' set to 'ECC_STATE_NO_ERR"  (ie: 1 or more bit flips present in NAND) and It will return 0 while calculating errors,

    ie iserror == 5 means ECC_STATE=5 that is still ECC calculation is going on

    There is problem with "check condition" that required to call the function which is used to correct the bitflips while reading.

    but It is actually not happening , So We have changed to if(iserror == ECC_STATE_NO_ERR)

    So It will return if no error or bit flips in NAND flash.

    Please read your device TRM in EMIF section.

  • Hi Titus,
    
    I appreciate the suggestion.  Unfortunately, we already have this patch/mod as follows:
    /usb_webcam/ti-davinci/drivers/mtd/nand/davinci-nand.c: ~765
    
    
     if (iserror == ECC_STATE_NO_ERR)// || iserror == 5)
      return 0;
     else if (iserror == ECC_STATE_TOO_MANY_ERRS) {
      printk(KERN_ERR "%s Too many errors to be corrected!\n"
        , __func__);
      return -1;
     }
    
    
    That doesn't yet explain the CRC failure...  which is only present with the new UBOOT from the v5.1 drop.
    The failure is a result of a UBOOT change... nothing else.  
    i.e. I change UBOOT, leaving UBL & the kernel/application the same... 
    this single change causes the kernel CRC unpack error... 
    I believe we're looking for a code change in UBOOT. 
    It's also conceivable we could be looking for a code change in the kernel unpack...if perhaps it had
    an unrecognized failure in the existing code... this seems unlikely, however.
    
    Regards,
    
    Edwin
  • Hi Titus,
    
    I meant to note that we also have the code in the UBOOT davinci nand driver as follows: (UBOOT from IPNC v5.1)
    
    /UBOOT/cpu/arm927ejs/davinci/nand.c: ~422
     if (iserror == ECC_STATE_NO_ERR)
      return 0;
     else if (iserror == ECC_STATE_TOO_MANY_ERRS)
      return -1;
    
    i.e. neither of the above suggestions explain the observed CRC failure.  Any other ideas?  Has the UBOOT released
    with the v5.1 IPNC drop been verified as functional with the DM368?
    
    Regards,
    
    
    Edwin
    
  • Hi Edwin.

    Check the below patch which is used for avoiding the ECC read failures.

    This patch is actually in "u-boot/drivers/mtd/nand/davinci_nand.c" but we have to apply this patch into in old u-boot source location "u-boot/cpu/arm927ejs/davinci/nand.c"

    http://arago-project.org/git/projects/?p=u-boot-davinci.git;a=commitdiff;h=1075b07e2c67c1f504d9f3a6f1b9aaa8f81393b2

  • Hi Titus,

    That appears to be in the right general area.  ... but alas, the TI v5.1 drop already has it.  Granted the code base is different, but it's clear from the following:

     /*
      * Set the addr_calc_st bit(bit no 13) in the NAND Flash Control
      * register to 1.
      */
     emif_addr->NANDFCR |= 1 << 13;

     /*
      * Wait for the corr_state field (bits 8 to 11)in the
      * NAND Flash Status register to be equal to 0x0, 0x1, 0x2, or 0x3.
      */
     i = NAND_TIMEOUT;

    ...Still searching.

    Regards,

    Edwin