This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

tms320c6745 bootloader bit error booting form nand

Other Parts Discussed in Thread: TMS320C6745, OMAPL138

Hi,

We have a board desing using a tms320c6745 dsp that boots from NAND flash.  

We did a second bootloader that is written once in flash.  We recently had some problems with a couple of boards that were used by clients for several months that stopped working.  After investigation, it seems that 1 bit has corrupted over time in the flash. Some were stuck in our second bootloader waiting to get new code because the corrupted bit was in the main code and the others didn't even go into our boootloader because the bad bit was in this section.  From what I understand form the bootloader datasheet, ecc detection is used but is some correction done??     

Did any of you had similar issue with nand memory when one bit would change over time ?

  • Simon,

    What kind of NAND is used on the board? How many bit ECC does your NAND support? There is ECC based correction implemented in the boot loader and it supports upto 4 bit correction. Have you tried to run the debug GEL file to see if you can obtain any information from it:

    http://processors.wiki.ti.com/index.php/OMAP-L1x_Debug_Gel_Files

    Regards,

    Rahul

  • Hi Rahul.

    I am using micron MT29F2G08ABAEA 2gb flash.  According to datasheet, it supports 4bit ECC.  

    I did not know such a gel file existed ! I ran it on a good board and on one that has the memory problem.  

    Using a working board I have:
    ROM ID: d800k003 
    Silicon Revision 2.0
    Boot pins: 62335
    Boot Mode: NAND 8 (0x0000F37F)
    ROM Status Code: 0x00000000 
    Description: No error
    Program Counter (PC) = 0x1181F6E8
    Status Code: 0x00000000 
    Description: No error
    Program Counter (PC) = 0x1181F6E8
    Faulty board: 
    ROM ID: d800k003 
    Silicon Revision 2.0
    Boot pins: 62335
    Boot Mode: NAND 8 (0x0000F37F)
    ROM Status Code: 0x0000001A 
    Description: NAND read page failed
    Program Counter (PC) = 0x00712144
    The PLL configs are also different but I assume it is because id did not succeed loading the pll configurations from the AIS.
    Is there somewhere I can get the signification of the rom status ? Other than that  it pretty much gives me info that I already knew that nand has failed.  I am able to run the device in debug mode and download the content of the flash and I know exactly where it as failed. There is one bit that went up so I have 0x22 where is was supposed to be 0x20.  ECC correction should be able to solve this problem right ??  

    Is there something I need to do to enable ECC correction or it is already on by default ??
    thanks for your help,
    Simon

  • Simon,

    The description below the ROM Status is the interpretation of the Status code. ECC calculation is done in hardware by EMIF and the correction is setup by default in the bootloader. If you are seeing just one error on that NAND page then I would have assumed that it would be corrected by the bootloader. Another thing that you want to try is enabling the CRC while creating the AIS boot image to check for errors during loading of the code.

    Another useful tip for you would be to follow the tip mentioned by Daniel on this post:

    http://e2e.ti.com/support/dsp/omap_applications_processors/f/42/t/118800.aspx

    Regards,

    Rahul

    Helpful wikis:

    http://processors.wiki.ti.com/index.php/Raw_NAND_ECC

    http://processors.wiki.ti.com/index.php/Davinci/Sitara/Integra_Nand_Boot_FAQ#What_is_the_ECC_support_on_DM.2FOMAPL13x.2FC678x.3F

  • thanks Rahul, I will check that.  I also downloaded the Ti's nand_writer that i will try to run in debug.

    The thing I don't understand is that the device worked for several month and then stopped working.  Also, crc would detect an error but would only try to reload it 3 times and stop so the device would still be in error. 

  • hello again,

    So i programme my user bootloader using the nand writer program from Ti.  I tried to simulate a bit error by doing the following.

    The I read/backed-up all the sparebytes for ECC, modified one bit of the code and wrote it back in memory with the original sparebytes. I boot up and the device doesn't boot... Does that mean ecc correction does not work ? Is there something I can do so simulate a bit changing over time ?

  • Simon,

    I doubt whether the way you've simulated the ECC corruption is correct or not. But can you confirm the ECC algorithm used here? Is it 1-bit or 4-bit. And how many bits of corruption are you seeing per 512bytes of data? Also you might have to see whether the correction logic is enabled in your NAND driver.

  • Hi Renjith,

    The corruption problem I have is during the TI boot loader part.  If I understand correctly, there is supposed to be 4 bits ECC correction by default.  Is there a way to enable/disable correction logic during the bootloader part ???  When I run the faulty board in debug mode and read the flash, there is one bit wrong on the whole block.  I tought that maybe 5 bits are corrupted and only 4 are corrected but I don't know how I can validate/test this...

    thanks for your support.

    Simon

  • Simon,

    I have not worked on C6000 bootloaders. If you can share the code with me or point to the link where I can download the same version as yours I can check the support for ECC algorithm and correction logic implemented. If you can share the code please send it to renjith.thomas@pathpartnertech.com. 

  • Renjith,

    Pretty much all I used can be found here: http://processors.wiki.ti.com/index.php/Boot_Images_for_OMAP-L137

    I used the flash utility to program my original flash in debug mode, and files are generated using AISgen.  But there is no available source code for the TI bootloader.

  • I am not sure as to how the ECC algorithm is selected in case of AISGen. Can you please send the contents of a NAND page(2k + 64) along with the spare area bytes? From that we can figure out the algorithm used. 

    Also, have you explored the option of using UBL instead of AISGen? UBL source code is available. I'm not sure about AISGen source.

  • Renjith, Simon,

    ECC bits generation on OMAPL138 is done in hardware by the EMIFA port. Basically the software (ROM boot loader) for correction logic follows the the NAND write cycle that is described on Page 733 of the C6745 Technical reference manual

    Simon,

    I am not sure you can simulate the NAND bit error that occurs over time using software, because when you artificially create an error and write back , the EMIF hardware will calculate the ECC bits by default to align with the changed data. 

    Regards,

    Rahul

  • Thanks both of you,

    What I will do for know is I will write my sencondary bootloader using the nandwriter from TI to make sure ecc is used correctly at write. Other than that, I will have to wait and see if the problem is solve with that solution since it is pretty hard to reproduce.

    Regards,

    Simon