This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

4bit ECC support in Linux NAND driver for 16 bit NAND in AM1808

Other Parts Discussed in Thread: AM1808, OMAPL138

We have developed a custom board using am1808 controller and Micron’s 16bit NAND flash MT29F2G16ABAEAWP-IT. We are currently using Linux-Davinci 2.6.37 (taken from arago-project git repository) of source revision ‘V2.6.37_DAVINCIPSP_03.21.00.04’. The bootloader we are using is u-boot-omapl1, also taken from arago-project git repository.
Currently we are facing some issue with Davinci-NAND driver in Linux. We are using 4-bit ECC support in u-boot and it works fine. But enabling 4-bit ECC support in Linux Davinci-NAND driver results in some errors. We need to mount root filesystem from NAND flash as jffs2. But enabling 4-bit ECC creates errors and mounting issues. Analyzing the kernel source code, a comment has been found saying that 4-bit ECC may not work with 16 bit NAND flash.
###########################################################
Davinci_nand.c->nand_davinci_probe()
case NAND_ECC_HW:
                                if (pdata->ecc_bits == 4) {
/* No sanity checks:  CPUs must support this,
                                                * and the chips may not use NAND_BUSWIDTH_16.
                                                */
###########################################################

Switching to 1 bit ECC in Linux and u-boot solved filesystem mounting issue. But we would like to have 4-bit ECC support in Linux, so that updating u-boot image in NAND from Linux is possible. U-boot image should be written to NAND with 4-bit ECC data written in OOB area.
Also switching to 1bit ECC instead of 4bit reduces the error detection and error correction possibilities.

•    Is there any possibility to fix the above mentioned issue in davinci-nand driver with 16 bit NAND flash devices?
•    Also it seems the latest kernel version available for Linux-Davinci is 2.6.37. Is there any newer kernel versions (3.0 and above) available for davinci family, especially am1808/omapl138

  • Sreejith,

    We've fixed almost same issue long back. Its possible to fix this. I'm not sure about support for latest kernel for l138.

  • Hello Renjith,

    Can you please give a little more details on this.

    I would like to get details like: what was the problem with original kernel to disable this support? in which code portion you have modified to fix this? etc.

    It will be nice if you can share any patch files for this fix, if available.

    Thanks,

    Sreejith

  • Sreejith,

    I have done this more than a year before for a customer. I don't any piece of code with me for the same as it was done at a customer location. They were having lot of issues with UBIFS and we've fixed all of them. I'm not really sure about your current kernel version and track all these changes, as I mostly work on OMAP4 and DM81xx series. 

    Please contact me at renjith.thomas@pathpartnertech.com. We can have this discussion offline, if you are okay with it.

  • Hello, I ran into the same problem as you. Have you been able to resolve the problem? If yes, may I ask you to tell me how?

    Thank you

  • Hi Patrick,

    Can you share more details about this? Which platform, logs etc..

  • Hi all,

    Sorry for the late reply. Actually we found the issue and had done a workaround for this in the NAND driver. Please see the details of the problem given below:

    ########

    When a NAND block is erased, both the data area and OOB area will be filled with 0xffff.

    Now we are trying to write jffs2 filesystem image to NAND partition.

    While writing jffs2 image, some of the NAND pages are to be filled with 0xffff pattern.

    While writing 0xffff to NAND pages using u-boot/linux NAND driver, corresponding 4bit ECC is to be filled to OOB area. The ECC value for 0xffff pattern is 0x3f2756f529d861d99d14 (but for 1bit ECC the value is 0xff itself).

    So next time when we write some data other than 0xffff to the above mentioned page (which need not be erased as the data already is 0xffff), the driver will try to write the corresponding ECC also to the OOB area.

    But since OOB area is not erased now, (it contains the value

    3f2756f529d861d99d14) the new ECC value written to it will not be stored correctly.

    ########

    I have made a workaround to this issue:

    While writing 0xffff pattern to a page, the ECC will be forcefully written as 0xff pattern itself (instead of 3f2756f529d861d99d14). After adding this fix to the driver, mounting from NAND flash works without the ECC errors.

    Regards,

    Sreejith

  • Hi Sreejith,

    My driver works well if I use it with ECC 4bit and my NAND device with buswidth 8 bit. The problems arises when I try to use ECC 4 bit on our proprietary board where we have the same NAND device but with buswidth 16 bit. On the other hand, 1 bit ECC works on our board. Is your fix related to problem of using buswidth 16 bit and 4 bit ECC?

    Cheers

    Patrick

  • Hi Patrick,

    I am not sure whether the issue is specific to 16 bit NAND. I have not tested the original driver with 8bit NAND.

    Can you please let me know what is the error you got while accessing the NAND? What is the data you used to fill the structure "davinci_nand_pdata" to configure the driver?

    Regards,

    Sreejith

  • Hi Sreejith,

    I am sorry for not getting back to you for so long ... I had to work on something else but now I am back on my NAND problem.

    I made a test this morning where I try to write my rootfs as ubi.img to the NAND and then mount it when I start Linux. The Linux version I am using is 2.6.37. If I do this with the following  davinci_nand_pdata settings everything goes fine:

    static struct davinci_nand_pdata omapl138nagra_nandflash_data = {
            .parts = omapl138nagra_nandflash_partition,
            .nr_parts = ARRAY_SIZE(omapl138nagra_nandflash_partition),
            .ecc_mode = NAND_ECC_HW, 
            .ecc_bits = 1,
            .options = NAND_USE_FLASH_BBT | NAND_BUSWIDTH_16,
            .timing = &omapl138nagra_nandflash_timing,
    };

    static struct davinci_aemif_timing omapl138nagra_nandflash_timing = {
            .wsetup         = 24,
            .wstrobe        = 21,
            .whold          = 14,
            .rsetup         = 19,
            .rstrobe        = 50,
            .rhold          = 0,
            .ta             = 20,
    };

    but if I set "ecc_bits" to 4 the rootfs can not be mounted due to ECC errors. Here the details:

    1. I write the my ubi.img as following from to the NAND (to be able to do that I boot the rootfs temp. from my host):

      ubiformat /dev/mtd3 -f rootfsubi.img -s 512 -O 2048
    -> this reports no error

    2. Then, I boot Linux and try to mount the rootfs from the NAND which results in the following log messages:

    NAND device: Manufacturer ID: 0x2c, Chip ID: 0xbc (Micron )
    Creating 4 MTD partitions on "davinci_nand.0":
    0x000000000000-0x000000400000 : "Kernel-nand"
    0x000000400000-0x000000800000 : "Splash-nand"
    0x000000800000-0x000000900000 : "Fpga-nand"
    0x000000900000-0x000008900000 : "Rootfs-nand"
    davinci_nand davinci_nand.0: controller rev. 2.5
    UBI: attaching mtd3 to ubi0
    UBI: physical eraseblock size:   131072 bytes (128 KiB)
    UBI: logical eraseblock size:    126976 bytes
    UBI: smallest flash I/O unit:    2048
    UBI: sub-page size:              512
    UBI: VID header offset:          2048 (aligned 2048)
    UBI: data offset:                4096
    UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 0:0, read 64 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 512 bytes from PEB 0:2048, read 512 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 1:0, read 64 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 512 bytes from PEB 1:2048, read 512 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 2:0, read 64 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 512 bytes from PEB 2:2048, read 512 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 3:0, read 64 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 512 bytes from PEB 3:2048, read 512 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 4:0, read 64 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 512 bytes from PEB 4:2048, read 512 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 5:0, read 64 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 512 bytes from PEB 5:2048, read 512 bytes
    UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 6:0, read 64 bytes
    etc. ...

    Another simple test I made is using a NAND test tool:

    nandtest -o 0 -l 0x8000000 -k /dev/mtd3

    with the following result:

    ECC corrections: 0
    ECC failures   : 264
    Bad blocks     : 0
    BBT blocks     : 0
    00000000: reading...
    ECC failed at 00000000
    00000000: checking...
    compare failed. seed 1804289383
    Byte 0x5c3c is 4a should be 50
    Byte 0x5d03 is 53 should be e3
    Byte 0x5d5f is 4c should be c0
    Byte 0x5de9 is dd should be 97

    Do you think your patch could help?

    Could my NAND timings be too small so the ECC machine has not enough cycles for the correct computation?

    Will try to use the latest driver from Linux.org

    Thanks for your help

    Cheers,

    Patrick

  • Hi again,

    I compared my davinci_nand.c with the one Linux v3.8 and I found no real difference ...

    Then I tried to increase the EMIFA NAND timings and used the following (important is rhold which was 0):

    static struct davinci_aemif_timing omapl138nagra_nandflash_timing = {
            .wsetup         = 48,
            .wstrobe        = 42,
            .whold          = 28,
            .rsetup         = 38,
            .rstrobe        = 80,
            .rhold          = 28,
            .ta             = 40,
    };

    With these settings I could mount my UBI rootfs but every 2nd or 3rd time I still got ECC errors on in general only one PEB (not always the same one). So, this improved the situation dramatically since before not one PEB could be read without ECC error but it still does not solve the problem. It seems that the DaVinci ECC peripheral has now more time to digest the bytes com ing from the NAND when reading ...

    Any thoughts are welcome ... I am running out of ideas ...

    Cheers

    Patrick



  • Hi Patrick,

    If you are getting ECC error during the second mount attempt, the issue seems to be similar to what I had faced. You can try out the following test.

    Use nanddump utility to dump the content of NAND partition to a file: nanddump -c -f file /dev/mtdN

    Now check the content and verify

    If any of the NAND page is fully erased (data is 0xffff), then the corresponding ECC (check the OOB area) should also be 0xffff.

    Otherwise it will show ECC errors during mounting the second time.

    ~Sreejith

  • Hi again,

    Given below is the workaround that I have done to fix my issue.

    diff -Naur git_orig/drivers/mtd/nand/nand_base.c git/drivers/mtd/nand/nand_base.c
    --- git_orig/drivers/mtd/nand/nand_base.c       2012-11-21 11:08:54.044509860 +0530
    +++ git/drivers/mtd/nand/nand_base.c    2012-11-21 10:55:17.492550908 +0530
    @@ -1984,19 +1984,31 @@
     static void nand_write_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
                                      const uint8_t *buf)
     {
    -       int i, eccsize = chip->ecc.size;
    +       int i, j, eccsize = chip->ecc.size;
            int eccbytes = chip->ecc.bytes;
            int eccsteps = chip->ecc.steps;
    -       uint8_t *ecc_calc = chip->buffers->ecccalc;
    +       //uint8_t *ecc_calc = chip->buffers->ecccalc;
    +       uint8_t ecccalc[NAND_MAX_OOBSIZE];
    +       uint8_t *ecc_calc = ecccalc;
            const uint8_t *p = buf;
            uint32_t *eccpos = chip->ecc.layout->eccpos;

    +       for (i = 0; i < chip->ecc.total; i++)
    +               chip->buffers->ecccalc[i] = 0xff;
            for (i = 0; eccsteps; eccsteps--, i += eccbytes, p += eccsize) {
    +               ecc_calc = ecccalc;
                    chip->ecc.hwctl(mtd, NAND_ECC_WRITE);
                    chip->write_buf(mtd, p, eccsize);
    +               for (j = 0; j < eccsize; j++){
    +                       if (p[j] != 0xff){
    +                               ecc_calc = chip->buffers->ecccalc;
    +                               break;
    +                       }
    +               }
                    chip->ecc.calculate(mtd, p, &ecc_calc[i]);
            }

    +       ecc_calc = chip->buffers->ecccalc;
            for (i = 0; i < chip->ecc.total; i++)
                    chip->oob_poi[eccpos[i]] = ecc_calc[i];

    ~Sreejith

  • Hello Sreejith,

    I added your patch to my u-boot driver and it allows me to burn the UBI image from u-boot and mount it from Linux (which was not possible without). Thanks a lot!

    For newer kernels (3.0 and higher) there seem to be solution for above problem, see http://www.linux-mtd.infradead.org/faq/ubifs.html#L_free_space_fixup.

    Regarding the spurious ECC errors I found the bug. It was related to overlapping dma transfers on the same bus as the NAND is. 

    So I am quite happy now after a week of very hard debug ...

    Thanks again

    Patrick