Booting from NAND causes UBI error in EZSDK 5.04.00.11 using BCH8

Maynard Cabiente

Hello,

We are investigating the usage of NAND and NOR flash in DM814x and DM816x boards for prototyping our custom board. Right now, we are focusing on the NAND flash. We are using EZSDK 5.04.00.11 and have built u-boot (minimum and 2nd stage for DM814x), linux kernel, and filesystem (using UBI/UBIFS). We have successfully programmed all the images into the NAND flash of the DM814x with the flash memory map provided in

http://processors.wiki.ti.com/index.php/TI81XX_PSP_U-Boot#NAND_Layout

When trying to boot from NAND, we made sure that the SPI flash is disabled. The bootargs and bootcmd arguments that we provide for u-boot are the following:

setenv bootargs 'console=ttyO0,115200n8 noinitrd ip=off mem=256M rootwait=1 rw ubi.mtd=4,2048 rootfstype=ubifs root=ubi0:rootfs init=/init notifyk.vpssm3_sva=0xBF900000 earlyprintk'
setenv bootcmd 'nand read 0x81000000 0x00280000 0x300000;bootm 0x81000000'

On the first boot-up of the DM814x board after programming all the images into the NAND flash, the linux kernel already complained about the NAND flash having ECC errors as seen below.

UBIFS: start fixing up free space

UBIFS: free space fixup complete
UBIFS: mounted UBI device 0, volume 0, name "rootfs"
UBIFS: file system size: 199225344 bytes (194556 KiB, 189 MiB, 1569 LEBs)
UBIFS: journal size: 9023488 bytes (8812 KiB, 8 MiB, 72 LEBs)
UBIFS: media format: w4/r0 (latest is w4/r0)
UBIFS: default compressor: zlib
UBIFS: reserved for root: 0 bytes (0 KiB)
VFS: Mounted root (ubifs filesystem) on device 0:14.
devtmpfs: error mounting -2
Freeing init memory: 208K
Failed to execute /init. Attempting defaults...
INIT: version 2.84 booting
UBI: run torture test for PEB 1600
UBI error: ubi_io_read: error -74 (ECC error) while reading 131072 bytes from PEB 1600:0, read 131072 bytes
UBI error: torture_peb: read problems on freshly erased PEB 1600, must be bad
UBI error: erase_worker: failed to erase PEB 1600, error -5
UBI: mark PEB 1600 as bad
UBI: 10 PEBs left in the reserve
UBI: run torture test for PEB 985
UBI error: ubi_io_read: error -74 (ECC error) while reading 131072 bytes from PEB 985:0, read 131072 bytes
UBI error: torture_peb: read problems on freshly erased PEB 985, must be bad
UBI error: erase_worker: failed to erase PEB 985, error -5
UBI: mark PEB 985 as bad
UBI: 9 PEBs left in the reserve

After a reboot, the ECC error gets worse as seen below:

UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 278:4096, read 126976 bytes

UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 278
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 534:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 1001:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 1001
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 535:4096, read 126976 bytes
UBIFS: recovery needed
UBI error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes from PEB 566:49152, read 2048 bytes
UBIFS error (pid 1): ubifs_leb_read: reading 2048 bytes from LEB 274:45056 failed, error -74
UBI error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes from PEB 536:8192, read 2048 bytes
UBIFS error (pid 1): ubifs_leb_read: reading 2048 bytes from LEB 8:4096 failed, error -74
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 1:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 92160 bytes from PEB 565:38912, read 92160 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 118784 bytes from PEB 0:12288, read 118784 bytes
UBIFS: recovery completed
UBIFS: mounted UBI device 0, volume 0, name "rootfs"
UBIFS: file system size: 199225344 bytes (194556 KiB, 189 MiB, 1569 LEBs)
UBIFS: journal size: 9023488 bytes (8812 KiB, 8 MiB, 72 LEBs)
UBIFS: media format: w4/r0 (latest is w4/r0)
UBIFS: default compressor: zlib
UBIFS: reserved for root: 0 bytes (0 KiB)
VFS: Mounted root (ubifs filesystem) on device 0:14.
devtmpfs: mounted
Freeing init memory: 208K
Failed to execute /init. Attempting defaults...
INIT: version 2.84 booting
UBI: run torture test for PEB 279
UBI error: ubi_io_read: error -74 (ECC error) while reading 131072 bytes from PEB 279:0, read 131072 bytes
UBI error: torture_peb: read problems on freshly erased PEB 279, must be bad
UBI error: erase_worker: failed to erase PEB 279, error -5
UBI: mark PEB 279 as bad
UBI: 8 PEBs left in the reserve

Can this issue be fixed in the latest EZSDK 5.04.00.11? Is the computation in the linux kernel for BCH8 correct in this release? Is there a way to configure the kernel and u-boot to use Hamming instead of BCH8 if BCH8 have some issues?

The same issue happens in our DM816x board as well. I'm assuming that this is not a hardware issue but more likely a software bug in the kernel.

Any help is appreciated to determine if NAND flash is a viable option for us.

Regards,
Maynard

over 13 years ago

0 Pavel Botev over 13 years ago

TI__Guru**** 170625 points

Hi Maynard,

I successfully boot the DM816X from NAND with UBIFS, using BCH8 (which is the recommended ECC).

I followed these documents:

http://processors.wiki.ti.com/index.php/UBIFS_Support

http://processors.wiki.ti.com/index.php/TI81XX_PSP_U-Boot

http://processors.wiki.ti.com/index.php/TI81XX_PSP_User_Guide

The bootargs and bootcmd that work for me are:

TI8168_EVM# setenv bootargs 'console=ttyO2,115200n8 noinitrd ip=off mem=256M rw ubi.mtd=7,2048
     rootfstype=ubifs root=ubi0:rootfs init=/init'

TI8168_EVM# setenv bootcmd 'nand read.i 0x81000000 0x00280000 0x500000;bootm 0x81000000'

Can you try with these?

BR,
Pavel

0 Maynard Cabiente over 13 years ago in reply to Pavel Botev

Intellectual 460 points

Hi Pavel,

Thanks for the suggestion. We did follow the instructions as per the documents you mentioned. But, it just give us errors from the start.

We created the UBI image using these commands:
> mkfs.ubifs -m 2048 -e 124KiB -c 1580 -F -x zlib -r 'work/flashdisk_tmp' -o ubifs.img
> ubinize -o 'work/flashdisk.ubifs' -m 2048 -p 128KiB -s 2048 -O 2048 ubinize.cfg

Our ubinize.cfg file has these contents:

[ubifs]
mode=ubi
image=ubifs.img
vol_id=0
vol_size=192MiB
vol_type=dynamic
vol_name=rootfs
vol_flags=autoresize

If there is something wrong with these commands to create the UBI image, please let me know. We are using zlib compression instead of the default LZO as it compresses better.

According the the UBIFS Support documentation, there is supposed to be no sub-page writes. Why is it then that the same documentation used the ubinize command with a sub-page on it (-s 512)? Is there a sub-page for this NAND flash or not?

In our case, we tried both -s 512 and -s 2048 for the sub-page. But, it did not matter.

I'm guessing that you or any other TI employees have encountered this issue? However, I did see 2 or 3 posts (different people) who have the same issue in the TI forum.

I just want to be sure that the issue is not because of the UBI image creation. Can you please verify the commands that we used?

Also, if your images work, is there some way I can get your u-boot, linux kernel, and UBI image so I can try it myself for my own verification?

Thanks,
Maynard

0 Maynard Cabiente over 13 years ago in reply to Maynard Cabiente

Intellectual 460 points

Correction:

I'm guessing that you or any other TI employees have "NOT" encountered this issue? However, I did see 2 or 3 posts (different people) who have the same issue in the TI forum.

-Maynard

0 Pavel Botev over 13 years ago in reply to Maynard Cabiente

TI__Guru**** 170625 points

Hi Maynard,

These are my files:

1. u-boot.noxip.bin 3683.u-boot.noxip.bin

2. linux kernel 3377.uImage

3. ubi image 1385.ubi.img

For creating the ubi.img I am using exactly the commands from the http://processors.wiki.ti.com/index.php/UBIFS_Support:

mtd-utils# mkfs.ubifs/mkfs.ubifs -r filesystem/ -F -o ubifs.img -m 2048 -e 126976 -c 1580

mtd-utils# ubi-utils/ubinize -o ubi.img -m 2048 -p 128KiB -s 512 -O 2048 ubinize.cfg

For file system I am using the content of the arago-base-tisdk-image-dm816x-evm.tar.gz, which 
comes with the EZSDK. The linux kernel is the default one, coming as pre-built image.

Thus I boot successfully from the NAND with UBIFS without any UBI/UBIFS warnings/errors reporting. 
But when I reboot (boot for second time), I have some UBI/UBIFS error reports. These error 
messages does not block the boot process and I end up with successful boot,  but may be the 
system will be unstable! 
I am checking this with our PSP team.

Best Regards,
Pavel

0 Chris P over 13 years ago in reply to Pavel Botev

Intellectual 505 points

Pavel,

Thank you for finally looking into this. We've been suffering from terrible file system stability since the Appro 2.0 release (based on the same PSP). The "fix" move to UBIFS made things worse and the first attempt at a patch in Appro 3.0 didn't help at all.

I have reverted to using JFFS2 with 1-bit hamming ECC, but that's only slightly more stable.

Please, please, please get the file system stable.

If you search around the forums you will find many other people trying to figure out how to make this stuff work reliably.

Thanks,Chris

0 Maynard Cabiente over 13 years ago in reply to Pavel Botev

Intellectual 460 points

Hi Pavel,

Thanks for the images. I tried it in our DM816x board and I encountered the same issue were having. As you mentioned, the first boot did not have an issue with your images. But, on the second boot and the next consecutive boots, the issue is now always present. Here are snippets of the UBI error messages I get from each boot.

2nd BOOT

--------

UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 152:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 152
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 185:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 153:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 153
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 186:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 271:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 271
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 2:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 219:104448, read 26624 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 102400 bytes from PEB 0:28672, read 102400 bytes
UBIFS: mounted UBI device 0, volume 0, name "rootfs"
UBIFS: file system size: 199225344 bytes (194556 KiB, 189 MiB, 1569 LEBs)
UBIFS: journal size: 9023488 bytes (8812 KiB, 8 MiB, 72 LEBs)
UBIFS: media format: w4/r0 (latest is w4/r0)
UBIFS: default compressor: lzo
UBIFS: reserved for root: 0 bytes (0 KiB)

3rd BOOT
--------
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 152:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 152
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 185:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 153:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 153
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 186:4096, read 126976 bytes
UBIFS: recovery needed
UBI error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes from PEB 220:34816, read 2048 bytes
UBIFS error (pid 1): ubifs_leb_read: reading 2048 bytes from LEB 149:30720 failed, error -74
UBI error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes from PEB 187:10240, read 2048 bytes
UBIFS error (pid 1): ubifs_leb_read: reading 2048 bytes from LEB 8:6144 failed, error -74
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 271:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 271
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 2:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 26624 bytes from PEB 219:104448, read 26624 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 102400 bytes from PEB 0:28672, read 102400 bytes
UBIFS: recovery completed
UBIFS: mounted UBI device 0, volume 0, name "rootfs"
UBIFS: file system size: 199225344 bytes (194556 KiB, 189 MiB, 1569 LEBs)
UBIFS: journal size: 9023488 bytes (8812 KiB, 8 MiB, 72 LEBs)
UBIFS: media format: w4/r0 (latest is w4/r0)
UBIFS: default compressor: lzo
UBIFS: reserved for root: 0 bytes (0 KiB)

4th BOOT
--------
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 152:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 152
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 185:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 153:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 153
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 186:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 271:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 271
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 7:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 4096 bytes from PEB 662:126976, read 4096 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 88064 bytes from PEB 663:43008, read 88064 bytes
UBIFS: mounted UBI device 0, volume 0, name "rootfs"
UBIFS: file system size: 199225344 bytes (194556 KiB, 189 MiB, 1569 LEBs)
UBIFS: journal size: 9023488 bytes (8812 KiB, 8 MiB, 72 LEBs)
UBIFS: media format: w4/r0 (latest is w4/r0)
UBIFS: default compressor: lzo
UBIFS: reserved for root: 0 bytes (0 KiB)
VFS: Mounted root (ubifs filesystem) on device 0:15.

5th BOOT
--------
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 152:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 152
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 185:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 153:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 153
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 186:4096, read 126976 bytes
UBIFS: recovery needed
UBI error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes from PEB 660:47104, read 2048 bytes
UBIFS error (pid 1): ubifs_leb_read: reading 2048 bytes from LEB 149:43008 failed, error -74
UBI error: ubi_io_read: error -74 (ECC error) while reading 2048 bytes from PEB 661:22528, read 2048 bytes
UBIFS error (pid 1): ubifs_leb_read: reading 2048 bytes from LEB 8:18432 failed, error -74
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 271:4096, read 126976 bytes
UBI warning: ubi_eba_copy_leb: error -74 while reading data from PEB 271
UBI error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 9:4096, read 126976 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 124928 bytes from PEB 8:6144, read 124928 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 83968 bytes from PEB 663:47104, read 83968 bytes
UBIFS: recovery completed
UBIFS: mounted UBI device 0, volume 0, name "rootfs"
UBIFS: file system size: 199225344 bytes (194556 KiB, 189 MiB, 1569 LEBs)
UBIFS: journal size: 9023488 bytes (8812 KiB, 8 MiB, 72 LEBs)
UBIFS: media format: w4/r0 (latest is w4/r0)
UBIFS: default compressor: lzo
UBIFS: reserved for root: 0 bytes (0 KiB)
VFS: Mounted root (ubifs filesystem) on device 0:15.

The PEB locations have some difference in each boot sequence. In our case with our own images, sooner or later, the DM816x board will stop booting linux from NAND flash and will get stuck in a kernel crash.

Maybe you could give the same images that you posted here to the PSP team so they realize that there is an actual problem. Please tell them also to keep rebooting the DM816x or DM814x board to be able to get the problem. Don't stop with the first boot. In our case, the first boot even have the issue.

Oh and by the way, Renjith Thomas, a TI community member, was more than helpful in trying to fix this issue. He gave me his own kernel patches that actually made the error goes away on both DM816x and DM814x board. He claimed that he has sent the patch to TI already but was not sure if they will do anything with it.

I am attaching the patch to see if your PSP team can verify Renjith's possible fix in this matter.

6283.renjith_thomas_ubi_bch8_patch.txt

Fullscreen

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Index: arch/arm/mach-omap2/gpmc.c
===================================================================
--- arch/arm/mach-omap2/gpmc.c  (revision 254089)
+++ arch/arm/mach-omap2/gpmc.c  (working copy)
@@ -854,7 +854,7 @@
            bch_mod = 0;
            bch_wrapmode = 0x09;
        } else if (ecc_type == OMAP_ECC_BCH8_CODE_HW) {
-           eccsize1 = 0x2; eccsize0 = 0x1A;
+           eccsize1 = 0x10; eccsize0 = 0x0;
            bch_mod = 1;
            bch_wrapmode = 0x01;
        } else
@@ -870,7 +870,7 @@
            bch_mod = 0;
            bch_wrapmode = 0x06;
        } else if (ecc_type == OMAP_ECC_BCH8_CODE_HW) {
-           eccsize1 = 0x1c; eccsize0 = 0x00;
+           eccsize1 = 0x10; eccsize0 = 0x0;
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Index: arch/arm/mach-omap2/gpmc.c
===================================================================
--- arch/arm/mach-omap2/gpmc.c	(revision 254089)
+++ arch/arm/mach-omap2/gpmc.c	(working copy)
@@ -854,7 +854,7 @@
 			bch_mod = 0;
 			bch_wrapmode = 0x09;
 		} else if (ecc_type == OMAP_ECC_BCH8_CODE_HW) {
-			eccsize1 = 0x2; eccsize0 = 0x1A;
+			eccsize1 = 0x10; eccsize0 = 0x0;
 			bch_mod = 1;
 			bch_wrapmode = 0x01;
 		} else
@@ -870,7 +870,7 @@
 			bch_mod = 0;
 			bch_wrapmode = 0x06;
 		} else if (ecc_type == OMAP_ECC_BCH8_CODE_HW) {
-			eccsize1 = 0x1c; eccsize0 = 0x00;
+			eccsize1 = 0x10; eccsize0 = 0x0;
 			bch_mod = 1;
 			bch_wrapmode = 0x01;
 		} else
Index: drivers/mtd/nand/omap2.c
===================================================================
--- drivers/mtd/nand/omap2.c	(revision 254089)
+++ drivers/mtd/nand/omap2.c	(working copy)
@@ -958,6 +958,9 @@
 		eccsize = BCH8_ECC_OOB_BYTES;
 
 		for (i = 0; i < blockCnt; i++) {
+			if (memcmp(read_ecc, calc_ecc, 13) == 0) {
+				continue;
+			}
 			eccflag = 0;
 			/* check if area is flashed */
 			for (j = 0; (j < eccsize) && (eccflag == 0); j++)
@@ -1237,7 +1240,6 @@
 		} else if (pdata->ecc_opt == OMAP_ECC_BCH8_CODE_HW) {
 			info->nand.ecc.bytes     = 14;
 			info->nand.ecc.size      = 512;
-			info->nand.ecc.read_page = omap_read_page_bch;
 		} else {
 			info->nand.ecc.bytes    = 3;
 			info->nand.ecc.size     = 512;
Index: drivers/mtd/nand/nand_base.c
===================================================================
--- drivers/mtd/nand/nand_base.c	(revision 254089)
+++ drivers/mtd/nand/nand_base.c	(working copy)
@@ -3410,6 +3410,7 @@
 			break;
 		}
 	}
+	mtd->subpage_sft = 0;
 	chip->subpagesize = mtd->writesize >> mtd->subpage_sft;
 
 	/* Initialize state */

Can you please try this patch out in your end and please send it to your PSP team for verification.

Maynard

0 Chris P over 13 years ago in reply to Maynard Cabiente

Intellectual 505 points

I too received valuable help from Renjith Thomas. And I'm not speaking negatively of his contribution in any way, but I still had issues with UBIFS/BCH8 on my board. JFFS2/1-Bit Hamming still needs to be fixed.

The entire system needs to be reviewed, fixed, and verified.

Chris

0 jiangao peng101018 over 13 years ago in reply to Pavel Botev

Intellectual 310 points

I am using BCH8 and ubifs, it works well now. I want to use hamming code ecc for a better nand speed, but failed!

1: you must use linux-2.6.37-psp04.04.x

2: you must use bch8

3: mkfs.ubifs -U -F -r ${src} -m 2048 -e 129024 -c ${BLOCK_NUM} -o ubifs.img

ubinize -o dest.data -m 2048 -p ${blocksize} -s 512 UBI_CFG

0 Pavel Botev over 13 years ago in reply to Chris P

TI__Guru**** 170625 points

Hello,

I apply this patch, plus one more : 3122.0003-nand-ubifs.patch

This patch (nand-ubifs.patch) also came from Renjith Thomas, who has great expertise in this field.

Now I am booting successfully without any UBI/UBIFS errors reporting, several times (I test it 5 times).

This is the new kernel image, after applying the two patches: 4846.uImage

I test with DM8168 TI EVM.

Could you try this on your side?

Best Regards,

Pavel

0 Pavel Botev over 13 years ago in reply to Pavel Botev

TI__Guru**** 170625 points

Hi Jiangao,

I tried the patched kernel, and it works with HW ECC Hamming code also.

What is your current NAND speed in u-boot and in kernel? What NAND speed do you expect to achieve?

Best Regards,

Pavel

0 Maynard Cabiente over 13 years ago in reply to Pavel Botev

Intellectual 460 points

Hi Pavel,

I'm not sure if I will be able to try your new kernel today as I am in the middle of something. If not today, then I will definitely test it on Monday. I will also include the new patch in our codebase.

Can TI PSP team respond if this is the correct fix for this UBI errors? Are these patches already included in the current TI PSP kernel? Would you also know when will there be a new EZSDK/PSP for the DM81xx products?

Regards,
Maynard

0 jiangao peng101018 over 13 years ago in reply to Pavel Botev

Intellectual 310 points

Hi Pavel:

My nand speed in uboot is 2M/s and in kenel is 5M/s.

I have test HW ECC hamming code, it works at 10M/s, but it can not work with ubifs.

I am using 8-bit nand.

Best regards.

0 Pavel Botev over 12 years ago in reply to jiangao peng101018

TI__Guru**** 170625 points

Hi Maynard,

"Can TI PSP team respond if this is the correct fix for this UBI errors?" I reported them these fixes, so these should be examined and considered by them.

"Are these patches already included in the current TI PSP kernel?" No, these patches are not included in the current TI PSP kernel. I reported them these fixes, so these should be considered for the next release.There are 3 more official patches that will be included in the next release (unless they decide to revert any of these 3):

http://arago-project.org/git/projects/?p=linux-omap3.git;a=commit;h=adc46d691d745604da1197d154fe712e10ec468d

http://arago-project.org/git/projects/?p=linux-omap3.git;a=commit;h=439196c951fbec3cca596fc45389de4512958595

http://arago-project.org/git/projects/?p=linux-omap3.git;a=commit;h=209d8a31904f072b4d6d3d7511bcbd58e016f430

"Would you also know when will there be a new EZSDK/PSP for the DM81xx products?" I think these are on 6 moths bases, so next one should be in October. I will check this with them.

Hi Jiangao,

With the new kernel and patch files (attach in my post from 14-Sept-2012, 12:22 PM) the HW ECC hamming code works fine with UBIFS.

Best Regards,

Pavel

0 Renjith Thomas over 12 years ago in reply to jiangao peng101018

Guru 31670 points

Peng,

The NAND throughput can be improved to atleast 6MB/sec in u-boot and kernel with BCH8 algorithm itself. If you are worried about the throughput in using BCH8, I dont think this is a big issue at all.

0 jiangao peng101018 over 12 years ago in reply to Renjith Thomas

Intellectual 310 points

Hi Renjith Thomas

We are using 8bit nand. We changed the value of GPMC and only 2MB/s in uboot 、5MB/s in kernel.

do you have some ideas to improve nand speed ?

Best regards.

0 Renjith Thomas over 12 years ago in reply to jiangao peng101018

Guru 31670 points

Hi Peng,

We've to optimize the NAND driver read code, tune the NAND timings etc. Also we need to really see the NAND part exactly, and if it having the internal ECC support or not etc. Depending on all these parameters we can improve the throughput to a good extend in u-boot without using DMA itself. Writing some code in assembly also has helped a lot.

0 Maynard Cabiente over 12 years ago in reply to Pavel Botev

Intellectual 460 points

Hi Pavel,

Thanks for posting those patches. We will either try them or wait for the next release in Oct.

I did test your new kernel image. At first, the UBI errors disappeared completely. And most of the time, the errors are non-existent.

However, I did encounter the same error during some quick testing by creating files and directories and then rebooting the system. I tried several times again to reproduce the problem by creating/removing files/directories but was unable to do so again. But, the issue I have is that I encountered it twice on boot-up. Here are the errors:

random boot-up 1:

UBI: attaching mtd7 to ubi0
UBI: physical eraseblock size: 131072 bytes (128 KiB)
UBI: logical eraseblock size: 126976 bytes
UBI: smallest flash I/O unit: 2048
UBI: VID header offset: 2048 (aligned 2048)
UBI: data offset: 4096
ata1: SATA link down (SStatus 0 SControl 300)
ata2: SATA link down (SStatus 0 SControl 300)
UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 1031:0, read 64 bytes
UBI: max. sequence number: 968
UBI warning: print_rsvd_warning: cannot reserve enough PEBs for bad PEB handling, reserved 11, need 16
UBI: attached mtd7 to ubi0
UBI: MTD device name: "File System"
UBI: MTD device size: 200 MiB
UBI: number of good PEBs: 1601
UBI: number of bad PEBs: 0
UBI: number of corrupted PEBs: 0
UBI: max. allowed volumes: 128
UBI: wear-leveling threshold: 4096
UBI: number of internal volumes: 1
UBI: number of user volumes: 1
UBI: available PEBs: 0
UBI: total number of reserved PEBs: 1601
UBI: number of PEBs reserved for bad PEB handling: 11
UBI: max/mean erase counter: 29/1
UBI: image sequence number: 458701251
UBI: background thread "ubi_bgt0d" started, PID 43

random boot-up 2:

UBI: attaching mtd7 to ubi0
UBI: physical eraseblock size: 131072 bytes (128 KiB)
UBI: logical eraseblock size: 126976 bytes
UBI: smallest flash I/O unit: 2048
UBI: VID header offset: 2048 (aligned 2048)
UBI: data offset: 4096
ata1: SATA link down (SStatus 0 SControl 300)
ata2: SATA link down (SStatus 0 SControl 300)
UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 1513:0, read 64 bytes
UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 1533:0, read 64 bytes
UBI: max. sequence number: 2014
UBI warning: print_rsvd_warning: cannot reserve enough PEBs for bad PEB handling, reserved 11, need 16
UBI: attached mtd7 to ubi0
UBI: MTD device name: "File System"
UBI: MTD device size: 200 MiB
UBI: number of good PEBs: 1601
UBI: number of bad PEBs: 0
UBI: number of corrupted PEBs: 0
UBI: max. allowed volumes: 128
UBI: wear-leveling threshold: 4096
UBI: number of internal volumes: 1
UBI: number of user volumes: 1
UBI: available PEBs: 0
UBI: total number of reserved PEBs: 1601
UBI: number of PEBs reserved for bad PEB handling: 11
UBI: max/mean erase counter: 30/1
UBI: image sequence number: 458701251
UBI: background thread "ubi_bgt0d" started, PID 43

At least, there are no bad or corrupted PEBs even after these UBI errors. However, I am concerned that the errors are still present.

Is the kernel image that you provided includes all the patches that are relevant to BCH8 and UBI?

Regards,
Maynard

0 Pavel Botev over 12 years ago in reply to Maynard Cabiente

TI__Guru**** 170625 points

Hi Maynard,

These 3 patches (from the arago-project) were not included in the kernel image that I last provided.

Could you please try with pure kernel image (without any patches) plus only this patch:

http://arago-project.org/git/projects/?p=linux-omap3.git;a=commit;h=adc46d691d745604da1197d154fe712e10ec468d

Please let me know if these UBI errors still present.

Best Regards,

Pavel

0 Maynard Cabiente over 12 years ago in reply to Pavel Botev

Intellectual 460 points

Hi Pavel,

I did what you asked. I reverted back the patches that Renjith gave me and just included the 1 patch you recommended.

The good news is that the original issue [UBI error: ubi_io_read: error -74 (ECC error) while reading 131072 bytes from PEB 1600:0, read 131072 bytes] has not surfaced (yet). I created and removed files and directories and then rebooted afterwards without encountering the original issue.

The bad news is that I did get a kernel crash one time. Below is the kernel crash I received while trying to remove certain files.

root@dm816x-evm:~/test# rm hugefile.*
UBIFS error (pid 1437): ubifs_check_node: bad CRC: calculated 0x442235c3, read 0x379e79ed
UBIFS error (pid 1437): ubifs_check_node: bad node at LEB 1217:116160
UBIFS error (pid 1437): ubifs_read_node: expected node type 9
UBIFS warning (pid 1437): ubifs_ro_mode: switched to read-only mode, error -117
Backtrace:
[<c004a2bc>] (dump_backtrace+0x0/0x110) from [<c03b6800>] (dump_stack+0x18/0x1c)
r6:ffffff8b r5:cc56c480 r4:ccbf0000 r3:c0526684
[<c03b67e8>] (dump_stack+0x0/0x1c) from [<c017baa0>] (ubifs_ro_mode+0x6c/0x78)
[<c017ba34>] (ubifs_ro_mode+0x0/0x78) from [<c0172ed8>] (ubifs_jnl_delete_inode+0x9c/0xbc)
[<c0172e3c>] (ubifs_jnl_delete_inode+0x0/0xbc) from [<c0177bf0>] (ubifs_evict_inode+0x7c/0xf0)
r7:00000000 r6:cc56c480 r5:ccbf0000 r4:cc56c480
[<c0177b74>] (ubifs_evict_inode+0x0/0xf0) from [<c00dc2d8>] (evict+0x28/0x9c)
r5:00000000 r4:cc56c480
[<c00dc2b0>] (evict+0x0/0x9c) from [<c00dc850>] (iput+0x1f8/0x240)
r4:ccbcc800 r3:cc56c498
[<c00dc658>] (iput+0x0/0x240) from [<c00d4570>] (do_unlinkat+0xec/0x144)
r6:cc56aa80 r5:00000000 r4:cc56c480 r3:00000000
[<c00d4484>] (do_unlinkat+0x0/0x144) from [<c00d5720>] (sys_unlink+0x18/0x1c)
r7:0000000a r6:bef58c40 r5:bef58dad r4:00000000
[<c00d5708>] (sys_unlink+0x0/0x1c) from [<c00466a0>] (ret_fast_syscall+0x0/0x30)
UBIFS error (pid 1437): ubifs_evict_inode: can't delete inode 201941, error -117
rm: remove 'hugefile.12'? UBIFS error (pid 272): make_reservation: cannot reserve 160 bytes in jhead 1, error -30
UBIFS error (pid 272): ubifs_write_inode: can't write inode 4896, error -30

Once I hit this issue, there is no way to remove the corrupted inode. Just to make sure that any other errors are not due to this kernel crash, I re-programmed the root filesystem partition again.

I tried reproducing the issue again afterwards but I couldn't anymore. I will try testing again next week.

So, for now, does the TI PSP team recommend to just use that one patch and discard Renjith's patches to fix the original UBI error that we encountered?

http://arago-project.org/git/projects/?p=linux-omap3.git;a=commit;h=adc46d691d745604da1197d154fe712e10ec468d

Regards,
Maynard

0 Leon Pollak over 10 years ago in reply to Pavel Botev

Intellectual 960 points

Hello, Pavel and all.

We can't overcome the NAND-BCH8 problems.

We took the last Arago git kernel (from Oct 2013), which is supposed to contain all known patches for all known issues and we still have ECC errors just on simple nand_write / nand_dump operations. Although the dumped content is exactly the same as written one, we see the ECC error messages on the console.

We also noted that the patches from Thomas discussed here seem to not appear in the Arago git repository.

Can somebody enlighten the situation? How can the working kernel with some (UBIFS or JFFS2-with-external-clean-marker) file system be obtained?

Thanks!

0 Pavel Botev over 10 years ago in reply to Leon Pollak

TI__Guru**** 170625 points

Leon,

Leon Pollak said:
we still have ECC errors just on simple nand_write / nand_dump operations. Although the dumped content is exactly the same as written one, we see the ECC error messages on the console.

Can you provide console log? Can you try with other NAND chip?

In the below thread you will find info regarding UBIFS flashing on NAND:

ONET8501V: long term availability - Interface forum - Interface - TI E2E support forums

e2e.ti.com

Part Number: ONET8501V Other Parts Discussed in Thread: ONET4291VA , Hello TI! I am in the process of replacing, in my electronics, the laser driver ONET4291VA

Regards,
Pavel

0 Leon Pollak over 10 years ago in reply to Pavel Botev

Intellectual 960 points

Hello, Pavel.

Our problem with the ECC errors was in pure write/read operations: we booted via NFS and made nand_write and nand_dump. nand_dump produced ECC errors while reading the same information, although the generated file passed comparison with the original.

IMHO, this is the major problem. But still we tried to follow your instructions from the link you provided today.
Please, see below the log file of your sequence. Please, note that ubiattach produces several errors, from which incorrect LEB size seems to be most important.

1. -----------------
./mkfs.ubifs -v -r ./ubifs -m 2048 -e 126976 -c 1601 -o ./ubifs.img
mkfs.ubifs
root: ./ubifs/
min_io_size: 2048
leb_size: 126976
max_leb_cnt: 1601
output: ./ubifs.img
jrn_size: 8388608
reserved: 0
compr: lzo
keyhash: r5
fanout: 8
orph_lebs: 1
space_fixup: 0
super lebs: 1
master lebs: 2
log_lebs: 5
lpt_lebs: 2
orph_lebs: 1
main_lebs: 494
gc lebs: 1
index lebs: 13
leb_cnt: 505
UUID: 7588F2F4-D951-4EE8-ACB3-5D0C25AC6BFE
Success!

2. ---------------
cat ./ubinize.cfg
[ubifs]
mode=ubi
image=./ubifs.img
vol_id=0
vol_size=160MiB
vol_type=dynamic
vol_name=ubi_rootfs
vol_flags=autoresize

3. --------------------
./ubinize -v -o ./ubi.img -m 2048 -p 128KiB -s 2048 ./ubinize.cfg
ubinize: LEB size: 126976
ubinize: PEB size: 131072
ubinize: min. I/O size: 2048
ubinize: sub-page size: 2048
ubinize: VID offset: 2048
ubinize: data offset: 4096
ubinize: UBI image sequence number: 197615993
ubinize: loaded the ini-file "./ubinize.cfg"
ubinize: count of sections: 1

ubinize: parsing section "ubifs"
ubinize: mode=ubi, keep parsing
ubinize: volume type: dynamic
ubinize: volume ID: 0
ubinize: volume size: 167772160 bytes
ubinize: volume name: ubi_rootfs
ubinize: volume alignment: 1
ubinize: autoresize flags found
ubinize: adding volume 0
ubinize: writing volume 0
ubinize: image file: ./ubifs.img

ubinize: writing layout volume
ubinize: done

---------------------------------------------------------------------------------
In the board (embedded):
---------------------------------------------------------------------------------
4.
root@dm814x-evm:~/UBI# flash_erase /dev/mtd4 0 0
Erasing 128 Kibyte @ c7a0000 -- 99 % complete flash_erase: Skipping bad block at 0c7c0000
Erasing 128 Kibyte @ c800000 -- 100 % complete

5.
root@dm814x-evm:~/UBI# ./ubiformat /dev/mtd4 -f ./ubi.img -s 512 -O 2048
ubiformat: mtd4 (nand), size 209846272 bytes (200.1 MiB), 1601 eraseblocks of 131072 bytes (128.0 KiB), min. I/O size 2048 bytes
libscan: scanning eraseblock 1600 -- 100 % complete
ubiformat: 1600 eraseblocks are supposedly empty
ubiformat: 1 bad eraseblocks found, numbers: 1598
ubiformat: flashing eraseblock 506 -- 100 % complete
ubiformat: formatting eraseblock 1600 -- 100 % complete

6.
root@dm814x-evm:~/UBI# sync

7.
root@dm814x-evm:~/UBI# ./ubiattach /dev/ubi_ctrl -m 4
ubiattach: error!: cannot attach mtd4
error 22 (Invalid argument)

UBI: attaching mtd4 to ubi0\par
UBI: physical eraseblock size: 131072 bytes (128 KiB)
UBI: logical eraseblock size: 129024 bytes !!!!!
UBI: smallest flash I/O unit: 2048
UBI: sub-page size: 512
UBI: VID header offset: 512 (aligned 512) !!!!!
UBI: data offset: 2048 !!!!!
UBI error: validate_ec_hdr: bad VID header offset 2048, expected 512
UBI error: validate_ec_hdr: bad EC header
UBI error: ubi_io_read_ec_hdr: validation failed for PEB 0

8.
root@dm814x-evm:~/UBI# mtdinfo /dev/mtd4
mtd4
Name: File System
Type: nand
Eraseblock size: 131072 bytes, 128.0 KiB
Amount of eraseblocks: 1601 (209846272 bytes, 200.1 MiB)
Minimum input/output unit size: 2048 bytes
Sub-page size: 512 bytes
OOB size: 64 bytes
Character device major/minor: 90:8
Bad blocks are allowed: true
Device is writable: true

0 Pavel Botev over 10 years ago in reply to Leon Pollak

TI__Guru**** 170625 points

Leon,

Leon Pollak said:
Our problem with the ECC errors was in pure write/read operations: we booted via NFS and made nand_write and nand_dump. nand_dump produced ECC errors while reading the same information, although the generated file passed comparison with the original.

Could you please provide me the exact steps, I will try this at my side (DM814x TI EVM with latest linux kernel from arago). Can you also try with the "nand scrub" command.

Leon Pollak said:
But still we tried to follow your instructions from the link you provided today.
Please, see below the log file of your sequence. Please, note that ubiattach produces several errors, from which incorrect LEB size seems to be most important.

Leon Pollak said:
./mkfs.ubifs -v -r ./ubifs -m 2048 -e 126976 -c 1601 -o ./ubifs.img

What is the target filesystem (-r ./ubifs) that you are using here? Is it EZSDK filesystem, or arago fs or else?

For LEB size calculations, see the below wiki page:

processors.wiki.ti.com/.../TI811x_UBIFS_Support

processors.wiki.ti.com/.../UBIFS_Support

BR
Pavel

0 Leon Pollak over 10 years ago in reply to Pavel Botev

Intellectual 960 points

Dear Pavel.
Thank you very much for the UBIFS instructions - it starts working. It was the real great help!!!
Still, we are anxious about the errors we receive in non-file system I/O.
Please, see below the log of erroneous behavior of direct I/O.
-------------------------------------------------------------------
root@dm814x-evm:~/UBI# nandtest -p 1 /dev/mtd4
ECC corrections: 0
ECC failures : 0
Bad blocks : 1
BBT blocks : 0
0c800000: checking...
Finished pass 1 successfully

root@dm814x-evm:~/UBI# nandtest -p 1 /dev/mtd4
ECC corrections: 0
ECC failures : 0
Bad blocks : 1
BBT blocks : 0
0c800000: checking...
Finished pass 1 successfully

root@dm814x-evm:~/UBI# /usr/sbin/flash_erase /dev/mtd4 0 0
Erasing 128 Kibyte @ c7a0000 -- 99 % complete flash_erase: Skipping bad block at 0c7c0000
Erasing 128 Kibyte @ c800000 -- 100 % complete

root@dm814x-evm:~/UBI# nandtest -p 1 /dev/mtd4
ECC corrections: 0
ECC failures : 0
Bad blocks : 1
BBT blocks : 0
00000000: reading...
ECC failed at 00000000
00020000: reading...
ECC failed at 00020000
00040000: reading...
ECC failed at 00040000
00060000: reading...
ECC failed at 00060000
00080000: reading...
ECC failed at 00080000
000a0000: reading...
ECC failed at 000a0000
000c0000: reading...
ECC failed at 000c0000
000e0000: reading...
ECC failed at 000e0000
00100000: reading...
ECC failed at 00100000
00120000: reading...
ECC failed at 00120000
00140000: reading...
ECC failed at 00140000
00160000: reading...
ECC failed at 00160000
00180000: reading...
ECC failed at 00180000
001a0000: reading...
ECC failed at 001a0000
001c0000: reading...
ECC failed at 001c0000
001e0000: writing...^C

root@dm814x-evm:~/UBI# nandtest -p 1 /dev/mtd4
ECC corrections: 0
ECC failures : 1
Bad blocks : 1
BBT blocks : 0
Bad block at 0x0c7c0000
0c800000: checking...
Finished pass 1 successfully
---------------------------------
These may be results of buggy utilities.
But the following, IMHO, can't be explained by utilities problems and seems to be a real nand driver issue:
---------------------------------
root@dm814x-evm:~/UBI# /usr/sbin/flash_erase /dev/mtd4 0 0
Erasing 128 Kibyte @ c7a0000 -- 99 % complete flash_erase: Skipping bad block at 0c7c0000
Erasing 128 Kibyte @ c800000 -- 100 % complete

root@dm814x-evm:~/UBI# /usr/sbin/nandwrite -p /dev/mtd4 ./big_test_file
Writing data to block 0 at offset 0x0

root@dm814x-evm:~/UBI# /usr/sbin/nanddump -o -f ./dump_big_test_file -s 0x0 -l 0x2000 /dev/mtd4
ECC failed: 56105
ECC corrected: 0
Number of bad blocks: 1
Number of bbt blocks: 0
Block size 131072, page size 2048, OOB size 64
Dumping data starting at 0x00000000 and ending at 0x00002000...
ECC: 1 uncorrectable bitflip(s) at offset 0x00000000

root@dm814x-evm:~/UBI# nandtest -p 1 /dev/mtd4
ECC corrections: 0
ECC failures : 56106
Bad blocks : 1
BBT blocks : 0
Bad block at 0x0c7c0000
0c800000: checking...
Finished pass 1 successfully

Thank you for your help again!

0 Leon Pollak over 10 years ago in reply to Leon Pollak

Intellectual 960 points

Pavel, please - I forgot to mention that
cmp dump_big_test_file big_test_file
does not return any error.

0 Pavel Botev over 10 years ago in reply to Leon Pollak

TI__Guru**** 170625 points

Leon,

See if the below pointers will be in help to fix the ECC errors:

e2e.ti.com/.../937980
e2e.ti.com/.../360912
e2e.ti.com/.../330690
e2e.ti.com/.../291288
e2e.ti.com/.../233022
e2e.ti.com/.../391494

BR
Pavel

Processors

Processors forum

Booting from NAND causes UBI error in EZSDK 5.04.00.11 using BCH8

ONET8501V: long term availability - Interface forum - Interface - TI E2E support forums