We have a product that uses a DM355 and two different types of NANDs. ubl, uboot, the kernel, and the file system live on a YAFFS file system on a Micron MT29F4G08AACWC-ET, we have a data storage NAND that is a Samsung K9WBG08U1M device. We are running an old kernel from Montavista 2.6.18. I wanted to do a raw dump of the Samsung NAND, but the nanddump (from mtdutils) didn't work, so I grabbed the source and modified it to support the large block and oob sizes of the Samsung NAND.
When I started dumping the NAND I noticed I was getting a lot of uncorrectable bit-flips. I started digging into this and it looks like the kernel is reporting a good deal of uncorrectable ecc errors on both the Micron and Samsung NANDs. But we aren't seeing corruption of the ubl, u-boot, the kernel, the filesystem, and the data NAND. I looked at the errata on the DM355 and there are several items listed with regard to RBL and NAND ECC, but I don't see any problems with the EMIF controller.
Has anyone else seen problems like this? I pulled in the ECC check code from the latest Arago kernel and that does the same thing. I'm starting to think this is a problem with the EMIF silicon. Can TI help or verify this?
Thank you.