This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Nand corruption with UBIFS - corrupt empty space!

Hi,

I'm working with DM8148 running linux 2.6.37 kernel from a nand flash partition with UBIFS + BCH8 ECC.

we have an issue with a single corrupted bit in empty space causing the file system to be mounted read only on startup:

[ 36.980000] UBIFS error (pid 258): ubifs_scan: corrupt empty space at LEB 424:19741

[ 36.980000] UBIFS error (pid 258): ubifs_scanned_corruption: corruption at LEB 424:19741

[ 36.990000] 00000000: fffffffe ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ................................

[ 37.000000] UBIFS error (pid 258): ubifs_scan: LEB 424 scanning failed [ 37.010000] UBIFS warning (pid 258): ubifs_ro_mode: switched to read-only mode, error -117

I see on the forum that this issue has been raised before - my understanding is that the omap2 nand driver does not perform ECC detection/correction on empty pages so when UBIFS checks the empty space data and doesn't read all 0xFF then it fails and mounts read-only. I didn't find any good solution - only a workaround to remove the UBIFS check..

I have applied all the latest nand and ubifs patches from arago repository as suggested in the posts below:

http://e2e.ti.com/support/embedded/linux/f/354/t/171839.aspx

http://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/716/p/269235/945842.aspx#945842

Any help here would be appreciated - is there any fix from TI for this?

  • Hi folks,

    We have experienced a similar problem to the above and would be very interested in a solution to this problem,

    Regards,

    Terry

  • I'm not sure if it's the same issue, but I've seen an issue where ubifs fails to mount the root partition at all, complaining about corrupt free space.  I've seen this issue and have had some luck with making ubifs run an LEB recovery after finding corrupt empty space.

     Can be accomplished by changing fs/ubifs/recovery.c:

    fs/ubifs/recovery.c:697
    	} else if (!is_empty(buf, len)) {
    		if (!is_last_write(c, buf, offs)) {
    			int corruption = first_non_ff(buf, len);
    
    			/*
    			 * See header comment for this file for more
    			 * explanations about the reasons we have this check.
    			 */
    			ubifs_err("corrupt empty space LEB %d:%d, corruption "
    				  "starts at %d", lnum, offs, corruption);
    			/* Make sure we dump interesting non-0xFF data */
    			offs += corruption;
    			buf += corruption;
    			//goto corrupted;
    		}
    	}
    
    	min_io_unit = round_down(offs, c->min_io_size);

    really the only change is to comment out/remove the 'goto corrupted' line.  Still testing, but I've recovered a few units using a kernel with this modification.

     

     

     

     

  • Hi Shawn,

    I tried the same fix in recovery.c to ignore the corruption but I was getting UBI errors again after further system restarts.

    After this I added a fix in drivers/mtd/nand/omap2.c->omap_correct_data() to make the nand driver correct erased block data to all 0xFF. Its a bit crude but I haven't seen any further problems.

    William.