SLC Nand files are nullified if powered down after writing

Gopher79

Other Parts Discussed in Thread: AM3505

Simple to reproduce it on my board, running a file update (from a PC) during which a file is being uploaded and rewritten to the flash (5K filesize), 0.5-2 seconds after the operation I cut down the power. Then, the system boots up with the file being 0 bytes.

Suspect it can be an "UBIFS unstable bits" issue read about here: http://www.linux-mtd.infradead.org/doc/ubifs.html#L_unstable_bits

Have an AM3505 with Micron Nand MT29F8G08ADBDAH4 worked generally fine with BCH4 support

Using SDK 04.02.00.07.

I reckon ubifs should be very stable and protected from these kind of problems.

Wondering if kernel 2.6.37 has been updated with related stability patches.

A really serious customers/production facing issue for my company.

Please advise,

Kind regards

Yakir

over 11 years ago

0 Miroslav Kiradzhiyski XID over 11 years ago

TI__Mastermind 25235 points

Since you are using UBIFS, I can assume that this happens under Linux? Please let me know how you are uploading the file.

How is the 0.5-2 seconds delay related to the problem?

Does this happen with any file?

I'm not familiar with the AM35x SDK, but if you are using both U-Boot and Linux to manipulate the contents of the NAND flash, make sure that you are using the X-Loader, U-Boot and Linux from the same SDK. The ECC scheme should be the same.

Best regards,
Miroslav

0 Wolfgang Muees1 over 11 years ago

Genius 3685 points

#include <unistd.h>

// update:

fopen(...);

fwrite(...);

fclose(...);

sync(); // write file system metadata into flash

// done...

0 Gopher79 over 11 years ago in reply to Miroslav Kiradzhiyski XID

Expert 1125 points

Yes, I am under Linux (2.6.37 TI's). To put it simple, just copying the file from a ram partition to the flash.

The delay is just a figure of speech, I mean that when I power down after a short period after writing a file, the problem shows up. Therefore, I suspect the power down is related to the write operation.

Yes, it did happen with other files in similar scenarios.

The AM35x SDK is using X-loader which is 1-bit Hamming and U-Boot uses BCH4 like the Linux does. Again, the UBIFS is Linux only. I haven't done any experiments in u-boot, I am less concerned about such problems there...

In regards to the experiments I've done with brown outs, I doubt that the writing to flash by UBIFS is atomic....

Regards

Yakir

0 Gopher79 over 11 years ago in reply to Wolfgang Muees1

Expert 1125 points

Wolfgang,

Thanks for the proposal. Unfortunately, I am using OS commands like "cp" or "mv" which go down to the file system to perform the necessary ops. Therefore, if the FS is not robust enough, it should be patched with an improved procedure I guess.

Regards

0 Wolfgang Muees1 over 11 years ago in reply to Gopher79

Genius 3685 points

If you are operating with OS commands, there is a command named "sync".

The behaviour of the file system is intentional. It's a balance between caching, performance and safety.

If you want your data to be on disk immediately after writing, you can

a) use the "sync" command.

b) mount the file system with the option "sync".

Note, that option b) incures a performance penalty (and is not used in practice).

regards

Wolfgang

0 Gopher79 over 11 years ago in reply to Wolfgang Muees1

Expert 1125 points

Wolfgang,

The ubifs should be robust against brown outs from the documentation I saw. Your comment might be valid though the mtd doc is telling the following,

...

The solution is to teach UBIFS to erase-cycle any LEB which could potentially be written to when the power cut happened. This is not only about the journal LEBs, but also LPT, log, master and orphan LEBs. This means that the valid data from this LEB has to be read (and only once!) and then it should be written back to this LEB using the atomic LEB change UBI operation. This has to be done even if the LEB looks all-right - no corruptions, all 0xFFs at the end.

....

After that they provide a git source of the ubifs that contains the fix. Since I am using a relatively old version of linux kernel, here I am after the right advise of what patch I should apply to the /fs section.

Again, as far as I understand - it is not about "what I can do with it" but about "how the kernel should do it".

Regards

Yakir

PS. I have addressed the mtd doc in the initial post.

0 Wolfgang Muees1 over 11 years ago in reply to Gopher79

Genius 3685 points

Yakir,

I have doubt that it is the MTD issue with incomplete writes that you have.

If a write is interrupted for a power down, only 512 oder 2048 Bytes of data will be invalid, nothing more. And the write time per sector is only a very short period, so it is unlikely to trigger this.

Have you checked to add a sync command after your cp/move? Does the behaviour change?

regards

Wolfgang

0 Gopher79 over 11 years ago in reply to Wolfgang Muees1

Expert 1125 points

Thanks a lot Wolfgang

I've tried your bits (mount -o sync) and using sync commands directly in partitions where I don't want to mount like that and so far I've failed to reproduce the bug. It is looking good. My appreciation.

Regards

Yakir

Processors

Processors forum

SLC Nand files are nullified if powered down after writing