This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi!
We have a custom board with am3517 and Samsung k9f2g x8 NAND. Using Linux 2.6.37 from http://arago-project.org/git/people/?p=sriram/ti-psp-omap.git -- OMAPPSP_04.02.00.07 branch.
When writing flash (jffs2 and ubi) we were getting errors of this sort (for jffs2) -- "Write of X bytes at Y failed. returned -5, retlen 0 Not marking the space at Y as dirty because the flash driver returned retlen zero"
Seeing other posts related to this we wanted to share our findings. We have found that for some reason the flash read/write/erase operations can take a lot of time to finish in polled mode.
Increasing the timeout for omap_wait(...) in omap2 nand driver removes all problems (it seems).
--- a/drivers/mtd/nand/omap2.c
+++ b/drivers/mtd/nand/omap2.c
@@ -924,11 +924,11 @@ static int omap_wait(struct mtd_info *mtd, struct nand_chip *chip)
mtd);
unsigned long timeo = jiffies;
int status = NAND_STATUS_FAIL, state = this->state;
-
+ //HACK:
if (state == FL_ERASING)
- timeo += (HZ * 400) / 1000;
+ timeo += (HZ * 4000) / 1000;
else
- timeo += (HZ * 20) / 1000;
+ timeo += (HZ * 1000) / 1000;
This is a hack however as the above timeouts are silly to be honest (the above timeouts are just some big enough numbers, as samsung spec says that 5/10/500us should be enough for r/w/e).
Seeing how the omap_wait code is written the performance of the flash is still pretty good as it will only have the complete timeout if an error actually occur (we belive).
Enableing DMA prefetch also works, however when flash load is high errors can still occur. We belive this is because of fallback to polled mode operations if DMA is busy.
We are currently using DMA prefetch with the above change to omap_wait, which feels pretty stable.
This is what we have found as of yet. It would be nice to get some comments on wheter this might be a problem with the NAND timings set in x-loader. We have not noticed any problems with flash in u-boot though, so this seems unlikely(?).
Another note is that we cannot use subpage writes with UBI and must manually turn these off in kernel to get ubi to work. When subpage is off ubi seems to work well.
Subpage write seems to be fixed now : http://arago-project.org/git/people/?p=sriram/ti-psp-omap.git;a=commit;h=1f62a9d1143cffcdc5dbf5b433fb905fa5f78831
Nice work!
Regards,
Anton
Anton,
What was your final configuration where this worked? NAND_OMAP_PREFETCH and NAND_OMAP_PREFETCH_DMA both on? JFFS2 and/or UBIFS?
I tried the subpage write patch (on a 2.6.32 kernel, JFFS2) but I still see the problem occasionally (once every 1-4 hours with heavy file system load).
Thanks,
Orjan
Hello Orjan,
I would try increasing the timeouts for polled mode if you havent already. (without this we also got errors during heavy load).
"Final" solution for us (seems stable for both jffs2 & ubi):
* NAND_OMAP_PREFETCH_DMA
* subpage fix from arago
* Increased timeout for polled mode (this was important even when using prefetch as polled mode might still be used from time to time).
Regards,
Anton
Thanks. I wasn't sure if the increased timeout was needed once you got the subpage fix in place.
Timeout value are ok, but the exit code have to be updated.
while (time_before(jiffies, timeo)) {
status = __raw_readb(this->IO_ADDR_R);
if (status & NAND_STATUS_READY)
break;
cond_resched();
}
+ /* if we have time-out exit, then check again */
+ if (!(status & NAND_STATUS_READY)) {
+ status = __raw_readb(this->IO_ADDR_R);
+ }
Update (concerning am335x),
The reason for this (very late) post is that we found the same problem on am3359.
David: Thank you! Fix seems to do the trick! On a note TI seems to have introduced a very similar fix in their am335x kernel (arago).