This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TI8168evm GPMC NAND read cycle time

Hi all,

I've been working on improving NAND read speeds in u-boot for the TI8168evm board.

I'm currently investigating the time to read each word from the flash chip. I have this downs to about 180ns per cycle. That consists of CS0 going low for 60ns then being high for 120ns, and then repeating for every word to be read.

Theoretically the Micron NAND chip included on the EVM board can support a 50ns cycle, and I would expect to be able to get this down to about 70ns to 80ns.

I can't find any timing diagrams in the TRM that explain the time between CS assertions,

There is the CYCLE2CYCLE and CYCLE2CYCLEDELAY config parameters, but I have them both set to 0.

In section 9.2.4.12.1.6 - NAND Device General Chip-Select Timing Control Requirement

It states: "Because accesses to a NAND device can be interleaved with other chip-select accesses, there is no certainty that CS always stays low between two accesses to the same chip-select. Moreover, an CS deassertion time between the same chip-select NAND accesses is likely to be required as follows: the CS deassertion requires programming CYCLETIME and RDACCESSTIME according to the CS-to-data-valid critical timing."

Since I have no other GPMC devices connected, nor configured then I'm at a loss as to why the CS line gets de-asserted between each word access. As for "there is no certainty that CS always stays low..." I am never seeing it stay low.

The rest of that section talks about the prefetch engine, which I'm going to go and research now, and probably implement next week, however I am quite curious to see whether there is something I'm missing that means the CS line is de-asserted every time, and why it has to stay de-asserted for so long.

Any ideas would be welcomed.

Thanks,

Andy

  • Andy,

    What is the read/write throughput that you could achieve finally?

  • All results are from reading / writting a 100MB partition.

    Reads use the prefetch engine, in polled mode, the partition is filled with data before hand (as reading an erased partition is quicker, as we don't use the ELM to look for errors).

    Writes are just polled (I found a bug where writes were writing incorrect ECC bytes when using the postwrite engine, and I haven't fixed it yet). The partition is fully erase before hand.

    In Uboot:

    Reads (nand read 0x81000000 rootfs1): 15.4MB/s

    Writes: 5.4MB/s

    In Linux:

    Reads (time cat /dev/mtd6 > /dev/null): 17.33MB/s

    Reads (time cat /dev/mtd6 > /tmp/foobar): 15.4MB/s

    Writes (time nandwrite -q /dev/mtd6 /tmp/foobar): 4.6MB/s

    I got prefetch with IRQs working, but obviously that made reads a bit slower, so we don't use that.

    I also implemented using the cached read command, which improves reads quite a bit, however it is slower for non sequential reads, so we don't use that in linux.

  • Thanks for the info!

    Have you tried the same in u-boot? If so, what is the throughput that you achieved in u-boot?

  • I posted my Uboot throughput in my last reply.

  • Hi,Andrew,i find that nand read is very slow in my ti816x evm board ,so i try use prefetch and poll mode to modify the uboot code by copy and modify code from linux kernel .

    the compile is successful ,but working is not successful . how do you implement your good modify? Could you share your code or modify ?

  • OK, there was quite a lot of changes, and I can't guarantee that all of them are applicable, but I'll go through them anyway.

    1) Enable the icache, dcache and MMU:

    http://arago-project.org/git/projects/?p=u-boot-omap3.git;a=commit;h=7d9cb5380cbe706052c91b35b6cee92b00e8d4eb

    By default the MMU isn't enabled, which means every read / write to memory is slow as hell, this includes executing instructions. I wrote my own code to enable these, based on patches to the main UBoot repository. The above patch is in the TI UBooot GIT repository and should work (you may need to add calls to enable your cache, to your board file), it is based on the same patches.

    Al/so note that once I enabled this, I found that the ethernet driver broke, which was to do with DMA being used and caches being flushed. http://slexy.org/view/s2buZ9YRCH is my patch to fix that, hopefully it'll work with the above TI cache + MMU patch.

    2) I added a dev_ready function to the NAND driver:

    http://slexy.org/view/s2EtZGLCqH

    Without this the base driver just delays for 100us after issuing a command, rather than watching the busy line.

    3) Changed the timing parameters:

    http://slexy.org/view/s2jBLtHSV2

    This is the complicated bit. I spent ages reading the TRM and the datasheet for the NAND chip and tweaked all the GPMC registers to be as close as possible to the fastest the NAND chip could cope with. Note the "#ifdef CONFIG_SHARK", to separate my timings from the defaults, you probably just want to replace the defaults with the ones in the #ifdef.

    4) Optimised out a RNDOUT command:

    http://slexy.org/view/s21TeGL84l

    In the ti81xx_read_page_bch function, we always issues a RNDOUT command before reading each 512 byte sector, this is not necessarily the first time, since we have just issued a READ0 command, we don't need to set the offset to 0.

    5) Optimised the rest of the RNDOUT Commands:

    http://slexy.org/view/s2in7ukK6I

    We don't need to send page address in the RNDOUT command, as it's for setting the offset in a page.

    6) Implemented the prefetch engine:

    http://slexy.org/view/s200y1flkx

    and: http://slexy.org/view/s2elpaBsXy

    The first is the implementation, the second fixes a bug that stopped linux reading correctly.

    7) Added support for the READ_CACHE command:

    http://slexy.org/view/s2aXjBzU6o

    This is great if you want to do lots of large sequential reads, however if you read only a page here, and a page there, then this will slow down your reads.

    That's all of them that I can find. The most important ones is the MMU, the nand timings and the prefetch engine.

    I spent weeks using a logic analyser and looking for any gaps that I could cut out, or data that was sent unnecessarily.

    Note: my read / write speeds posted a couple of posts up, are for our custom hardware, which has a different NAND chip on, so don't be surprised if your timings are slightly off on the eval board.

  • Hi,Andrew,i use your modify ,but i do not add DCACHE part .

    the nand speed is very fast . your idea is good .

    there is a little problem .  i type the command:

    nand read 0x81000000 0x0 0xa

    the first result is error:

    NAND read: device 0 offset 0x0, size 0xa
    ECC: uncorrectable.
    NAND read from offset 0 failed -74
     0 bytes read: ERROR
    KEPLER#nand read 0x81000000 0x0 0xa

    the second result is OK :

    NAND read: device 0 offset 0x0, size 0xa
     10 bytes read: OK
    KEPLER#nand read 0x81000000 0x0 0xa

    every reboot uboot , then test nand read , the first result is error ,the second ,the third and more is OK .

    i am finding  the source of error ,if i find out ,i will reply .

  • Andrew,you code is correct .

    my test way is error. i use kernel to write data to nand flash , and use uboot to read data .

    but the ecc correct manner is diffrent between kernel and uboot .

    so,the first result is error .

    thank you for your idea and share !

  • Glad I could help :)

  • Hi Andrew,

    I want to make an interface between NAND and OMAP4460. i am having issues in timing parameters of NAND.

    So in your changes "Timing parameter changed"

    Can you throw some light on how did you give values to timing parameters. I have gone through TRM of OMAP4460 But couldn't find any idea.

    #define M_NAND_GPMC_CONFIG1 0x00001800
    +
    +//bits 3-0: CSONTIME = 0ns = 0 cycles
    +//bits 12-8: CSRDOFFTIME = 56ns = 7 cycles
    +//bits 20-16: CSWROFFTIME = 56ns = 7 cycles
    +#define M_NAND_GPMC_CONFIG2 0x00070700
    +
    +//bits 3-0: ADVONTIME = 0
    +//bits 12-8: ADVRDOFFTIME = 56ns = 7 cycles (this is a guess, not sure how to calculate this value)
    +//bits 20-16: ADVWROFFTIME = 56ns = 7 cycles
    +#define M_NAND_GPMC_CONFIG3 0x00070700
    +
    +//bits 3-0: OEONTIME = 0ns = 0 cycles
    +//bits 12-8: OEOFFTIME = 32ns = 4 cycles
    +//bits 19-16: WEONTIME = 0ns = 0 cycles
    +//bits 28-24: WEOFFTIME = 40ns = 5 cycles
    +#define M_NAND_GPMC_CONFIG4 0x05000400
    +
    +//bits 4-0: RDCYCLETIME = 56ns = 7 cycles
    +//bits 12-8: WRCYCLETIME = 56ns = 7 cycles
    +//bits 20-16: RDACCESSTIME = 32ns = 4 cycles
    +//bits 27-24: PAGEBURSTACCESSTIME = ??
    +#define M_NAND_GPMC_CONFIG5 0x00040707
    +
    +//todo
    +//bits 3-0: BUSTURNAROUND = 0ns = 0 cycles
    +//bit 7: CYCLE2CYCLE delay between two successive chip selects to same chip? = no
    +//bits 11-8: CYCLE2CYCLEDELAY = 0ns = 0 cycles
    +//bits 28-24: WRACCESSTIME = ??
    +#define M_NAND_GPMC_CONFIG6    0x16000000
    +#define M_NAND_GPMC_CONFIG7    0x00000008
    +
    +#else
    +
    +#define M_NAND_GPMC_CONFIG1 0x00001810
    +#define M_NAND_GPMC_CONFIG2 0x001e1e00
    +#define M_NAND_GPMC_CONFIG3 0x001e1e00
    +#define M_NAND_GPMC_CONFIG4 0x16051807
    +#define M_NAND_GPMC_CONFIG5 0x00151e1e
     #define M_NAND_GPMC_CONFIG6    0x16000f80
     #define M_NAND_GPMC_CONFIG7    0x00000008

    How did you set the values of CSONTIME, CSRDOFFTIME, ADVRDOFFTIME, ADVWROFFTIME...etc.

    Regards,

    Shyam

  • Hi Shyam,

    Please be warned that it was quite a long time ago when I looked at this, and so I could make mistakes here.

    You need two things:

    • the TRM for your chip, specifically the GPMC peripheral chapter. I don't know anything about the OMAP4460, and so I'm using the TI8168 TRM (please refer to this for my example).
    • The datasheet for the NAND chip, in my case this is the MT29F2G....

    IIRC there are four parts to a NAND transaction:

    • Command latch - telling the NAND what you are going to do
    • Address latch - what address are you operating on
    • Read data
    • Write data

    Starting with Command latch, find the diagram in both the TI8168 TRM and the Nand datasheet. For me this is: 9.2.4.12.1.3 in the TRM and Figure 11 in the Nand datasheet.

    With a little imagination you can combine these two diagrams and figure out the paramaters. For example in the TRM the time that CS_N line is low (active) is WRCYCLETIME and CSWROFFTIME. Looking at the NAND datasheet that line corresponds with the one labeled CE# and that time is t_CS + t_CH = minimum of 5ns.

    I had a granularity of 8ns for the GPMC peripheral (although I don't remember why 8ns, probably the period of the clock or something). This means that while the stated minimum was 5ns, I actually had to use a minimum of 8ns

    In addition to that you can look at all the other timings constraints and see that the best you can do is 24ns, which is what I chose for WRCYCLETIME and CSWROFFTIME.

    Attached: 0508.nand_timing_diagrams.pdf is a scanned PDF of three diagrams I drew to figure these values out for my chips. Hopefully that helps explain why I chose the values I did.

    As for the particular values you asked about:

    • CSONTIME - Everything in the TRM says this is 0, Since chip select (chip enable) is always the first thing to get asserted this is always your time 0
    • CSRDOFFTIME - TRM chapter 9.2.4.12.1.5 (Nand read cycle) combined with Figure 14 from the NAND datasheet.
    • ADVRDOFFTIME - TRM chapter 9.2.4.12.1.4 (Nand address latch cycle) combined with Figure 12 from the NAND datasheet.
    • ADVWROFFTIME - same

    In addition to this research I spent a long time in front of a logic analyser / oscilloscope looking at the waveform and trying to find any section that was longer than it had to be, and tweaked things as required.

    Hope that helps. Unfortunately all these values are different depending on the clock of your GPMC module and the timing requirements of your NAND chip, so I can't give you exact values.

    If you have more questions then I'll do what I can to help.