This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Nor Flash

We setup the nor flash writer to perform a write and then a read of the CFI memory in the flash.  Then we reran this over and over.  It would succeed maybe 6-8 times in a row and then there would be a failure.  There might be another success before another failure then a few successes, etc.  On average, the fail rate is somewhere around 10-20%.


In the image, The Yellow is CS, Green is OE and pink is GPMC_D0.  Instead of the read cycle which includes a CS assertion of ~180ns with our conservative timings, I saw a cycle that did not include a OE assertion and lasted only ~90ns.  When we had this setup, one person was running the computer and another working the scope.  The person on the scope could tell the computer operator (running CCS) with 100% accuracy whether the read succeeded or failed by observing the scope.  This shortened cycle was observed for a failure while the full cycle is seen for the success.

While it did not occur to me at the time, the length of this CS is the same as it is for the write cycle.  Since we had the scope set to re-trigger after each cycle, this is likely the write before the read and there was no read cycle on the bus at all.  I can confirm this on Monday by watching OE, WE and CS and performing the same experiment, but I think this is the most likely explanation of what is happening here.
Is anyone else out there using the nor-flash-writer program successfully with this part (8168)?
  • If you modify the code a bit such that it just continually runs through a loop where it reads the NOR flash manufacturer info and the code checks for the expected response, is there any consistency in the failure?  For example does it always fail on the same pass?  You should try to make it fail in a way that does not require you to manually step through code and look at things to spot the failure.  Here's some pseudo code of what I'm thinking:

    int i=0;

    while (1)

    {

        ret_val = read_manufacturer_id();

        if (ret_val != EXPECTED_ID)

            printf("Iteration %d: Read returned %x instead of %x", i, ret_val, EXPECTED_ID);

        i++;

    }

    Can you do something like that?  Is there any consistency?

  • I've been talking to the designer that owns the GPMC IP block.  There's no reason why the GPMC itself would not actually perform the read on the bus.  That said, it seems most likely that if you're not seeing a read on the GPMC bus, that means that the GPMC has not actually been told to perform a read.  So how would that happen?  One thing that comes to mind would be if you're using external memory for code/data and the DDR timings are slightly off.  That might cause the pointer address to be corrupted such that the CPU is requesting data from some other address.

    So how have you tested your DDR timings?  Here's a very easy test that's reasonably good:

    1. Open a memory window to the start of DDR (0x80000000).
    2. Maximize the memory window.
    3. Write a few pieces of data into the window to make sure the data stays, e.g. 0x00000001, 0x00000002, etc.
    4. Refresh the window a bunch of times.  CCS will highlight any changes, so if you're seeing things change then your DDR timings are bad.

    Software errors can also cause this issue.  For example, if software references a pointer before it has been initialized, then the behavior will be dependent upon whatever happens to be in the memory location.  Similarly, if you have a stack overflow, etc that might also corrupt things and cause "strange" behavior.

    I think that in order to make progress you need to try and simplify your test case as much as possible:

    1. Use internal memory only.
    2. Create a very simple test like I suggested in my previous post (preferably where the CPU can check the results itself) and see if you can make a test like that run reliably.
    3. Once you get something very simple and reliable then you can add a little bit more code and test again until it fails.

    On a more general note, how is this issue impacting your development?  For example, does this make it impossible to flash a board, or does it just take a few attempts to do so?  Have you tried running u-boot from the internal RAM?  Is it stable?  If so, another thing to try would be to run some of the u-boot flash commands.  In general I would expect u-boot to be far better tested than the flash writer utility.

    Brad

  • Brad Griffis said:
    I've been talking to the designer that owns the GPMC IP block.  There's no reason why the GPMC itself would not actually perform the read on the bus.  That said, it seems most likely that if you're not seeing a read on the GPMC bus, that means that the GPMC has not actually been told to perform a read.  So how would that happen?  One thing that comes to mind would be if you're using external memory for code/data and the DDR timings are slightly off.  That might cause the pointer address to be corrupted such that the CPU is requesting data from some other address.

    Thing one: we are running out of internal SRAM at 0x4030_0000 (L3 I think it's called).  Thing Two: We have done software levelling on our DDR3 and have run an extensive 5 minute memory test and verified that the DDR3 is functional.  So at present, I don't think this is the problem

    Brad Griffis said:

    So how have you tested your DDR timings?  Here's a very easy test that's reasonably good:

    1. Open a memory window to the start of DDR (0x80000000).
    2. Maximize the memory window.
    3. Write a few pieces of data into the window to make sure the data stays, e.g. 0x00000001, 0x00000002, etc.
    4. Refresh the window a bunch of times.  CCS will highlight any changes, so if you're seeing things change then your DDR timings are bad.

    We've done this, but of course the memory test was much more extensive.

    Brad Griffis said:
    Software errors can also cause this issue.  For example, if software references a pointer before it has been initialized, then the behavior will be dependent upon whatever happens to be in the memory location.  Similarly, if you have a stack overflow, etc that might also corrupt things and cause "strange" behavior.

    We have seen what I would call some very strange behavior so I think this is a real possibility.  I'll go look and see if we can adjust the stack size.  I assume this is in the makefile somewhere?  This is an excellent idea.

    Brad Griffis said:

    I think that in order to make progress you need to try and simplify your test case as much as possible:

    1. Use internal memory only.
    2. Create a very simple test like I suggested in my previous post (preferably where the CPU can check the results itself) and see if you can make a test like that run reliably.
    3. Once you get something very simple and reliable then you can add a little bit more code and test again until it fails.

    Yes, I agree.  Excellent point.  We are working with the TI Nor Flash writer and we expected it would "just work."  Alas, this is not the case and now we're debugging a large number of lines from a program we do not fully understand.

    Brad Griffis said:
    On a more general note, how is this issue impacting your development?  For example, does this make it impossible to flash a board, or does it just take a few attempts to do so?

    We have been able to write individual words to flash, but we have never been able to get this program to work and writ the whole flash.  We believe, quite frankly, that we would have been better off just to write our own from scratch.  We may end up doing this, but we are about to punt on NOR completely, thinking it might be a defect in the GPMC NOR controller.

    Brad Griffis said:
    Have you tried running u-boot from the internal RAM?

    We would very much like to do this.  Are there instructions on how to compile and run it out of memory?  We are trying to send it to the processor over serial boot.  We munged a few of the config lines from the u-boot makefile together in hopes it would make an image that we could boot, but after the xmodem transfer, we get nothing.  We don't even know what the ORG location for the file should be ... should it run at 0x4030_0000?  ox4040_0000? DDR3 at 0x8000_0000?  somewhere higher in memory at 0x8070_0000?  We just are clueless.  I would love to read something on how to do this rather than fumble around in the dark.

    Steve

  • Stephen Hicks said:
    We would very much like to do this.  Are there instructions on how to compile and run it out of memory?  We are trying to send it to the processor over serial boot.

    You might find this article helpful in understanding what happens as u-boot starts running, e.g. where important things like PLL/DDR init occur in the sequence and the corresponding files where you find that info:

    http://processors.wiki.ti.com/index.php/Understanding_u-boot-min_startup_for_DM814x

    By the way, no matter which memory you end up using you will need to port u-boot to your board.  So these steps will be required regardless of whether you decide to use TI's flash-writer and regardless of whether you use NAND/NOR.  So if you haven't done any of this work, now is a good time.

    So for example, most of the changes you make will likely be in the file board/ti/ti8168/evm.c.  At a minimum you will need to update the PLL/DDR configuration, though probably some other stuff too.

    Stephen Hicks said:
    We don't even know what the ORG location for the file should be ... should it run at 0x4030_0000?  ox4040_0000?

    In the TRM Section 25.8 "Peripheral Booting" you can find the following:

    "The boot image is downloaded directly into internal RAM at the location 40400000h. The maximum size
    of downloaded image is 256KB."

    If you go to board/ti/ti8168/config.mk you will see TI_LOAD_ADDR = 0x40400000 and TEXT_BASE = 0x80700000.  This corresponds to a load address of 0x40400000 (which corresponds with the peripheral booting) and a run address of 0x80700000.  In other words, after the PLL and DDR init is completed then u-boot relocates itself to DDR at address 0x80700000 (see arch/arm/cpu/arm_cortexa8/start.S). 

    FYI, you will NOT want to use your gel files while debugging this u-boot code, i.e. let u-boot do all this setup. Also, I believe it's the u-boot.bin file (not u-boot.img) that you want to send over the serial port.  The TRM specifies that for peripheral booting you do NOT need the GP header.

  • FYI, I'm on vacation now, but if you have any updates please post here.  If I get a few minutes I'll try to squeeze in a reply.