• Join
  • Sign In with my.TI Login
Texas Instruments
  • Products
  • Applications
  • Tools & Software
  • Support & Community
  • Sample & Buy
  • About TI
Sample & Purchase Cart Sample & Purchase Cart
  • Search
  • Advanced
TI E2E™ Community
  • Support Forums
  • Blogs
  • Groups
  • Videos
  • 简体中文
  • More ...
TI Home » TI E2E Community » Support Forums » Digital Signal Processors (DSP) » OMAP™ Processors » OMAP35x Processors Forum » L2 cache issue when running from NOR while accessing another GPMC CS
Share
OMAP™ Processors
  • Forums
  • Announcements
Options
  • Subscribe via RSS
Resources
  • OMAP-L1x DSP+ARM9™-based Processors Product Folder
  • OMAP3525/30 DSP+ARM Cortex™-A8-based SOCs Product Folder

  • Top OMAPL Wiki Links
  • OMAPL3x Schematic Review Checklist
  • OMAPL13x Boot resources

  • OMAPL Document Resources
  • OMAPL137 Technical reference manual
  • OMAPL138 Technical reference manual
  • OMAPL Boot loader App Notes
  • Forums

    L2 cache issue when running from NOR while accessing another GPMC CS

    This question is answered
    1962332
    Posted by 1962332
    on May 10 2012 08:14 AM
    Prodigy150 points

    Here's my scenario....

    - bootloader  based on uboot for NOR flash

    - Wanting to configure the L2 first thing in bootloader and boot into our custom OS with only the OS switching in a new MMU mapping table.

    - On GPMC CS0 - NOR flash (executing bootloader from this false, sets up cache, mmu, etc)

    - On GPMC CSx - deviceX has slower access times then the NOR flash and is synchronous with use of the wait signals on read/writes (wait lines can delay any where from 1us - 2ms for a single word read/write access)

    - GPMC is configured with no timeout and the L3 timeout is setup for max.  So this allows the wait lines on CSx to stall the processor until data is ready. (blocking GPMC and Initiator target access from ARM core to GPMC (through L3))

    - MMU configured with NOR flash as normal write through caching, DDR and SRAM as normal write through cached, and the deviceX as strong ordered non cached.

    So every thing will function OK in this configuration if I don't turn on L2 (L1 caching only).  I believe the reason is based on the fact the L2 module manages all memory accesses when cache is enabled (cached and not cached).  In this configuration, is it possible that the L2 can't handle the stalls that the CSx is causing?

    ISSUE:

    After doing only 2-6 accesses to CSx while running out of flash on CS0, I start to get a few different failure states if I keep power cycling. 
    The first is random memory corruption in DDR.  I'm loading a image from flash into DDR, doing a CRC that passes right after the load.  Then I do an access to CSx (while instructions are executing out of CS0) and some words in the image's DDR location change.  Thus my CRC check I added after the CSx call fails.
    The second is a GPMC Err type/address registers being populated with a invalid address (memory not GPMC addr).  The address isn't valid for any place on the system and it looks like it must have been a failure of a GPMC access attempted by the L2 controller to GPMC CS0 while I had a CSx access stalling.

    If I setup the MMU entry for NOR flash to be strong ordered and not cached the issue doesn't occur, but that drastically slows down my system when running from flash.  We're not sure if this is the correct way to resolve the issue (as it could just be we slow down execution enough it works)....  If that is the correct configuration, we were planning to move more of our boot code to DDR to speed it up and should be able to make it work.

    So I guess the simple question is how long of a GPMC wait signal stall can the L2 controller handle when it's managing the non cached memory access to that device?  Since if I disable L2 this all works and has worked for months....

    omap cache L2 gpmc stall
    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    All Replies
    • 1962332
      Posted by 1962332
      on May 10 2012 08:21 AM
      Prodigy150 points

      We have applied the following ARM errata....


      460075
      430973
      458693

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Brad Griffis
      Posted by Brad Griffis
      on May 11 2012 09:54 AM
      Guru57350 points

      Did you utilize this ROM service prior to enabling the L2?

      Can you post the code that enables the L2 cache for review?

      ---------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.
      --------------------------------------------------------------------------------------------------------- 

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Clayton Shotwell
      Posted by Clayton Shotwell
      on May 11 2012 12:49 PM
      Prodigy120 points

      We are currently not using the ROM service.  I will look into getting that enabled.

      Below is the code we are using to enable the L2 cache and MMU.

      // Invalidate I-Cache
      __asm__ volatile ("mcr  p15,0,%0,c7,c5,0" : : "r" (0x0));
      __asm__ volatile ("isb");

      /*
       * Setup the MMU Table here.  Leaving code out to save space.
       */

      /* Drain write buffer */
      __asm__ volatile ("mcr  p15,0,%0,c7,c10,4" : : "r" (0));

      /* TLB Flush */
      __asm__ volatile ("mcr  p15,0,%0,c8,c7,0" : : "r" (0));

      /* Load TTBR0 */
      __asm__ volatile ("mcr  p15,0,%0,c2,c0,0" : : "r" (TTBR0_BASEADDR((uint32 )trans_table) |
                                                             TTBR0_C                               |
                                                             TTBR0_RGN_WB_WA));

      // L2 Cache enable
      __asm__ volatile ("mrc p15,0,%0,c1,c0,1" : "=r" (ctrlReg) :);
      ctrlReg = ctrlReg | 0x2;
      __asm__ volatile ("mcr p15,0,%0,c1,c0,1" : : "r" (ctrlReg));

      /* Set default domain access (all manager) */
      __asm__ volatile ("mcr  p15,0,%0,c3,c0,0" : : "r" (0xfffffffd));

      /* Enable MMU, alignment, instruction cache, branch prediction, data cache */
      __asm__ volatile ("mrc  p15,0,%0,c1,c0,0" : "=r" (mmu_ctrl) :);
      mmu_ctrl |= (SCTLR_I | SCTLR_Z | SCTLR_C | SCTLR_A | SCTLR_M);
      __asm__ volatile ("mcr  p15,0,%0,c1,c0,0" : : "r" (mmu_ctrl));
      __asm__ volatile ("isb");

      // Invalidate I-Cache
      __asm__ volatile ("mcr  p15,0,%0,c7,c5,0" : : "r" (0x0));
      __asm__ volatile ("isb");

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Clayton Shotwell
      Posted by Clayton Shotwell
      on May 11 2012 13:12 PM
      Prodigy120 points

      Looked more into using the ROM service to invalidate the L2 cache.  We are currently using the following code to do achieve the same functionality.  We are disabling the L2 at this point and enabling a little later when we setup the MMU. 


          /* Invalidate L1 */
          mov     r0, #0                 
          mcr     p15, 0, r0, c8, c7, 0   /* Invalidate TLBs */
          isb
          mcr     p15, 0, r0, c7, c5, 0   /* Invalidate I-Cache */
          isb

          /* Invalidate L2 */
          mrc p15, 1, r0, c0, c0, 1
          ands r3, r0, #0x7000000
          mov r3, r3, lsr #23
          beq finished
          mov r10, #0

      loop1:
          add r2, r10, r10, lsr #1
          mov r1, r0, lsr r2
          and r1, r1, #7
          cmp r1, #2
          blt skip
          mcr p15, 2, r10, c0, c0, 0
          isb
          mrc p15, 1, r1, c0, c0, 0
          and r2, r1, #0x7
          add r2, r2, #4
          ldr r4, =0x3ff
          ands r4, r4, r1, lsr #3
          clz r5, r4
          ldr r7, =0x00007fff
          ands r7, r7, r1, lsr #13
      loop2:
          mov r9, r4
      loop3:
          orr r11, r10, r9, lsl r5
          orr r11, r11, r7, lsl r2
          mcr p15, 0, r11, c7, c6, 2
          subs r9, r9, #1
          bge loop3
          subs r7, r7, #1
          bge loop2
      skip:
          add r10, r10, #2
          cmp r3, r10
          bgt loop1
      finished:
         
          /* Turn off L2 Cache */
           isb
           MRC p15, 0, r0, c1, c0, 1
           and r0, r0, #0xFFFFFFFD
           MCR p15, 0, r0, c1, c0, 1
           isb

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Brad Griffis
      Posted by Brad Griffis
      on May 11 2012 13:27 PM
      Guru57350 points

      Clayton Shotwell
      Looked more into using the ROM service to invalidate the L2 cache.  We are currently using the following code to do achieve the same functionality.

      It is absolutely required to use the ROM service.  It's not possible for you to achieve the same result because you cannot run in secure mode like the ROM code. 

      ---------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.
      --------------------------------------------------------------------------------------------------------- 

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Clayton Shotwell
      Posted by Clayton Shotwell
      on May 11 2012 13:51 PM
      Prodigy120 points

      I added the following lines just above the assembly I posted earlier.

      /* Use the ROM to invalidate the L2 cache */
      moveq r12, #0x1
      smc #1

      I have been working with the MMU configuration to see if there is a different configuration that might work better with our hardware setup.  I have tried configuring the NOR flash as strongly ordered or device memory and that allowed the bootloader to run but it slowed everything way down.  Next I tried to configure the L1 and L2 cache models separately to see if I could get better results.  I ended up configuring the L1 as non-cacheable and the L2 as write-through, no write allocate (write back with allocate or no-allocate also work) and not shareable (enabling the shareable bit in the MMU table slows everything down considerably).  This allows the bootloader to run without errors.  I found the configuration table I referenced on page 7-5 of the Cortex-A8 TRM revr3p2 for normal memory and another configuration table in the ARMv7 Architecture Reference Manual in section B3.8.2 that details the register values I've set to configure the MMU.  I am not sure why I am having to disable the caching in L1 to avoid the GPMC bad address errors but it seems to help.

      To backtrack just a little bit for some more information on the errors I am seeing. The errors occur on the GPMC interface during reads and writes.  I am seeing bad address errors when I check the device before I try a read or a write to the deviceX (not the NOR flash device).  I checked the addresses from many different failures and none of the addresses fall into a valid address range that was configured in the GPMC config7_i register.  What's really odd is the address I am trying to access is not the address in the GPMC error address register (not even close). 

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Brad Griffis
      Posted by Brad Griffis
      on May 11 2012 15:09 PM
      Guru57350 points

      Clayton Shotwell

      I added the following lines just above the assembly I posted earlier.

      /* Use the ROM to invalidate the L2 cache */
      moveq r12, #0x1
      smc #1

      I was hoping that would fix the issue, but in any case it's definitely the right thing to do so keep it there.

      Clayton Shotwell
      I checked the addresses from many different failures and none of the addresses fall into a valid address range that was configured in the GPMC config7_i register.

      Will you please provide the address you accessed (virtual and physical) and the address of the failure?  A few register values would be good too, e.g. GPMC_config, any registers pertinent to the error you're seeing, etc.

      ---------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.
      --------------------------------------------------------------------------------------------------------- 

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Brad Griffis
      Posted by Brad Griffis
      on May 11 2012 15:22 PM
      Guru57350 points

      One other issue comes to mind that I should mention.  When you enable cache for the bootloader you need to be very careful not to have data (i.e. the instructions) "stuck" inside the cache.  If you have configured write-through cache then that shouldn't be an issue as all levels of memory would be updated immediately.  However, in the case of write-back cache I think you would need to manually flush the data in order to "push" the instructions to the physical memory.

      Have you done any profiling of the boot sequence, i.e. back when everything worked fine but was slow, what was the biggest offender?  Was it the actual copying of the application from NOR flash to DDR?  Or was it the execution of some piece of initialization code?

      In order to narrow down your issue I think we should try to come up with a specific configuration and then work to debug it.  Right now there are a lot of variables changing and it's hard for me to understand what all is happening.  For example, I was thinking something like this:

      • DeviceX: Strongly Ordered
      • NOR flash: Normal (Cacheable/Bufferable), read allocate
      • DDR: Normal (Cacheable/Bufferable), read/write allocate, write-through

      ---------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.
      --------------------------------------------------------------------------------------------------------- 

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Clayton Shotwell
      Posted by Clayton Shotwell
      on May 11 2012 15:41 PM
      Prodigy120 points

      Here are a few of the address errors I have seen.  There doesn't seem to be any pattern in the values.  These errors occur on both reads and write with no pattern and they do not always happen at the same point.

      Address Accessed in GPMC     Error Address from GPMC register
      0x18600006                                  0x15F1ECC0
      0x18086020                                  0x02BEDB40
      0x18600002                                  0x35C7ED80
      0x18600006                                  0x38FD6D90
      0x18600006                                  0x17C7ED80
      0x18600002                                  0x03726690

      Below are the address ranges I have mapped in the GPMC and the MMU mappings that are configured.
      Device         GPMC Address    GPMC Size     MMU Address      MMU Size
      NOR            0x08000000          64MB              0x08000000         16MB
                                                                                    0x09000000         16MB
      DeviceX      0x18000000         128MB             0x18000000         16MB
                                                                                    0x1C000000        16MB
                                                                                    0x1D000000        16MB

      GPMC is being configured with smart idle and auto idle enabled.  DeviceX uses its wait pin while the NOR flash does not use its wait pin. 

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • 1962332
      Posted by 1962332
      on May 14 2012 09:14 AM
      Prodigy150 points

      Brad, Clayton is out this week and I'll be filling in.

      To continue with what you guys had already checked out....  We believe all the setup for cache and MMU are correct (very similar to other project we've done and per the TRM for the most part).  Do we want to go through our use case of the GPMC interface and the really long processor stalls we're doing with the Wait lines?  I believe that is our root cause, but I haven't been able to find enough support evidence to figureout how to workaround/fix it.

      I have confirmed with my hardware guy that the GPMC CS3 wait line can be held from 10s of uSec to max ~2.5ms. 

      If I set the GPMC module to timeout, my transactions fail on CS3 (the one using the waits).  Which tells me the wait line is being held for longer then 6uS  (max GPMC wait that can be configured.  So this is in essence canceling my transactions.  So I have to stay configued at the GPMC level with no timeout.  Next I look at the L3 timeout settings.  When I calculate my max timeout on that interface it's about 1.5ms.  I don't know how to catch that interface's failure case if it's ever exceeded.  So my theory is the GPMC stalls out the L3 beyond it's timeout, but the GPMC transaction still attempts to complete... Which in this state, possibly??? leaves some unknown values locked in for the address, causing the GPMC access err? (Can you confirm the behavior of each module if a wait line is held beyond GPMC and L3 timeouts? And idea why this works with only L1 enabled?)

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Brad Griffis
      Posted by Brad Griffis
      on May 15 2012 11:41 AM
      Guru57350 points

      Section 5.2.3.4.2 "Time-Out" of the OMAP3530 TRM discusses the various registers associated with configuring and detecting a timeout on the L3 interconnect.  Do you see a timeout error being logged in the L3_TA_AGENT_STATUS register for GPMC (0x6800 2428)?  Note that it is a 64-bit register, i.e. get the whole thing.

      What is the value of L3_TA_AGENT_CONTROL for the GPMC (0x6800 2420)?  Note that it is a 64-bit register, i.e. get the whole thing.

      Hopefully looking at the registers will show that the root cause of the issue is truly an L3 timeout.  If that's the case it seems like you would simply want to turn off the timeout.

      ---------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.
      --------------------------------------------------------------------------------------------------------- 

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • 1962332
      Posted by 1962332
      on May 15 2012 16:31 PM
      Prodigy150 points

      The last time we checked, I believe it was showing one of the 5 MPU agent ids as the timeout reason in the status reg.  I'll get a dump of those for you.

      Can you completely turn off the timeout?  Would it truely stall the L2 controller or is there a level of inhert timeout in that controller that we're going to run into?

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Brad Griffis
      Posted by Brad Griffis
      on May 15 2012 16:51 PM
      Guru57350 points

      1962332
      Can you completely turn off the timeout?  Would it truely stall the L2 controller or is there a level of inhert timeout in that controller that we're going to run into?

      It looks like there are a number of different time outs.  As far as I can tell both the GPMC and L3 Interconnect timeout can be turned off.  I did some quick searching of the ARM documentation and didn't see anything related to a timeout mechanism.  That said, it seems plausible that there might still be some kind of timeout mechanism in the Cortex A8 that I've not yet discovered.  Maybe I'm using the wrong word in my search or something...  That would be the only thing that would make any sense out of the fact that things work/break depending on how the cache is configured, etc.  We both need to dig some more to try and figure that out!

      In the mean time I think the register dumps related to the interconnect are a good place to start.  Once we can get rid of the timeouts at the Interconnect then we can worry about the Cortex A8.

      ---------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.
      --------------------------------------------------------------------------------------------------------- 

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Clayton Shotwell
      Posted by Clayton Shotwell
      on May 21 2012 14:35 PM
      Prodigy120 points

      Brad, I'm back from vacation and back to work.

      I did a register dump after disabling the L3 timeouts for the MPU, GPMC, and the RT (this one should disable all of the timeouts).  The timeouts are getting disabled right after clocks are being setup.  I get the following register values when I get a GPMC error.  I have two sets below.  One is for a GPMC write error and the other is for a GPMC read error.  From what I have read, this looks like and error with an MPU transaction.  The MPU L3 Error Log register points to an initiator ID of 0x19 which is the MPU subsystem.  I'll start digging through the ARM documentation to see what I can find.

        IO Write Access Post Err @[0x18082042]
        GPMC ERR Addr  [0x0c9e8b80]
        GPMC ERR Type  [0x00000111]
        GPMC Status    [0x00000001]
        GPMC L3 Ctrl   [0x0000000003000000]
        GPMC L3 Status [0x0000000000000000]
        GPMC L3 Error  [0x0000000000000000]
        GPMC L3 ErrAddr[0x0000000000000000]
        MPU L3 Ctrl    [0x000000003e000000]
        MPU L3 Status  [0x0000000010000010]
        MPU L3 Error   [0x0000000082001901]
        MPU L3 ErrAddr [0x0000000047f20ac0]
        SI Flg Sts0    [0x0000000000000004]
        SI Flg Sts1    [0x0000000000000000]

        IO Read Access Post Err @[0x18082028]
        GPMC ERR Addr  [0x3f7ab580]
        GPMC ERR Type  [0x00000111]
        GPMC Status    [0x00000001]
        GPMC L3 Ctrl   [0x0000000003000000]
        GPMC L3 Status [0x0000000000000000]
        GPMC L3 Error  [0x0000000000000000]
        GPMC L3 ErrAddr[0x0000000000000000]
        MPU L3 Ctrl    [0x000000003e000000]
        MPU L3 Status  [0x0000000010000010]
        MPU L3 Error   [0x0000000084001900]
        MPU L3 ErrAddr [0x0000000000000000]
        SI Flg Sts0    [0x0000000000000004]
        SI Flg Sts1    [0x0000000000000000]

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Brad Griffis
      Posted by Brad Griffis
      on May 22 2012 11:15 AM
      Guru57350 points

      Here are some notes I'm taking related to your first register dump.

      Looking at the GPMC related registers I see the following:

      Clayton Shotwell
        GPMC ERR Type  [0x00000111]

      I see the following errors:  ILLEGALMCMD and ERRORNOTSUPPADD, i.e. no timeout.

      Looking at the L3 interconnect related registers I see the following:

      Clayton Shotwell
        SI Flg Sts0    [0x0000000000000004]

      According to Table 5-29 "L3_SI_FLAG_STATUS_0 for Application Error" this is considered a "Functional Inband Error".

      Clayton Shotwell
        MPU L3 Status  [0x0000000010000010]

      I see INBAND_ERROR_PRIMARY and REQ_ACTIVE are set.

      Clayton Shotwell
        MPU L3 Error   [0x0000000082001901]

      This decodes as:

      MULTI=1 (There are other errors in addition to this one.)

      SECONDARY=0

      CODE=2 (Address hole according to Table 5-26 "CODE Field Definition")

      INITID=0x19

      CMD=1

      Clayton Shotwell
        MPU L3 ErrAddr [0x0000000047f20ac0]

      This looks like a Reserved/invalid address.  This doesn't even map to the GPMC which I expect is related to why we have MULTI=1 (i.e. there was probably another error that did actually get to the GPMC).

      ---------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.
      --------------------------------------------------------------------------------------------------------- 

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    12
    TI E2E™ Community
    • Support Forums
    • Blogs
    • Videos
    • Groups
    • Site Support & Feedback
    • Settings
    TI E2E™ Community Groups
    • TI University Program
    • Make the Switch
    • Microcontroller Projects
    • Motor Drive & Control
    Other Communities
    • Deyisupport
    • Designsomething.org
    • beagleboard.org
    • TI on Element 14
    • TI on TechXchangeSM
    Other Technical & Support Resources
    • WEBENCH® Design Center
    • Product Information Centers
    • Technical Documents
    • TI Design Network
    • TI Technical Articles
    • TI Training

    All content and materials on this site are provided "as is". TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with regard to these materials, including but not limited to all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement of any third party intellectual property right. TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with respect to these materials. No license, either express or implied, by estoppel or otherwise, is granted by TI. Use of the information on this site may require a license from a third party, or a license from TI.

    Content on this site may contain or be subject to specific guidelines or limitations on use. All postings and use of the content on this site are subject to the Terms of Use of the site; third parties using this content agree to abide by any limitations or guidelines and to comply with the Terms of Use of this site. TI, its suppliers and providers of content reserve the right to make corrections, deletions, modifications, enhancements, improvements and other changes to the content and materials, its products, programs and services at any time or to move or discontinue any content, products, programs, or services without notice.

    Follow Us Texas Instruments on Facebook Texas Instruments on Twitter Texas Instruments on LinkedIn Texas Instruments on Google+
    TI Worldwide | Contact Us | my.TI Login | Site Map | Corporate Citizenship | mobile m.ti.com (Mobile Version)

    TI is a global semiconductor design and manufacturing company. Innovate with 100,000+ analog ICs and
    embedded processors, along with software, tools and the industry’s largest sales/support staff.

    © Copyright 1995-2013 Texas Instruments Incorporated. All rights reserved.
    Trademarks | Privacy Policy | Terms of Use