This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM3352: GPMC problem

Part Number: AM3352


Hi champs,

customer has reached a reproducible state in which the GPMC controller does not behave in conformity with the system.

The first accesses proceed as expected correctly, only after a certain time the GMPC controller does not seem to have the need

to operate the Write_Enable signal.

Question:

What can cause the GPMC controller to go completely out of step and ignore the GPMC settings?

 

Below the representation of the error image:

 

Following is the analog measurement of the write_enable signal, the signal is not pulled low by the GPMC controller:

 

A review of the timing both before the situation and in case of error shows no manipulation of

GPMC timing what this could have been explained.

 

Another insight,

This could be avoided by adding an ARM-DSB (Data Synchronization Barrier) command.

The problem is that this is not possible in the DMA transfer.

 

Can you help us here?

  • Please post what software is used, and which version.

  • Hi Biser,

     

    ...got that from the software developer

    Seemingly a simple loop for memory copy

     

    we're writing in a while loop at the at a RAM inside the FPGA. The code looks like this:

      while (n-- != 0)
      {
        *pLocalDestination++ = *pLocalSource++;
      }

  • Hi DJ-NG,

    Interesting.

    I see CS and ALE have gone low like the beginning of all the previous GPMC cycles, but WEn did not go low as before. Why would that happen?

    Can you provide a register dump of all the GPMC Registers from the CCS register window? A dump before the issue occurs and after would be most helpful.

    Since WAITn still became asserted low, the GPMC state machine might be frozen until WAIT is released. For debugging can you force WAITn high just to see if the GPMC resumes after WAIT is released?

    The real question is why didn't WEn go low before the WAIT signal, as it should have?

    Is it possible the GPMC is performing a read instead of a write at the time you have circled? You could easily test this by probing the OEn signal as well as the WEn signal (making sure that OEn is pinmuxed correctly and validly programmed in the GPMC registers). It should go low instead of WEn if a read has been issued to the GPMC.

    What is the repeatability of this behavior? Does it correlate with any other event? Does it happen at the same address? Does it occur at the end of the loop? Could there be any other interrupt in this applicaion? Did you try using DMA?

    Is any other part of the program changing pin mux? Can you check pinmux registers after the issue occurs and compare them to before the issue occurred? Are any error bits set in GPMC_ERR_TYPE or GPMC_ERR_ADDRESS?

    How have you programmed MMU and cache?

    Regards,
    Mark

  • Mark Mckeown said:

    Hi DJ-NG,

    Interesting.

    I see CS and ALE have gone low like the beginning of all the previous GPMC cycles, but WEn did not go low as before. Why would that happen?

    Can you provide a register dump of all the GPMC Registers from the CCS register window? A dump before the issue occurs and after would be most helpful.

    yes here are the dumps of the GPMC_CONFIG, GPMC_STATUS and the relevant GPMC_CS0-Registers :

    before the issue (dump from lauterbach debugger):
    
       GPMC (General Purpose Memory Controller)
    
         Miscellaneous Registers
         GPMC_REVISION          00000060  REV                      60
         GPMC_SYSCONFIG         00000011  SIDLEMODE                Smart-idle
                                          SOFTRESET                Normal
                                          AUTOIDLE                 Applied
         GPMC_SYSSTATUS         00000001  RESETDONE                Completed
         GPMC_IRQSTATUS         00000000  WAIT1EDGEDETECTIONSTATUS Not detected
                                          WAIT0EDGEDETECTIONSTATUS Not detected
                                          TERMINALCOUNTSTATUS      >0
                                          FIFOEVENTSTATUS          <FIFOTHRESHOLD
         GPMC_IRQENABLE         00000000  WAIT1EDGEDETECTIONENABLE Masked
                                          WAIT0EDGEDETECTIONENABLE Masked
                                          TERMINALCOUNTEVENTENABLE Masked
                                          FIFOEVENTENABLE          Masked
         GPMC_TIMEOUT_CONTROL   00001FF0  TIMEOUTSTARTVALUE        01FF
                                          TIMEOUTENABLE            Disabled
         GPMC_ERR_ADDRESS       00000000  ILLEGALADD               00000000
         GPMC_ERR_TYPE          00000000  ILLEGALMCMD              0
                                          ERRORNOTSUPPADD          No error
                                          ERRORNOTSUPPMCMD         No error
                                          ERRORTIMEOUT             No error
                                          ERRORVALID               Not valid
         GPMC_CONFIG            00000800  WAIT1PINPOLARITY         Low
                                          WAIT0PINPOLARITY         Low
                                          WRITEPROTECT             Low
                                          LIMITEDADDRESS           Not supported
                                          NANDFORCEPOSTEDWRITE     Disabled
         GPMC_STATUS            00000201  WAIT1STATUS              De-asserted
                                          WAIT0STATUS              Asserted
                                          EMPTYWRITEBUFFERSTATUS   Empty
         GPMC_PREFETCH_CONFIG1  00004000  CYCLEOPTIMIZATION        0
                                          ENABLEOPTIMIZEDACCESS    Disabled
                                          ENGINECSSELECTOR         /CS0
                                          PFPWENROUNDROBIN         Disabled
                                          PFPWWEIGHTEDPRIO         1
                                          FIFOTHRESHOLD            40
                                          ENABLEENGINE             Disabled
                                          WAITPINSELECTOR          Wait0EdgeDetection
                                          SYNCHROMODE              StartEngine set
                                          DMAMODE                  Interrupt
                                          ACCESSMODE               Prefetch read
         GPMC_PREFETCH_CONFIG2  00000000  TRANSFERCOUNT            0000
         GPMC_PREFETCH_CONTROL  00000000  STARTENGINE              Stopped
         GPMC_PREFETCH_STATUS   00000000  FIFOPOINTER              00
                                          FIFOTHRESHOLDSTATUS      <=FIFOTHRESHOLD
                                          COUNTVALUE               0000
         GPMC_ECC_CONFIG        00001030  ECCALGORITHM             Hamming
                                          ECCBCHTSEL               t=8
                                          ECCWRAPMODE              0
                                          ECC16B                   8 columns
                                          ECCTOPSECTOR             4
                                          ECCCS                    Chip-select 0
                                          ECCENABLE                Disabled
         GPMC_ECC_CONTROL       00000000  ECCCLEAR                 No effect
                                          ECCPOINTER               ECC engine disabled
         GPMC_ECC_SIZE_CONFIG   FFFFF000  ECCSIZE1       FF        ECCSIZE0       FF
                                          ECC9RESULTSIZE ECCSize0  ECC8RESULTSIZE ECCSize0
                                          ECC7RESULTSIZE ECCSize0  ECC6RESULTSIZE ECCSize0
                                          ECC5RESULTSIZE ECCSize0  ECC4RESULTSIZE ECCSize0
                                          ECC3RESULTSIZE ECCSize0  ECC2RESULTSIZE ECCSize0
                                          ECC1RESULTSIZE ECCSize0
         GPMC_BCH_SWDATA        00000000  BCH_DATA       0000
    
         Chip Select #0
         GPMC_CONFIG1_CS0       00611200  WRAPBURST                Not supported
                                          READMULTIPLE             Single
                                          READTYPE                 Asynchronous
                                          WRITEMULTIPLE            Single
                                          WRITETYPE                Asynchronous
                                          CLKACTIVATIONTIME        At StartAccess
                                          ATTACHEDDEVICEPAGELENGTH 4 Words
                                          WAITREADMONITORING       Monitored
                                          WAITWRITEMONITORING      Monitored
                                          WAITMONITORINGTIME       With valid data
                                          WAITPINSELECT            WAIT1
                                          DEVICESIZE               16 bit
                                          DEVICETYPE               NOR Flash like
                                          MUXADDDATA               Addr/data-multiplexed
                                          TIMEPARAGRANULARITY      x1 latencies
                                          GPMCFCLKDIVIDER          1
         GPMC_CONFIG2_CS0       00070800  CSWROFFTIME              7
                                          CSRDOFFTIME              8
                                          CSEXTRADELAY             Not delayed
                                          CSONTIME                 0
         GPMC_CONFIG3_CS0       00010100  ADVAADMUXWROFFTIME       0
                                          ADVAADMUXRDOFFTIME       0
                                          ADVWROFFTIME             1
                                          ADVRDOFFTIME             1
                                          ADVEXTRADELAY            Not delayed
                                          ADVAADMUXONTIME          0
                                          ADVONTIME                0
         GPMC_CONFIG4_CS0       07020881  WEOFFTIME                7
                                          WEEXTRADELAY             Not delayed
                                          WEONTIME                 2
                                          OEAADMUXOFFTIME          0
                                          OEOFFTIME                8
                                          OEEXTRADELAY             Delayed
                                          OEAADMUXONTIME           0
                                          OEONTIME                 1
         GPMC_CONFIG5_CS0       0008090A  PAGEBURSTACCESSTIME      0
                                          RDACCESSTIME             8
                                          WRCYCLETIME              9
                                          RDCYCLETIME              10
         GPMC_CONFIG6_CS0       870201C0  WRACCESSTIME             7
                                          WRDATAONADMUXBUS         2
                                          CYCLE2CYCLEDELAY         1
                                          CYCLE2CYCLESAMECSEN      Delay
                                          CYCLE2CYCLEDIFFCSEN      Delay
                                          BUSTURNAROUND            0
         GPMC_CONFIG7_CS0       00000F41  MASKADDRESS              16MB
                                          CSVALID                  Enabled
                                          BASEADDRESS              01
         GPMC_NAND_COMMAND_CS0  XXXXXXXX
         GPMC_NAND_ADDRESS_CS0  XXXXXXXX
         GPMC_NAND_DATA_CS0     AA5555AA

    dump after issue:
    
       GPMC (General Purpose Memory Controller)
    
         Miscellaneous Registers
         GPMC_REVISION          00000060  REV                      60
         GPMC_SYSCONFIG         00000011  SIDLEMODE                Smart-idle
                                          SOFTRESET                Normal
                                          AUTOIDLE                 Applied
         GPMC_SYSSTATUS         00000001  RESETDONE                Completed
         GPMC_IRQSTATUS         00000000  WAIT1EDGEDETECTIONSTATUS Not detected
                                          WAIT0EDGEDETECTIONSTATUS Not detected
                                          TERMINALCOUNTSTATUS      >0
                                          FIFOEVENTSTATUS          <FIFOTHRESHOLD
         GPMC_IRQENABLE         00000000  WAIT1EDGEDETECTIONENABLE Masked
                                          WAIT0EDGEDETECTIONENABLE Masked
                                          TERMINALCOUNTEVENTENABLE Masked
                                          FIFOEVENTENABLE          Masked
         GPMC_TIMEOUT_CONTROL   00001FF0  TIMEOUTSTARTVALUE        01FF
                                          TIMEOUTENABLE            Disabled
         GPMC_ERR_ADDRESS       00000000  ILLEGALADD               00000000
         GPMC_ERR_TYPE          00000000  ILLEGALMCMD              0
                                          ERRORNOTSUPPADD          No error
                                          ERRORNOTSUPPMCMD         No error
                                          ERRORTIMEOUT             No error
                                          ERRORVALID               Not valid
         GPMC_CONFIG            00000800  WAIT1PINPOLARITY         Low
                                          WAIT0PINPOLARITY         Low
                                          WRITEPROTECT             Low
                                          LIMITEDADDRESS           Not supported
                                          NANDFORCEPOSTEDWRITE     Disabled
         GPMC_STATUS            00000201  WAIT1STATUS              De-asserted
                                          WAIT0STATUS              Asserted
                                          EMPTYWRITEBUFFERSTATUS   Empty
         GPMC_PREFETCH_CONFIG1  00004000  CYCLEOPTIMIZATION        0
                                          ENABLEOPTIMIZEDACCESS    Disabled
                                          ENGINECSSELECTOR         /CS0
                                          PFPWENROUNDROBIN         Disabled
                                          PFPWWEIGHTEDPRIO         1
                                          FIFOTHRESHOLD            40
                                          ENABLEENGINE             Disabled
                                          WAITPINSELECTOR          Wait0EdgeDetection
                                          SYNCHROMODE              StartEngine set
                                          DMAMODE                  Interrupt
                                          ACCESSMODE               Prefetch read
         GPMC_PREFETCH_CONFIG2  00000000  TRANSFERCOUNT            0000
         GPMC_PREFETCH_CONTROL  00000000  STARTENGINE              Stopped
         GPMC_PREFETCH_STATUS   00000000  FIFOPOINTER              00
                                          FIFOTHRESHOLDSTATUS      <=FIFOTHRESHOLD
                                          COUNTVALUE               0000
         GPMC_ECC_CONFIG        00001030  ECCALGORITHM             Hamming
                                          ECCBCHTSEL               t=8
                                          ECCWRAPMODE              0
                                          ECC16B                   8 columns
                                          ECCTOPSECTOR             4
                                          ECCCS                    Chip-select 0
                                          ECCENABLE                Disabled
         GPMC_ECC_CONTROL       00000000  ECCCLEAR                 No effect
                                          ECCPOINTER               ECC engine disabled
         GPMC_ECC_SIZE_CONFIG   FFFFF000  ECCSIZE1       FF        ECCSIZE0       FF
                                          ECC9RESULTSIZE ECCSize0  ECC8RESULTSIZE ECCSize0
                                          ECC7RESULTSIZE ECCSize0  ECC6RESULTSIZE ECCSize0
                                          ECC5RESULTSIZE ECCSize0  ECC4RESULTSIZE ECCSize0
                                          ECC3RESULTSIZE ECCSize0  ECC2RESULTSIZE ECCSize0
                                          ECC1RESULTSIZE ECCSize0
         GPMC_BCH_SWDATA        00000000  BCH_DATA       0000
    
         Chip Select #0
         GPMC_CONFIG1_CS0       00611200  WRAPBURST                Not supported
                                          READMULTIPLE             Single
                                          READTYPE                 Asynchronous
                                          WRITEMULTIPLE            Single
                                          WRITETYPE                Asynchronous
                                          CLKACTIVATIONTIME        At StartAccess
                                          ATTACHEDDEVICEPAGELENGTH 4 Words
                                          WAITREADMONITORING       Monitored
                                          WAITWRITEMONITORING      Monitored
                                          WAITMONITORINGTIME       With valid data
                                          WAITPINSELECT            WAIT1
                                          DEVICESIZE               16 bit
                                          DEVICETYPE               NOR Flash like
                                          MUXADDDATA               Addr/data-multiplexed
                                          TIMEPARAGRANULARITY      x1 latencies
                                          GPMCFCLKDIVIDER          1
         GPMC_CONFIG2_CS0       00070800  CSWROFFTIME              7
                                          CSRDOFFTIME              8
                                          CSEXTRADELAY             Not delayed
                                          CSONTIME                 0
         GPMC_CONFIG3_CS0       00010100  ADVAADMUXWROFFTIME       0
                                          ADVAADMUXRDOFFTIME       0
                                          ADVWROFFTIME             1
                                          ADVRDOFFTIME             1
                                          ADVEXTRADELAY            Not delayed
                                          ADVAADMUXONTIME          0
                                          ADVONTIME                0
         GPMC_CONFIG4_CS0       07020881  WEOFFTIME                7
                                          WEEXTRADELAY             Not delayed
                                          WEONTIME                 2
                                          OEAADMUXOFFTIME          0
                                          OEOFFTIME                8
                                          OEEXTRADELAY             Delayed
                                          OEAADMUXONTIME           0
                                          OEONTIME                 1
         GPMC_CONFIG5_CS0       0008090A  PAGEBURSTACCESSTIME      0
                                          RDACCESSTIME             8
                                          WRCYCLETIME              9
                                          RDCYCLETIME              10
         GPMC_CONFIG6_CS0       870201C0  WRACCESSTIME             7
                                          WRDATAONADMUXBUS         2
                                          CYCLE2CYCLEDELAY         1
                                          CYCLE2CYCLESAMECSEN      Delay
                                          CYCLE2CYCLEDIFFCSEN      Delay
                                          BUSTURNAROUND            0
         GPMC_CONFIG7_CS0       00000F41  MASKADDRESS              16MB
                                          CSVALID                  Enabled
                                          BASEADDRESS              01
         GPMC_NAND_COMMAND_CS0  XXXXXXXX
         GPMC_NAND_ADDRESS_CS0  XXXXXXXX
         GPMC_NAND_DATA_CS0     AA5555AA

    we cant see any differences before and after the issue.

    Mark Mckeown said:

    Since WAITn still became asserted low, the GPMC state machine might be frozen until WAIT is released. For debugging can you force WAITn high just to see if the GPMC resumes after WAIT is released?

    yes, and also the rest of the soc. also the access via a hardware debugger is impossible in issue state. Only an external reset on the external device (fpga) helps to get rid of the blocked state.

    here are the state of gpmc-pins after resetting the external device:

    Mark Mckeown said:

    The real question is why didn't WEn go low before the WAIT signal, as it should have?

    Is it possible the GPMC is performing a read instead of a write at the time you have circled? You could easily test this by probing the OEn signal as well as the WEn signal (making sure that OEn is pinmuxed correctly and validly programmed in the GPMC registers). It should go low instead of WEn if a read has been issued to the GPMC.

    the "gpmc_oen_ren"-Pin is already traced in the first picture of this thread. Please look at the first picture (also available in the above picture), there it is called GPMC_nRE (green wave-line)

    Mark Mckeown said:

    What is the repeatability of this behavior? Does it correlate with any other event? Does it happen at the same address? Does it occur at the end of the loop? Could there be any other interrupt in this applicaion? Did you try using DMA?

    the behavior is very well repeatable. we played with the GPMC_TIMEOUT_CONTROL register. the benefit of this register is, that the last access-address is latched in the GPMC_ERR_*-Registers. there we see that it is always the same address on the external memory.

    Mark Mckeown said:

    Is any other part of the program changing pin mux? Can you check pinmux registers after the issue occurs and compare them to before the issue occurred? Are any error bits set in GPMC_ERR_TYPE or GPMC_ERR_ADDRESS?

    yes we have checked both. No changes before/after issue in pinmux and no hints in GPMC_ERR_* regs except the case when using GPMC_TIMEOUT_CONTROL register.

    Mark Mckeown said:

    How have you programmed MMU and cache?

    we use linux and map the memory to the user-space through the uio-framework. A look in the source-code shows that the memory is mapped as strongly-ordered memory.

    any other hints? what about the note in the first post when the ARM-DSB (Data Synchronization Barrier) command is used. Does this indicate any missconfiguration for example?

    thank you for the support.

  • Hello Oleg,

    Sorry for the delay in this response. Thanks for supplying the information I requested.

    I agree there appear to be no differences in the register dump before or after the issue has occurred.

    I don't expect it to impact the behavior, but the rule checker flagged the following timing checks:

    * WRACCESSTIME < CSWROFFTIME

    * WEOFFTIME < CSWROFFTIME

    * Rule 7. Regardless of WAITWRITEMONITORING and GPMCFCLKDIVIDER, WEOFFTIME and CSWROFFTIME must be greater than or equal to WRACCESSTIME+1

    * Rule 8. WrCycleTime must be strictly greater than all the Off times of the control signals (OeOffTime CsRdOffTime, CsWrOffTime, AdvRdOffTime, AdvWrOffTime, WeOffTime), plus the possible extra delays added (CSExtraDelay, AdvExtraDelay, WeExtraDelay, OeExtraDelay, CsExtraDelay).

    To clear up the flags, extend WEOFFTIME to 8, CSWROFFTIME to 9, WRCYCLETIME to 10

    =-=-=-=-

    With the ARM-DSB (Data Synchronization Barrier) playing a role, I'm starting to feel that this issue may be related to MMU/cache settings.
     
    Refer to this similar post:     https://e2e.ti.com/support/processors/f/791/t/316742

    Generally the recommendation is to make it bufferable, non-cacheable device memory, shareable, and no execute permissions (or equivalent)

    The MMU settings we recently tested with on AM572x are below (they may be different terminology to AM335x)

    mmuAttr0.accPerm = 0U; // 0: R/W at PL1             
    mmuAttr0.noExecute = 1U;
    mmuAttr0.attrIndx = 0U; //0U: non-cacheable normal memory  
    mmuAttr0.shareable = 0U; //0: not shareable        

    Possibly relevant threads:
    https://e2e.ti.com/support/processors/f/791/t/515268?AM335x-MMU-settings-in-ISDK

    https://e2e.ti.com/support/processors/f/791/t/579072?AM5728-GPMC-problem#pi320966=2

    https://e2e.ti.com/support/processors/f/791/t/193846?Program-crashes-when-enabling-I-and-D-cache-

    =-=-=-=-

    You said that the issue occurs when the same address is issued to the memory. What is that address? I cannot see it in the waveforms.


    I had a concern about the MASKADDRESS set to 16MB and wanted to recommend something larger, but since the expected CS is going low when the problem occurs I do not expect that the address has crossed into the next chip select region, using that chip select's GPMC configuration.. Worth a sanity check, however...

    One other sanity check... in the registers, WAITPINSELECT is mapped to WAIT1 but the scope shots have label "WAIT0n" - I assume it is a labeling mixup, but I have to ask.

    I'll loop in a member of the software who can better support the MMU and cache configurations.

    Regards,
    Mark

  • Refer to ARMv7-A MMU settings (in DEN0013D_cortex_a_series_PG.pdf). This should apply to ARM Cortex-A8.

    Regards,
    Mark

  • Mark Mckeown said:

    Hello Oleg,

    Hi Mark,

    Mark Mckeown said:

    I don't expect it to impact the behavior, but the rule checker flagged the following timing checks:

    is this tool "rule checker" somewhere available for us? I think it could be very helpful during the development process.

    Mark Mckeown said:


    * WRACCESSTIME < CSWROFFTIME

    * WEOFFTIME < CSWROFFTIME

    * Rule 7. Regardless of WAITWRITEMONITORING and GPMCFCLKDIVIDER, WEOFFTIME and CSWROFFTIME must be greater than or equal to WRACCESSTIME+1

    * Rule 8. WrCycleTime must be strictly greater than all the Off times of the control signals (OeOffTime CsRdOffTime, CsWrOffTime, AdvRdOffTime, AdvWrOffTime, WeOffTime), plus the possible extra delays added (CSExtraDelay, AdvExtraDelay, WeExtraDelay, OeExtraDelay, CsExtraDelay).

    To clear up the flags, extend WEOFFTIME to 8, CSWROFFTIME to 9, WRCYCLETIME to 10

    We extended the timings to your recommended ones. The failure still exists. But thanks for the hint, we have adopted our timings...

    Mark Mckeown said:

    With the ARM-DSB (Data Synchronization Barrier) playing a role, I'm starting to feel that this issue may be related to MMU/cache settings.
     
    Refer to this similar post:     https://e2e.ti.com/support/processors/f/791/t/316742

    Generally the recommendation is to make it bufferable, non-cacheable device memory, shareable, and no execute permissions (or equivalent)

    The MMU settings we recently tested with on AM572x are below (they may be different terminology to AM335x)

    mmuAttr0.accPerm = 0U; // 0: R/W at PL1             
    mmuAttr0.noExecute = 1U;
    mmuAttr0.attrIndx = 0U; //0U: non-cacheable normal memory  
    mmuAttr0.shareable = 0U; //0: not shareable        

    Possibly relevant threads:
    https://e2e.ti.com/support/processors/f/791/t/515268?AM335x-MMU-settings-in-ISDK

    https://e2e.ti.com/support/processors/f/791/t/579072?AM5728-GPMC-problem#pi320966=2

    https://e2e.ti.com/support/processors/f/791/t/193846?Program-crashes-when-enabling-I-and-D-cache-

    We already thought about a possible mmu-config issue and checked our ones for the memory map of the fpga memory. As i said, we are using

    linux (v4.9.146-rt125) and the uio-framework to map the fpga memory to the user-space. The mmap implementation in the uio-framework sets:

    drivers/uio/uio.c +661
    
    static int uio_mmap_physical(struct vm_area_struct *vma)                         
    {                                                                                
            struct uio_device *idev = vma->vm_private_data;                          
            int mi = uio_find_mem_index(vma);                                           
            struct uio_mem *mem;                                                        
            if (mi < 0)                                                              
                    return -EINVAL;                                                     
            mem = idev->info->mem + mi;                                                 
                                                                                        
            if (mem->addr & ~PAGE_MASK)                                                 
                    return -ENODEV;                                                  
            if (vma->vm_end - vma->vm_start > mem->size)                             
                    return -EINVAL;                                                  
                                                                                        
            vma->vm_ops = &uio_physical_vm_ops;                                      
            vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);                                             
                                                                                        
            /*                                                                       
             * We cannot use the vm_iomap_memory() helper here,                      
             * because vma->vm_pgoff is the map index we looked                         
             * up above in uio_find_mem_index(), rather than an                         
             * actual page offset into the mmap.                                     
             *                                                                       
             * So we just do the physical mmap without a page                        
             * offset.                                                               
             */                                                                      
            return remap_pfn_range(vma,                                                 
                                   vma->vm_start,                                    
                                   mem->addr >> PAGE_SHIFT,                          
                                   vma->vm_end - vma->vm_start,                      
                                   vma->vm_page_prot);                               
    }                                                                                   
    

    where

    vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

    sets the memory flags as strongly-ordered. When using CONFIG_ARM_PTDUMP with attached patch to dump

    also user-space pagetables the flags are set to (arch/arm/mm/dump.c):

    # 0xb5ad2000-0xb6ad2000          16M USR RW NX     SO/UNCACHED

    We also switched the flags to device-only, but the failure still occurs. My understanding is, that strongly ordered is also

    ok for the mapping. Device-only is only a matter of performance because its bufferable, right?

    Mark Mckeown said:

    You said that the issue occurs when the same address is issued to the memory. What is that address? I cannot see it in the waveforms.

    The adress is a specific ram-adress in the fpga.

    Mark Mckeown said:

    I had a concern about the MASKADDRESS set to 16MB and wanted to recommend something larger, but since the expected CS is going low when the problem occurs I do not expect that the address has crossed into the next chip select region, using that chip select's GPMC configuration.. Worth a sanity check, however...

    Here we changed the CS region to 256MB, but also here the failure still occurs.

    Mark Mckeown said:

    One other sanity check... in the registers, WAITPINSELECT is mapped to WAIT1 but the scope shots have label "WAIT0n" - I assume it is a labeling mixup, but I have to ask.

    Oh, yeah you are right. The wave-forms are generated from our own evaluation-board with all the pins extracted. The register dumps are done with the currently available form-factor, that is using the wait1 pin instead the wait0. sorry for the confusion.

    Any other ideas?

    Mark Mckeown said:

    Regards,
    Mark

    Regards,
    Oleg
    From 61685720c237bf4980ac3837162e1850dc301f1b Mon Sep 17 00:00:00 2001
    From: Oleg Karfich <oleg.karfich@wago.com>
    Date: Thu, 19 Dec 2019 15:19:21 +0100
    Subject: [PATCH] arm: mm: dump userspace process pagetables
    
    Signed-off-by: Oleg Karfich <oleg.karfich@wago.com>
    ---
     arch/arm/mm/dump.c | 32 +++++++++++++++++++++++++++-----
     kernel/sysctl.c    | 30 ++++++++++--------------------
     2 files changed, 37 insertions(+), 25 deletions(-)
    
    diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
    index e1f6f0d..2696cf9 100644
    --- a/arch/arm/mm/dump.c
    +++ b/arch/arm/mm/dump.c
    @@ -16,19 +16,24 @@
     #include <linux/fs.h>
     #include <linux/mm.h>
     #include <linux/seq_file.h>
    +#include <linux/sched.h>
    +#include <linux/highmem.h>
     
     #include <asm/fixmap.h>
     #include <asm/pgtable.h>
     
    +extern int pgt_dump_process_id;
    +
     struct addr_marker {
     	unsigned long start_address;
     	const char *name;
     };
     
     static struct addr_marker address_markers[] = {
    +	{ 0, "User Space" },
     	{ MODULES_VADDR,	"Modules" },
     	{ PAGE_OFFSET,		"Kernel Mapping" },
    -	{ 0,			"vmalloc() Area" },
    +	{ 0,							"vmalloc() Area" },
     	{ VMALLOC_END,		"vmalloc() End" },
     	{ FIXADDR_START,	"Fixmap Area" },
     	{ CONFIG_VECTORS_BASE,	"Vectors" },
    @@ -256,13 +261,16 @@ static void note_page(struct pg_state *st, unsigned long addr, unsigned level, u
     
     static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start)
     {
    -	pte_t *pte = pte_offset_kernel(pmd, 0);
    +	pte_t *pte;
     	unsigned long addr;
     	unsigned i;
     
    -	for (i = 0; i < PTRS_PER_PTE; i++, pte++) {
    +	for (i = 0; i < PTRS_PER_PTE; i++) {
     		addr = start + i * PAGE_SIZE;
    -		note_page(st, addr, 4, pte_val(*pte));
    +		pte = pte_offset_map(pmd, addr);
    +		if (pte)
    +			note_page(st, addr, 4, pte_val(*pte));
    +		pte_unmap(pte);
     	}
     }
     
    @@ -302,7 +310,7 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
     
     static void walk_pgd(struct seq_file *m)
     {
    -	pgd_t *pgd = swapper_pg_dir;
    +	pgd_t *pgd;
     	struct pg_state st;
     	unsigned long addr;
     	unsigned i;
    @@ -311,6 +319,20 @@ static void walk_pgd(struct seq_file *m)
     	st.seq = m;
     	st.marker = address_markers;
     
    +	if (pgt_dump_process_id > 0) {
    +		struct task_struct* ts = find_task_by_vpid((pid_t)pgt_dump_process_id);
    +		seq_printf(m, "Page tables for process id = %d\n", pgt_dump_process_id);
    +
    +		if (ts == NULL) {
    +			seq_printf(m, "Process DNE!\n");
    +			return;
    +		}
    +		pgd = ts->mm->pgd;
    +	} else {
    +		seq_printf(m, "Page tables for kernel");
    +		pgd = swapper_pg_dir;
    +	}
    +
     	for (i = 0; i < PTRS_PER_PGD; i++, pgd++) {
     		addr = i * PGDIR_SIZE;
     		if (!pgd_none(*pgd)) {
    diff --git a/kernel/sysctl.c b/kernel/sysctl.c
    index 23f658d..e94c914 100644
    --- a/kernel/sysctl.c
    +++ b/kernel/sysctl.c
    @@ -345,8 +345,7 @@ static struct ctl_table kern_table[] = {
     		.data		= &sysctl_sched_time_avg,
     		.maxlen		= sizeof(unsigned int),
     		.mode		= 0644,
    -		.proc_handler	= proc_dointvec_minmax,
    -		.extra1		= &one,
    +		.proc_handler	= proc_dointvec,
     	},
     	{
     		.procname	= "sched_shares_window_ns",
    @@ -1795,24 +1794,6 @@ static struct ctl_table fs_table[] = {
     		.extra2		= &one,
     	},
     	{
    -		.procname	= "protected_fifos",
    -		.data		= &sysctl_protected_fifos,
    -		.maxlen		= sizeof(int),
    -		.mode		= 0600,
    -		.proc_handler	= proc_dointvec_minmax,
    -		.extra1		= &zero,
    -		.extra2		= &two,
    -	},
    -	{
    -		.procname	= "protected_regular",
    -		.data		= &sysctl_protected_regular,
    -		.maxlen		= sizeof(int),
    -		.mode		= 0600,
    -		.proc_handler	= proc_dointvec_minmax,
    -		.extra1		= &zero,
    -		.extra2		= &two,
    -	},
    -	{
     		.procname	= "suid_dumpable",
     		.data		= &suid_dumpable,
     		.maxlen		= sizeof(int),
    @@ -1861,6 +1842,8 @@ static struct ctl_table fs_table[] = {
     	{ }
     };
     
    +int pgt_dump_process_id = -1;
    +
     static struct ctl_table debug_table[] = {
     #ifdef CONFIG_SYSCTL_EXCEPTION_TRACE
     	{
    @@ -1882,6 +1865,13 @@ static struct ctl_table debug_table[] = {
     		.extra2		= &one,
     	},
     #endif
    +	{
    +		.procname = "pgt_dump_process_id",
    +		.data = &pgt_dump_process_id,
    +		.maxlen = sizeof(int),
    +		.mode = 0644,
    +		.proc_handler = proc_dointvec
    +	},
     	{ }
     };
     
    -- 
    2.7.4
    
    
  • It's a long shot, but please validate the configuration of the register conf_gpmc_wen at physical address 0x44E10898 in the case where you have the issue.

  • Thanks Oleg,

    We'll review your MMU/cache settings.

    What address is in the GPMC_ERR_ADDRESS register after the timeout? I think it could be a clue.

    If you write to this address first or by itself, does the WEn stay high instead of going low?

    Regards,
    Mark

  • Hi Oleg, Dirk,

    I have unlocked the thread. Please try to post your reply now.

    Regards,
    Mark

  • Hi Brad,

    Brad Griffis said:

    It's a long shot, but please validate the configuration of the register conf_gpmc_wen at physical address 0x44E10898 in the case where you have the issue.

    thanks for the hint but we already did this. The config is still correct in case of the issue. See my next post that shows the accesses after the issue..
  • Hi Mark,

    Mark Mckeown said:

    We'll review your MMU/cache settings.

    Could you and maybe somebody from your os-team see any missconfiguration from our side?

    Mark Mckeown said:

    What address is in the GPMC_ERR_ADDRESS register after the timeout? I think it could be a clue.

    If you write to this address first or by itself, does the WEn stay high instead of going low?

    It's an address in the mapped fpga-memory. We wrote a simple application were we write different randomized patterns to the memory location and some others. But we did not see the issue and the WEn signal behaves correct in this case. It happens only when we start our main application with the code that we are posted here. But that is also what we are doing in our simple application.

    Currently we configured slower timings to disable the wait-monitoring and to see what happens after the issue. The WAIT signal is still driven by the fpga. The wave-form show that the chip-select signal is driven many times without a corresponding WEn signal. That matches with the issue that you are pointing us already here. The recommendation there is to use strongly-ordered oder device mapping to the gpmc memory location. We checked the mapping and the uio mmap function and there we have strongly ordered configured.

    So the question is: how is it possible that we get "speculative accesses to the memory" even tough we have strongly-ordered mapping on this location?

    Regards,
    Oleg

  • Oleg Karfich said:
    So the question is: how is it possible that we get "speculative accesses to the memory" even tough we have strongly-ordered mapping on this location?

    Strongly ordered memory will not result in any speculative accesses.  However, you might be assuming that only a single mapping to this physical address range exists in your system.  If there is another virtual address mapping elsewhere in the system (normal memory rather than strongly ordered), this could still result in speculative accesses.

    Are you able to see the corresponding GPMC address on the pins in the case where you see the chip select assert but not nWE or nOE?

  • Hi Brad,

    Brad Griffis said:

    Strongly ordered memory will not result in any speculative accesses.  However, you might be assuming that only a single mapping to this physical address range exists in your system.  If there is another virtual address mapping elsewhere in the system (normal memory rather than strongly ordered), this could still result in speculative accesses.

    Thanks for the explanation. We already checked this when we read about "Mismatched Memory Attributes" in the ARM Architecture Reference Manual (A3-139) yesterday. We have a linux kernel driver that also maps the memory of the fpga with the ioremap_nocache() function. And this results in a device-only mmu mapping, different to the strongly-ordered mapping from userspace through the uio framework. We compiled the driver out of the kernel but the issue still remains.

    Brad Griffis said:

    Are you able to see the corresponding GPMC address on the pins in the case where you see the chip select assert but not nWE or nOE?

    I will check this tomorrow morning (germany) and post the result here.

  • Hi Brad,

    Oleg Karfich said:
    Brad Griffis

    Are you able to see the corresponding GPMC address on the pins in the case where you see the chip select assert but not nWE or nOE?

    I will check this tomorrow morning (germany) and post the result here.

    are the logic analyzer pics in the first post sufficient for your question? Signals GPMC[A1:A16]

    Another question regarding the "speculative accesses" and "mismatched memory attributes". If we had this, shouldn't the pins nWE or nOE asserted anyway?

    Regards

    Oleg

  • Oleg Karfich said:
    are the logic analyzer pics in the first post sufficient for your question? Signals GPMC[A1:A16]

    Yes and no.  I was re-reading the thread and trying to understand the issue a bit better.  In fact, I'm wondering if maybe there are two slightly different manifestations of the issue:

    1. Your original screenshot looks like you were doing a bunch of sequential accesses and then suddenly things stopped.  However, there were a lot of questions surrounding the nWAIT signal and whether it was the "real" issue there.  Was there a conclusion on that topic, e.g. does that issue disappear if you disable the "WAIT" functionality in the GPMC?

    2. There's the behavior similar to the other thread where you might see the chip select assert when you're not even attempting to access the GPMC.  Do you see that happening, or do you only see issues within the context of accesses that you're deliberately performing?

    Oleg Karfich said:
    Another question regarding the "speculative accesses" and "mismatched memory attributes". If we had this, shouldn't the pins nWE or nOE asserted anyway?

    In the case of speculative accesses, I've seen many cases of "phantom accesses" like the other thread mentioned.  It is consistent that the nOE/nWE signals are not asserted in those cases.

  • Hi Brad,

    Brad Griffis said:

    1. Your original screenshot looks like you were doing a bunch of sequential accesses and then suddenly things stopped.  However, there were a lot of questions surrounding the nWAIT signal and whether it was the"real" issue there.  Was there a conclusion on that topic, e.g. does that issue disappear if you disable the "WAIT" functionality in the GPMC?

    No, the issue still remains if the WAIT functionality is disabled. We described the behaviour in [1] and showed this with a wave form. So the WAIT functionality is not the problem here.

    Brad Griffis said:

    2. There's the behavior similar to the other thread where you might see the chip select assert when you're not even attempting to access the GPMC.  Do you see that happening, or do you only see issues within the context of accesses that you're deliberately performing?

    Yes, we see the issue only when we start our application and try to access the memory within a loop, described here [2].

    Meanwhile we reduced the problem to some neon instructions that are generated by our compiler. Amongst other compiler flags, we build our application with "-ftree-vectorize". This permits the compiler to vectorize the loop that is mentioned in [2]. The vectorization ends up in NEON load (vld1.32) and store (vst1.32) operations. If we disable this vectorization by removing the compiler-flag or when we compile with "Os" the issue disappears. Even the "dsb" instruction we mentioned to fix the problem, also only disabled the neon instruction and therefore doesn't seem to be any related.

    We are using "arm-linux-gnueabihf-gcc (Linaro GCC 5.5-2017.10) 5.5.0".

    Does the gpmc-controller on the am335x soc have problems with this kind of neon instructions?

    Thanks for your response.

    Regards

    Oleg

    [1] https://e2e.ti.com/support/processors/f/791/p/859983/3236399#3236399

    [2] https://e2e.ti.com/support/processors/f/791/p/859983/3180379#3180379

  • Hi Oleg,

    I apologize that this thread has not found resolution yet.
    I have enlisted the help of another colleague to help to drive it to closure.
    Please help us to take a fresh look at the issue by first restating the facts.

    Can you confirm the following statements and answer the questions below?

    * The problem is that the WEn signal fails to assert low during one of the write cycles to GPMC CS0
    * This problem occurs with and without the WAIT signal asserted
    * It is repeatable and occurs when writing to the same word each time

    Does the problem occur during CPU loops like the code shown or during DMA accesses (or both)?
    If CPU loop, the problem stops happening if you insert a DSB instruction? Where is the instruction? Inside the loop?
    If DMA, can you share the DMA configuration? DSB cannot be used during a DMA burst

    Its not clear to me how ARM NEON is related.
    If you disable ARM NEON, does the problem stop happening?
    Are you utilizing the SIMD execution of the NEON? Is it required?
    Have you attempted to use the GCC compiler instead of Linaro? TI does not use this compiler for the ARM Cortex-A8.

    The strongly ordered configuration is the right one to use. Confirm strongly ordered is chosen for the GPMC data space.

    Can you please provide assembly code from when the problem occurs verses when the problem does not occur? (showing the NEON instructions)?

    Thanks.

    Regards,
    Mark