This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM6548: Use of MSMC SRAM in TI-RTOS application

Part Number: AM6548
Other Parts Discussed in Thread: SYSBIOS

Dear TI team,

we're currently trying to figure out where to place our own application that is supposed to run on the R5F cores of a AM6548, and I have a few questions regarding the MSMC SRAM.

The linker script (ti\pdk_am65xx_1_0_3\packages\ti\build\am65xx\linker_r5_sysbios.lds) that comes with the RTOS SDK version 05.02 links most of the application to the MSMC RAM, but the MSMC memory is somewhat fragmented:

0x70000000 - 0x700EFFFF - MSMC3
0x700F0000 - 0x700FFFFF - MSMC3_DMSC (Reserved for DMSC according to comment)
0x70100000 - 0x701F1FFF - MSMC3_H
0x701F2000 - 0x701FFFFF - MSMC3_NOCACHE

  • What is the purpose of MSMC3_DMSC reserved area? I couldn't find that region anywhere in the documentation.
  • I'm assuming that the MSMC3_NOCACHE region relies on a particular MPU configuration, but the r5_mpu.xs from the same folder doesn't seem to make that destinction, since there's only a single entry for the whole 0x70000000 area. What is the MSMC3_NOCACHE area supposed to be used for? Why does it need to be accessed uncached, if it is coherent, or rather with whom is it coherent?
  • There's a r5_mpu.xs in ti\pdk_am65xx_1_0_3\packages\ti\drv\usb\example\bios\am65xx that contains a separate region for the 0x701F0000 - 0x701FFFFF region, but I believe there's a mistake in the sub-region mask, since a sub-region mask of 0x80 should disable that region for the LAST 1/8 of the range, instead of the FIRST 1/8 (not yet verified).
  • The TRM talks about the 2 MB MSMC SRAM and that it's both a memory-mapped SRAM and a L3 cache, but I couldn't find a lot of information on that.
    • Threre is the boardcfg_msmc structure with its msmc_cache_size field that is supposed to configure which fraction of the MSMC is used as a cache, and that defaults to 0x10 (* 1/32) which would equal one half of the MSMC used as a cache.
    • There's also a MSMC_CACHE_CTRL register and its CACHE_SIZE field, and I'm guessing that this might be what boardcfg_msmc affects.
    • Can you briefly describe what the MSMC SRAM ought to be used for, and what tradeoffs there are?
    • With the default SCIClient initialization performed by the SBL, where the boardcfg pointer is NULL and thus the default gets used, can you tell me what the MSMC looks like to the R5F and to the A53?

Regards,

Dominic

  • Dominic,

    From an architecture stand point. For best performance of application from R5F, we typically recommend users to leverage OCRAM and R5F TCM memory since it exists in the MCU domain but the SDK uses MSMC because not all applications can fit in the available 256 KB of OCRAM memory and so MSMC.

    Having said that let me explain the reserved memory in the MSMC. As you know on this device architecture, DMSC/Cortex M3 is the power management and resource management master core and the other cores setup proxy channels using navigator subsystem for messaging with the M3. The region reserved in memory is for these secure proxy structure definition that are used for messaging with SYSFW running on the M3. The MSMC_NOCACHE is used to configure the cache for buffers backing this secure proxy and ring memory. You can find the reference to this MSMC usage in the TISCI USer guide for the SYSFW here:

    We planned to add this as a in integration usage note but It looks like the integration document notes are waiting to be merged with Processor SDK RTOS documentation. Here is a preview of the documentation:

    Configuration of R5F cache shouldn`t impact this last 56KB region. If you are application also uses A53 to send messages to M3 then using this region can impact the system/.  In the longer term, we plan you add an API so cores can query the MSMC usage and also move the MSMC_DMSC region to the end of the MSMC region so it is not in the middle of the MSMC range. 

    Let me check on the MSMC usage in the USB driver and the board configuration and get back to you but I suspect that since the MSMC is used by A53 applications to talk to DMSC, the R5F developers don`t consider it to be reserved. 

    Regards,

    Rahul 

  • Hello Rahul,

    thanks for your reply.

    Okay, so MSMC is used "out of convenience" in the example linker scripts since the MSMC provides a lot more room. I'm aware that MCU SRAM and/or TCM are going to offer better performance, but since I don't know what amount of memory the application is going to use I'm thinking of MCU SRAM and TCM more as an optimiziation in case we hit performance bottlenecks.

    Regarding the secure proxy backing memory:

    I re-read the TISCI user guide and I'm still slightly confused:

    Is boardcfg_secproxy.disable_main_nav_secure_proxy and boardcfg_msmc.msmc_cache_size refering to the same us of MSMC SRAM as the secure proxy backing buffer?

    Is that also what the AM65x TRM means in chapter 8.1.1 when talking about "cache" and the MSMC_CACHE_CTRL register?

    MSMC supports the following features:

    • 2MB (2 banks x 1MB) SRAM with ECC:

    – Shared coherent level 2/level 3 memory-mapped SRAM

    – Shared coherent level 3 cache

    It would be great if you could provide some explanation on the meaning of "level 3 cache" in the MSMC chapter of the TRM.

    Regarding R5F caches and the MSMC3_NOCACHE region:

    Now that I (somewhat) understand the secure proxy backing stuff I think that this refers to something else. I'm guessing that the USB demo requires some of its memory uncached because it is shared with the USB peripheral. The problem that I'm seeing is that the MPU settings apparently try to configure an uncached 56 KB range from 0x701F2000 to 0x701FFFFFF, but according to my understanding of the subregion disable bits setting the highest subregion disable bit (0x80) should disable that mapping for the subregion with the highest address, i.e. it would create that 56 KB range at 0x701F0000-0x701FEFFFF.

    From pdk_am65xx_1_0_3\packages\ti\drv\usb\example\bios\am65xx\r5_mpu.xs:

    /* make 0x701F2000 non-cache */
    attrs.enable = true;
    attrs.bufferable = false;
    attrs.cacheable = false;
    attrs.shareable = false;
    attrs.noExecute = false;
    attrs.accPerm = 1;          /* RW at PL1 */
    attrs.tex = 1;
    attrs.subregionDisableMask = 0x80; /* mask first 0x2000 */
    MPU.setRegionMeta(6, 0x701F0000, MPU.RegionSize_64K, attrs); /* last 56k of MSMC effectively */
    

    This creates a 64 KB MPU region where the LAST 0x2000 bytes are disabled.

    From the R5F reference manual:

    c6, MPU Region Size and Enable Registers
    ...
    [15:8] Sub-region disable Each bit position represents a sub-region, 0-7 a.
    ...
    a. Sub-region 0 covers the least significant addresses in the region, while sub-region 7 covers the most significant
    addresses in the region. For more information, see Subregions on page 7-3.


    Regards,

    Dominic

  • Dominic,

    The MSMC along with the two dual A53  clusters forms the compute cluster on the AM65x device. The MSMC can be configured as SRAM as is the current implementation or can be configured as L3 cache for external EMIF memory access from the A53.   The MSMC_CACHE_CTRL allows you to configure the MSMC SRAM as L3 cache.  This feature essentially  allows AM65x device to support two cache hierarchies :

    1. Shared L3 SRAM mode (L1 and L2 cache MSMC SRAM)      <---- Default

    2. Shared L3 cache mode (L1 and L2 and MSMC L3 cache DDR)  

    You can also support a third hybrid hierarchy by configure MSMC as part cache and part SRAM.  If you set 1 MB of MSMC as L3 cache then you can use the rest as SRAM.

    An example use cases  would be the  configuration of quad A53  in SMP mode, the MSMC forms the shared cache that allows for synchronization across four cores that are implemented across two clusters on the chip.  typically if you see devices like AM572x (dual core A15 device) or K2H (quad core A15 device), they have a shared L2 that provide shared cache for the SMP mode implementation since they are all implemented using a single cluster.

    An example function to setup MSMC Cache using CSL code would be as follows:

    // Following values are for SOC_AM6X (Maxwell)
    // Valid Maxwell cache sizes (per MSMC3 spec):
    // 0.25MB, 0.5MB, 0.75MB, 1MB, 1.25MB, 1.5MB, 1.75MB, 2MB
        #define     MSMC_L3MAX          0x200000        // 2.0  MB
        #define     MSMC_L3MIN           0x40000        // 0.25 MB
        #define     MSMC_SRAM_BANKS             2
        #define     MSMC_SRAM_GROUPWAYS         4
        #define     MSMC_SRAM_DATAPATH         32
        #define     MSMC_SRAM_CACHELINE        64
        #define     MSMC_SRAM_ENTRIES         512
    
    
    // The following defines are used here until msmc CSLR is available in git repo
    #define CSL_MSMC_CFGS0_REGS_BASE                                        (0x0E000000U)
    #define CSL_MSMC_CFGS0_CACHE_CTRL                                       (0x00001000U)
    #define CSL_MSMC_CFGS0_CACHE_CTRL_CACHE_SIZE_MASK                       (0x000000000000000FU)
    #define CSL_MSMC_CFGS0_CACHE_STAT_SZ_TRANSITION_MASK                    (0x0000000000000010U)
    #define MSMCCFG     (CSL_COMPUTE_CLUSTER0_MSMC_CFGS0_BASE)
    
    
    
    /* Set MSMC SRAM as L3 cache based on size in bytes
    * Usage notes:
    *     This API is for setting L3 cache from non-cache (size = 0) only
    *     Less than MSMC_L3MIN returns ERROR even though size of 0 is a valid size
    *     Actual cache size will be the valid minimum of value passed to function
    *     Example: 310KB will result in 256KB cache size
    * Input: desired L3 size (has to be permissible for selected SoC)
    * Return: 0 == error
    */         final cache size in bytes
    
    uint32_t msmcSetL3CacheSize(uint32_t size)
    {
        uint32_t status;
        uint32_t    val = 0;
        volatile uint64_t *cacheCtrl_ptr = (uint64_t *) (MSMCCFG + CSL_MSMC_CFGS0_CACHE_CTRL);
        // Test desired cache size for valid sizes
        // If CACHE_CTRL.CACHESIZE[3-0] is non-zero, or size is out of range, return ERROR
        if (/* (*cacheCtrl_ptr & CSL_MSMC_CFGS0_CACHE_CTRL_CACHE_SIZE_MASK) || */ ((size < MSMC_L3MIN) || (size > MSMC_L3MAX)))
            status = 0;
        else
        {
            // Calculate "Group-Ways" value for CACHE_CTRL.CACHE_SIZE
            val = (size >> 16) / MSMC_SRAM_GROUPWAYS;
            // Write cache size
            *cacheCtrl_ptr = (*cacheCtrl_ptr & !CSL_MSMC_CFGS0_CACHE_CTRL_CACHE_SIZE_MASK) + val;
            // Wait for transition to complete
            while ( *cacheCtrl_ptr & CSL_MSMC_CFGS0_CACHE_STAT_SZ_TRANSITION_MASK)
                ;
            // Verify again
            if ((*cacheCtrl_ptr & CSL_MSMC_CFGS0_CACHE_CTRL_CACHE_SIZE_MASK) == val)
               status =  val * MSMC_SRAM_GROUPWAYS << 16;    // return actual size used for L3 cache
            else   {
                status = 0;    // this should not happen so flag as ERROR
            }
        }
        return status;
    }

    Hope this helps.

    Regards,

    Rahul