This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA4VM: The cortex r5f core executes coprocessor instructions, and the execution time is 105us

Part Number: TDA4VM

In MCU domain, cortex r5f calls CSL_ Armr5mpucfgregion writes the memory protection register to realize the memory protection function. The execution time of this function is 105us.

I don't think this time is correct. Ask experts to help me.

FIgure 1

figure 2 

 /root/psdk_07_03/pdk_jacinto_07_03_00_29/packages/ti/csl/arch/r5/src/csl_arm_r5_mpu.asm

figure 3

  • Hi,

    The time seems large to me too. Can you see if there is no optimization compiler optimization happening in the code which may be stuffing more instructions in that window of GPIO high to low? Can you build the application in debug mode?

    Regards,

    Karan

  • hi Karan:

    Thank you for your reply.

     

    Let me describe the application I developed using MPU:

    I am implementing memory protection of task / interrupt stack in AUTOSAR OS based on MPU unit of cortex-r5f.           

    For example, OS is performing tasks in 10ms cycle, and I call CSL_ Armr5mpucfgregion function, parameter regionnum: 10, baseaddrregval: 0x97c00000, sizeregval: 1KB, accessctrlregval: readable and writable           

     For example, OS is performing tasks of 1ms cycle, and I call CSL_ Armr5mpucfgregion function, parameter regionnum: 10, baseaddrregval: 0x97c00400, sizeregval: 1KB, accessctrlregval: readable and writable           

    For example, OS is performing tasks of 5ms cycle, and I call CSL_ Armr5mpucfgregion function, parameter regionnum: 10, baseaddrregval: 0x97c00800, sizeregval: 1KB, accessctrlregval: readable and writable

    TABLE 1
    stack start address      size                   task  name 
    -----------------------------------------------------------------------------------------
    97c00000             1kB              OsCfg_Stack_AppTask_10ms_Core0_ASILC_Dyn 
    97c00400                        1kB                            OsCfg_Stack_AppTask_1ms_Core0_QM_Dyn 
    97c00800                        1 kB                           OsCfg_Stack_AppTask_5ms_Core0_QM_Dyn 

    FIgure 1  mcusw/build/j721e/mcu1_0/linker_r5.lds

    FIgure 2   mcusw/build/j721e/mcu1_0/linker_r5.lds

    Figure 3.  const CSL_ArmR5MpuRegionCfg gCslR5MpuCfg[CSL_ARM_R5F_MPU_REGIONS_MAX]

  • Hi,

    I am implementing memory protection of task / interrupt stack in AUTOSAR OS based on MPU unit of cortex-r5f.           

    For example, OS is performing tasks in 10ms cycle, and I call CSL_ Armr5mpucfgregion function, parameter regionnum: 10, baseaddrregval: 0x97c00000, sizeregval: 1KB, accessctrlregval: readable and writable           

     For example, OS is performing tasks of 1ms cycle, and I call CSL_ Armr5mpucfgregion function, parameter regionnum: 10, baseaddrregval: 0x97c00400, sizeregval: 1KB, accessctrlregval: readable and writable           

    For example, OS is performing tasks of 5ms cycle, and I call CSL_ Armr5mpucfgregion function, parameter regionnum: 10, baseaddrregval: 0x97c00800, sizeregval: 1KB, accessctrlregval: readable and writable

    Should the MPU configuration not happen in the startup code before you run these runnable tasks? Something like _enable_mpu() in <SDK>/pdk*/packages/ti/csl/arch/r5/src/startup/startup.c?

    All the regions in the MPU can be programmed in one go.

    Can you see if there is no optimization compiler optimization happening in the code which may be stuffing more instructions in that window of GPIO high to low? Can you build the application in debug mode?

    Also, is there a way for you to build you application with less compiler optimizations?

    Regards,

    Karan

  • hi, Karan Saxena

    Thank you for your reply.

    Can you see if there is no optimization compiler optimization happening in the code which may be stuffing more instructions in that window of GPIO high to low? Can you build the application in debug mode?

             The compiler instruction of my current project is: make - s base_ project_ mcu2 BUILD_ PROFILE=debug -j8

             BUILD_ PROFILE The value of the profile variable is debug

             CSL_Armr5mpucfgregion function is assembly code, compiler optimization level should not affect the efficiency of this function

    Should the MPU configuration not happen in the startup code before you run these runnable tasks? Something like _enable_mpu() in <SDK>/pdk*/packages/ti/csl/arch/r5/src/startup/startup.c?

    All the regions in the MPU can be programmed in one go.

    Startup. C call_ enable_ Mpu() function, gcslr5mpucfg code is

             

    const CSL_ArmR5MpuRegionCfg gCslR5MpuCfg[CSL_ARM_R5F_MPU_REGIONS_MAX] =
    {
        {
            /* Region 0 configuration: complete 32 bit address space = 4Gbits */
            .regionId         = 0U,
            .enable           = 1U,
            .baseAddr         = 0x0U,
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_4GB,
            .subRegionEnable  = CSL_ARM_R5_MPU_SUB_REGION_ENABLE_ALL,
            .exeNeverControl  = 1U,
            .accessPermission = CSL_ARM_R5_ACC_PERM_PRIV_USR_RD_WR,
            .shareable        = 0U,
            .cacheable        = (uint32_t)FALSE,
            .cachePolicy      = 0U,
            .memAttr          = 0U,
        },
        {
            /* Region 1 configuration: 128 bytes memory for exception vector execution */
            .regionId         = 1U,
            .enable           = 1U,
            .baseAddr         = 0x0U,
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_128B,
            .subRegionEnable  = CSL_ARM_R5_MPU_SUB_REGION_ENABLE_ALL,
            .exeNeverControl  = 0U,
            .accessPermission = CSL_ARM_R5_ACC_PERM_PRIV_USR_RD_WR,
            .shareable        = 0U,
            .cacheable        = (uint32_t)TRUE,
            .cachePolicy      = CSL_ARM_R5_CACHE_POLICY_WB_WA,
            .memAttr          = 0U,
        },
        {
            /* Region 2 configuration: 1MB KB MCU MSRAM */
            .regionId         = 2U,
            .enable           = 1U,
            .baseAddr         = 0x41C00000,
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_512KB,
    
    #if defined (SOC_J721E) || defined (SOC_J7200)
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_1MB,
    #endif
            .subRegionEnable  = CSL_ARM_R5_MPU_SUB_REGION_ENABLE_ALL,
            .exeNeverControl  = 0U,
            .accessPermission = CSL_ARM_R5_ACC_PERM_PRIV_USR_RD_WR,
            .shareable        = 0U,
            .cacheable        = (uint32_t)TRUE,
            .cachePolicy      = CSL_ARM_R5_CACHE_POLICY_WB_WA,
            .memAttr          = 0U,
        },
        {
            /* Region 3 configuration: 2 MB MCMS3 RAM */
            .regionId         = 3U,
            .enable           = 1U,
            .baseAddr         = 0x70000000,
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_2MB,
    
    #if defined (SOC_J721E)
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_8MB,
    #endif
    
    #if defined (SOC_J7200)
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_1MB,
    #endif
    
            .subRegionEnable  = CSL_ARM_R5_MPU_SUB_REGION_ENABLE_ALL,
            .exeNeverControl  = 0U,
            .accessPermission = CSL_ARM_R5_ACC_PERM_PRIV_USR_RD_WR,
            .shareable        = 0U,
            .cacheable        = (uint32_t)TRUE,
            .cachePolicy      = CSL_ARM_R5_CACHE_POLICY_WB_WA,
            .memAttr          = 0U,
        },
        {
            /* Region 4 configuration: 2 GB DDR RAM */
            .regionId         = 4U,
            .enable           = 1U,
            .baseAddr         = 0x80000000,
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_2GB,
            .subRegionEnable  = CSL_ARM_R5_MPU_SUB_REGION_ENABLE_ALL,
            .exeNeverControl  = 0U,
            .accessPermission = CSL_ARM_R5_ACC_PERM_PRIV_USR_RD_WR,
            .shareable        = 0U,
            .cacheable        = (uint32_t)TRUE,
            .cachePolicy      = CSL_ARM_R5_CACHE_POLICY_WB_WA,
            .memAttr          = 0U,
        },
        {
            /* Region 5 configuration: 32 KB BTCM */
            /* Address of ATCM/BTCM are configured via MCU_SEC_MMR registers
               It can either be '0x0' or '0x41010000'. Application/Boot-loader shall
               take care this configurations and linker command file shall be
               in sync with this. For either of the above configurations,
               MPU configurations will not changes as both regions will have same
               set of permissions in almost all scenarios.
               Application can chose to overwrite this MPU configuration if needed.
               The same is true for the region corresponding to ATCM. */
            .regionId         = 5U,
            .enable           = 1U,
            .baseAddr         = 0x41010000,
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_32KB,
            .subRegionEnable  = CSL_ARM_R5_MPU_SUB_REGION_ENABLE_ALL,
            .exeNeverControl  = 0U,
            .accessPermission = CSL_ARM_R5_ACC_PERM_PRIV_USR_RD_WR,
            .shareable        = 0U,
            .cacheable        = (uint32_t)TRUE,
            .cachePolicy      = CSL_ARM_R5_CACHE_POLICY_NON_CACHEABLE,
            .memAttr          = 0U,
        },
        {
            /* Region 7 configuration: 32 KB ATCM */
            .regionId         = 6U,
            .enable           = 1U,
            .baseAddr         = 0x0,
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_32KB,
            .subRegionEnable  = CSL_ARM_R5_MPU_SUB_REGION_ENABLE_ALL,
            .exeNeverControl  = 0U,
            .accessPermission = CSL_ARM_R5_ACC_PERM_PRIV_USR_RD_WR,
            .shareable        = 0U,
            .cacheable        = (uint32_t)TRUE,
            .cachePolicy      = CSL_ARM_R5_CACHE_POLICY_NON_CACHEABLE,
            .memAttr          = 0U,
        },
        {
            /* stack */
            .regionId         = 9U,
            .enable           = 1U,
            .baseAddr         = 0x97c00000,
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_16KB,
            .subRegionEnable  = CSL_ARM_R5_MPU_SUB_REGION_ENABLE_ALL,
            .exeNeverControl  = 0U,
            .accessPermission = CSL_ARM_R5_ACC_PERM_PRIV_USR_RD,
            .shareable        = 0U,
            .cacheable        = (uint32_t)TRUE,
            .cachePolicy      = CSL_ARM_R5_CACHE_POLICY_WB_WA,
            .memAttr          = 0U,
        },
    
         {
            /* init stack */
            .regionId         = 10U,
            .enable           = 1U,
            .baseAddr         = 0x97c02c00,
            .size             = CSL_ARM_R5_MPU_REGION_SIZE_1KB,
            .subRegionEnable  = CSL_ARM_R5_MPU_SUB_REGION_ENABLE_ALL,
            .exeNeverControl  = 0U,
            .accessPermission = CSL_ARM_R5_ACC_PERM_PRIV_USR_RD_WR,
            .shareable        = 0U,
            .cacheable        = (uint32_t)FALSE,
            .cachePolicy      = CSL_ARM_R5_CACHE_POLICY_WB_WA,
            .memAttr          = 0U,
        },
    };

    Now can you confirm that the execution time of armr5mpucfgregion function interface is 100US? Please test the execution time of armr5mpucfgregion function interface. The execution time is too long to meet the task scheduling time after adding memory protection. I hope to get help from experts

  • Hi,

             CSL_Armr5mpucfgregion function is assembly code, compiler optimization level should not affect the efficiency of this function

    The code for CSL_armR5MpuCfgRegion will not be optimized but the positioning of your profile points may change.

    I profiled the function CSL_armR5MpuCfgRegion() using the R5F's PMU counters. I see that CSL_armR5MpuCfgRegion() is taking ~216 cycles, R5F is running at 1GHz so this translates to 216ns.

    Please use the attached patch on top of the can_profile_app in MCUSW and you can try profile this on your side. I also profiled a 10ms sleep just to make sure that the profiling is reliable.

    Patch on SDK7.3 MCUSW: 

    0001-Add-profiling-code-infra-to-can_profile_app.patch

    Logs:

    SBL Revision: 01.00.10.00 (May  5 2021 - 13:39:13)
    TIFS  ver: 21.1.1--v2021.01a (Terrific Lla
    Starting Sciserver..... PASSED
    CAN Profile App:Variant - Pre Compile being used !!!
    CAN Profile App: Successfully Enabled CAN Transceiver MCU MCAN0!!!
    CAN Profile App: Successfully Enabled CAN Transceiver MCU MCAN1!!!
    CAN Profile App:Will Transmit & Receive (Internal-loopback) 10000 Messages, 2 times
    CAN Profile App:Profile CSL_armR5MpuCfgRegion
    CAN Profile App:startTime = 84
    endTime = 300
    endTime - startTime = 216
    CAN Profile App:Profile 10ms task sleep
    CAN Profile App:startTime = 81
    endTime = 9999132
    endTime - startTime = 9999051
    CAN Profile App:NOTE : Operating in interrupt mode!
    CAN Profile App:Transmit & Receive (Internal-loopback)  10000 packets 2 times
    CAN Profile App:Average of 310.7698 usecs per packet
    CAN Profile App:Average of 3223 packets in 1 second with CPU Load 1.546732%
    CAN Profile App:Packets sent: 20000, Packets recv: 20000 in total time: 12407698 us
    CAN Profile App:Measured Load: Total CPU: 1.643616%, HWI: 1.132556%, SWI:0.023025% TSK: 0.391151%
    CAN Profile App:Message Id Received c00000c0 Message Length is 64
    CAN Profile App:Test completed for 0 instance
    
    CAN Profile App: 8192 bytes used for stack
    CAN Profile App:Profiling completes sucessfully!!!

    Regards

    Karan