This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM4378: About Data cache operation by HLOS API

Part Number: AM4378
Other Parts Discussed in Thread: TMDSEVM437X, SYSBIOS

Hello,

 

Regarding to data cache operation by HLOS API with TMDSEVM437X, my customer is asking a question.

When data cache operation(clean) by HLOS API(using inval_Dcache() ), it no response from program at inval_Dcache() call.

At that time, when “Suspend” buttom (CCS tool bar) is push, the error message is shown following.

「CortexA9: Trouble Halting Target CPU: (Error -1321 @ 0xFFFFFFFF) Device failed to enter debug/halt mode because security settings prevent debug. Power-cycle the board. If error persists, confirm configuration and/or try more reliable JTAG settings (e.g. lower TCLK). (Emulation package 8.2.0.00004)」

 

(Their conditions are following.)

EVM: TMDSEVM437X (AM437x high security)

CCS:Ver.9.1.0.00010 (Compiler: GNU v7.2.1(Linaro)),

WIN OS : Win10(64bit)

 

(Their test procedure is following.)

1)They used LedBlinking and Loader project in attached file as test code.

And they put “tiimage.exe” in LedBlinking project folder.

(“tiimage.exe”: https://software-dl.ti.com/processor-sdk-rtos/esd/AM437X/latest/index_FDS.html)

 

・Loader project: LedBlinking project is translated form OCMC RAM(0x40320000) to DDR SDRAM(0x80000000) and run translated LedBlinking project.

・LedBlinking project: LED3 blinking sample code

 

2) Build LedBlinking and Loader project and select target configuration(Loader project). (Below)


 

3)After EVM initialization is Done, load ”LedBlinking.dat” (binary) in debug folder by Load Memory in CCS. Start Address is “0x40320000”. (Below)


 

4)When LedBlinking.dat is download on OCMC RAM, debug start to Loader project.

5) When data cache operation(clean) by HLOS API(using inval_Dcache() ),it no response from program at inval_Dcache() call.

(Before running translated LedBlinking project, data cache operation(clean) has problem.)

6) At that time, when “Suspend” buttom (CCS tool bar) is push, the error message is shown following. (Below)


 

(Question)

Could you please tell why this problem is occurred and solution for problem?

Could you please give us any comment or advice?

 

Regards,

Tao_2199

6102.example.zip

  • Hi,

    Tao_2199 said:
    data cache operation(clean) by HLOS API(using inval_Dcache() ), it no response from program at inval_Dcache() call.

    The "HLOS API" isn't defined in PRSDK or TI-RTOS, but is defined in the mmuA9g.S file(s) in the customer example code which you shared:

    01_Loader/mmuA9g.S:249:inval_Dcache:
    01_Loader/mmuA9g.S:529:inval_dcache: @ void inval_dcache(UW addr, UW size)
    02_LedBlinking/mmuA9g.S:249:inval_Dcache:
    02_LedBlinking/mmuA9g.S:529:inval_dcache: @ void inval_dcache(UW addr, UW size)

    Is this correct?

    Regards,
    Frank

  • Hello, Frank.

    Thank you for your reply.
    I am a colleague of "Tao_2199" and will take over and respond to this thread.

    I have additional information and additional question from my customer regarding your confirmation above.


    **********
    Based on "5.2.11 Services for HLOS Support - API" in AM437x Technical Reference Manual (spruhl7i) "
    and implementing clean_Dcache() etc. in mmuA9g.S. 

    After reviewing it, when I changed to all Dcache clean, I succeeded to switch the projects in the sample I provided.

    <Before> clean_Dcache( ADDR_DDR_APP, LEN_DDR_APP );
    <After>    clean_Dcache( 0, 0 );

    Q: Could you tell me what is causing the Before-change HLOS API usage that
         prevents from executing the project and suspending it in the debugger?

    *********

    Best regards,
    KANAE

  • Hi Kanae,

    Can you please explain the relationship between "5.2.11 Services for HLOS Support - API" and the example code? From what I can determine, spruhl7i, Section 5.2.11 appears to discuss code in ROM. However, it seems the example code executes entirely from RAM (e.g. the Loader app executes from OCMC_A_RAM). I don't observe any calls to code in ROM.

    What is the origin of the assembly language in the example code which causes the crash (mmuA9g.S)?

    I compiled the "LedBlinking" and "Loader" projects. I followed the instructions for loading the LedBlinking .dat file and loading/executing the Loader project. I don't currently have access to an AM437x EVM (GP or HS), so instead I'm using an AM437x IDK.

    I observe the Loader code crashes on the "smc #0" instruction in the clean_l2_cache() function which is called inside clean_Dcache( ADDR_DDR_APP, LEN_DDR_APP );

    clean_l2_cache:                         @ void clean_l2_cache(UW start,
                                            @                   UW end, UW linesz)
          .ifdef SECMODE
            sub     r1, r1, r0
            mov     ip, #0x101              @ SMC call to clean and invalidate
            smc     #0
            bx      lr
          .else
            ldr     r3, =PL310_BASE + L2C_CLEAN_PA
            b       op_l2_cache
          .endif
    

    I don't yet understand the cause of this crash.

    Are you saying that  clean_Dcache( 0, 0 ); does not result in a crash? If so, would the code provide the desired functionality? It seems like this wouldn't properly clean the D-cache for the DDR region.

    I'll continue to work on this and keep you posted on my progress.

    Regards,
    Frank

  • Hi Kanae,

    In the Loader app, I replaced:

    clean_Dcache( ADDR_DDR_APP, LEN_DDR_APP );

    with:

    clean_Dcache( 0, 0 );

    The Loader app doesn't crash with this change, and the Loader branches to the start of the LedBlink app (loaded at the start of DDR @ 0x80000000).

    Can you please confirm this is what you expect?

    I'll see if I can determine the cause of the failure in the case of clean_Dcache( ADDR_DDR_APP, LEN_DDR_APP );

    Regards,
    Frank

  • Hi, Frank.

    Thank you for your reply.

    I will check on the items you have confirmed to our customers.

    I look forward to receiving your progress report and
    thank you for your continued support.

    Best regards,
    Kanae

  • Hi Kanae,

    I think I understand the connection between the code and the TRM, 5.2.11 Services for HLOS Support – API. The example code is attempting to invoke the monitor mode primitive described in Table 5-55. L2 Cache Clean and Invalidate Range of Physical Address. The code crashes in the function clean_l2_cache() when the SMC (Secure Monitor Call) instruction is executed.

    In the case of clean_Dcache( 0, 0 ), a different path is followed through the code. There aren't any SMC instructions in this alternate path.

    Below is a rough outline of the passing and failing cases.

            PASS case: no SMC instruction in this path
                // R0=ADDR_DDR_APP  = 0x00000000
                // R1=LEN_DDR_APP   = 0x00000000
                clean_Dcache( 0, 0 )
                    SWI_CALL  clean_dcache,   2
                    swi 0
                    b SWI_Handler   // vector table entry
                    SWI_Handler()
                        clean_dcache
                            clean_dcache_all
                                op_all_dcache
                                clean_l2_cacheall
                                    op_l2_cache_all
    
            FAIL case: SMC instruction in this path
                // R0=ADDR_DDR_APP  = 0x80000000
                // R1=LEN_DDR_APP   = 0x04000000
                //
                clean_Dcache( ADDR_DDR_APP, LEN_DDR_APP )
                    SWI_CALL  clean_dcache,   2
                    swi 0
                    b SWI_Handler   // vector table entry
                    SWI_Handler()
                        clean_dcache
                            clean_l1_dcache
                            clean_l2_cache
                                SMC #0 // crash
    

    I'm still working on determining the reason the code crashes on the SMC instruction.

    Regards,
    Frank

  • Hi Kanae,

    I see there is an starterware API function for invoking the monitor primitives. Please see:

    • pdk_am437x_1_0_16\packages\ti\starterware\include\pub2mon.h
    • pdk_am437x_1_0_16\packages\ti\starterware\soc\armv7a\gcc\pub2mon.S
    /**
     * \brief   This is a secure monitor API which can be used to modify the secure
     *          registers. A few CPU registers including lr will be modified by the 
     *          smc, so they need to be backed up.
     *
     * \param   smc_id  SMC ID of the functionality as defined in device TRM. 
     *                  Passed via r0.
     * \param   param1  Passed via r1, specific to the ID passed. 
     *                  If not applicable for specific id pass 0.
     * \param   param2	Passed via r2, specific to the ID passed. 
     *                  If not applicable for specific id pass 0.
     *
     * \retval  Return value will be passed in r0 & r1 registers.
     **/
    uint64_t Pub2MonDispatch(uint32_t smc_id, uint32_t param1, uint32_t param2);
    

    I tried using this function and the code crashes on the SMC instruction.

    The customer is using an HS device, correct?

    Regards,
    Frank

  • Hi, Frank.

    Thank you for reporting on the status of the investigation!

    As I first posted, my customer is using TMDSEVM437X (AM437x high security).

    From your survey results,
    clean_Dcache (ADDR_DDR_APP, LEN_DDR_APP);
    the Secure Monitor Call (SMC) instruction will be executed, resulting in a crash, and the
    clean_Dcache (0, 0);
    will result in a different path and therefore the Secure Monitor Call (SMC) instruction will not be executed.

    According to your findings, when using
    clean_Dcache (ADDR_DDR_APP, LEN_DDR_APP);
    it crashes because the SMC (Secure Monitor Call) instruction is executed,

    and when changed to
    clean_Dcache (0, 0);
    it is a different path and the SMC (Secure Monitor Call) instruction is not executed.

    Therefore, I understand that changing it to "clean_Dcache (0, 0);" is an effective countermeasure to avoid a crash?

    As described in pub2mon.h and pub2mon.S, it is a secure monitor API that can be used to change the secure registers,
    so does that mean it should be set to "0" if it does not correspond to a specific ID?


    Best regards,
    Kanae

  • Hi Kanae,

    Kanae said:
    Therefore, I understand that changing it to "clean_Dcache (0, 0);" is an effective countermeasure to avoid a crash?

    The clean_dcache() function explicitly checks for both the address and length parameters being 0:

                clean_dcache:                           @ void clean_dcache(UW addr, UW size)
                orrs    r2, r0, r1              @ if both addr and size are zero,
                beq     clean_dcache_all        @ then clean entire D-cache
    

    If both parameters are set 0, then the function clean_dcache_all() is called. 

    This is the point at which the code paths diverge for the passing and failing cases. The function header for clean_dcache_all() states the function cleans all D-caches.

    clean_dcache_all header : "Clean all D-Caches (Inner D-Cache and Outer L2 cache)"

    The functions called in the alternate, failing case (e.g. clean_l1_dcache(), clean_l2_cache()) state the functions only clean (L1,L2) D-cache for a specified address range.

    clean_l1_dcache header : "Clean L1 D-Cache for the Given Address Range"
    clean_l2_cache header : "Clean L2 Cache for the Given Address Range"

    Provided the clean_dcache_all() works according to the function header, it should be a viable workaround (assuming cache "clean" means writeback & invalidate, not simply cache invalidate). However, I would confirm with detailed code inspection that this is the case. I'm concerned that the "smc" call is required to clean L2 cache for a given address range, but not required when the entire L2 cache.

    Kanae said:
    so does that mean it should be set to "0" if it does not correspond to a specific ID?

    There isn't any code in Pub2MonDispatch() for checking a valid Function ID. The smc_id parameter contained in R0 is moved to R12 before the "smc" instruction is executed. I don't know how the smc instuction behaves if a Function ID other than those listed in "5.2.11 Services for HLOS Support – API" is used. Hence I would suggest not calling Pub2MonDispatch() with an unsupported Function ID.

    @******************************************************************************
    @ Function prototype: uint64_t Pub2MonDispatch(uint32_t smc_id, uint32_t param1,
    @                                      uint32_t param2)
    @
    @ smc_id - Passed via r0, SMC ID of the functionality as defined in device
    @          data sheet or TRM.
    @ Param1 - Passed via r1, if not applicable for specific id pass 0.
    @ Param2 - Passed via r2, if not applicable for specific id pass 0.
    @
    @ return - return value will be passed in r0 & r1 registers.
    @
    @ This is a secure monitor API which can be used to modify the secure registers.
    @ A few cpu registers including lr will be modified by the smc, so they need to
    @ be backed up.
    @
    @******************************************************************************
    Pub2MonDispatch:
        stmfd   sp!, {r4-r12, lr}
        mov     r12, r0
        mov     r0, r1
        mov     r1, r2
        dsb
        smc #0
        ldmfd   sp!, {r4-r12, pc}
    

    I'm still trying to determine why the "smc" instruction crashes the processor.

    Regards,
    Frank

  • Hi, Frank.

    Thank you for your kind support.
    I appreciate your explaining the details to me.

    The following is the response from your customer regarding your confirmation.

    ===========================================================

    # Can you please explain the relationship between "5.2.11 Services for HLOS Support - API" and the example code?

    >Our prototype board is QSPI boot, and "Starter" (no OS) is executed in ROM.
    And then "Booter" is extracted from QSPI-connected Flash ROM to DDR3 SDRAM,
    then it branches to 0x80000000 of the destination.

    I created "example.zip" to reproduce the symptoms of the branch failure of L2 cache ON
    in our development environment.
    Instead of writing to ROM, I downloaded Loader and LedBlinking to OCMC RAM with CCS debugger,
    and when Loader is executed, LedBlinking is extracted from OCMC_B_RAM to DDR3 SDRAM and branched.

    # I can't find a call to the code in the ROM, so please contact me with details of the relationship.

    >"5. Initialization" in (spruhl7i) in the description of the Public ROM code and 
    the initialization code that executes the ROM in each memory device,
    and "5.2.11 Services for HLOS Support - API" also just describes the ROM code.
    I was wondering if this was the case.

    # What is the origin of the assembly language in the example code which causes the crash (mmuA9g.S)?

    >This code was ported to AM437x by Mispo Inc. as shown in the comments at the top of the mmuA9g.S file.
    I will inquire with Mispo Ltd. to see if they have any other information sources other than "5.2.11 Services for HLOS Support - API" for the porting.

    # Are you saying that clean_Dcache( 0, 0 ); does not result in a crash?
    If so, would the code provide the desired functionality?

    >In "example.zip" and Stater -> Booter of the prototype board, it did not crash
    even if you run with the above changes.
    I don't know how to check if the clean and invalidate caches work properly,
    just that I was able to branch, so I don't know if it works as expected.

    Does this mean that there is a part in the code of clean_Dcache that seems inappropriate?

    ===========================================================

    Please continue to investigate the "smc" instruction causing the crash.

    Best regards,
    Kanae

  • Hi Kanae,

    I added a call to enable_Dcache() in main() before the call to clean_Dcache():

                enable_Dcache();    // added this call
                clean_Dcache( ADDR_DDR_APP, LEN_DDR_APP );			/* clean D-cache */
    

    The enable_Dcache() function doesn't crash on a GP part. A rough outline of the call sequence for enable_Dcache() is shown below. As can be seen, "smc #0" is called with R12=0x102 in ena_l2_cache(). This corresponds to spruhl7i, "Table 5-56. L2 Cache Set Control Register". In this case, the "smc #0" instruction doesn't result in a crash. The program proceeds beyond this point through the crash in clean_l2_cache().

                enable_Dcache()
                    SWI_CALL  enable_dcache,  0
                    swi 0
                    b SWI_Handler   // vector table entry
                    SWI_Handler()
                        enable_dcache
                            ena_l1_cache
                            ena_l2_cache
                                r12 = 0x102
                                smc #0
    

    I wasn't able to locate a call to Pub2MonDispatch() with Function ID=0x101 in AM437x PDK 1.0.16. A colleague showed me the monitor primitives are used in Linux, but never with Function ID=0x101. This suggests the monitor primitive with R12=0x101 isn't supported on the device.

    Kanae said:
    I don't know how to check if the clean and invalidate caches work properly,

    I'll take a closer look at this code, but it will take me some time. I'll get back with you early next week.

    Kanae said:
    Does this mean that there is a part in the code of clean_Dcache that seems inappropriate?

    I don't have definite proof, but this seems likely. My investigation so far leads me to believe the monitor primitive in "Table 5-55. L2 Cache Clean and Invalidate Range of Physical Address" isn't supported.

    Regards,
    Frank

  • Hi, Frank.


    Thank you for your report on the status of the investigation.
    I look forward to have your response on this matter
    and appreciate your continued support.

    Best regards,
    Kanae

  • Hi Kanae,

    I wanted to see if the "smc" instruction in ena_l2_cache() has some observable effect on the state of the processor. I noticed enable_Dcache() is called from the MMU_init() in the Reset_Handler:

    • Just before the "smc" instruction, the PL310 Control Register is set to 0x00000000
    • Just after executing the "smc" instruction the PL310 Control Register is set to 0x00000001.

    This proves:

    • monitor primitives can be executed from a GP device
    • the monitor primitive in the TRM, Table 5-56 works as expected.

    For more details on PL310 registers, please see:

    • AM437x TRM (spruhl7i.pdf):
      • "Table 2-3. L4_PER Peripheral Memory Map"
      • "3.3.1.2.1 PL310 Registers"
    • PrimeCell Level 2 Cache Controller TRM (DDI0246C_l2cc_pl310_r2p0_trm.pdf):
      • "3.3.3 Control Register"

    The PL310 TRM, Table 3-16 describes cache maintenance operations. The description for "Clean and Invalidate Line by PA" leads me to believe the "L2 Cache Clean and Invalidate Range of Physical Address" monitor primitive should clean and invalidate the cache lines associated with the provided physical address range.

    Although clean_Dcache( 0, 0 ) doesn't cause the processor to crash, it doesn't seem to write back cache data to DDR. I performed this experiment to come to this conclusion:

    1. Halt processor at main()
    2. Open Memory Browser in CCS
    3. In Memory Browser
      • Select DDR base address @ 0x80000000
      • Select Physical Memory view. Take note of DDR contents.
      • Select CPU Memory view
    4. Execute code through CopyProgram(). This will update the CPU Memory view with the data copied from OCMC RAM to DDR.
    5. Execute code through call to clean_Dcache( 0, 0 ). This should write-back and invalidate the dirty cache lines associated with the DDR.
    6. In Memory Browser, select Physical Memory view. If clean_Dcache( 0, 0) was working, I would expect the Physical and CPU Memory views to match. However, the Physical Memory view displays the same information from Step #3.

    I still plan to take a closer look at the clean_Dcache( 0, 0) code.

    Regards,
    Frank

     

  • Hi, Frank.

    Thank you for your support!

    I have reply from my customer about the origin of the assembly language in the example code that you asked dated Oct-29th.

    # What is the origin of the assembly language in the example code which causes the crash (mmuA9g.S)?

    ==========
    This code was ported to AM437x by Mispo Inc. as shown in the comments at the top of the mmuA9g.S file.
    Mispo Ltd. has information sources other than "5.2.11 Services for HLOS Support - API"
    to port NORTi with ARMv6 architecture to ARMv7 Cortex-A9 core,
    please refer to the following

    "(DDI0406) ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition"
    Chapter B3 Virtual Memory System Architecture (VMSA)
    Chapter B4 System Control Registers in a VMSA implementation
    "Appendix D12 ARMv6 Differences"

    "(DEN0013) ARM Cortex-A Series Programmer's Guide"
    Chapter 8 Caches,
    Chapter 9 Memory Management Unit

    "(DDI0388) Cortex-A9 Technical Reference Manual"
    Chapter 4 System Control

    For more information on the dependencies of the L2 cache controller PL310,
    please refer to the following Reference.
    "(DDI0246F) CoreLink Level 2 Cache Controller L2C-310"
    2.3 Cache operation,
    Chapter 3: Programmers Model

    At the time of porting to AM437x, the part to manipulate L2 cache
    using HLOS in secure mode has been added by referring to the following.

    AM437x ARM Cortex-A9 Processors Technical
    5.2.11 Services for HLOS Support - API" in the Reference Manual
    ==========

    I appreciate to your additional reports on this matter in advance.

    Best regards,
    Kanae

  • Hi Kanae,

    Thanks much for the detailed feedback.

    Kanae said:
    At the time of porting to AM437x, the part to manipulate L2 cache
    using HLOS in secure mode has been added by referring to the following.

    Was the issue with the monitor primitive in "Table 5-55. L2 Cache Clean and Invalidate Range of Physical Address" encountered during the port of the software to AM437x?

    Regards,
    Frank

  • Hi Frank,


    Thank you for your support.

    Please take a few moments to reply your additional confirmation
    that my customer will contact MISPO Ltd. with your feedback so far.

    My customer needs to confirm in advance that the following 1. to 3. that his understandings are correct, or not?

    1. clean_Dcache ( ADDR_DDR_APP, LEN_DDR_APP ) investigation status:

    You assume that the API with ID=0x101 used in clean_l2_cache, although listed in (spruhl7i),
    is likely not actually supported by AM437x.

    [Reason for speculation]
    In "example.zip", the API with ID=0x102 is set to the PL310 control register as expected and does not crash
    by the smc instruction. On the other hand, the API with ID=0x101 causes a crash in the smc instruction.

    In TI-RTOS AM437x PDK 1.0.16, the API is used via the Pub2MonDispatch function. The part that calls
    smc_id=SMC_ID_CLEAN_INV_PHY_ADDR (corresponding to ID=0x101) has not been found.
    Linux PDK also uses the API, but the part that uses ID=0x101 has not been found. (ID=0x101 is unlikely to be used.)

    2. Investigation of clean_Dcache( 0, 0 ):

    According to the following experiment, the write back operation from L2 cache to real memory does not work. (*)

    [Experiment results]
    After copying LedBlinking program to DDR3 SDRAM by CopyProgram function of "example.zip",
    it should have been written back from cache to DDR3 SDRAM by calling clean_Dcache( 0, 0 ).
    However, when I looked at the content of DDR3 SDRAM in the Physical Memory View of the Memory Browser in CCS,
    it was not reflected.

    (*) When I looked at op_l2_cache_all, which is called by clean_Dcache( 0, 0) when L2 cache is ON,
    it seems to be a dummy implementation (because only clean of L2 cache is not in the API at AM437x secure mode).
    Sorry for my lack of confirmation.

    3. Additional your confirmation in your last post:

    As mentioned in 1. above, I assume that there is a high possibility that ID=0x101 is not supported by AM437x,
    and I would like to confirm this to increase the accuracy of my assumption,
    but when I ported and ran clean & invalidate of L2 cache using ID=0x101, there was some kind of "little movement Strange...?
    Are there any symptoms that you think are strange?
    I am not sure which symptoms "TRM: Table 5-55 - Monitor Primitive Issues" refers to.

    Best regards,
    Kanae

  • Hi Kanae,

    Kanae said:
    1. clean_Dcache ( ADDR_DDR_APP, LEN_DDR_APP ) investigation status:

    Correct. However, this is only indirect evidence for the lack of support. I don't yet have confirmation from a HLOS monitor primitives or PL310 L2 cache controller expert.

    I next plan to check the PRSDK Secondary Boot Loader to see if it copies data to DDR with D-cache enabled. If so, it should perform a cache clean & invalidate and we can check how this is performed. Would it be possible for the customer to try this experiment?

    Kanae said:
    2. Investigation of clean_Dcache( 0, 0 ):

    Ok, so we agree clean_Dcache( 0, 0 ) doesn't actually clean & invalidate the D-cache?

    Kanae said:
    I would like to confirm this to increase the accuracy of my assumption

    Understood, I would also like confirmation and I'm working toward that goal.

    Regards,
    Frank

  • Hi Frank,

    Thank you for your support.

    Here are my customer's comments.

    Regarding #1, I understand.
    As for our verification, it's difficult for us because we don't use the processor SDK in our development
    and I'm not very familiar with it. I'm sorry for my lack of knowledge.

    Regarding #2, I will confirm about clean_Dcache( 0, 0 ) to Mispo, and report it later.

    Regarding #3, I understand. I will contact MISPO for your confirmation.

    Best regards,
    Kanae

     

  • Kanae,

    Ok, I'll run the experiment with the SBL and let you know what I find.

    Regards,
    Frank

  • Hi Kanae,

    I checked the behavior of the MMCSD Starterware Secondary Boot Loader (SBL). The SBL doesn't enable the L2 D-cache. Hence the bootloader doesn't need to clean & invalidate the L2 D-cache before executing code from DDR.

    I'll see if I can find any PDK code that uses Pub2MonDispatch().

    Regards,
    Frank

  • Hi Frank,

    Thank you for your additional report.
    I will share this to my customer.

    Here are replies from Mispo to your confirmations.

    =======================================================================================================
    2. Investigation of clean_Dcache( 0, 0 ):
    Frank;
    Ok, so we agree clean_Dcache( 0, 0 ) doesn't actually clean & invalidate the D-cache?

    Mispo reply;
    # "The AM437x does not have an HLOS API to clean the entire L2 cache, so that part of the API is a dummy,
    but it does perform an operation to clean the entire L1 data cache.
    This ensures that at least the contents of the L1 data cache are reflected in the L2 cache.
    The call to clean_Dcache( 0, 0 ) is as follows 1, 2.
    1. When only the L1 cache is ON: Data is reflected from the L1 data cache to the DDR3 SDRAM.
    2. When L2 cache is also ON: data is reflected from the L1 data cache to the L2 data cache.
    No data transfer from L2 data cache to DDR3 SDRAM


    3. Additional your confirmation in your post :
    Frank
    Was the issue with the monitor primitive in "Table 5-55.
    L2 Cache Clean and Invalidate Range of Physical Address" encountered
    during the port of the software to AM437x?

    Mispo reply;
    # Mispo has not had any particular problems, so Mispo has not noticed this before,
    but the following phenomenon (*) you cited reproduces,
    and the contents of the L2 cache do not seem to be reflected in the DDR
    when the HLOS API with Function ID=0x101 is executed.

    (*) The phenomenon in the experiment that the investigator confirmed the reflection
    in DDR with Physical Memory View after clean_Dcache( 0, 0 )
    =======================================================================================================

    If you need more confirmation, please let me know.

    Best regards,
    Kanae

  • Hi Kanae,

    Thanks for the feedback. I'm still working on identifying PDK software which uses Pub2MonDispatch(). It may take several days to conclude this investigation. I'll try to have this completed by mid week.

    Regards,
    Frank

  • Hi Frank,

    Thank you for your support.

    Please let me know the progress of your investigation
    so I have to report back to my customer.

    Best regards,
    Kanae

  • Hi Kanae,

    I haven't had an opportunity to pursue this since my last post. I'll try to have an update by early next week. Thanks for your patience.

    Regards,
    Frank


  • Hi Frank,

    Thank you for your reply.
    I look forward to your update early next week.

    Best regards,
    Kanae


  • Hi Frank,

    Thank you for your support.

    How is the progress of your investigation?
    I need to report back to my customer.

    Just let me know your current status.

    Best regards,
    Kanae

  • Hi Kanae,

    Sorry, my further investigation has been delayed. I'll check through PDK for Pub2MonDispatch() and get back with you on Monday.

    Regards,
    Frank

  • Hi Frank,

    Thank you for your reply,
    I would like to know your progressing of investigation.

    Best regards,
    Kanae

  • Hi Kanae,

    I carefully reviewed the PDK code and wasn't able to find any usage of the Starterware function Pub2MonDispatch() with SMC_ID_CLEAN_INV_PHY_ADDR (i.e. HLOS function "L2 Cache Clean and Invalidate Range of Physical Address").

    I created a version of the Loader app using SYSBIOS to see how SYSBIOS manages the L2 cache. This test program correctly cleans and invalidates the L2 cache after CopyProgram() using the following function:

    Cache_wbInv((Ptr)ADDR_DDR_APP, LEN_DDR_APP, Cache_Type_L2, TRUE);

    I notice SYSBIOS doesn't use the HLOS primitive for the L2 cache clean and invalidate. Instead, it directly writes to the PL310 register "Clean and Invalidate Line by PA".

    I can share this test program if you think it would be helpful.

    In summary, I can't find any usage of the HLOS primitive "Table 5-55. L2 Cache Clean and Invalidate Range of Physical Address" in PDK/Starterware or SYSBIOS. 

    Have you checked the MMU and page table settings from your code, or experimented with different settings?

    Are you testing on a GP or HS device? According to the TRM the primitives are available GP devices, but I don't have an HS device to see if I get different behavior than a GP device.

    I'll see if I can find someone with detailed knowledge of the ROM code.

    Regards,
    Frank

  • Hi Frank,

    Thank you for your support.

    I will share your report to my customer.

    Best regards,
    Kanae

  • Hi Frank,

    Here is updated information from my customer.

    ***************************************************************************************************************************

    <Frank>
    I can share this test program if you think it would be helpful.

    # Thank you, please provide it to us.

    <Frank>
    Have you checked the MMU and page table settings from your code, or experimented with different settings?

    # No specific MMU and page table checks from the code.
    I also did not experiment with the HLOS API Function ID=0x101.
    Regarding the problem of starting with L2 cache ON, Mispo Inc. said "clean_Dcache( 0, 0 ) expels L1 D cache to L2 cache",
    so I think it is possible to start without Clean & Invalidate of L2 cache, and I am experimenting with with Mispo as follows.

    ==============================================================================================================
    【Customer reports (A) and confirmations (B) to Mispo】

    (A) Program split with L2 cache ON
    [Before Change]
       vdis_psw(); /* disable all interrupts */ /* disable all interrupts
       clean_Dcache(load_addr, load_size); /* clean D-cache */ /* clean D-cache
       inval_Icache(load_addr, load_size); /* invalidate I-cache */ /* invalidate I-cache
       loaded_program(); /* branch to loaded program */ /* branch to loaded program

    [Modified]
       vdis_psw();
       clean_Dcache( 0, 0 ); /* copied code from the L1 D-Cache to the L2 cache is guaranteed to be copied */
       loaded_program();

    When we measured the start-up time of our prototype in (A) [modified] below, the start-up time was reduced
    and it seemed to satisfy the start-up time requirements.

    (B) Confirmation.
    (B-1) Is the reason for clean_Dcache(load_addr, load_size) in (A) [Before] above, correct?
          [Recognition]
          Copying from ROM to RAM, the code at the branch destination is stored in the L1 D-Cache.
          The L1 cache is a separate cache for the I-Cache and the D-Cache.
          The code stored on the D-Cache cannot be instruction fetched.
          Therefore, by calling clean_Dcache(), the code is kicked out of the D-cache, and the instruction fetching is possible via the L1 I-Cache from the kicked-out destination.

    (B-2) Is the following information about clean_Dcache( 0, 0 ) correct?
          [Recognition]
          clean_Dcache( 0, 0 ) is at least guaranteed to be reflected from the L1 D-Cache to the L2 cache.
          Because the L2 cache is a unified cache, the code expelled from the L1 D-cache to the L2 cache can be instruction-fetched via the L1 I-cache.

    (B-3) Is the reason for inval_Icache(load_addr, load_size) in (A) [before change] above correct as follows?
          [Recognition]
          If the RAM area where the code is copied to has been previously stored and executed by other code, the L1 I-cache may contain the old code that was copied before.
          Therefore, before the program branching, inva_Icache() is called to destroy the contents of the L1 I-cache and read the copied new code from the L2 cache into the L1 I-cache.

    Mispo: Yes. All of the above is as you know.


    (B-4) In the case of our prototype, is it OK to call inval_Icache() before a program branch, or not, as you recognize below?
          Recognition
           In our prototype this time, the
             In the program branch of Starter→Booter→App at startup, the RAM area of the code copy destination is separate
      for each program and no other code has been placed or executed before.
             When you reboot, you don't have to call inval_Icache() because it only resets the hardware (also resets the L1/L2 cache),
      so there is no need to call inval_Icache() because the L1 I-cache does not contain old code that does not exist.
           Even if it is called, the contents of the L1 I-cache are destroyed, so there is no problem.

    #Mispo: If the size of the area to be copied is large and inval_Icache() was taking a long time, it might be faster to disable the entire instruction cache.
    Replace inval_Icache() with four lines of asm volatile(...), as shown below Try it.

      vdis_psw();
      clean_Dcache( 0, 0 );
      asm volatile("mcr p15, 0, %0, c7, c5, 0" : "r" (0)); /* invalidate entire I-cache */
      asm volatile("mcr p15, 0, %0, c7, c5, 6" : "r" (0)); /* invalidate branch predictor */
      asm volatile("dssb" : : : "memory"); /* barrier instructions */ /* barrier instructions
      asm volatile("isb" : : : "memory");
      loaded_program();


    ##Customer: How to disable the entire instruction cache? As described in the above additional explanation, in our current prototype, it is unlikely that the start-up time will be significantly different between disabling the instruction cache and disabling the entire instruction cache for the specified address range.


    (B-5) In the case of our prototype, is there any problem with the above procedure (A) [After change] for the program split with L2 cache ON?
    Of course, once the matter of HLOS API Function ID=0x101 is resolved, we will revert to (A) [Before]


    #Mispo: No. I think it's better to implement disabling the instruction cache just to be sure.
     I think it would be better to implement instruction cache invalidation, just to be sure.

    ##Customer: As you answered, we will implement instruction cache invalidation.
    Currently, if there is no significant difference in start-up time with or without instruction cache invalidation,
    and if there is no problem in processing time, we will implement it as an insurance policy
    in case there is old code in the instruction cache that we did not expect.

    ==============================================================================================================

    Please continue to contact me if you find out anything about the investigation you are doing into the cause of the crash with HLOS API Function ID=0x101.

    <Frank>
    Are you testing on a GP or HS device? According to the TRM the primitives are available GP devices,
    but I don't have an HS device to see if I get different behavior than a GP device.

    # Sorry, our prototype is using GP devices.
    We have no plans to use a HS device and have no environment to use it.
    Also, it was mistaken to report our EVM info in the first post, it is not " TMDSEVM437X (AM437x high security)",
    it is the model number "TMDSEVM437X" is correct (General Purpose module).

    ==============================================================================================================

    Could you please share your sample program in this thread?
    If you feel unclear in my post, please point out it.

    Best regards,
    Kanae

  • Hi Frank,

    My customer would like to have your sample program ASAP.
    When can you provide it to us?

    And the final customer request is as follows.

    **************************************************************************************************************

    TI support team is currently investigating this issue, but once the results are in,
    please respond to the following questions as TI's opinion.

    If the above is a problem, is the crash when cleaning and invalidating the L2 cache using the HLOS API Function ID=0x101 a bug or a usage problem?

    If the above is a bug, what is the workaround to clean and invalidate the L2 cache to DDR3 SDRAM?
    I'm assuming that the sample program you provide will be the workaround.

    Best regards,
    Kanae

  • Hi Kanae,

    Kanae said:
    When can you provide it to us?

    The example code is attached.

    Kanae said:
    I'm assuming that the sample program you provide will be the workaround.

    No, it isn't a solution or workaround for the issue with the L2 cache clean and Invalidate ROM code since SYSBIOS doesn't use this ROM code. I offered this example program to aid in understanding how TI-RTOS (SYSBIOS) uses the PL310 cache controller for managing the L2 cache. Perhaps the customer can take a similar approach to SYSBIOS in their bare-metal code. 

    Kanae said:
    No specific MMU and page table checks from the code

    What I meant is have you manually checked the MMU configuration? What is the MMU configuration? Can you share it?

    Kanae said:
    If the above is a problem, is the crash when cleaning and invalidating the L2 cache using the HLOS API Function ID=0x101 a bug or a usage problem?

    I've looped in a colleague with more knowledge of the ROM code.

    Regards,
    Frank

    test_rtos.zip

  • Hi Frank,

    Thank you for providing the sample program.

    Here are my customer's comment.

    =============================================================

    I appreciate you and I will refer to your sample program. 

    I would like to confirm that the "Cache_wbInv((Ptr)ADDR_DDR_APP,
    LEN_DDR_APP, Cache_Type_L2, TRUE);" in the sample is "correct usage", isn' t it?

    Regarding the MMU settings, I have not checked the register values directly,
    but only edited the "mmutableg.inc" file included in "example.zip" to configure the MMU settings.

    Do you need the MMU settings for our prototype (Starter, Booter, and Application programs)
    as well as the MMU settings for the "example.zip" that reproduces the crash symptom
    in the AM437x GP evaluation module?

    Thank you for your continued support.

    =============================================================

    Best regards,
    Kanae

  • Hi Frank,

    Could you please reply to my customer's questions dated Dec-16th?
    If you have unclear points, please let me know.

    I appreciate your continued support.

    Best regards,
    Kanae

  • Usage of the L2 cache maintenance services is documented in section 5.2.11 of the TRM.  There is no known errata associated with these services.  Most likely a crash is the consequence of user error.  

    Regards,

    James

  • Hi James,

    Thank you for your reply.

    I would like to know the answer from Frank in the post dated Dec-15th.

    ===============================================================================

    Kanae
    If the above is a problem, is the crash when cleaning and invalidating the L2 cache
    using the HLOS API Function ID=0x101 a bug or a usage problem?
  • Kanae, it could be a number of things.  You would have to be sure to invoke the proper barrier instructions around cleaning and invalidating cache.  Probably what is happening is that you are executing code from a improperly cleaned cache.  I would follow the examples from RTOS in the sequence.  Are they using the RTOS examples, or creating their own?

    Regards,

    James

  • Hi James,

    Thank you for your reply.
    I will share your comment with our customer.

    The code used by my customer was posted on October 23,
    but for the mmuA9g.S file that is causing this crash, it was ported to AM437x by Mispo Inc.

    The information referred to during the porting process is described in the post of November 4.

    e2e.ti.com/.../3581758

    Best regards,
    Kanae

  • Kanae, it is very difficult for us to debug code that was ported by a 3rd party.  We should have clear examples in our software offerings (either Linux or RTOS), which the 3rd party should be able to refer to.  We cannot support 3rd party software

    Regards,

    James