This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[Concerto] C28 Code Execute Cycle Timing Comparison - Flash vs RAM



Hi All,

I used a simple program for C28 to measure the execution time of  C28 FCU fast RTS library function sincos() . I did a comparison test on the execution time in CPU cycles for the sincos() function by running the code from Flash vs RAM. Here's the test condition and result.

The clock rate of C28 is set to 150Mhz, Flash is configured with 3 wait state. The FPU math table is located in single-wait Boot ROM.

When the code runs from flash, it takes 156 cycles to calculate sincos().  However, if the code runs from RAM, the timing reduces to 46.

Apparently the performance of running code from Flash is not good. Is it mainly due to the 3 wait state for accessing the Flash? The wait state  number is recommended by TI while the C28 clock rate is at 150Mhz,  is it an expected behavior running the code on Flash?

Thanks!

Holly

  • Zhou,

    Are you using preftech for data and program memory while doing these tests,

    Please make sure you use the V140 of the header files for concerto, there is an error in the previous versions which will make the prefect mechanism not work,

    EALLOW;

    //Disable Cache and prefetch mechanism before changing wait states
           FlashCtrlRegs.FRD_INTF_CTRL.bit.DATA_CACHE_EN = 0;
           FlashCtrlRegs.FRD_INTF_CTRL.bit.PREFETCH_EN = 0;
           
           //Set waitstates according to frequency        
           //                CAUTION
           //Minimum waitstates required for the flash operating
           //at a given CPU rate must be characterized by TI.
           //Refer to the datasheet for the latest information.
           #if CPU_FRQ_150MHZ
           FlashCtrlRegs.FRDCNTL.bit.RWAIT = 0x3;
           #endif
        
           #if CPU_FRQ_100MHZ
           FlashCtrlRegs.FRDCNTL.bit.RWAIT = 0x2;
           #endif
           
           #if CPU_FRQ_60MHZ
           FlashCtrlRegs.FRDCNTL.bit.RWAIT = 0x1;
           #endif              
           
           //Enable Cache and prefetch mechanism to improve performance
           //of code executed from Flash.
           FlashCtrlRegs.FRD_INTF_CTRL.bit.DATA_CACHE_EN = 1;
           FlashCtrlRegs.FRD_INTF_CTRL.bit.PREFETCH_EN = 1;
             
           //At reset, ECC is enabled
           //If it is disabled by application software and if application again wants to enable ECC
           FlashEccRegs.ECC_ENABLE.bit.ENABLE = 0xA;
        
           EDIS;

    The code in InitSysCtrl.c shows the procedure to enable the prefetch mechanism, you can alternatively refer to the TRM, 

    Regards

    Manish Bhardwaj

  • Hi Manish,

    Thank you for the quick reply.

    I used V130 of Concerto files with prefetch enabled. As you pointed out, using older version of the Concerto files is definitely the cause of the problem.

    I replaced the Concerto files with the V140 package and did the same test. Running sincos() function from the flash only takes 62 cycles now.

    Thank you and Best Regards

    Holly

  • Glad to be of help,

    regards

    Manish Bhardwaj