This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Measure CPU % (percent) usage?

Does anyone have a suggestion for how to measure the amount of DSP CPU I am using with my firmware?  A rough percentage, as displayed by the typical operating system, would be helpful.

I am using CSL, but not DSP/BIOS or anything that high level.  My application is based around several channels of I/O, all running at 125 kHz with a 128-sample-frame window.  Thus, my code needs to repeat every 976.5625 Hz.  What I want to determine is whether I am using more cycles than are available.  The period is 1.024 ms, and the C5506 is running at the maximum 108 MHz, meaning that I have 110,592 clock cycles available.  I am calling FFT routines, several DSP assembly-optimized subroutines, and a couple of USB API endpoints.  I have counted cycles on many of the individual routines, but I have no idea what the total cycles for everything might be.  There's also the consideration that 4 channels of DMA and USB are eating cycles outside the actual execution, at least if there is contention for bus resources then DMA would add cycles, which is certainly true while the FFT is running.

Any suggestions?

  • Brian,

    You are using CCS3.3, aren't you? CCS3.3 will tell you CPU cycle counts between two break points. I think you may be able to get total cycle count that you want by using this feature (called clock) .

    How to enable it:

    1) Under "Profile" menu, go to Clock -> Enable

    2) Under "Profile" menu, go to Clock -> View

    3) Under "Profile" menu, go to Clock -> Setup

    -  Select: cycle.CPU

    - Select: Auto (reset option)

    You will see cycle count on bottom right of the CCS window.

    Good luck!

    Regards,

    Peter Chung

     

     

  • Hi Peter,

    I'm not sure where to put my breakpoints.  I have two ADS chips feeding two McBSP ports and two DMA callbacks (and have enabled both the HALF and BLOCK interrupts for each).  Then I have two USB endpoints, one for each ADS, serviced from one USB callback interrupt.  I'm not sure I can find one place in my code where I can guarantee all code will be executed between two points.  I could measure any one of four or five key routines, but I don't quite know where to measure everything.

    I have a main loop where I check four global flags (HALF and BLOCK for each of the two DMA channels) and conditionally run service code if the global is set.  Then I have the USB callback for PSOF which runs outside the main loop (but perhaps this is negligible).  I think the key is that I have nothing to synchronize my main firmware loop to the USB frame cycle.

    Would DSP/BIOS somehow make this easier to measure? ... or does DSP/BIOS still use flags and/or events to service DMA and USB?

    P.S.  I'm going to follow your instructions above and try the cycle count inside my main loop, and see how that works.  I predict that I will get different CPU cycle counts, especially if my loop repeats frequently while waiting for a global event to fire.

  • Brian,

    Yes, I see your frustration. I have checked some built-in DSP_BIOS features in CCS3.3 but they does not seem to work. Let me ask around.

    Regards,

    Peter Chung

     

  • Hi Peter,

    I'm back to the question of calculating my total CPU % usage with multiple tasks running in parallel. Did you ever ask around on this topic?

    Brian

  • Hi,

     

    CCS tool does not provide a good estimated numbers. So we created using timer and provide cycle numbers of  cpu active or idle time.

    //================================

    #include "csl_gpt.h"

     

    // Global variables

    CSL_GptObj gptObj;

    CSL_Handle    hGpt;

    CSL_Status    status;

    CSL_Config    hwConfig;

    CSL_GptObj gptObj1;

    CSL_Handle    hGpt1;

    CSL_Config    hwConfig1;

    CSL_GptObj gptObj2;

    CSL_Handle    hGpt2;

    CSL_Config    hwConfig2;

     

    Uint32 timeCnt1,timeCnt2;

    Uint32 timeCnt3=0;

    Uint32 timeCnt4=0;

     

    /**

     *  \brief  Function to calculate the system clock

     *

     *  \param    none

     *

     *  \return   System clock value in Hz

     */

    void run_GPT(unsigned int flag);

    void run_GPTidle(unsigned int flag);

     

    void main(void) {

     

    ///////////////////////////////////////////////////////

    // setup timer 0

    ///////////////////////////////////////////////////////

     

    status = 0;

     

    hGpt = GPT_open (GPT_0, &gptObj, &status);

    GPT_reset(hGpt);

     

    /* Configure GPT module */

    hwConfig.autoLoad = GPT_AUTO_ENABLE;

    hwConfig.ctrlTim = GPT_TIMER_ENABLE;

    hwConfig.preScaleDiv = GPT_PRE_SC_DIV_0; // divide by 2

    hwConfig.prdLow = 0xFFFF;

    hwConfig.prdHigh = 0xFFFF;

     

    GPT_config(hGpt, &hwConfig);

     

    ///////////////////////////////////////////////////////

    // setup timer 1

    ///////////////////////////////////////////////////////

     

    status = 0;

     

    hGpt1 = GPT_open (GPT_1, &gptObj1, &status);

    GPT_reset(hGpt1);

     

    /* Configure GPT module */

    hwConfig.autoLoad = GPT_AUTO_ENABLE;

    hwConfig.ctrlTim = GPT_TIMER_ENABLE;

    hwConfig.preScaleDiv = GPT_PRE_SC_DIV_0; // divide by 2

    hwConfig.prdLow = 0xFFFF;

    hwConfig.prdHigh = 0xFFFF;

     

    GPT_config(hGpt1, &hwConfig);

     

     

    run_GPT(1); // enable timer

        status = your_function1;

    run_GPT(0); // disable timer

    run_GPTidle(1);

     your_function2;

      run_GPTidle(0);

    }

     

    void run_GPT(unsigned int flag) {

     

      if (flag) {

        GPT_start(hGpt);

        IRQ_globalDisable();

        GPT_getCnt(hGpt, &timeCnt1);

        IRQ_globalEnable();

      } else {

        IRQ_globalDisable();

        GPT_getCnt(hGpt, &timeCnt2);

        IRQ_globalEnable();

        GPT_stop(hGpt);

        printf("exec cks = %ld   idle cks = %ld\n",2*(timeCnt1-timeCnt2),2*(timeCnt3-timeCnt4));

      }

    }

     

    void run_GPTidle(unsigned int flag) {

     

      if (flag) {

        GPT_start(hGpt1);

        IRQ_globalDisable();

        GPT_getCnt(hGpt1, &timeCnt3);

        IRQ_globalEnable();

      } else {

        IRQ_globalDisable();

        GPT_getCnt(hGpt1, &timeCnt4);

        IRQ_globalEnable();

        GPT_stop(hGpt1);

      }

    }

  • I am working with the C5506, which does not have the GPT peripheral.

    Thanks for the reply, Hyun. It looked very promising at first. I had never noticed the GPT Module before. I quickly looked in the CSL documentation and see that the GPT Module is available on the C5501/C5502 DSPs.