This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

to calculate MIPS for a given function using CCS 3.3

Other Parts Discussed in Thread: TMS320C6727, SPRC203

Hi,

I am using a TMS320c6727 DSP processor without DSP/BIOS. I need to calculate MIPS for a particular function. The data I have in hand is the benchmark, i.e the clock cycles consumed for the function(4578 cycles), the board MIPS which is the native MIPS- 2400 MIPS, and the board clock rate- 350 MHz. How do I estimate my MIPS for the function?

I have gone through lot of standard formula but none are giving me the correct answers. Also, an optimized code will give higher MIPS than a non- optimized code right?

Please help. I have gone through lot of forums but all in vain.

Thanks

Anushree

  • Greetings,

    Please get the TI publication SPRABF2, Introduction to TMS320C6000 DSP Optimization and use the example code on page 29

    // Required header file // Both files come with the code generation tools package

    #include <stdint.h> // defines uint64_t

    #include <c6x.h> // defines _itoll, TSCH, TSCL

    // In the variable declaration portion of the code:

    uint64_t start_time, end_time, overhead, cyclecount;

    // In the initialization portion of the code:

    TSCL = 0; //enable TSC

    start_time = _itoll(TSCH, TSCL);

    end_time = _itoll(TSCH, TSCL);

    overhead = end_time-start_time; //Calculating the overhead of the method.

    // Code to be profiled

    start_time = _itoll(TSCH, TSCL);

    function_or_code_here();

    end_time = _itoll(TSCH, TSCL);

    cyclecount = end_time-start_time-overhead;

    printf("The code section took: %lld CPU cycles\n", cyclecount);

    Good Luck,

  • hi,

    Thanks for your help but i tried this already tried this and has failed as I am using a DSP tms320c6727 and not 6700. the header file c6x.h header file does not have declarations for 6727 dsp. Thus TSCL still remains unrecognised. in my source code.

    Regards,

    Anushree

  • Anushree,

    Your question has to do with the several definitions of the term "MIPS". We confusingly use it to mean different things, including

    • Million Instructions Per Second (the exact meaning of the acronym)
    • Maximum performance for the device (2800 MIPS @ 350 MHz, when all 8 processing elements are fully loaded)
    • Overall processing efficiency (I only get 200 MIPS on the C6727 because of memory stalls and device settings)
    • Measure of device utilization by a particular function (how many MIPS does an FFT consume?)
    • Measure of device capability for real-time operation (in 1 us I have 350 MIPS available to do an operation before the next sample)

    If you have the number of clock cycles for a function, and you know the timeframe in which that function will operate, and you know the clock frequency of the DSP, then you know everything to make your calculation. Now you just need to write out your definition of MIPS, in other words state what it is you want to know.

    By the way, the "native MIPS" number in your first post should be 2800, from the datasheet when operating at 350 MHz. But this is probably not a value that applies to what you are looking for.

    Regards,
    RandyP

  • Greetings,

    In your project, you need to define both CHIP_6727 in order for use to resolve for this specific device any and all dependencies.  The -mv67p do one this and -d"CHIP_6727" do the rest.

    Good Luck,

  • Once you resolve your project setup, the code pointed to you for use will give you an accurate CPU clock count between entering to, and returning from your called function.  That number will include all your Instruction Cache latencies, your Data I/O your SP/DP/FP overhead etc...

  • Also, just in case, you will need to include the following in your linker control

    -l$(Install_dir)\sprc203\SystemPatch\applySystemPatch.obj

    -l$(Install_dir)\sprc203\SystemPatch\c672xSystemPatchV2_00_00.lib

    if you have not done so already.

    I believe this patch is needed for all C672x devices.  It is under sprc203 on TI's web site.

  • I am basically calculating the MIPS to estimate the headroom left for processing using the DSP chip. I need to compare with the board MIPS rate thus I have taken the board MIPS into account. ok. So then it is 2800 MIPS rate. The calculation which i followed to calculate MIPS is as follows, it is basically a performance measure thus MIPS for my purpose need not be calculated using cycle per instruction. Please comment if the method is wrong.

    I have my clock cycle count from benchmarking i.e 4578 cycles for a function, and the board clock rate is given as 350 Mhz. Thus I calculate the CPU utilization time which is 4578/350 which gives me 1.30 microsec of the CPU time per second. Thus now, to estimate how much has it consumed from 2800 MIPS i simply calculate 1.30 *2800 MIPS which gives me the instructions which the chip executes for that % of CPU time.

    Pls advise if am wrong.

    Where to define CHIP6727 in the header file or source code?

  • Anushree,

    The 2800 "native MIPS" number is not a value that applies to what you are looking for. It is a marketing number for comparison between other multi-element processors and DSPs. For what you want to calculate, ignore this number. It is not relevant here.

    RandyP said:
    If you have the number of clock cycles for a function, and you know the timeframe in which that function will operate, and you know the clock frequency of the DSP, then you know everything to make your calculation. Now you just need to write out your definition of MIPS, in other words state what it is you want to know.

    anushree mahapatra said:
    I have my clock cycle count from benchmarking i.e 4578 cycles for a function, and the board clock rate is given as 350 Mhz. Thus I calculate the CPU utilization time which is 4578/350 which gives me 1.30 microsec of the CPU time per second.

    You need to be careful with your decimal points.

    In my statement quoted above, the second phrase about "timeframe" is the missing part of your calculation. How often do you need to do your function, the one that takes 4578 cycles to execute? "the headroom left for processing" means that you want to calculate X us of total time - Y us of time for this function. You have calculated Y, so now you need to decide what X is, which is the total time between when you will start the function once and when you will start the function the next time.

    Regards,
    RandyP

  • Plese see attached CCS screen shots for your project setup.

  • Once you succeed in building your project with these addition, you should be able to profile any function accordingly.

    Theoritical Mips mean nothing when it come to the real hardware.  In your case, you need to know the clock cycles calculated in the following (sample) method

    // Code to be profiled

    start_time = _itoll(TSCH, TSCL);

    function_or_code_here();  //from entry to exit

    end_time = _itoll(TSCH, TSCL);

    cyclecount = end_time-start_time-overhead;

    printf("The code section took: %lld CPU cycles\n", cyclecount);

    Now, you multiply cyclecount by 1/(your CPU speed) and you will know how much the CPU spent in there.  Your Real Time budget has to be calculated based on your worst Real Time event, say 1 mSec rate, and if your function takes 200 uSec, you are left with ~800 uSec for everything else, including the invisible overhead that will not be under your control.  So if your budget is 1 mSec, use ~800 Usec as your max. 

  • Hi randy,

    Actually I need to calculate the MIPS for a single execution of a function since m executing it once. I do not know the timeframe value and total time when i will start the function once and when i will start the function next time. I just have the clock cycle count for a particular function being executed just "once". I need to estimate the rough MIPS ( not including overhead, mem accesses etc.) for that function. that is all. I need a value relative to the board's native MIPS that is why i included 2800 MIPS in my calculation. Pls verify my formula given the facts i have stated in this post now.

    Sam,

    I will try the profiling method you have mentioned but i need to get the mips for my function first. I am new to this area so i have only basic concepts.If you could check if my method is right or not given the post above. Thank you so much for ur help. 

  • Anushree,

    In the term MIPS, the "PS" means "per second". You could say that your function takes .004578 million instructions to execute, but you cannot say "per second" from what you have stated in this thread.

    In my opinion, what you are asking to calculate is like saying that the tires on your car go around 4578 times when you drive from home to work, and now you want to calculate how fast the car was moving. Even if you know that your tires are 6 feet or 2 meters in circumference, calculating MPH or km/h requires a timeframe to go with the distance travelled. The same is true for a MIPS calculation for a DSP algorithm.

    What you know is that your function requires 4578 cycles. It will require 4578 cycles if you run the DSP at 1GHz and it will require 4578 cycles if you run the DSP at 1Hz, assuming all other settings maintain the same proportions.

    If you optimize the code, it will take fewer cycles. If you run from slower memory or turn off your cache, it will take more cycles.

    Your formula cannot be verified given the facts that you have stated in this post. Your formula is incorrect to try to apply the 2800 MIPS figure-of-merit to this calculation.

    I apologize that I have not been able to give you the answer you want. Perhaps someone else will be able to explain this better than I can, or perhaps they will be able to verify your formula as you want it done.

    And I tell you this only to save you some time with benchmarking tests, but the C67x+ DSP core in the C6727 device does not have the TSC feature; it does not have the TSCH/TSCL registers. Your current benchmarking that measured 4578 cycles is likely the best and only way to determine cycle counts without external measurement equipment.

    Regards,
    RandyP

  • Sam,

    Also referring to the last post given, how do I fix my max. rate ( in ur post 1ms) so that i can calculate the time left for other overheads.

  • Hi randy,

    Yes, now I understand what you are tryin to explain. I should not be multiplying the 0.004578 million instr. to 2800 MIPS to get my MIPS. According to my logic, after multiplying 4578/(350Mhz) with 2800 MIPS i will only get the no. of instructions getting executed. So then howto get a rough estimate of MIPS without calculating the actual MIPS in literal sense. I just need to have a measure of the performance relative to the max. MIPS the board can process i.e 2800 MIPS. so that i know how much more MIPS i can use than the current MIPS i am executing.

    The link i followed to calculate the estimate is http://johnsantic.com/comp/mips.html

    Pls tell me if am wrong.

  • hi sam,

    As i have mentioned before, if u open the c6x.h header file, you can see the chip 6727 is not defined in the header file only. Thus, even if u put _DEBUG CHIP_6727 in the build options, it gave an error saying undefined symbol CHIP_6727 as the symbol is not defined in the header file itself. thus this process wont work for 6727 chip.

    regards,

    anushree

  • Grtns,

    Likely then the C6727 support you are using is not up to date.

    Here is your primitive method.

    Load your run time and go to main.

    From the drop down under profile, enable and view clock.

    Run to where your function is at.

    double click your mouse on the clock counter to clear it.

    Step over your function,

    Your clock counter will give you an accurate count of the clock cycles used.

    Please see attached

    Good Luck,

     

  • Now related to your MIPS Q.

    You have a DSP based system where it is to handle certain real time I/O along with many transforms to be performed on the data.  Assume the data are exchanged via your DMA engine with some 500 uSec overhead total for input and output, and is at a certain rate, let us say every 5mSec, then you are left with 4.5 mSec to perform every thing else.  Your real time budget is less than 4.5 mSec, when it comes to CPU performance because you need to account for all non CPU waste such as Compulsory Cache miss, memory read/write speed, interrupt, etc...

    MIPS figures are marketing numbers used for competition purpose between vendors.  A 2400 MIPS machine may be adequate to handle Audio type application, but likely to fail in a Radar type application or a Video type application.

    It all depend on what you are going to do with your data in my example during the ~4.5 mSec budget using the choice DSP.

    Good Luck,

     

  • Thank you sam for all your comments.

    Currently am estimating MIPS using the relation: Cycle count for a function * no. of function  calls in 1 sec.

    Regards,

    Anushree