This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Best Measure of Execution Time



I am running CCS 5.3 using the code generation tools v.7.4.1 with a TMDSEVMC6678LE Evaluation Board. I would like to know what the best function is for measuring the run time in (sec, msec, usec... doesn't matter). I have used 

omp_get_wtime();

In the omp.h library but the value it gives when doing something like

start = omp_get_wtime();
... code...
end = omp_get_wtime();
std::cout<<"Time: " << (end - start) << "\n";

I can't make heads or tales of what the out put of (end-start) is. My best guess is clock cycles at this point, but I know under MSVSC++ the same omp time command gives seconds, but it definitely is not giving seconds on the c6678 proc. Any suggestions? What is the omp command giving and what are maybe other  (better) options to get code segment execution time?

Thank you,

Aaron

  • Hi Aaron,

    On an unrelated note to your question, where did you download CGT 7.4.3 from? Are you using CCS 5.3.0.00090? On windows or linux?

    Is your OpenMP program running on the TMDSEVMC6678LE board with CGT 7.4.3?

    I'm trying to get OpenMP working on the same board, and CCS 5.3.0.00090 + CGT 7.4.2 on linux is giving me an executable which produces runtime errors: http://e2e.ti.com/support/development_tools/compiler/f/343/t/256975.aspx

    Thanks for your help.

    Cheers..

    Gaurav

  • Gaurav,

    Sorry, I made a typo. I am using CGT 7.4.1 not 7.4.3(which I am guessing doesn't exist - I edited it above). Yes I am using CCS5.3.0.00090 on windows 7.

    According to this thread http://e2e.ti.com/support/development_tools/compiler/f/343/p/232804/822182.aspx#822182 openMP wont work properly under 7.4.1 or 7.4.2.

    I am not really running an openMP program, I was just using the time call from the omp libs. I actually am having a lot of trouble getting openMP integrated into my currently developed programs and have put it on the back burner for now. My question is only regarding the best measure of execution time for the c6678 DSP.

    I have looked into this option http://www.cplusplus.com/reference/ctime/clock/  but when I look at the time.h file it has 

    #define CLOCKS_PER_SEC 200000000 /* 200 MHz *

    To me this seems to be wrong for the c6678, since this proc runs at 1.25GHz? Also the specs say 1.0-1.25 GHz? so which one is it? 

  • Aaron Hill said:
    I am not really running an openMP program, I was just using the time call from the omp libs.

    In light of that comment, here are some ideas to consider.  The compiler RTS library supplies time() and clock() functions.  This wiki article has more details.

    Take a look at the code in this wiki tutorial which measures cycle count.

    Thanks and regards,

    -George

  • Here is what I have tried

    double get_wtime() { return (double)clock(); }

    ...

    double start, end;
    clock_t s, e;
    unsigned int t0, t1;

    TSCL = 0;

    start = get_wtime();
    s = clock();
    t0 = TSCL;

    ...(CODE CODE CODE)...

    end = get_wtime();
    e = clock();
    t1 = TSCL;
    std::cout<<"Time1: " << (end - start) << "\n";
    std::cout<<"Time2: " << (double)(e - s) << "\n";
    std::cout<<"Time3: " << (t1 - t0) << "\n";

    [OUTPUT]

    Time1: 0
    Time2: 0
    Time3: 519433

     

    Why is clock() giving me 0 as the timing result? Also, as of now it seems the Cycle Count method is the only thing working. How do I turn Cycle Count into time? do I divide by the clock frequency So milliseconds = (ClockCycles) / (ClockRate in MHz) / 1000. Additionally, what is the clock rate of the c6678? 1.0 or 1.25 GHz?

    Or perhaps I would want to take ClockCycles/CLOCKS_PER_SEC as defined in time.h? for the c6678 CGT I have CLOCKS_PER_SEC defined as 200,000,000

    ---------EDIT-------------

    As I test the Cycle Count method I think I found it to be insufficient for my program. The TSCL register is defined as an unsigned int which has a max int value of 4294967295. On a 1GHz dsp this means after 4.29 seconds the counter will overlap and begin back at 0. I have noticed that my timing data is negative sometime and never accurate.

  • I'm not an expert on the subject, but I think you also need to use TSCH.

    This thread may be relevant: http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/439/t/66172.aspx

  • Archaeologist said:
    I'm not an expert on the subject, but I think you also need to use TSCH.

    I read through the thread you gave but I do not understand how the poster is using TSCH. I don't even know what TSCH is, but it looks like from the code he posted that he is only reading TSCH, and I do not know what good this would do.

    As I said before I am getting negative values when reading TSCL after long periods of code execution. Because this is a 32bit register I would assume that the negative values are generated because of the counter resetting. So the second time I read the value it is at a lower value than the initial read. I tried setting TSCL to 0 every time I wanted to time the code, but that didn't help either, I still got negative values, which is odd and I have no idea why that would be. So I must not understand something about the cycle count registers and am looking for advice. Right now I am using time(), but the resolution of seconds is not sufficient in some areas of my program.

  • TSCH and TSCL are the two 32-bit parts of a 64-bit counter

  • ahhh HA... That bit of info lead me to this site: http://guy-eschemann.de/2011/03/how-to-profile-a-section-of-code-on-ti-c64x-dsps/

    I will try to implement this today and report back my results. If you can I would like to know how to convert this cycle count into real time (usec, msec, sec...). I have read that the CLOCKS_PER_SEC is what I should use to divide the cycle count because the way it is defined in time.h is the frequency (Hz) at which the cycle count register is updated. Is this correct? or should I use the raw processor (Hz) of the DSP, in this case the c6678 is 1.0 or 1.25 GHz. For CGT 7.4.1 time.h defines CLOCKS_PER_SEC as 200Mhz.

    Thank you,

    Aaron

  • TSCH/TSCL are unrelated to CLOCKS_PER_SEC.  I do not know the unit of time used in TSCH/TSCL. 

    CLOCKS_PER_SEC is intended to be used only with the library function "clock."  The intent is that CLOCKS_PER_SEC gives you the estimated number of CPU clock ticks per second, but there are a wide variety of C6000 devices with varying frequencies, and counting CPU clocks doesn't take into account memory stalls, so you are not likely to get an accurate value from clock.  We provide clock (only when using CCS) just to provide a monotonically-increasing estimate of CPU time for profiling purposes.  The hard-coded value of CLOCKS_PER_SEC in time.h is an arbitrary value; don't take it as authoritative.

  • According to TMS320C66x DSP CPU and Instruction Set Reference Guide, Section 2.9.13:

    2.9.13 Time Stamp Counter Registers (TSCL and TSCH)

    The CPU contains a free running 64-bit counter that advances each CPU clock under normal operation. The counter is accessed as two 32-bit read-only control registers,
    TSCL (Figure 2-24) and TSCH (Figure 2-25).

    In order to know the time in seconds you just to have to divide the counter value by the device clock speed. I think it's useful if you want to know the total time spent in some application or part of it.

  • Thank you all for your responses and input. After digging into this more based on the info provided here I can verify the question answered. Here is how I tested this

    #include <stdint.h> // uint64_t
    #include <c6x.h> // _itoll, TSCH, TSCL
    #include <sys/types.h>
    #include <time.h>

    #define CYCLES_PER_SEC 1000000000 //1e9 or 1.0 GHz, spec of the c6678

    ...

    time_t s, e;
    uint64_t start_time, end_time

    s = time(NULL);
    start_time = _itoll(TSCH, TSCL);

    ... CODE CODE CODE ...

    end_time = _itoll(TSCH, TSCL);
    e = time(NULL);
    std::cout<<"Wall Time: " << difftime(e,s) << "\n";
    std::cout<<"CPU Time: " << (double)(end_time - start_time) / CYCLES_PER_SEC << "\n";

    This code snippet compares the time metric from time.h and the metric from cycle counts. In my tests the two timers came out to be nearly the same with the time() function being less accurate as expected. Also I noticed in a small number of examples the wall time was much larger than the cycle count time, though this occurred very rarely. I suspect this has to do with interrupts occurring that cycle count does not account for because it is not part of the main execution thread.

  • Aaron,

    just use

    #include "c6x.h" // needed
    ...

    unsigned int start, stop, cycle_count;

    TSCL = 0; // need to write to it to start counting

    start = TSCL;

    ///... critical code

    stop = TSCL;
    cycle_count = stop - start; // total number of CPU cycles

    as described here :

    http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/112/t/37848.aspx

    on C6678 @ 1.25Ghz 1 cycle = 0.8 ns
    so time_us = cycle_count * 0.0008;

    Clement