This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

incorrect results from clock() function (c66). Is there a workaround?

I'm trying to use clock() to time loops using a functional simulator for the c66 via loadti.bat on Cygwin and getting results that don't make sense.  When I run the same code on an older simulator, I get eh expected results.  I am timing timed_loop() with one trivial loop, no conditional code. 

include <stdio.h>

#include <time.h>

clock_t t_overhead;

clock_t timed_loop(int size)

{

   clock_t t_start, t_stop;

   int i;

   t_start = clock();

   #pragma MUST_ITERATE(1)

   for (i=0; i<size; i++)

   {

       asm(" NOP");

   }

   t_stop = clock();

   return t_stop - t_start - t_overhead;

}

main()

{

    int j;

    clock_t t_start, t_stop;

    t_start = clock();

    t_stop  = clock();

    t_overhead = t_stop - t_start;

    for (j=0; j < 10; j++)

    {

        clock_t runtime = timed_loop(1000);

        printf("%d: #iter=%d: cyc: %d cyc (%.1f cyc/iter)\n",

           j, 1000, runtime, ((float) runtime / (float) 1000));

    }

The testcase is compiled using cl6x 7.4.2 with no special flags (i.e., "cl6x bug.c -z lnk.cmd rts6200.lib"), so it can be simulated on any target:

Using loadti.bat with a ccxml file  for the c674, I get the expected results

(i.e., on Cygwin, executing "loadti.bat -c c674.ccxml a.out"):

0: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

1: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

2: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

3: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

4: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

5: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

6: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

7: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

8: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

9: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

 

With the same executable and a c66 little endian ccxml file, I get the incorrect results (e.g., "loadti.bat -c c66_sim_le.ccxml a.out"):

 

0: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

1: #iter=1000: cyc: 13015 cyc (13.0 cyc/iter)

2: #iter=1000: cyc: 2329 cyc (2.3 cyc/iter)

3: #iter=1000: cyc: 1367 cyc (1.4 cyc/iter)

4: #iter=1000: cyc: 1367 cyc (1.4 cyc/iter)

5: #iter=1000: cyc: 1367 cyc (1.4 cyc/iter)

6: #iter=1000: cyc: 1367 cyc (1.4 cyc/iter)

7: #iter=1000: cyc: 1367 cyc (1.4 cyc/iter)

8: #iter=1000: cyc: 1367 cyc (1.4 cyc/iter)

9: #iter=1000: cyc: 1367 cyc (1.4 cyc/iter)

 5287.ccxml.zip contains ccxml files.

I am observing this behavior with various versions of the loop I am timing.  The only loops for which I can repeatedly  get correct timing using c66 simulation are software pipelined loops.  They yield the expected timing results, even when I time them multiple times (as I did with the loop above). 

Is this a known bug?  Is there a work around? 

 

  • This appears to be a simulator problem, so I'll move this thread to the CCS forum, where such issues are typically handled.  Before I do that, I'll pass on a possible workaround.

    Rather than calling clock, read the TSCL register in the same manner used by the example in this wiki article.  Because this is a much simpler mechanism than calling clock, it is more likely to work.

    Thanks and regards,

    -George

  • Hi,

    Wanted to suggest an experiment. Can you edit <ccsinstallation folder>\...\ccs_base\simulation_keystone1\bin\configurations\tisim_c6670_pv.cfg file and put the block below inside C++ style comments as shown. Do this for all such blocks (MODULE HPS2.... END HPS2; ) in the config file.

            //            MODULE HPS2;
            //                MAX_BLOCK_SIZE_HINT 65536; // 64KB, in bytes
             //               BLOCK_HIT_THRESHOLD 2000;
              //              SPLOOP OFF;
              //              CODE_COVERAGE ON;
               //         END HPS2;

    I got below output with above change for debug build.

    [TMS320C66x_0] 0: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    [TMS320C66x_0] 1: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    [TMS320C66x_0] 2: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    [TMS320C66x_0] 3: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    [TMS320C66x_0] 4: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    [TMS320C66x_0] 5: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    [TMS320C66x_0] 6: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    [TMS320C66x_0] 7: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    [TMS320C66x_0] 8: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    [TMS320C66x_0] 9: #iter=1000: cyc: 16015 cyc (16.0 cyc/iter)

    regards,

    Sheshadri

  • Hi,

    An additional input in case you might be having a question on what does this change affect. Functionality will be same, speed of simulator run would be reduced by 50% or so.

    regards,

    Shesha