This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

EVMK2GX: Inconsistencies between DSPLIB 3.4.0.0 C66x Cycle Count Formulas and actual measurement of clocks cycles using the timer.

Part Number: EVMK2GX

Hello,

I am experimenting with c66x DSPs in EVMK2GX board. I would like to measure function times for my application. However when I time the actual clocks using driver examples provided with the library, actual clock cycles appear to be always 25% more than theoretical clock cycles I calculate using formulas from Texas Instruments "Test Results DSPLIB 3.4.0.0 C66x (document comes with the library installation)". For example:

  DSPF_sp_mat_submat_copy_cplx DSPF_sp_mat_trans_cplx DSPF_sp_mat_mul_gemm_cplx
Cycles Formula 5270 3780 55700
EVMK2GX 6692 4631 68989

I modifed an example to do the measurements:

void main(void) {

clock_t t_overhead, t_start, t_stop, t_opt;

/* ------------------------------------------------------------------- */
/* Compute the overhead of calling clock twice to get timing info */
/* ------------------------------------------------------------------- */
/* Initialize timer for clock */
TSCL= 0, TSCH=0;
t_start = _itoll(TSCH, TSCL);
t_stop = _itoll(TSCH, TSCL);
t_overhead = t_stop - t_start;

/* ------------------------------------------------------------------- */
/* Generate random inputs in range (-10, 10). */
/* ------------------------------------------------------------------- */
UTIL_fillRandSP(ptr_x1, 2 * NR1 * NC1, 10.0);

DSPF_sp_mat_submat_copy_cplx(ptr_x1, NR1, NC1, 0, NR1, ptr_x2, 1);


/* ------------------------------------------------------------------- */
/* Measure the cycle count */
/* ------------------------------------------------------------------- */
t_start = _itoll(TSCH, TSCL);
t_start = _itoll(TSCH, TSCL); /* to remove L1P miss overhead */
DSPF_sp_mat_submat_copy_cplx(ptr_x1, NR1, NC1, 0, NR1, ptr_x2, 1);
t_stop = _itoll(TSCH, TSCL);
t_opt = t_stop - t_start - t_overhead;

printf("\tNR = %.2d\tNC1 = %.2d\tNC2 = %.2d\toptC: %d\n", NR1, NC1, NC2, t_opt);

}

All data is stored in L2 memory (see attached linker file). It is stated that there might be differences due to cache misses etc. However, in my case, inconsistency is not constant in all functions. Actual measurement takes about 1.25 times cycles longer than the measurement calculated using cycles formula.

What might be the issue?