Hi all.
I want to measure the computation time of an algorithm running on the C6747 (on OMAP-L137 EVM). The algorithm is placed in a loop of 10000 iterations and prior and after algo execution, I call CLK_gethtime() . Finally I'd like to have minimum, maximum and average computation time in miliseconds (with a precision of let's say 0.1 ms). Therefore I divide all the measurements by CLK_countspms(). The whole thing is compiled in release mode with -o3 and -mf5 but with full symbolic debug.
What I get with this method is something like:
t_min = 2.20 ms
t_mean = 2.21 ms
t_max = 2.8 ms
But if I multiply the measurements by the number of iterations I'd get a total measurement time of ~ 22 seconds. But my stop watch only shows something like 2.2 seconds.
So where is the problem? Should I divide my measurements by 10? I first thought it was due to the compiler optimizing the loop, but there I'm wrong. I get similar values if I only "iterate" once. I think the optimization is there indeed, leading to very close t_min and t_mean, but it does not explain the why the measurements are not correct to a factor 10...
Can anybody tell me where I'm wrong? Or some other hints? As a reference, here is the code leading to the above measurements:
const Uint16 TEST_ITER = 10000u;
float timeMean = 0.f;
float timeMin = 1e16f;
float timeMax = 0.f;
float dtime;
Uint32 timeStart;
Uint16 idxTest;
for (idxTest = 0u; idxTest < TEST_ITER; idxTest++) {
timeStart = CLK_gethtime();
validImg = imregions (binFilterImg, procImg, regionImg, regions, &nRegions);
dtime = (float)(CLK_gethtime()-timeStart)/(float)CLK_countspms();
if (dtime > 0.f) {
timeMean += (dtime/(float)TEST_ITER);
if (dtime < timeMin) timeMin = dtime;
if (dtime > timeMax) timeMax = dtime;
}
}
LOG_printf(&trace, "t_min = %f", timeMin);
LOG_printf(&trace, "t_max = %f", timeMax);
LOG_printf(&trace, "t_mean = %f", timeMean);
Thank you very much for your help.
Andreas