This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Profiling irregularities in DM64x+

Hi,

I am facing a vague problem regarding profiling irregularity with my code. I use hardware timer for profiling. When i profile my code I get a certain time say "T" ms. Next time if i profile it, it takes "T+30" ms: however it can also take "T" ms. This 30 ms addition is completely random and occurs irregularly. Thus i am not able to establish a benchmark for optimization. I run my code in release mode with program level and O3 optimization.I am profiling using BIOS function CLK_getltime().

Please let me know the compiler settings necessary and how i should approach this problem.

Thanks You

 

  • Which device are you running on?   Which version of BIOS are you using?   Is it possible that the code you are benchmarking is sometimes preempted by a higher priority task or ISR?   30ms delta is a big difference.    How big is "T".

    Thanks,
    -Karl-

  • Hi Karl,

    Thanks for the reply, please find my response below

    Which device are you running on?  

         TIDM6437 evaluation board

    Which version of BIOS are you using?  

    5.33.06, build tools -  6.1.12

    Is it possible that the code you are benchmarking is sometimes preempted by a higher priority task or ISR?  

    I have three tasks in the system, out of which this one is the highest priority. The remaining two are ethernet tasks meant for sending and receiving data on demand. So when this high priority task is executing the other two tasks are blocked.

    30ms delta is a big difference.    How big is "T".

    Yes its a big difference. "T" can be anything, but typically for one image frame "T" is around 15ms and it may vary according to each frame.however this 30ms peak is always seen randomly, sometimes for the same frames which were profiled earlier. I have also checked for the timer overflow condition to detect any wrap around.

    However, I have observed one thing - when given -mi compiler option to a certain value more than 60ms then this 30ms peak frequency reduces. Please let me know if this interrupt threshold might be the reason ? OR you may suggest some alternate way of measuring the time (right now we are using CLK_Getltime()). We can profile via CCS GUI in debug mode but somehow its not that effective as we need to automate our system and generate results for more than 10000 frames per day.

    Regards,

    Mihir


     

  • It seems you may have already tried this, but if not, can you try compiling your code with -mi10 or some other small number for mi and see if the latency improves?   The compile may be pipelining some of your loops with interrupts disabled.   The compiler does this for max optimization of the loop.   If you have long loops, then it is possible that interrupts will be disabled for a long time.   Using a small -mi10 number will make compiled code slower but improve the latency.   You should also consider using CLK_gethtime (with an 'h').  This returns CPU instruction cycles.  The 64x+ has 2 registers -- TSCH/L -- which count at the instruction frequency.   Reading TSCL is very fast and it will return number of instruction cycles.  You'll need to multiply this by your CPU frequency to get real time.   On a 1GHz device, I think this will wrap every 4 seconds.   If you use unsigned integer math, the subtraction of t1-t0 will work even when there's a wrap.

    Thanks,
    -Karl-

  • Thanks for the reply.

    I will try and use GethTime, My processor is clocked at 594Mhz. I think the high resolution timer may wrap every 7 seconds in this case.

    Thanks and Regards,

    Mihir S

  • Hi Karl,

    I tried the above option but it doesn't seem to be working. The profiling irregularity still exists. Please let me know what other options I should try?

    Also, does the Code Generation Tools version matter here? I am currently using BIOS 5.33.06 with Code Generation Tools version v6.1.12. I had earlier profiled my code with the same BIOS version and Code Generation Tools version v6.0.8. The frequency of these profiling irregularities is less for the current version which I'm using (Code Generation Tools version v6.1.12.) .The compiler settings are exactly the same as the earlier version (Code Generation Tools version v6.0.8.) . What could be the reason for this?

    Thanks and Regards,

    Mihir S

  • Hi Mihir --

    CLK_gethtime() simply reads TSCL register.   This counts at the CPU frequency.    It's a simple counter and once it starts, you cannot stop, reset or stop it.  So, it is very reliable.   What you see is probably related to some thread switch or ISR getting in the way.  Can you make your task max priority before the benchmark using TSK_setpri()?   Can you disable interrupts for your benchmark?  I think either ISR or ISR+TSK switch are getting in your way.

    If you are using 'unsigned int' math, then subtract t1-t0 should give you good answer even if there's a wraparound.

    You can read this directly in your code using the following in your C code.

    extern volatile unsigned cregister TSCL;

    -Karl-

  • Hi Karl,

    Thanks for the reply, I tried with CLK_GetHTime() as well, but the issue persists. One thing I would like to mention after reading your reply, We are not using or configured and HWI although I will still disable IER and try again.

    I'am using NDK from dvsdk_1_01_00_15, basically I am reading some images from ethernet client running on DM6437 and sending the result of my algorithm back through ethernet. I see that the ethernet driver from dvsdk is using PRD !! Can this cause the peak in my timings measurement ? please have a look into the below snapshot which represents my tcf file

     

  • Hi Karl,

    Problem Solved Finally!!! :)

    What was happening was that the function "_llTimerTick" of prdNdk was interrupting the code every 100ms(Timer 0 HWI interrupt 14) and this function would take 30ms to execute. Since our code has no fixed timing(depends on the input) this peak was appearing randomly.

    Now what has been done is that before profiling begins i.e. before the timer is run, Interrupt 14 has been disabled and after profiling is done it has been enabled again. Thus our code is free of peaks now/

    Once again thanks for ur help.

    Regards,

    Mihir