This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Limits of profiler counters



Hi,

I would be interested if someone knows the largest access count, CPU cycles, etc. that can be correctly processed by the profiler of Code Composer 3.3. That is, how many bits are the counters that are are used, and what is the behavior when a counter overflows? Thanks for any information.

Jim

  • Hello,

    Please see the attached presentation, especially slide 4.

    Thanks

    ki

    ccsv3_profiling.pdf
  • Ki-Soo Lee said:

    Hello,

    Please see the attached presentation, especially slide 4.

    Thanks

    ki

    Thanks for the slides. Between posting my question and getting a reply, I did some profiling that shows there are at least some parts that use >32 bits. Below I have put a portion of the results

    Address Range, Symbol Name, SLR, Symbol Type, Access Count, CPU Cycles: Incl. Total

    0x3e8a53-0x3e8b74, cusum_mode_update, 406-492:cusum01.c, function, 41451, 6017339474
    0x3e8a36-0x3e8a53, cusum_update, 328-399:cusum01.c, function, 49874, 6022759589
    0x3e88ab-0x3e89f4, main, 64-308:moog_cusum01.c, function, 1, 2282310293

    The last line is for main(), which was called once with an inclusive total of 2282310293 cycles. But both cusum_mode_update and cusum_update, called either directly or indirectly from main(), have cycle counts above 2^32. Adding 2^32 to the cycle count for main(), gives 6577277589, which looks about right. It appears that each individual timing must be < 2^32, but the totals are accumulated with more bits. At least 33 bits must be used to store 6022759589, but how large of a number can be stored is still not known. I did not see anything in the slides that mentioned more than 32 bits for storing values.

    The profiler clock also appears to wrap at 2^32 since it showed a total cycle count of 2,746,485,085.

    The last slide of known profiling limitations was also interesting

    • CCS Profiling IS:
      • Intrusive (BP based) on HW and older simulators

    The slides were last updated in 2005, so "older" is a bit relative. I am currently using CCS 3.3 and the cycle-accurate simulator for the 2812. Would real hardware be more accurate, less accurate, or about the same?

    Jim

  • Note that the above presentation I posted was geared towards profiling on HW. Simulators work differently. There is a special 64 bit counter with simulators which is used by the CCS profiler. CCS Profiling on simulators is also non-intrusive since it does not rely on breakpoints.

    Simulation is certainly easier for profiling since it has less limitations. Especially when profiling code in flash.

  • Ki-Soo Lee said:

    Note that the above presentation I posted was geared towards profiling on HW. Simulators work differently. There is a special 64 bit counter with simulators which is used by the CCS profiler. CCS Profiling on simulators is also non-intrusive since it does not rely on breakpoints.

    Simulation is certainly easier for profiling since it has less limitations. Especially when profiling code in flash.

    If the simulators have 64-bit counters, they are not fully used by the profiler, at least not in CCS 3.3. Whenever I have seen loop or function cycle counts that should be greater than 2^32 cycles, wrapping occurs. Perhaps that has changed in later versions. It does appear that there are variables larger than 32 bits to store the cumulative totals within the profiler software since I have seen these exceed 2^32. How much larger and the response of the profiler on overflow were the original question, and I still do not know. I am certain that at least 33 bits are available. Thankfully my simulations have become much faster lately, so I have not had the opportunity to do more experimentation on the limits.

    Jim

  • Jim,

    Jim Monte said:
    If the simulators have 64-bit counters, they are not fully used by the profiler, at least not in CCS 3.3. Whenever I have seen loop or function cycle counts that should be greater than 2^32 cycles, wrapping occurs.

    Now that I think about it, you are probably right. We did a lot of new (at the time) functionality to the simulators in regards to benchmarking (profiling, code coverage, etc) but those were all done on C55x and C6x simulators. If I remember correctly, you are using the F28x simulators. So it is probably using the old 32-bit CLK.

    Jim Monte said:
     It does appear that there are variables larger than 32 bits to store the cumulative totals within the profiler software since I have seen these exceed 2^32.

    yes the buffer in the debugger where it stores the cumulative count is 64-bit. This would apply to all targets, simulator or not. But the issue is that actual buffer where the count for one non-branching range is and it is 32 bits in most cases it seems.