This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Profiling on TMS570LS3137

Hello,

As I'm currently implementing a computationally quite heavy algorithm (that's why I also created this thread about running from RAM hoping to gain some performance) I wanted to use profiling functionalities in order to identify the functions I need to optimize.

Sadly, I couldn't enable the "Profile all Functions for CPU Cycles" under "Profile Setup" in CCSv5.3 and the only thing that seems to work is the Clock function. I couldn't find any specific information in the "Profiler" page in the Wiki and the "supported targets FAQ" link contains an empty page. The best I could find is this thread, which doesn't seem to be a very comfortable solution. If thats the best available solution, I might as well go on with my current method: inserting output pin Set and Clear instrucions to watch them on my 16ch digital oscilloscope. It might not be the most precise (and neither comfortable) solution, but as it is just around 8-10 points which I need to watch and they are in milisecond orders of magnitude it gives me sufficient information... and if I need a little more precissión I can use the Clock + Breakpoints + Step... and maybe leave the PMU method for later detailed analysis.

I would be very thankful if someone knows a solution for decent profiling.

Regards!

Martin

  • Hi Martin,

    Apologies, I don't think the profiling in CCS is supported on Hercules either.

    One suggestion that may help though. 

    You can manually enable the PMU cycle counter with two steps:

    -> Set bit of the register "CP15_PERFORMANCE_MONITOR_CONTROL"

    -> Set bit 31 of the register "CP15_COUNT_ENABLE_SET"

    Then you can see the cycle counter in the register "CP15_CYCLE_COUNT".

    You could then get the cycle count running between two breakpoints.   When I single stepped this way it looked like there was some overhead.  It does take 6 cycles for the processor to access the cycle count register so this may be part of the overhead.   But if the code section is large as in your case this should be a small error.   Mainly pointing this out so you don't use it to step over an instruction and get the cycle count of the single instruction since that would be way off.

    These registers can be accessed through the "Registers" pane in CCS:

    You can also manipulate (read & write) these registers through the Scripting Console:

    The first example is just reading the value of CP15_CYCLE_COUNT,  and 124322145 decimal is 0x7690161 which you can see was the cycle count of the first capture.

    The second example evaluates the expression "CP15_COUNT_ENABLE_CLEAR = 0x80000000" which winds up writing 0x80000000 to that register, clearing bit 31.

    So you could probably make a script to automate some of the steps if it becomes too tedious to do this work through the GUI.

    Best Regards,

    Anthony

  • Hi Anthony,

    I searched for "CP15_PERFORMANCE_MONITOR_CONTROL" just to know the bit position. This thread is a near miss, you wrote
    > Set bit of the register "CP15_PERFORMANCE_MONITOR_CONTROL"

    May be the information is in the assembler code linked in SPNA138 (Execution Time Measurement for Hercules™ ARM® Safety
    MCUs), but I still have to learn ARM assembler.

    Besides, Martin wrote:
    > I couldn't find any specific information in the "Profiler" page in the Wiki and the "supported targets FAQ" link contains an empty page.
    Suggested linkfix: FAQ_-_CCSv5#Q:_What_targets_support_CCS_function_profiling.3F

    - Rainald

    Edit: I found Charles Tsai identifying "the X bit" with "bit4" here. And yes, it is set, even before _pmu_init(). By stepping over the _pmu_* functions, I noticed that bit 31 of the ENABLE_SET register is never set, but bit 0 is (re)set by _pmuStartCounters_() and _pmuStopCounters_(), resp., which explains that the counter keeps at zero. When I use scripting to set bit 31 as you suggested above, it works. Is that a bug in sys_pmu.asm?

    Edit 2 The bug is not in the asm file but in the example code in SPNA138A, which may be fine with other event types than PMU_CYCLE_COUNT, 0x11 (not yet tested). The following works for counting cycles:

    	_pmuInit_();
    	_pmuStartCounters_(pmuCYCLE_COUNTER); // enable = 0x80000000, argument changed from pmuCOUNTER0
    	volatile unsigned long pmu_start= _pmuGetCycleCount_(); // changed from _pmuGetEventCount_()
    	{int i; for (i=0; i<22000; i++) ;} // 15 cpu cycles per iteration (1.5 ms @ 220 MHz)
    	float pmu_time = (_pmuGetCycleCount_() - pmu_start - 20) / 220.f; // 20 cycles overhead compensation, 220 MHz, result is in microseconds