FastRTS library issue with TMS320F28065

Alberto Soattin

Genius 16905 points

Other Parts Discussed in Thread: TMS320F28065

Hi Support,

my customer is performing benchmark tests on FastRTS library (in particular function sincos()) on TMS320F28065 device.

The selection of CGT leads to significative compiling time and execution difference!

Tested CGTs: 6.4.10 and 15.12.3.LTS

Same settings: optimization OFF, as in following screenshots:

CGT 6.4.10

CGT 15.12.3.LTS

The execution timing is measured using GPIO set and clear on the following code (within the same source code):

MARK2_SET;

sincos(angle_test, &sin_theta, &cos_theta);

MARK2_CLR;

where:

#define MARK2_SET {\

GpioDataRegs.GPASET.bit.MK2GPIO = 1u;\

}

#define MARK2_CLR {\

GpioDataRegs.GPACLEAR.bit.MK2GPIO = 1u;\

}

Results

CGT 6.4.10

Compilation time: 44 seconds

Execution time ("sincos" with FastRTS): 0.98522 us

CGT 15.12.3.LTS

Compilation time: 4 minutes and 2 seconds

Execution time ("sincos" with FastRTS): 1.0714 us

This is a 8.75% increase in computational time and a huge increase in compilation time!

Moreover, from the fastRTS Libary user's guide (chapter 7 - Benchmarks):

The TMS320F28065 has zero-wait states Boot ROM (datasheet, sprs698f.pdf, Figure 6-5, pag 52) so I consider 44 cycles.

At 90 MHz the "ideal" execution time would be:

ExecTime = 44 / 90000000 = 0.49 us

Even if considering 50 cycles I get 0.55 us

In the best case (CGT 6.4.10), knowing that there is an overhead for managing input data + results and GPIO toggle, I get almost twice the expected timing.

Even worse using CGT 15.12.3.LTS, not to speak about the compiling time that lasts ages (compared to the other CGT).

What do I have to do / set in my project?

Many thanks in advance for helping me supporting customer with this issue.

Alberto

over 9 years ago

0 Vishal_Coelho over 9 years ago

TI__Mastermind 20850 points

Hi Alberto

Can you post the disassembly for the benchmarking code? Also, when we profile the functions we use the "Clock" feature of CCS, under run->clock->enable. You open up the diassembly window and run to the line where the function is actually called, the LCR instruction

LCR _sincos

or something similar to that. With the clock enabled and reset (double click on the clock icon in the lower right of the CCS window to reset the count to 0) step over the LCR instruction in the disassembly window to get the cycle count for that function. If you are running the function out of 0-wait memory and the tables in ROM you should get the numbers indicated in the UG. If that matches the overhead is probably coming from the profiling code around the function call.

0 Alberto Soattin over 9 years ago in reply to Vishal_Coelho

TI__Genius 16905 points

Vishal,

while waiting for customer feedbacks on this test, I received this interesting spreadsheet of their differences between various CGT version.

These are confirming someething wrong is happening.

I'll keep you posted on their investigations.

regards,

Alberto

C2000™︎ microcontrollers

C2000 microcontrollers forum

FastRTS library issue with TMS320F28065