TMS320F280049C-Q1: Double float optimization in F280049C

Karthik Raveendranath

Part Number: TMS320F280049C-Q1
Other Parts Discussed in Thread: C2000WARE

Hi all,

Optimization of Double float operation in F280049C

I need to implement a double float operation in F280049C which has only dedicated single FPU unit. But need to see how optimized a double float operation can be executed.

a = a + b + c;

The instruction is taking 400 cycles when a,b & c are double float (long double) and 10 cycles when they are single float (float). Is there any way to optimize the execution for double floating point ? b is a constant value and c is varying.

Thanks and Regards

Karthik

over 3 years ago

0 Shanty over 3 years ago

TI__Expert 5410 points

Hi,

I was unable to reproduce the issue. As you can see in the below image, the operation only takes 6 cycles (bottom right). Let me know if I misunderstood the query.

To optimise, you can try --fp_mode=relaxed. For further optimisation, you can try using IQMath or the Fixed Point library. For further benchmarking, you can consult https://www.ti.com/lit/pdf/spruhs1 or the respective documentation in c2000ware.

-Shantanu

0 Karthik Raveendranath over 3 years ago in reply to Shanty

Intellectual 840 points

Hi Shantanu,

Thanks for the reply. I rechecked the same. It's still giving near to 400 cycles.One difference from your code is that I have declared the variables as "long double" as defined in the C2000 compiler guide for 64bit float value.

In the snippet shared by you, the variables are declared as "double" only. Maybe that'sy the difference. Since it is taking same cycles as 32 bit single float, I assume declaring as simple "double" will be considered as 32 bit only. But not sure though.

Thanks and Regards

Karthik

0 Shanty over 3 years ago in reply to Karthik Raveendranath

TI__Expert 5410 points

Hi,

Made the change and I'm getting 278 cycles. It is still not the most optimised version you could use. Using fpu64 mode and including the rts2800_fpu64_eabi lib, I was able to optimise it down to 12 cycles:

-Shantanu

0 Karthik Raveendranath over 3 years ago in reply to Shanty

Intellectual 840 points

Hi Shantanu,

Thanks for the reply. 12 cycle is really a good value. But whether we can use fpu64 since the device don't have fpu64 hardware ?

Thanks and Regards

Karthik R

0 Shanty over 2 years ago in reply to Karthik Raveendranath

TI__Expert 5410 points

Hi Karthik,

Apologies. YOu cannot use FPU64 on your device. But using the rts2800_fpu32_eabi.lib, i was able to greatly optimise it.

My build command is as follows:

"C:/ti/ccs1020/ccs/tools/compiler/ti-cgt-c2000_20.2.2.LTS/bin/cl2000" --float_support=fpu32 --idiv_support=idiv0 --opt_for_speed=4 --fp_mode=relaxed --advice:performance=all -g --diag_warning=225 --diag_wrap=off --display_error_number --abi=eabi -z -m"E2E_TEST.map" --warn_sections -i"C:/ti/ccs1020/ccs/tools/compiler/ti-cgt-c2000_20.2.2.LTS/lib" -i"C:/ti/ccs1020/ccs/tools/compiler/ti-cgt-c2000_20.2.2.LTS/include" --reread_libs --diag_wrap=off --display_error_number --xml_link_info="E2E_TEST_linkInfo.xml" --rom_model -o "E2E_TEST.out" "./main.obj" "../28003x_generic_ram_lnk.cmd" -l"C:/ti/ccs1020/ccs/tools/compiler/ti-cgt-c2000_20.2.2.LTS/lib/rts2800_fpu32_eabi.lib"

Can you try this out and let me know?

-Shantanu

0 Karthik Raveendranath over 2 years ago in reply to Shanty

Intellectual 840 points

Hi Shantanu,

Thanks for the reply. I modified my compiler flags as per your build command and it started throwing many errors.

So I reverted to earlier flag setting and changed the following

a) Project >> Properties >> CCS General >> Project Type and Tool chain >> Output Format = eabi (ELF)

b) Project >> Properties >> CCS General >> Project Type and Tool chain >> Run Time Support Library = rts2800_fpu32_eabi.lib

Its showing following error

warning #10440-D: creating output section ".bss" without a SECTIONS specification. For additional information on this section, please see the 'C28x EABI Migration' guide at processors.wiki.ti.com/.../C28x_EABI:C28x_EABI_Migration

warning #10440-D: creating output section ".const" without a SECTIONS specification. For additional information on this section, please see the 'C28x EABI Migration' guide at processors.wiki.ti.com/.../C28x_EABI:C28x_EABI_Migration

warning #10440-D: creating output section ".sysmem" without a SECTIONS specification. For additional information on this section, please see the 'C28x EABI Migration' guide at processors.wiki.ti.com/.../C28x_EABI:C28x_EABI_Migration

warning #10440-D: creating output section ".init_array" without a SECTIONS specification. For additional information on this section, please see the 'C28x EABI Migration' guide at processors.wiki.ti.com/.../C28x_EABI:C28x_EABI_Migration

warning #10247-D: creating output section ".data" without a SECTIONS specification

undefined first referenced

symbol in file

--------- ----------------

F28x_usDelay ./source_files/f28004x_sysctrl.obj

error #10234-D: unresolved symbols remain

error #10010: errors encountered during linking; "Sin_Gen.out" not built

Thanks n Regards,

Karthik R

0 Shanty over 2 years ago in reply to Karthik Raveendranath

TI__Expert 5410 points

You need to make sure these sections are allocated in the cmd file. You can use the examples provided as reference. Can you share the cmd files?

Regarding the linker error, you need to make sure that the unresolved symbols are properly linked in your project either in project files or by including the library.

-Shantanu

0 Veena Kamath over 2 years ago in reply to Shanty

TI__Mastermind 31525 points

Karthik,

Regarding the issue with F28x_usDelay, please add the following line above the definition of _F28x_usDelay in the <device>_usdelay.asm file

.if __TI_EABI__
.asg F28x_usDelay, _F28x_usDelay
.endif

f28004x bitfield files are still on COFF format. The above code makes sure the file can be used with EABI compiler as well.

Regards,

Veena

0 Karthik Raveendranath over 2 years ago in reply to Veena Kamath

Intellectual 840 points

Hi Shantanu/Veena,

Thanks for the response.

F28x_usDelay() is defined as extern in "f28004x_examples.h" and called in f28004x_sysctrl.c file. I couldnt find where else is it defined.

I have added the following in "f28004x_examples.h"

It is giving the following errors

-----------------------------------

>> Compilation failure

source_files/subdir_rules.mk:16: recipe for target 'source_files/f28004x_sysctrl.obj' failed

"/Sin_Gen/header_files/f28004x_examples.h", line 356: error #171: expected a declaration

"/Sin_Gen/header_files/f28004x_examples.h", line 360: warning #12-D: parsing restarts here after previous syntax error

"../source_files/f28004x_sysctrl.c", line 620: warning #225-D: function "F28x_usDelay" declared implicitly

1 error detected in the compilation of "../source_files/f28004x_sysctrl.c".

gmake: *** [source_files/f28004x_sysctrl.obj] Error 1

gmake: Target 'all' not remade because of errors.

_____________________________________

So I commented the call F28x_usDelay() in f28004x_sysctrl.c. Now I am getting following errors

-------------------------------------------------------------

warning #10247-D: creating output section ".data" without a SECTIONS specification

"../280049C_RAM_lnk.cmd", line 27: error #10099-D: program will not fit into available memory, or the section contains a call site that requires a trampoline that can't be generated for this section. placement with alignment/blocking fails for section ".cinit" size 0x9fpage 0. Available memory ranges:

RAMM0 size: 0x40b unused: 0x0 max hole: 0x0

error #10010: errors encountered during linking; "Sin_Gen.out" not built

>> Compilation failure

makefile:155: recipe for target 'Sin_Gen.out' failed

gmake[1]: *** [Sin_Gen.out] Error 1

makefile:151: recipe for target 'all' failed

gmake: *** [all] Error 2

**** Build Finished ****

-----------------------------±

This is my original cmd file content. I have modified this based on suggestion from

e2e.ti.com/.../979405

----------------------

MEMORY

{

PAGE 0 :

/* BEGIN is used for the "boot to SARAM" bootloader mode */

BEGIN : origin = 0x000000, length = 0x000002

RAMM0 : origin = 0x0000F5, length = 0x00040B

RAMLS : origin = 0x008000, length = 0x003000

RESET : origin = 0x3FFFC0, length = 0x000002

PAGE 1 :

BOOT_RSVD : origin = 0x000002, length = 0x0000F3 /* Part of M0, BOOT rom will use this for stack */

RAMM1 : origin = 0x000500, length = 0x000300 /* on-chip RAM block M1 */

RAMLS5 : origin = 0x00A800, length = 0x003800RAMGS1 : origin = 0x00E000, length = 0x002000

RAMGS2 : origin = 0x010000, length = 0x002000

RAMGS3 : origin = 0x012000, length = 0x002000

}

SECTIONS

{

codestart : > BEGIN, PAGE = 0

.TI.ramfunc : > RAMM0 PAGE = 0

.text : >>RAMM0 | RAMLS , PAGE = 0

.cinit : > RAMM0, PAGE = 0

.pinit : > RAMM0, PAGE = 0

.switch : > RAMM0, PAGE = 0

.reset : > RESET, PAGE = 0, TYPE = DSECT /* not used, */

.stack : > RAMM1, PAGE = 1

.ebss : > RAMLS5, PAGE = 1

.econst : > RAMLS, PAGE = 0

.esysmem : > RAMLS5, PAGE = 1

ramgs0 : > RAMGS1, PAGE = 1

ramgs1 : > RAMGS2, PAGE = 1

}

Thanks n Regards,

Karthik R

0 Veena Kamath over 2 years ago in reply to Karthik Raveendranath

TI__Mastermind 31525 points

Karthik,

The above code should be insert in the asm file f28004x_usdelay.asm

We have created a ticket to include this in the next C2000ware release.

Regards,

Veena

0 Veena Kamath over 2 years ago in reply to Veena Kamath

TI__Mastermind 31525 points

Karthik,

Regarding the issue with .bss, .const etc, these are names of the sections created by the EABI compiler. The names of these sections created by the COFF compiler were different (.ebss., .econst, etc)

Please refer to this document for more details.-> https://software-dl.ti.com/ccs/esd/documents/C2000_c28x_migration_from_coff_to_eabi.html#eabi-sections

You can also refer to the reference linker command file in the C2000Ware\device_support\f28004x\common\cmd. Use the 28004x_generic_flash_lnk.cmd for Flash config and 28004x_generic_ram_lnk.cmd for RAM config

Regards,

Veena

0 Karthik Raveendranath over 2 years ago in reply to Veena Kamath

Intellectual 840 points

Hi Veena,

Thanks a lot for the suggestion. I made the necessary modification in asm file as well as added f28004x_generic_ram_link.cmd to the project. Now all build errors gone. But when I try to debug on LaunchPad its throwing following error

28xx_CPU1: Trouble Removing Breakpoint with the Action "Finish Auto Run" at 0x8a8a: (Error -1066 @ 0x8A8A) Unable to set/clear requested breakpoint. Verify that the breakpoint address is in valid memory. (Emulation package 9.1.0.00001)

Thanks and Regards

Karthik R

0 Karthik Raveendranath over 2 years ago in reply to Shanty

Intellectual 840 points

Hi Shantanu,

Forgot to ask. How many cycles now the 64 bit floating point execution takes with "rts2800_fpu32_eabi.lib".

Thanks and Regards

Karthik R

0 Veena Kamath over 2 years ago in reply to Karthik Raveendranath

TI__Mastermind 31525 points

Karthik,

Can we take the linker issue to a different thread? Please open a new E2E thread so that its helpful for others facing similar issue.

Regards,

Veena

C2000™︎ microcontrollers

C2000 microcontrollers forum

TMS320F280049C-Q1: Double float optimization in F280049C