TMS320F280049: Calculation cycles comparision between float32 and uint32

Sheldon He

Part Number: TMS320F280049

For normal multiplication (uint32*uint32, uint16*uint16), division calculation(uint32/uint16), will FPU take shorter time than a normal fix point DSP core? (compare F280049 with F28027, with all instructions enabled)

For example,

uint32 a = 2, b = 3;

float32 c = 4, d = 5;

Will c * d fater than a * b.? What if variable is in uint16 and float32?

Will a / b fater than c / d? What if variable is in uint16 and float32?

Thanks

Sheldon

over 6 years ago

0 Sira Rao80 over 6 years ago

TI__Mastermind 26875 points

Sheldon,

I am tempted to give a quick answer here, but on second thought, I think there are some nuances here(including the fact that the 28004x is a 100MHz device, whereas the 28027 is a 60MHz device), so let me analyze this carefully before getting back to you. Please give me a day or so.

Also, could you elaborate what you mean by "What if variable is in uint16 and float32?" Do you mean a mixed-operation involving fixed-pt and floating-pt numbers. In those cases, you will end up casting. Or did you mean to differentiate between uint16 vs uint32. Please clarify.

Thanks,

Sira

0 Sheldon He over 6 years ago in reply to Sira Rao80

TI__Expert 8585 points

Hi Sira,

We may assume they are running at the same main clock speed, what I care about is the differnece between instruction set here. (like if a FPU is enabled or not).

I mean the same test with differences in uint16 and uint32. Like below, thanks!

uint16 a = 2, b = 3;

float32 c = 4, d = 5;

Will c * d fater than a * b.?

Will a / b fater than c / d?

Thanks

Sheldon

0 Sira Rao80 over 6 years ago in reply to Sheldon He

TI__Mastermind 26875 points

Sheldon,

I think since there are single cycle instructions for fixed-point multiply MPY as well as floating-point multiply MPYF32, there should be no differences on the multiply side.

On the division side, there are quite a few possibilities.

If "/" just calls the standard RTS library, it would consume a different number of cycles for fixed-pt and floating-pt. Presume floating-pt will take longer. I haven't measured or know of where this is documented.

If you use the FPUFastRTS library, it supports single-precision floating-pt division (and double-precision division), and benchmarking data is available in the user guide (25 cycles for single-precision). Presume this would be smaller than RTS library single-precision floating-pt division, because the purpose is to tradeoff accuracy for speed.

On devices that have FASTINTDIV (2838x, but not 28004x), integer division can be performed fast (benchmarking data available in the corresponding user guide - 16 cycles for uint16/uint16).

If you use the IQMath fixed-pt library for fixed-pt division, benchmarking data is available in the user guide (63 cycles of IQNDiv).

Hope this helps,

Sira

0 Sheldon He over 6 years ago in reply to Sira Rao80

TI__Expert 8585 points

Hi Sira,

Thanks for your detailed answer, could you check if you understanding is correct or not?

As for multiplication between two fixed point numbers and between two float point numbers, there should be no differences.

uint32uint32, uint16uint16	float32*float32	Comments
C28x	C28x + FPU	No differences, for former has MPY instruction and later has MPYF32, both are single cycle

As for division between two fixed point numbers and between two float point numbers, please see details below

uint32/uint32, uint16/uint16	float32/float32	Comments
C28x + FASTINTDIV	C28x + FPU + TMU	Former takes around 14 to 16 cycles with hardware support. Later uses DIVF32 to process with 5 cycles (Can execute other instructions in the remaining four cycles for CPU pipelining)
C28x + IQmath	C28x + FPU + FPUfastRTS	Former takes 63 cycles by calling IQNdiv. Later takes 24 cycles.

Sheldon

0 Sira Rao80 over 6 years ago in reply to Sheldon He

TI__Mastermind 26875 points

Sheldon,

Thanks for including info on the DIVF32 supported by the FPU.

Tables look good. Probably would be good to explicitly say"IQMath Library" and "FPUFastRTS Library" to distinguish them from the HW modules like FPU, TMU, and FASTINTDIV.

Thanks,

Sira

C2000™︎ microcontrollers

C2000 microcontrollers forum

TMS320F280049: Calculation cycles comparision between float32 and uint32