32x32=64-bit multiplication on C6400

Andy Polyakov

Another day there was a question about multiplication, which made me recall certain issue with C6400 code generator. Consider this

unsigned long long foo(unsigned int a, unsigned int b)
{  return (unsigned long long)a * b;   }

C6400+ code generator, i.e. note plus, compiles it as single MPY32U instruction, which is totally cool. Now, provided that C6400 code generator, i.e. note lack of plus, compiles 32x32=32 multiplication as three 16-bit multiplications plus shift and pair of additions (totally cool), one would expect that 32x32=64 multiplication would be compiled as four 16-bit ones plus shifts and additions. But that's not what happens :-( Instead it generates function call to _mpyll, which apparently performs 64x64=64 multiplication with 10 16-bit multiplication and numerous shifts and additions. This is waste and naturally bad for performance...

over 9 years ago

George Mock over 9 years ago

TI__Guru**** 251710 points

This situation is described in detail in the application note How to Write Multiplies Correctly in C Code.

Thanks and regards,

-George

Andy Polyakov over 9 years ago in reply to George Mock

Expert 1340 points

Once again, C6400+ code generator manages to recognize that upper halves of 32-bit values just converted to 64 bits are actually zeros, so that operation can be performed with single MPY32U instruction. And it is performed with single operation. And question why doesn't C6400 code generator recognizes that upper halves of 32-bit values just converted to 64 bits are actually zeros? And perform operation with four 16x16=32 multiplication (naturally with complementary adds and shifts) inline? In other words question is not about how to write correct code, but why isn't it properly optimized.

Archaeologist over 9 years ago in reply to Andy Polyakov

TI__Guru* 84285 points

You are correct, there is opportunity for optimization. I've created enhancment request CODEGEN-1931 to track this issue.

Andy Polyakov over 9 years ago in reply to Archaeologist

Expert 1340 points

To double-clarify. The problem was observed with several versions of TI CGT v6 and v7, including latest v7.4.20. "Latest" refers to the fact that post-v7 compiler doesn't accept -mv6400 option. Reference to -mv6400 in turn means that that's what it takes to reproduce the problem [with two-liner in the beginning of the thread]. Naturally along with an optimization flag...

Code Composer Studio™︎

Code Composer Studio forum

32x32=64-bit multiplication on C6400