32bit fast multiplication

Ajeesh Arakkal

Other Parts Discussed in Thread: AFE4490

Iam using msp430 .I would like to make FIR filter for 22bit ADC ( AFE4490 SPO2 Module ) for that I need 32 multiplication . is there any assembly code for 32 bit multiplication. functions like "mul16()". please help me

over 11 years ago

0 old_cow_yellow over 11 years ago

Guru 58965 points

You could pick a member of MSP430 family with hardware 32 x 32 bit multiply.

0 Jens-Michael Gross over 11 years ago in reply to old_cow_yellow

Guru 227245 points

old_cow_yellow said:
You could pick a member of MSP430 family with hardware 32 x 32 bit multiply.

...and rewrite the mul16() code to do a mul32(). It's almost straight forward.
However, if your MSP only has a 16bit hardware multiplier, then it becomes a bit more complex - and will gain less speed compared to a standard '*'.

0 Ajeesh Arakkal over 11 years ago in reply to Jens-Michael Gross

Prodigy 120 points

We can use * operator for multiplication .I am using IAR IDE in that workbench we can enable Hardware multiplier

Option->general options->hardware multiplier. Then Complier will use Hardware Multiplier of the MSP430.

0 Jens-Michael Gross over 11 years ago in reply to Ajeesh Arakkal

Guru 227245 points

For the * operator on integer values, the compiler will automatically use the HWM, if available and enabled (which should be the default).

However, the C language specification forces the compiler to run an inefficient strategy. C language specifies only 16*16->16 or 32*32->32 multiplications, even when the MPY32 could do a 16x16->32 multiplication without previously extending the operands to 32bit. That’s why there are functions/macros of type “unsigned long long mul32(unsigned long a, unsigned long b)”.

0 Anders Lindgren over 11 years ago in reply to Jens-Michael Gross

Expert 2070 points

In fact, the IAR compiler is fully capably of recognizing the cases you mention and generate good code for them. For example:

unsigned long mul1616_to_32(unsigned short x, unsigned short y)
{
return ((unsigned long)x) * y;
}

This is compiled into:

PUSH.W SR
DINT
NOP
MOV.W R12, &0x130
MOV.W R13, &0x138
MOV.W &0x13a, R12
MOV.W &0x13c, R13
POP.W SR
RETA

This code utilizes the 16*16->32 feature of the 32 bit hardware multiplier.

Hopefully, the Ti tools does something similar.

-- Anders Lindgren, IAR Systems, Author of the IAR compiler for MSP430

0 Jens-Michael Gross over 11 years ago in reply to Anders Lindgren

Guru 227245 points

Interesting. So strictly spoken, the IAR compiler not following the C language specification here. :)
Of course no harm is done as the outcome is the expected one, and faster than the strict implementation. So I think nobody (except a few pedantic fault-finders) has a reason to complain.

0 Anders Lindgren over 11 years ago in reply to Jens-Michael Gross

Expert 2070 points

Jens-Michael Gross said:

Interesting. So strictly spoken, the IAR compiler not following the C language specification here. :)
Of course no harm is done as the outcome is the expected one, and faster than the strict implementation. So I think nobody (except a few pedantic fault-finders) has a reason to complain.

It most surely is following the C language specification!

The compiler is free to compile source code into any code sequence it wants as long as the result is identical to the "C virtual machine". In this case, the naive way to implement the code is to cast both arguments to 32 bits and then perform a full 32 bit multiplication. However, a more efficient way is to use the built-in 16x16->32 multiplication unit.

The result of both methods are the same, and hence both follow the C language specification.

(One clarification: If the code instead had been written as (unsigned long)(x * y) then the compile must (and, of course, do) perform a 16 bit multiplication and then extend the result to 32 bits.)

-- Anders Lindgren, IAR Systems, Author of the IAR compiler for MSP430

0 Jens-Michael Gross over 11 years ago in reply to Anders Lindgren

Guru 227245 points

Well, the C language defines that operands have to be extended to the largest type of any one operand before an operation, and the result of an operation is of the same type as (truncated to) this larger type.
But in the given case, the compiler does not extend the parameters nor does it give a result of the same type as the actually used operands..
To get a 32 bit result from two 16 bit operands, you need to do a typecast to 32 bit on the operands. That’s what you need to do in the source code and what the compiler should compile (the ‘naïve way’, as you called it). However, the IAR compiler does not perform the typecast, which is smart because it is superfluous with the hardware multiplier. One could call it ‘optimization’ (which it indeed is).
Surely there is nothing wrong with what the compiler does. I never intended to say that. It is just not strictly following the specified procedure (and rightfully does so, as the specification doesn’t know of this special case). That’s why I said ‘strictly spoken’. It is of course no bug, as the result is the expected one, which is the only thing that counts.

**Attention** This is a public forum

MSP low-power microcontrollers

MSP low-power microcontroller forum

32bit fast multiplication