how to handle 16 bits x 16 bits multiplication on c5515 ?

Minzhen Ren

I just got c5515 eZusb stick a couple days ago, and I'm trying to implement a simple 32-tab FIR filter on the chip to process the input audio coming from the codec. The data from the codec is 16-bit and I defined my filter coefficients as 16-bit (Q15). Then, I'm running into the issue that c5515 is a 16-bit fixed-point processor, and how does it handle 16-bit x 16-bit multiplication? I know the fact that c5515 has 17-bit X 17-bit multiplier, and the result is 40-bit, but how can I write in C to shift the accumulation result by the correct amount in my case in order to get right product (I want to the product to be 16-bit as well) ?

Any help will be highly appreciated!

over 14 years ago

0 Michal Szymanski over 14 years ago

Expert 1215 points

I prefer two solutions. If each clock cycle is important I call assembler function. Otherwise I use very simple C procedure:

1.There are 2 variables of type Int16.

2.Cast one of them to Int32

3.Multiply and shift 15 bits right

4.Store least significant 16 bits as result.

You can pack all these steps to inline function or macro (fastest way) .

regards

0 Minzhen Ren over 14 years ago in reply to Michal Szymanski

Prodigy 165 points

Michal Szymanski said:

I prefer two solutions. If each clock cycle is important I call assembler function. Otherwise I use very simple C procedure:

1.There are 2 variables of type Int16.

2.Cast one of them to Int32

3.Multiply and shift 15 bits right

4.Store least significant 16 bits as result.

You can pack all these steps to inline function or macro (fastest way) .

regards

MS

Hi Michal:

Thanks for your fast response!

I just tried based on your suggestions. I defined the Macro as:

#define MULT(A, B) ( ( ( ( long int ) A ) * B ) >> 15 )

I checked the result with debugger and it still gives me what I want.

For example:

y = MULT( 0xFF84, 0x16 );

will gives me y = 0xFFFF.

but with long int multiplication: 0xFF84 x 0x0016 = 0x0015F558, and then shift by 15, so I'm expecting y = 0x0043

Could you point out my error here? I almost start suspecting I'm missing something fundamental about how multiplication and shift works in fix-point processors.

0 Mark M over 14 years ago in reply to Minzhen Ren

TI__Mastermind 30120 points

Hi Minzhen,

The C55x DSP Library has optimized filter functions written in ASM: http://focus.ti.com/docs/toolsw/folders/print/sprc100.html

Check out the FIR example that is located under dsplib_2.40.00\EXAMPLES\FIR

The source code for the FIR filter is located under dsplib_2.40.00\55x_src\fir.asm

This routine uses the Multiply and Accumulate operation (MACM). See page 273 of the C55x CPU Instruction Set Reference Guide:

"The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy."

You can also find an example of the FIR filtering in the Audio Filter Demo: http://code.google.com/p/c5505-ezdsp/

Hope this helps,
Mark

0 Minzhen Ren over 14 years ago in reply to Mark M

Prodigy 165 points

Thank you Mark!

I don't think I'm suppose to use any of the third part code, however, it will definitely help me to under how to work better with the chip.

0 Jim Noxon over 14 years ago in reply to Minzhen Ren

TI__Genius 14940 points

The problem isn't with your macro, it's how the compiler is interpreting it.

As you pointed out there is a 17x17 multiplier and a 40 bit accumulator. Combined with the barrel shifter, this combination of hardware provides very efficient 16 bit fixed point arithmetic. The problem is that the official methods defined in the C standard make taking advantage of this hardware a little difficult.

According to the C standard, if you multiply a 16 bit int by another 16 bit int you will get a 16 bit result because the C standard assumes all integral types represent numbers with the implied radix point just to the right of the LSB. So, your multiply of 0xFF84 by 0x16 would result in a value of 0xF558 and when promoted to a signed type and right shifted by 15 you get 0xFFFF. Again, this is expected by the C standard.

To effect what you want within the confines of the C standard, each value would need to be promoted to a 32 bit value as in

y = (int)( ( (long) a * (long) b ) >> 15 )

At first glance this is exactly what your macro does. However, again within the confines of the C standard, this code would require a subroutine call to a 32x32 multiply routine requiring a total of 4 multiplies and 3 additions followed by a right shift and then truncation of the value into a 16 bit int again.

Since the writers of the code gen tools knew that this was expected from the C5xxx processors, they created a special shortcut to accomplish what you want to do without having to call the 32x32 multiplication subroutine. Thus if you write code that appears to the compiler as

(int)( ( (long) a * (long) b ) >> n )

then it implements a single multiply instruction and shift. It gives you then intended result when applied properly but, as you have seen, it can fail under certain circumstances.

What is happening in your code is the compiler is recognizing your code as a request to perform this optimization automatically. Thus the value of 0xFF84, which would normally be interpreted as an unsigned int (because it is a hexadecimal constant) is actually being interpreted as a signed int of -124 because the cast to signed long is not actually accomplished thus the upper 16 bits are effectively interpreted as 0xFFFF instead of 0x0000. Thus your 40 bit result ends up being 0xFF_FFFF_F558 which when right shifted by 15 yields 0xFF_FFFF_FFFF and when truncated leaves 0xFFFF to be assigned.

There is a solution for this however, but you merely need to recognize the limitation of the compiler implementing the shortcut. This short cut can work for either signed operands or unsigned operands and you need to make sure you utilize the one you need. Thus you should have two macros defined as

#define SMULT( a, b, n ) ( (int)( ( (long)(int) a * (long)(int) b ) >> n ) )

#define UMULT( a, b, n ) ( (unsigned int)( ( (unsigned long)(unsigned int) a * (unsigned long)(unsigned int) b ) >> n ) )

Now if you call the UMULT macro as UMULT( 0xFF84, 0x16, 15 ) you will get the answer you are looking for.

The additional casts may seem overkill but it is important to make sure you tell the compiler how to interpret the operand before casting it to the long form.

Also, if the special format gets too complicated due to passing equations to evaluate in the macro or including the macro in a large equation, the compiler may not be able to properly identify the syntax to effect the optimization. Thus it is suggested that you always use the above macros by themselves, with single variables or constants passed, and assign its result to a variable. This will help to insure the compiler implements your expected results.

Jim Noxon

0 Minzhen Ren over 14 years ago in reply to Jim Noxon

Prodigy 165 points

Thank you so much, Jim! You explain everything I'm looking for here! I'm about to try this out!

Processors

Processors forum

how to handle 16 bits x 16 bits multiplication on c5515 ?