This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

how to handle 16 bits x 16 bits multiplication on c5515 ?

I just got c5515 eZusb stick a couple days ago, and I'm trying to implement a simple 32-tab FIR filter on the chip to process the input audio coming from the codec. The data from the codec is 16-bit and I defined my filter coefficients as 16-bit (Q15). Then, I'm running into the issue that c5515 is a 16-bit fixed-point processor, and how does it handle 16-bit x 16-bit multiplication? I know the fact that c5515 has 17-bit X 17-bit multiplier, and the result is 40-bit, but how can I write in C to shift the accumulation result by the correct amount in my case in order to get right product (I want to the product to be 16-bit as well) ? 

Any help will be highly appreciated! 

 

  • I prefer two solutions. If each clock cycle is important I call assembler function. Otherwise I use very simple C procedure:

    1.There are 2 variables of type Int16.

    2.Cast one of them to Int32

    3.Multiply and shift 15 bits right

    4.Store least significant 16 bits as result.

    You can pack all these steps to inline function or macro (fastest way) .

     

    regards

    MS

  • Michal Szymanski said:

    I prefer two solutions. If each clock cycle is important I call assembler function. Otherwise I use very simple C procedure:

    1.There are 2 variables of type Int16.

    2.Cast one of them to Int32

    3.Multiply and shift 15 bits right

    4.Store least significant 16 bits as result.

    You can pack all these steps to inline function or macro (fastest way) .

     

    regards

    MS

    Hi Michal:

    Thanks for your fast response!

    I just tried based on your suggestions. I defined the Macro as:

    #define MULT(A, B)  ( ( ( ( long int ) A ) * B ) >> 15 )

    I checked the result with debugger and it still gives me what I want.

     

    For example:

    y = MULT( 0xFF84, 0x16 );

    will gives me y = 0xFFFF.

    but with long int multiplication: 0xFF84 x 0x0016 = 0x0015F558, and then shift by 15, so I'm expecting y = 0x0043

    Could you point out my error here? I almost start suspecting I'm missing something fundamental about how multiplication and shift works in fix-point processors.

  • Hi Minzhen,

    The C55x DSP Library has optimized filter functions written in ASM: http://focus.ti.com/docs/toolsw/folders/print/sprc100.html

    Check out the FIR example that is located under dsplib_2.40.00\EXAMPLES\FIR

    The source code for the FIR filter is located under dsplib_2.40.00\55x_src\fir.asm

    This routine uses the Multiply and Accumulate operation (MACM). See page 273 of the C55x CPU Instruction Set Reference Guide:

    "The 32-bit result of the multiplication is sign extended to 40 bits and added to the source accumulator ACy."

    You can also find an example of the FIR filtering in the Audio Filter Demo: http://code.google.com/p/c5505-ezdsp/

    Hope this helps,
    Mark

  • Thank you Mark!

    I don't think I'm suppose to use any of the third part code, however, it will definitely help me to under how to work better with the chip.

     

  • The problem isn't with your macro, it's how the compiler is interpreting it.

    As you pointed out there is a 17x17 multiplier and a 40 bit accumulator.  Combined with the barrel shifter, this combination of hardware provides very efficient 16 bit fixed point arithmetic.  The problem is that the official methods defined in the C standard make taking advantage of this hardware a little difficult.

    According to the C standard, if you multiply a 16 bit int by another 16 bit int you will get a 16 bit result because the C standard assumes all integral types represent numbers with the implied radix point just to the right of the LSB.  So, your multiply of 0xFF84 by 0x16 would result in a value of 0xF558 and when promoted to a signed type and right shifted by 15 you get 0xFFFF.  Again, this is expected by the C standard.

    To effect what you want within the confines of the C standard, each value would need to be promoted to a 32 bit value as in

    y = (int)( ( (long) a * (long) b ) >> 15 )

    At first glance this is exactly what your macro does.  However, again within the confines of the C standard, this code would require a subroutine call to a 32x32 multiply routine requiring a total of 4 multiplies and 3 additions followed by a right shift and then truncation of the value into a 16 bit int again.

    Since the writers of the code gen tools knew that this was expected from the C5xxx processors, they created a special shortcut to accomplish what you want to do without having to call the 32x32 multiplication subroutine.  Thus if you write code that appears to the compiler as

    (int)( ( (long) a * (long) b ) >> n )

    then it implements a single multiply instruction and shift.  It gives you then intended result when applied properly but, as you have seen, it can fail under certain circumstances.

    What is happening in your code is the compiler is recognizing your code as a request to perform this optimization automatically.  Thus the value of 0xFF84, which would normally be interpreted as an unsigned int (because it is a hexadecimal constant) is actually being interpreted as a signed int of -124 because the cast to signed long is not actually accomplished thus the upper 16 bits are effectively interpreted as 0xFFFF instead of 0x0000.  Thus your 40 bit result ends up being 0xFF_FFFF_F558 which when right shifted by 15 yields 0xFF_FFFF_FFFF and when truncated leaves 0xFFFF to be assigned.

    There is a solution for this however, but you merely need to recognize the limitation of the compiler implementing the shortcut.  This short cut can work for either signed operands or unsigned operands and you need to make sure you utilize the one you need.  Thus you should have two macros defined as

    #define SMULT( a, b, n ) ( (int)( ( (long)(int) a * (long)(int) b ) >> n ) )

    #define UMULT( a, b, n ) ( (unsigned int)( ( (unsigned long)(unsigned int) a * (unsigned long)(unsigned int) b ) >> n ) )

    Now if you call the UMULT macro as UMULT( 0xFF84, 0x16, 15 ) you will get the answer you are looking for.

    The additional casts may seem overkill but it is important to make sure you tell the compiler how to interpret the operand before casting it to the long form.

    Also, if the special format gets too complicated due to passing equations to evaluate in the macro or including the macro in a large equation, the compiler may not be able to properly identify the syntax to effect the optimization.  Thus it is suggested that you always use the above macros by themselves, with single variables or constants passed, and assign its result to a variable.  This will help to insure the compiler implements your expected results.

    Jim Noxon

  • Thank you so much, Jim! You explain everything I'm looking for here! I'm about to try this out!