16 bit convolution

Remco Poelstra

Hi,

I've an application where I need to do 4 convolutions at the same time, but they are correlated: I've two inputs (I1,I2) and two sets of coefficients (H1,H2). I also have two outputs (O1=I1*H1+I2*H1,O2=I1*H2+I2*H2).

I've an C6748LCDK to get started with this application. Although it supports floating point I think integer math would be a better option as I've 16 bit inputs and 16 bit outputs.

The DSPLIB for the C6748 only includes floating point functions and the DSPLIB with integer functions seems to be not optimized for this processor. So it seems I've to write my own convolution function. I want to use a delay line and calculate an output sample for each input sample. Given that the number of coefficients can be more than 40,000, it needs to be quite optimized.

I've been studying some documentation and I've some questions:

The floating point functions in DSPLIB seem to try to do 4 multiplies each cycle, but the core can only do 2. Why is the loop then unrolled for 4 multiplies?

The documentation mentions that normally the core can do 4 16bit multiplies each cycle (2 units x2 16bit multiplies) but it seems that the DDOTP4 instruction matches perfectly with the task at hand, if I order my data correctly. That would allow me to do 8 multiply/add instructions each cycle. Am I correct?

Is it possible to write such functions in C or would I need assembly code to get it really optimized? Would it be possible to replace all FP data types in the DSPLIB SP function with shorts? Using the existing optimizations there?

Thanks in advance.

Kind regards,

Remco Poelstra

over 12 years ago

0 Rahul Prabhu over 12 years ago

TI__Guru** 116170 points

Remco,

Please refer to the 16 bit convolution implementationis provided in C64x+ IMGLIB

http://software-dl.ti.com/dsps/dsps_public_sw/c6000/web/c64p_imglib/latest/index_FDS.html

The C64x+ code is compatible with the C674x DSP core on the LCDK. Similarly all integer digital signal processing function are part of the C64x+ DSPLIB so you may also want to take a look at that library.

Regards,

Rahul

Processors

Processors forum

16 bit convolution