This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Using LDW instead of LDDW for "dot product" software pipelining using floating point 6713

Hi,

I am trying to optimize an FIR filter function in which the inner loop is a dot product function. Referring to Rulph Chassaing "DSP and app with c6713 and c6416", and some documents such as TI's "spru198k", floating point "dot product" is implemented using LDDW where four floating point numbers can be loaded at a single cycle.

- One question regarding this, is that does dot product with LDDW work for arrays having odd number of elements?

Also when implementing this with the FIR filter, the filter output repeats every single output twice, I think this may be due to LDDW loads two floating numbers at a time (with the address of load have to be aligned to  8-byte wide), when the pointer to the next array element is shifted by one element ( 4 bytes) , the alignment results in that the same two numbers are loaded again giving the same results after doing "dot product", the next figure explains this:

- Please advice if this is correct?

Also, can I use LDW for floating point dot product using software pipelining. I have tried this but there is some thing wrong with the accumulator, as when I accumulate the multiplication result on register A7 ( I use it as accumulator), I don't get correct result, whereas when I add the multiplication result to regsiter A9 ( the register is given a constant value of zero i.e, I am not accumulating on it to test ), the multiplication is added correctly to A9.

-Attached is my software piplined code for dot product using LDW, can you please check what is wrong with it?5873.dotp_ldw.rar