Hi,
I have a code where i require around (10E+6)*(10E+6) number of
floating point multiplication and that much number of floating point
additions as I am doing auto correlation for my development for 10E+6
samples in circular shifting manner
When I try to run my code in DM3730 on ARM Core (Since I don't
know how to use DSP Core), I get around 50E+6 multiplications and
additions in greater than 5 minutes which is very slow for my entire
requirement. My BB uses Angstrom and I am not able to use hard FPU.
Can Some body suggest me How to speed up Computation speed on BB so
that I can make a feasible system.
My snippet of the code is as follows: Please help me I am stuck....
The maximum value of k =1000000.
for(i=1;i<=k;i++)
{
sum=0;
for (j=1;j<=k;j++)
{
sum = sum + (*(prdc_pulse_out_store + j))*(*(prdc_pulse_out
+ j));
}
*(Rxy + i) = (sum/k);
// circular shifting
temp = *(prdc_pulse_out + k);
for(j=k;j>=2;j--)
{
*(prdc_pulse_out + j) = *(prdc_pulse_out + j -1);
}
*(prdc_pulse_out + 1) = temp;
}