This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

ARM vs DSP performance in DM37x

 

Dear all,

Using the BSP 2.2 from LogicPD, we've executed the c6run example named cfft, with the following results:

DM-37x# ./cfft_arm
N=16,nTimes=100: 0.001007 s
N=32,nTimes=100: 0.002045 s
N=64,nTimes=100: 0.004791 s
N=128,nTimes=100: 0.010772 s
N=256,nTimes=100: 0.024628 s
N=512,nTimes=100: 0.055725 s
N=1024,nTimes=100: 0.124604 s
N=2048,nTimes=100: 0.272583 s
N=4096,nTimes=100: 0.596222 s
N=8192,nTimes=100: 1.29132 s
N=16384,nTimes=100: 2.74329 s
DM-37x# ./cfft_dsp
N=16,nTimes=100: 0.126648 s
N=32,nTimes=100: 0.14206 s
N=64,nTimes=100: 0.177978 s
N=128,nTimes=100: 0.260376 s
N=256,nTimes=100: 0.451263 s
N=512,nTimes=100: 0.872955 s
N=1024,nTimes=100: 1.81073 s
N=2048,nTimes=100: 3.87048 s
N=4096,nTimes=100: 8.3595 s
N=8192,nTimes=100: 18.1759 s
N=16384,nTimes=100: 39.5378 s

We are surprised to see that ARM (with no NEON acceleration) is faster than DSP. We know that both processors are running at different speeds, but almost it was believed that the numerical performance of DSP was higher that ARM. Have you been able to reproduce this behavior? Could it be caused by a problem in software configuration?

c6run version: 0.98.03.03

dsplink: 1.65.01.05

 

Thanks and Best Regards,

Joaquim Duran