Hi,
For my project, i am trying to use from DSPLIB functions on C6678. I have gone through the programmers reference guide and calculated the time taken for executing the functions.The actual performance results are almost 3 times of theoretical results. Can someone kindly let me know where the problem is. The Optimization is set to level 3. I am running the code only on Core 0.
The following is the piece of code that is using DSPLIB:
All are floating point arrays of size 117912. tempa is a temporary register. CSL_tscRead function is used to sample the current clock time.
t1=CSL_tscRead();
DSPF_sp_vecmul(u,u,usq,117912);
DSPF_sp_vecmul(v,v,vsq,117912);
DSPF_sp_vecmul(w,w,wsq,117912);
DSPF_sp_w_vec(usq,ones,-1.5,tempa,117912);
DSPF_sp_w_vec(vsq,tempa,-1.5,tempa,117912);
DSPF_sp_w_vec(wsq,tempa,-1.5,vel_sq,117912);
DSPF_sp_w_vec(usq,vel_sq,4.5,usq,117912);
DSPF_sp_w_vec(vsq,vel_sq,4.5,vsq,117912);
DSPF_sp_w_vec(wsq,vel_sq,4.5,wsq,117912);
DSPF_sp_w_vec(v,u,1,uv,117912);
DSPF_sp_vecmul(uv,uv,uv_sq,117912);
DSPF_sp_w_vec(uv_sq,vel_sq,4.5,uv_sq,117912);
DSPF_sp_w_vec(v,u,-1,uv_1,117912);
DSPF_sp_vecmul(uv_1,uv_1,uv_1_sq,117912);
DSPF_sp_w_vec(uv_1_sq,vel_sq,4.5,uv_1_sq,117912);
DSPF_sp_w_vec(w,u,1,uw,117912);
DSPF_sp_vecmul(uw,uw,uw_sq,117912);
DSPF_sp_w_vec(uw_sq,vel_sq,4.5,uw_sq,117912);
DSPF_sp_w_vec(w,u,-1,uw_1,117912);
DSPF_sp_vecmul(uw_1,uw_1,uw_1_sq,117912);
DSPF_sp_w_vec(uw_1_sq,vel_sq,4.5,uw_1_sq,117912);
DSPF_sp_w_vec(w,v,1,vw,117912);
DSPF_sp_vecmul(vw,vw,vw_sq,117912);
DSPF_sp_w_vec(vw_sq,vel_sq,4.5,vw_sq,117912);
DSPF_sp_w_vec(w,v,-1,vw_1,117912);
DSPF_sp_vecmul(vw_1,vw_1,vw_1_sq,117912);
DSPF_sp_w_vec(vw_1_sq,vel_sq,4.5,vw_1_sq,117912);
t2=CSL_tscRead();
printf("The time Taken for This is %u\n",t2-t1);
Theoretically, this piece of code must be executed in 3 milli seconds(@1 Ghz) but it is taking around 10 milli seconds. My application is a performance critical application. So, is there any way to optimize on the code and achieve the theoretical minimum.
Thanks & Regards
Varun