Hello there!
I use the 6416 DSP for face recognition in my diploma thesis and want to optimize my code. In lieu of
for(i=240;i>0;i-=2) {
for(j=320;j>0;j-=2) {
if((i>=3 && i<=237) && (j>=3 && j <= 317)) {
rgb[0][240-i][320-j]= ((*(SrcFrame + (320-j+1) + XMAX*(240-i+1)))+ (*(SrcFrame + (320-j+1) + XMAX*(240-i-1)))) /2;
rgb[1][240-i][320-j]= ((*(SrcFrame + (320-j) + XMAX*(240-i-1))) + (*(SrcFrame + (320-j) + XMAX*(240-i+1)))) /2;
rgb[2][240-i][320-j]= (*(SrcFrame + (320-j) + XMAX*(240-i)));
rgb[0][240-i][320-j+1]= (*(SrcFrame + (320-j+1) + XMAX*(240-i+1) ));
rgb[1][240-i][320-j+1]= (*(SrcFrame + (320-j) + XMAX*(240-i+1)));
rgb[2][240-i][320-j+1]= ((*(SrcFrame + (320-j) + XMAX*(240-i))) + (*(SrcFrame + (320-j) + XMAX*(240-i+2))) )/2;
rgb[0][240-i+1][320-j]= ((*(SrcFrame + (320-j+1) + XMAX*(240-i-1) ))+(*(SrcFrame + (320-j+1) + XMAX*(240-i+1))))/2;
rgb[1][240-i+1][320-j]= (*(SrcFrame + (320-j+1) + XMAX*(240-i) ));
rgb[2][240-i+1][320-j]= ((*(SrcFrame + (320-j) + XMAX*(240-i))) + (*(SrcFrame + (320-j+2) + XMAX*(240-i))) )/2;
rgb[0][240-i+1][320-j+1]= (*(SrcFrame + (320-j+1) + XMAX*(240-i+1) ));
rgb[1][240-i+1][320-j+1]= ((*(SrcFrame + (320-j) + XMAX*(240-i+1)))+ (*(SrcFrame + (320-j+1) + XMAX*(240-i) )) + (*(SrcFrame + (320-j+1) + XMAX*(240-i+2) )) + (*(SrcFrame + (320-j+2) + XMAX*(240-i+1))) )/4;
rgb[2][240-i+1][320-j+1]= ((*(SrcFrame + (320-j) + XMAX*(240-i)))+ (*(SrcFrame + (320-j) + XMAX*(240-i+2)))+ (*(SrcFrame + (320-j+2) + XMAX*(240-i))) + (*(SrcFrame + (320-j+2) + XMAX*(240-i+2))))/4;
}
}
}
I wanna use three unsigned char pointer *r,*g,*b (same type as the 3dim rgb array) and do
r = &rgb[0][240-i][320-j];
g = &rgb[0][240-i][320-j];
b = &rgb[0][240-i][320-j];
at the beginning of the loop.
So now the Question: Why does the load6x sim say that it needs more cycles when I use r+=240; than it needs for r+=1; or r+=31; (the magical number where it switches is 32)?
P.S.: Any advices to optimize my Bayer to RGB Interpolation?