I wrote a function to copy data from in_data to out_data, and I used TSCL to calculate cycle counts.
the cycle counts is around 16313967 for 720*480*2 data size.
so I tried to add keyword restrict, but saw no improvement.
How to make loops faster?
void AssignY(unsigned char* restrict in_data, int height, int width, unsigned char* restrict out_data)
{
int i, j;
int line = 0;
int dWidth = width<<1;
int offset;
for (i = 0; i < height; i++)
{
for (j = 0; j < width; j++)
{
offset = line+j*2+1;
out_data[offset] = in_data[offset];
}// end for
line += dWidth;
}
}