This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Time consumption is different for 'for' loops with fix-length vector and dynamic vector

I have some test programs under the CCS3.3 and DSPC6713B and find that the time consumption is significantly different for the "for " loops.

 The first test program is as follows:

#define NUM   128
float dis[NUM];

#pragma MUST_ITERATE(10,,)
for(i = 0; i < NUM; i++)
   dis[i] =  dis[i] + val1 * val2; 

The second program is like:

typdef struct tag_OBJ
{
  float *dis;
}OBJ;

typdef struct tag_PARAM
{
   int num;
}PARAM;

OBJ obj;
PARAM param;

param.num = NUM;

obj.dis = (float *)malloc(sizeof(float) * param.num);


#pragma MUST_ITERATE(10,,)
for(i = 0; i < param.num; i++)
    obj.dis[i] = obj.dis[i] + val1 * val2;

 

The complie options for both cases is set to:-o3, -pm

It is surprised that time consumption for the first case is about 2-cycle for every loop and 20-cycle for each iteration for the second case?

What causes the dramatic difference on time consuption for these two cases?

  • One guess ... The global "dis" is in internal memory while the malloc'd "dis" is in external memory.

    Thanks and regards,

    -George

  • It might be because the value of param.num is not a known constant.  The software pipeline can sometimes optimize better when the trip count is constant.  Try replacing "param.num" with "NUM" to see if that gets you the performance back.  If it does, the problem is that you need to adjust the MUST_ITERATE pragma or add an _nassert to tell the compiler more hints about what the value of "param.num" could be.