This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C66xx optimizer bug at higher optimization levels

I am using CGT 7.3.1.  If I compile the attached file using -mv6600 -O2 -DSIZE=24, the time it takes to compile roughly doubles when I increase SIZE by one -- for example, 15 seconds when SIZE=24, 29 seconds when SIZE=25, 57 seconds when SIZE=26, and so forth (most of the time seems to be spent in opt6x).  My desired value for SIZE is large enough to put the compile time in years, assuming the pattern holds.  If I compile just using -O2, or with -mv6600 -O1, it finishes quickly.

Is this a known problem?  Is there a better workaround than to put this function in its own file and reduce the optimization level for that particular file?  (Please pardon it being attached as a .txt; the forum software rejected a .c extension.)

struct data
{
	int GBi[SIZE];
};

int min_gbi_(struct data *const input[], int count)
{
	int min_gbi = 2048;
	int it;
	int ix;

	for (it = 0; it < count; ++it)
	{
		for (ix = 0; ix < SIZE; ++ix)
		{
			if (min_gbi > input[it]->GBi[ix])
			{
				min_gbi = input[it]->GBi[ix];
			}
		}
	}

	return min_gbi;
}

  • The compiler is completely unrolling the inner loop.  You can inhibit that by adding "#pragma UNROLL(1)" in front of it.  (Or use any other reasonable unroll factor.)

    Complete unrolling stops at SIZE=40 in this example;  with 40 and above I see compilation times of less than a second.

    Adding an optimise-for-space option like -ms1 or -mf2 (I believe those two are equivalent) will also inhibit complete unrolling.

     

  • Hm.  Compiling with even SIZE values of at least 40 does finish in less than a second here -- but if I use SIZE=41 or SIZE=43, it triggers the exponential behavior.  It's just my luck that the case I care about has an odd SIZE with few useful divisors for unrolling :)

    Using -ms1 also avoids the problem, and is probably what I want for this code anyway.  Thank you!