This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Optimizer terminated abnormally

I'm using code generation toolset v6.1.6 (Linux version) to compile the following code:

 

void function_x(uint32* restrict pui_dst, 

                uint32* restrict pui_hyp_buf, 

                uint32  ui_l)

{

   uint32 ui_i;

 

   if ( ui_l == 0 )

   {

      /*

       * TI: for some unknown reason we have to disable unrolling of this loop

       * when compiling with CGT6.1.6 (Linux), otherwise we get the following:

       *   "Optimizer terminated abnormally"

       */

      #pragma UNROLL( 1 );

      for ( ui_i = 0 ; ui_i < 16 ; ui_i++ )

      {

         pui_dst[ui_i]  = _dotp2(pui_hyp_buf[ui_i], pui_hyp_buf[ui_i]);

      }

   } 

   else 

   {

      for ( ui_i = 0 ; ui_i < 16 ; ui_i++ )

      {

         pui_dst[ui_i] += _dotp2(pui_hyp_buf[ui_i], pui_hyp_buf[ui_i]);

      }

   }

}

 

As stated in the code comment above, this will not compile unless unrolling of the first loop is disabled, otherwise the compiler aborts with the following:

 

>>>> Optimizer terminated abnormally

>>>>    in function _function_x()

>>>>    in file "myfile.c"

 

Compiler options used are:

--opt_level=2 --define=CHIP_6416 --silicon_version=6400

 

Can anyone tell me why this will not compile/optimize?

  • I submitted a compiler bug report on your behalf to support@ti.com with a reference to you post. I included a project that should make it easy for them to duplicate the failure.

    #pragma MUST_ITERATE(16,16,16) also fails, as does using an intermediate variable for the assignment.

    Changing the = to += makes it pass the compiler, but of course that is not a work-around.

    Since your code seems to be accumulating the DOTP2 results, one workaround may be to clear the dst array when l=0, then always accumulate. I did not try this, but it would make sense to work well. You could use the UNROLL(2) for that accumulation loop, and maybe or maybe not for the clearing loop.

    Would that be an acceptable work-around?

  • I compiled this same source code in CGT v6.1.9 (currently the latest) to make sure the bug was not already fixed and the problem does still exist. As a result I have filed a bug (SDSCM00031482) on it. Hopefully this issue will be fixed in the next release of the code gen package.

    In the meantime please let us know if Randy's workaround suggestion is acceptable.

  • I also tried v6.1.9 and found the same.  Thanks for the suggestions Randy... luckily this particular piece of code is not particularly critical, so we're not too worried about optimization here.  I did find that if I manually unroll the first loop (the entire thing) then the compiler is happy.

  • Whenever you have enough of an answer, you may click Verify an Answer (I think) and we will quit throwing out more comments. :)

    WHen you said that you "manually unroll the first loop (the entire thing)", does that mean you replaced the for loop with 16 assignments? If so, I wonder if the optimizer was smart enough to recognize a repetitive pattern and put it back into a loop? That would be interesting to see in the listing/assembly file.

    You could also manually unroll the loop by doing what the compiler should have done with UNROLL(2):

    for ( ui_i = 0 ; ui_i < 16 ; ui_i+=2 )
          {
             pui_dst[ui_i]  = _dotp2(pui_hyp_buf[ui_i], pui_hyp_buf[ui_i]);
             pui_dst[ui_i+1]  = _dotp2(pui_hyp_buf[ui_i+1], pui_hyp_buf[ui_i+1]);
          }

    I must admit I did not try this to confirm the compiler will work with it, but if you care to try it and it fails, please let us know because that might be a different bug. It should work, and I would give it a very high probability of success.

  • Yes, by manually unrolling the entire loop I do mean replacing it with 16 assignments.  Neither the v6.1.6 nor the v6.1.9 compiler puts this back in a loop (I checked the assembly).

    What you propose, i.e. doing manually what UNROLL(2) would have done, does work.