This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS570LC4357: TI ARM CGT compiler bug - function stack sizes in 20.2 versions

Part Number: TMS570LC4357

I have ran into the following problem when switching from TI CGT ARM 18.1.4 to 20.2.7. I assume that this applies for all 18.1.x versions and all 20.2.x versions.

 

We experienced massive increases in function stack sizes when switching to the newer version of the compiler.

 

I've created an example code which reproduces the errors in our code.

The only compiler flag set was "-O0".

 

After compilation, I analyzed the .obj file with armofd.exe to see the function stack sizes, but I also compared the generated assembly codes to ensure that this is not a bug in the armofd.exe but really in the compiler itself.

 

With 18.1x version, the stack size of 'dummyFunctionWithSwitchInside' is 0x198, with the 20.2.x version, it increases to 0x968. (Almost six times bigger stack size, the multiplier is close to six, because of the six cases of the switch.)

 

I think this has to do with calculating the "worst-case" stack size for a function incorrectly in the new version, because this version calculates as if it was possible to fall through each case of the switch even though there are breaks in each case.

 

Example code:

 

typedef struct

{

    int dummy[100];

}dummyStruct;

 

int dummyFunction(int * x, dummyStruct * value)

{

    return (*x);

}

 

static void dummyFunctionWithSwitchInside(int * x)

{

    int switchInput;

    switch (switchInput)

    {

        case 1:

        {

            dummyStruct dummyInstance;

            dummyFunction(x, &dummyInstance);

            break;

        }

        case 2:

        {

            dummyStruct dummyInstance;

            dummyFunction(x, &dummyInstance);

            break;

        }           

        case 3:   

        {

            dummyStruct dummyInstance;

            dummyFunction(x, &dummyInstance);

            break;

        }           

        case 4:

        {

            dummyStruct dummyInstance;

            dummyFunction(x, &dummyInstance);

            break;

        }

        case 5:

        {

            dummyStruct dummyInstance;

            dummyFunction(x, &dummyInstance);

            break;

        }

        case 6:

        {

            dummyStruct dummyInstance;

            dummyFunction(x, &dummyInstance);

            break;

        }

    }

}

 

void main()

{  

    int varAgainstOptimization;

    dummyFunctionWithSwitchInside(&varAgainstOptimization);

}

  • Thank you for notifying us of this problem, and for providing a concise test case.  I can reproduce the same behavior.

    The only compiler flag set was "-O0".

    Note -O0 is equivalent to the longer form --opt_level=0.  While using version 20.2.7.LTS, if you change to --opt_level=1, then the stack size problem disappears.  Is this a practical solution?

    Thanks and regards,

    -George

  • Hi!

    Thanks for your quick response.

    It does make the problem disappear, but isn't this a bug?

    Shouldn't this be consistent in different compiler versions?

    Was it purposefully changed to allocate more stack than the exact possible maximum that is needed?

    Is this documented somewhere?

    This puts me in a situation where I either 

    - do not update the compiler and therefore I cannot have the bug fixes and improvements of the new versions

    or 

    - do update the compiler and change a project-level setting which can cause unwanted or unknown consequences

    BR,

    Adam Koros

  • Rather than address your questions directly, I think it is better to give you some insight into what happened.

    Start with the source code from your first post.  Build it with version 18.1.4.LTS and --opt_level=0.  Also add the option --src_interlist.  This option tells the compiler to keep the auto-generated assembly file, and to add comments to it which make it easier to understand.  It has the same name as the source file, with the file extension changed to .asm.  Here are a few lines of assembly to focus on.

    ; The following local variables in dummyFunctionWithSwitchInside() will be grouped together
    ; to share stack space among distinct scoping blocks.  References
    ; in the source interlisting will look like "O$1.s3_1.u4_2.s4_3.l4_4" or "&$O$O1+0".
    ;
    ;    --offset--    --reference--		 --variable--
    ;
    ;         0	  O$1.s3_1.u4_2.s4_3.l4_4        struct $$fake0 dummyInstance  [file.c:54]
    ;         0	  O$1.s3_1.u4_2.s4_5.l4_6        struct $$fake0 dummyInstance  [file.c:47]
    ;         0	  O$1.s3_1.u4_2.s4_7.l4_8        struct $$fake0 dummyInstance  [file.c:40]
    ;         0	  O$1.s3_1.u4_2.s4_9.l4_10       struct $$fake0 dummyInstance  [file.c:33]
    ;         0	  O$1.s3_1.u4_2.s4_11.l4_12      struct $$fake0 dummyInstance  [file.c:26]
    ;         0	  O$1.s3_1.u4_2.s4_13.l4_14      struct $$fake0 dummyInstance  [file.c:19]
       
        <skip a few lines>
        
    ;*****************************************************************************
    ;* FUNCTION NAME: dummyFunctionWithSwitchInside                              *
    ;*                                                                           *
    ;*   Regs Modified     : A1,V9,SP,LR,SR                                      *
    ;*   Regs Used         : A1,V9,SP,LR,SR                                      *
    ;*   Local Frame Size  : 0 Args + 400 Auto + 4 Save = 404 byte               *
    ;*****************************************************************************
    dummyFunctionWithSwitchInside:
        

    The first set of lines is a comment which describes an optimization the compiler performed.  The final effect is that all of the dummyInstance structure variables occupy the same place on the stack.  Just within this forum thread, let's call this optimization local variable grouping.  The block comment at the start of dummyFunctionWithSwitchInside confirms the total amount of stack used (titled Local Frame Size) is 404 bytes.

    Do the same thing, but with version 20.2.7.LTS.  The comment about local variable grouping is not present.  Here is the comment at the start of the function ...

    ;*****************************************************************************
    ;* FUNCTION NAME: dummyFunctionWithSwitchInside                              *
    ;*                                                                           *
    ;*   Regs Modified     : A1,V9,SP,LR,SR                                      *
    ;*   Regs Used         : A1,V9,SP,LR,SR                                      *
    ;*   Local Frame Size  : 0 Args + 2400 Auto + 4 Save = 2404 byte             *
    ;*****************************************************************************
    dummyFunctionWithSwitchInside:

    Note it uses 2404 bytes of stack.  When --opt_level=1 is used, then local variable grouping happens.

    Why the change?  I don't know the specific reason.  It would take a non-trivial amount of work to determine that.  I can give you a general impression of what probably happened.  Optimizations such as local variable grouping are typically the result of lots of small decisions working together.  Yes, there is a part of the compiler dedicated to this optimization.  I'm sure that version 20.2.7.LTS under --opt_level=0 attempts it.  It is likely some earlier decision somehow causes conditions to be such that the optimization is judged to be unsafe.  Maybe some bug fix caused that change in the earlier decisions.

    Under ideal circumstances, nothing like this happens.  We do run tests that attempt to detect degradations in performance or size between releases.  But it is impossible to detect everything.

    I realize this is not entirely satisfying.  But does it help?

    Thanks and regards,

    -George

  • Thank you, this does paint a clearer picture.

     

    We'll change our settings to "-O1" and update our compiler version. It did not seem to affect our code negatively and the function stack sizes stayed the same or almost the same as in the 18.1.x versions. (There are some cases with differences, mainly 8 bytes, in both directions, which seems acceptable.)

     

    Will you log this as a disclaimer/issue/defect in the release notes?

     

    BR,

    Adam Koros

  • I filed the entry EXT_EP-11105.  You are welcome to follow it with that link.

    Thanks and regards,

    -George