This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Missing inline function code when compiling with cgt 6.4.9

Other Parts Discussed in Thread: TMS320F28377D

Hi,


we are running the following filter function on a F2833x Delfino:


static inline FLOAT32 filter3rdOrder(FilterStruct* FilterToCalc)
{
    FilterToCalc->fl32out =
      FilterToCalc->fl32b0 * FilterToCalc->fl32in
    + FilterToCalc->fl32b1 * FilterToCalc->fl32in_k1
    + FilterToCalc->fl32b2 * FilterToCalc->fl32in_k2
    + FilterToCalc->fl32b3 * FilterToCalc->fl32in_k3
    - FilterToCalc->fl32a1 * FilterToCalc->fl32out_k1
    - FilterToCalc->fl32a2 * FilterToCalc->fl32out_k2
    - FilterToCalc->fl32a3 * FilterToCalc->fl32out_k3;

    //Save outputs
    FilterToCalc->fl32in_k3 = FilterToCalc->fl32in_k2;
    FilterToCalc->fl32in_k2 = FilterToCalc->fl32in_k1;
    FilterToCalc->fl32in_k1 = FilterToCalc->fl32in;
    FilterToCalc->fl32out_k3 = FilterToCalc->fl32out_k2;
    FilterToCalc->fl32out_k2 = FilterToCalc->fl32out_k1;
    FilterToCalc->fl32out_k1 = FilterToCalc->fl32out;

    return FilterToCalc->fl32out;
}


This functions uses the data structure:


typedef struct
{
    FLOAT32 fl32in_k1;  // 0x00
    FLOAT32 fl32in_k2;  // 0x02
    FLOAT32 fl32in_k3;  // 0x04
    FLOAT32 fl32in_k4;  // 0x06
    FLOAT32 fl32in;     // 0x08
    FLOAT32 fl32out_k1; // 0x0a
    FLOAT32 fl32out_k2; // 0x0c 
    FLOAT32 fl32out_k3; // 0x0e
    FLOAT32 fl32out_k4; // 0x10
    FLOAT32 fl32out;    // 0x12
    FLOAT32 fl32a0;
    FLOAT32 fl32a1;
    FLOAT32 fl32a2;
    FLOAT32 fl32a3;
    FLOAT32 fl32a4;
    FLOAT32 fl32b0;
    FLOAT32 fl32b1;
    FLOAT32 fl32b2;
    FLOAT32 fl32b3;
    FLOAT32 fl32b4;
} FilterStruct;

We do apply Compiler optimization:


                  --opt_level=2
                  --opt_for_speed

When we run the filter function with cgt 6.1.6 correct and complete code is generated.
However, when using cgt 6.4.9 (or cgt 15.12.0) compiled code (checked by disassembling the obj or out file) for C instruction "FilterToCalc->fl32out_k2 = FilterToCalc->fl32out_k1;" is missing.
When we run the filter function as non-inlined function correct and complete code is generated even with cgt 6.4.9.

Is this known and expected behavior?


Thx and regards,
Dieter

  • I cannot reproduce this result.  The difference is probably due to other code that is not shown.

    Please preprocess the source file with the problem call to filter3rdOrder, and attach it to your next post.  Also show the build options exactly as the compiler sees them.

    Thanks and regards,

    -George

  • Hello Dieter,

    Is FilterToCalc->fl32out_k2  used anywhere in the code?  It probably optimized it out because FilterToCalc->fl32out_k2 was not used anywhere else in the code.

    Change the function to static inline FLOAT32 filter3rdOrder(volatile FilterStruct * FilterToCalc); and pass it a pomter to a volatile FilterStruct.

    Stephen

  • Hello Stephen,
    fl32out_k2 is not used at any other place in the code.
    The filter function is called periodically and fl32out_k2 is used when the filter function is called next time. Therefore it cannot be optimized out.
    fl32out_k2 usage is not different from fl32out_k1 usage or fl32out_k1 usage (or from fl32in_kx usage).

    Declaring FilterStruct "inline" resolves the problem. But we are wondering which other code might be affected by the related compiler change.

    For providing more code I would need a direct contact.


    Dieter
  • Sorry - declaring FilterStruct "volatile" resolves the Problem.
  • You are correct. It doesn't seem like the optimizer should have removed that line. I am also using 6.4.9, so I am interested in finding out what is causing this issue.

  • Dieter Massa said:
    For providing more code I would need a direct contact.

    You can send the code to me privately.  Let your mouse float over my forum avatar or user name.  A window pops up with a buttons in it.  Click on Send a Private Message.  A message compose interface comes up.  Use the paper clip icon to attach the file.

    Thanks and regards,

    -George

  • Thank you for submitting a test case.  I can reproduce the problem with the missing assignment to some structure members.  I filed SDSCM00052737 in the SDOWP system to have this investigated.  You are welcome to follow it with the SDOWP link below in my signature.

    Thanks and regards,

    -George

  • Meanwhile I did some deeper investigation and I found the missing instruction:

    MOVD32 is used to read the old values of fl32out_k1 and fl32in_k1

    According to SPRUE02B assembler instruction "MOVD32 RaH, mem32" operations are:

    RaH = [mem32]
    [mem32 + 2] = [mem32]

    The problem lies deeper:

    Compiler version 6.4.9 generates the following code for reading fl32out_k2 and fl32out_k1

    0000032f   e318       MOV32        R4H, @0xc
    00000330   040c
    00000331   e223       MOVD32       R3H, @0xa
    00000332   030a

    What we finally see is that both, fl32out_k2 and fl32out_k3 receive the old value of fl32out_k1.

    This can only happen if R4H receives the new value of fl32out_k2 (the value after the MOVD32 partial operation [mem32 + 2] = [mem32] had finished).

    Code we generate with cgt 6.1.6 does not contain MOVD32 instructions.

    fl32in_k2 and fl32in_k3 are handled correctly. In this case there are a couple of other instructions between reading fl32in_k2 ("MOVL ACC, @0x2") and reading fl32in_k1 and moving fl32in_k1 to fl32in_k2 ("MOVD32 R5H, @0x0").  

    So, this looks like a pipeline error.

    Dieter

  • Yes, the compiler is generating an MOVD32 instruction, which is why you don't see an explicit write to fl32out_k2, and this is OK.

    I cannot reproduce the error.  I cannot find any problem with the generated assembly code, and the code executes correctly on an actual TMS320F28377D device.

    I am able to reproduce code which has the instructions "MOV32 R3H,@_stFilter_1+12 || ADDF32 R0H,R0H,R3H" and "MOVD32 R2H,@_stFilter_1+10" back-to-back, but I can't quite reproduce exactly the code sequence you are seeing.  We need a test case which demonstrates the error.  It would help to see the complete generated assembly code for function CtrlFunc.

  • The real problem is not visible in the assembly code.
    It is visible only in the realtime data.
    Expected behavior:
    FilterToCalc->fl32out_k3 = FilterToCalc->fl32out_k2;
    FilterToCalc->fl32out_k2 = FilterToCalc->fl32out_k1;
    FilterToCalc->fl32out_k1 = FilterToCalc->fl32out;

    Observed beavior:
    FilterToCalc->fl32out_k3 = FilterToCalc->fl32out_k1; // <- here is the problem
    FilterToCalc->fl32out_k2 = FilterToCalc->fl32out_k1;
    FilterToCalc->fl32out_k1 = FilterToCalc->fl32out;

    So, basically the real problem is not missing code - it is wrong execution of code.

    I believe there is a problem with executing the sequence
    MOV32 R4H, @0xc
    MOVD32 R3H, @0xa
    - at least on a F28335 microcontroller.
    R4H is initialized with data at offset 0xc (fl32out_k2) after the "[mem+2] = [mem]" partial microcode of the MOVD32 R3H, @0xa
    instruction was executed and not - as intended - before. Thus it is initialized with the original data from offset 0xa (fl32out_k1) which is finally written to offset 0xe (fl32out_k3). fl32out_k2 receives the data from fl32out_k1 via the "[mem+2] = [mem]" partial microcode.


    Dieter
  • I'm sorry, I cannot reproduce an error, nor can I reproduce your assembly exactly.

    Could you please post the full assembly code (or object file) for function CtrlFunc? I would also like to see the complete command-line options used to compile that file.

    Can you set a breakpoint in CtrlFunc and step through the function? Does the problem still happen? At which point do the values diverge from what you expect? Does the problem happen every time the function is called, or just sometimes?
  • We have reproduced the problem also on a F28377D microcontroller.

    Here is the complete object file disassembly for the simplyfied code

    extern FilterStruct stFilter_1;
    
    interrupt void CtrlFunc(void)
    {
        filter3rdOrder(&stFilter_1);
    } 
    


     .sect ".text:retain"
    000000:              _CtrlFunc:
    000000:              .text:retain:
    00000000   761b       ASP         
    00000001   fff0       PUSH         RB
    00000002   abbd       MOVL         *SP++, XT
    00000003   a0bd       MOVL         *SP++, XAR5
    00000004   c2bd       MOVL         *SP++, XAR6
    00000005   c3bd       MOVL         *SP++, XAR7
    00000006   e200       MOV32        *SP++, STF
    00000007   00bd
    00000008   e203       MOV32        *SP++, R0H
    00000009   00bd
    0000000a   e203       MOV32        *SP++, R1H
    0000000b   01bd
    0000000c   e203       MOV32        *SP++, R2H
    0000000d   02bd
    0000000e   e203       MOV32        *SP++, R3H
    0000000f   03bd
    00000010   e203       MOV32        *SP++, R4H
    00000011   04bd
    00000012   e203       MOV32        *SP++, R5H
    00000013   05bd
    00000014   e203       MOV32        *SP++, R6H
    00000015   06bd
    00000016   e203       MOV32        *SP++, R7H
    00000017   07bd
    00000018   e630       SETFLG       RNDF32=1,RNDF64=1
    00000019   0600
    0000001a   2942       CLRC         OVM|PAGE0
    0000001b   5616       CLRC         AMODE
    0000001c   761f       MOVW         DP, #0x0
    0000001d   0000
    0000001e   0602       MOVL         ACC, @0x2
    0000001f   e2af       MOV32        R4H, @0x8, UNCF
    00000020   0408
    00000021   e2af       MOV32        R0H, @0x1e, UNCF
    00000022   001e
    00000023   e2af       MOV32        R1H, @0x20, UNCF
    00000024   0120
    00000025   bda9       MOV32        R6H, ACC
    00000026   0f2a
    00000027   e223       MOVD32       R5H, @0x0
    00000028   0500
    00000029   e301       MPYF32       R0H, R4H, R0H
                       || MOV32        R3H, @0x22
    0000002a   0322
    0000002b   e303       MPYF32       R2H, R5H, R1H
                       || MOV32        R7H, @0x4
    0000002c   5704
    0000002d   e2af       MOV32        R1H, @0x24, UNCF
    0000002e   0124
    0000002f   e741       MPYF32       R3H, R6H, R3H
                       || ADDF32       R0H, R0H, R2H
    00000030   00f3
    00000031   e2af       MOV32        R6H, @0x16, UNCF
    00000032   0616
    00000033   e316       ADDF32       R0H, R0H, R3H
                       || MOV32        R3H, @0xc
    00000034   030c
    00000035   e223       MOVD32       R2H, @0xa
    00000036   020a
    00000037   e700       MPYF32       R6H, R2H, R6H
    00000038   0196
    00000039   7700       NOP         
    0000003a   7700       NOP         
    0000003b   e700       MPYF32       R1H, R7H, R1H
    0000003c   0079
    0000003d   bfa7       MOV32        XAR7, R6H
    0000003e   0f2a
    0000003f   e710       ADDF32       R0H, R0H, R1H
    00000040   0040
    00000041   bda7       MOV32        R1H, XAR7
    00000042   0f16
    00000043   7700       NOP         
    00000044   7700       NOP         
    00000045   7700       NOP         
    00000046   c40e       MOVL         XAR6, @0xe
    00000047   e720       SUBF32       R1H, R0H, R1H
    00000048   0041
    00000049   bda6       MOV32        R0H, XAR6
    0000004a   0f12
    0000004b   7700       NOP         
    0000004c   e2af       MOV32        R7H, @0x18, UNCF
    0000004d   0718
    0000004e   e00e       MPYF32       R7H, R3H, R7H
                       || MOV32        @0x0, R4H
    0000004f   fc00
    00000050   e2af       MOV32        R6H, @0x1a, UNCF
    00000051   061a
    00000052   e753       MPYF32       R0H, R0H, R6H
                       || SUBF32       R1H, R1H, R7H
    00000053   9380
    00000054   1e04       MOVL         @0x4, ACC
    00000055   e020       SUBF32       R0H, R1H, R0H
                       || MOV32        @0xe, R3H
    00000056   430e
    00000057   e2af       MOV32        R7H, *--SP, UNCF
    00000058   07be
    00000059   e203       MOV32        @0x12, R0H
    0000005a   0012
    0000005b   e203       MOV32        @0xa, R0H
    0000005c   000a
    0000005d   e2af       MOV32        R6H, *--SP, UNCF
    0000005e   06be
    0000005f   e2af       MOV32        R5H, *--SP, UNCF
    00000060   05be
    00000061   e2af       MOV32        R4H, *--SP, UNCF
    00000062   04be
    00000063   e2af       MOV32        R3H, *--SP, UNCF
    00000064   03be
    00000065   e2af       MOV32        R2H, *--SP, UNCF
    00000066   02be
    00000067   e2af       MOV32        R1H, *--SP, UNCF
    00000068   01be
    00000069   e2af       MOV32        R0H, *--SP, UNCF
    0000006a   00be
    0000006b   e280       MOV32        STF, *--SP
    0000006c   00be
    0000006d   c5be       MOVL         XAR7, *--SP
    0000006e   c4be       MOVL         XAR6, *--SP
    0000006f   83be       MOVL         XAR5, *--SP
    00000070   87be       MOVL         XT, *--SP
    00000071   fff1       POP          RB
    00000072   7617       NASP        
    00000073   7602       IRET        

    This was generated with cl2000 commandline Options:

    --keep_asm --quiet --asm_listing --c_src_interlist --optimizer_interlist --large_memory_model --silicon_version=28 --float_support=fpu32 --unified_memory --symdebug:dwarf --opt_level=2 --opt_for_speed

    and with compiler:

    TMS320C2000 C/C++ Compiler              v6.4.9

    So far we have always observed this problem.

    I.o.w.: There were no situations when the intended operation

    FilterToCalc->fl32out_k3 = FilterToCalc->fl32out_k2;

    was executed.

    Dieter

  • Further Investigation showed that the problem is not caused by the microcontroller but by code generation:
    In the complete application we see the sequence
    00ce51: 761F05CC MOVW DP, #0x5cc
    00ce53: E70001D7 MPYF32 R7H, R2H, R7H
    00ce55: A60A MOVDL XT, @0xa
    00ce56: BDA20F1A MOV32 R2H, @XAR2
    00ce58: BDAC0F16 MOV32 R1H, @XT
    00ce5a: 7700 NOP
    00ce5b: 7700 NOP
    00ce5c: 7700 NOP
    00ce5d: 7700 NOP
    00ce5e: E7000008 MPYF32 R0H, R1H, R0H
    00ce60: 7700 NOP
    00ce61: 7700 NOP
    00ce62: 7700 NOP
    00ce63: BFA60F12 MOV32 @XAR6, R0H
    00ce65: BDA00F12 MOV32 R0H, @XAR0
    00ce67: C50C MOVL XAR7, @0xc
    00ce68: A318 MOVL P, @0x18
    ...
    00ce97: C30E MOVL @0xe, XAR7

    address ce55:
    partial (second) operation of movdl: fl32out_k1 old is copied to fl32out_k2

    address ce67:
    fl32out_k2 (==fl32out_k1 old) is read

    address ce97:
    fl32out_k3 = fl32out_k2 (==fl32out_k1 old)

    So, the problem is caused by the context code.
    In the context another filter function is executed and we see both operations are nested in the assembler code.

    We try to find simple context code creating the problem.


    Dieter
  • I think that the disassembly fragment you're showing as the location of the bug is from an entirely different function.  I cannot find any sequence of instructions resembling that fragment in the disassembly for _CtrlFunc.  Are you absolutely sure that this code is from function _CtrlFunc?

    As evidence of my claim:

    • There is no sequence of 4 NOP instructions in _CtrlFunc.
    • There is no MPYF32 followed immediately by an MOVD32 in _CtrlFunc
    • There is no MOVDL instruction in _CtrlFunc
  • Yes - the disassembly fragment is from our original code - not from CtrlFunc.
    Our original assumption that the problem is caused by the MOV32, MOVD32 sequence was wrong.
    It is really a code generation issue.
    We try to find new and simple CtrlFunc code showing the problem.

    Dieter
  • Understood. Please keep in mind that even if we could say for sure that the problem exists in the compiler, we're not going to be able to analyze or fix it without a C test case that demonstrates the problem.
  • Here is the extracted code causing the Problem:
    /*
    Testcode for cgt649 ff issue
    */

    typedef float FLOAT32;

    typedef struct
    {
    FLOAT32 fl32in_k1; // 0x00
    FLOAT32 fl32in_k2; // 0x02
    FLOAT32 fl32in_k3; // 0x04
    FLOAT32 fl32in_k4; // 0x06
    FLOAT32 fl32in; // 0x08
    FLOAT32 fl32out_k1; // 0x0a
    FLOAT32 fl32out_k2; // 0x0c
    FLOAT32 fl32out_k3; // 0x0e
    FLOAT32 fl32out_k4; // 0x10
    FLOAT32 fl32out; // 0x12
    FLOAT32 fl32a0;
    FLOAT32 fl32a1;
    FLOAT32 fl32a2;
    FLOAT32 fl32a3;
    FLOAT32 fl32a4;
    FLOAT32 fl32b0;
    FLOAT32 fl32b1;
    FLOAT32 fl32b2;
    FLOAT32 fl32b3;
    FLOAT32 fl32b4;
    } FilterStruct;

    typedef struct
    {

    FLOAT32 fl32A;
    FLOAT32 fl32B;
    FLOAT32 fl32C;
    } ABC;


    FilterStruct filter1;
    ABC ABC1;
    ABC ABC2;
    FilterStruct filter2;

    static inline FLOAT32 filter3rdOrder(FilterStruct* FilterToCalc)
    //FLOAT32 filter3rdOrder(FilterStruct* FilterToCalc)
    {
    FilterToCalc->fl32out =
    FilterToCalc->fl32b0 * FilterToCalc->fl32in
    + FilterToCalc->fl32b1 * FilterToCalc->fl32in_k1
    + FilterToCalc->fl32b2 * FilterToCalc->fl32in_k2
    + FilterToCalc->fl32b3 * FilterToCalc->fl32in_k3
    - FilterToCalc->fl32a1 * FilterToCalc->fl32out_k1
    - FilterToCalc->fl32a2 * FilterToCalc->fl32out_k2
    - FilterToCalc->fl32a3 * FilterToCalc->fl32out_k3;

    //Save outputs
    FilterToCalc->fl32in_k3 = FilterToCalc->fl32in_k2;
    FilterToCalc->fl32in_k2 = FilterToCalc->fl32in_k1;
    FilterToCalc->fl32in_k1 = FilterToCalc->fl32in;
    FilterToCalc->fl32out_k3 = FilterToCalc->fl32out_k2;
    FilterToCalc->fl32out_k2 = FilterToCalc->fl32out_k1;
    FilterToCalc->fl32out_k1 = FilterToCalc->fl32out;

    return FilterToCalc->fl32out;
    }


    void CtrlFunc(void)
    {

    ABC2.fl32A = ABC1.fl32A;
    ABC2.fl32B = ABC1.fl32B;
    ABC2.fl32C = ABC1.fl32C;

    filter3rdOrder(&filter1);

    filter3rdOrder(&filter2);
    }


    Dieter
  • Are you compiling the new test case with the same options?
  • Okay! I was able to reproduce and analyze this bug. Thank you for the reproducible test case. It was definitely a novel failure mode. Basically, at a late stage of compilation, the compiler forgets that MOVDL writes to addr+2. I'll have to think about how to fix it.
  • Hi,
    please let me know with which cgt version this bug will be fixed and when these tools will be available.

    thx
    Dieter
  • This bug was fixed in C2000 compiler versions 6.4.10 and 15.12.2.LTS. I don't know exactly when they will be available, but they should both be available in 1-4 weeks.