This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28027: Error compiling __rpt_mov_imm(void * dst , int src , int count ) intrinsic

Part Number: TMS320F28027
Other Parts Discussed in Thread: CONTROLSUITE

Hi all,

I'm trying to use this intrinsic

void * result = __rpt_mov_imm(void * dst , int src , int count );

in such a way

interrupt some_int()
{
...
signed int BL_ZC_HIGH_RPM_ZC_FUNC[BL_ZC_HIGH_RPM_ZC_FUNC_ARRAY_MASK + 1];//[32];


__rpt_mov_imm(BL_ZC_HIGH_RPM_ZC_FUNC, -1, BL_ZC_HIGH_RPM_ZC_FUNC_ARRAY_MASK);
....
}

and get a such compiler error:

>> Compilation failure
_code_/subdir_rules.mk:16: recipe for target '_code_/BLR2_z_cross_evo3_04_07_18.obj' failed
"../_code_/BLR2_z_cross_evo3_04_07_18.c", line 2672: warning #69-D: integer conversion resulted in a change of sign
>> ../_code_/BLR2_z_cross_evo3_04_07_18.c, line 2672:
INTERNAL ERROR: no match for COMMA


This may be a serious problem. Please contact customer support with a
description of this problem and a sample of the source files that caused this
INTERNAL ERROR message to appear.

Cannot continue compilation - ABORTING!

This code compiles witht issues if I try to fill array with negative int.

I need to fill the array with -1, so I've tried -1 as above, 0xFFFF, 65535 - no difference. If the src is positive, it's ok.

Can anybody help me with that

  • Oh, forgot to say, I'm using CCS Version: 8.3.0.00009
    and a compiler TI v 18.1.4.LTS
  • Thank you for reporting the problem.  I can reproduce the same behavior.  I filed CODEGEN-5862 in the SDOWP system to have this investigated.  You are welcome to follow it with SDOWP link below in my signature.

    As a workaround, consider replacing the intrinsic with a loop that does the same thing ...

    for (i = 0; i <= BL_ZC_HIGH_RPM_ZC_FUNC_ARRAY_MASK; i++) BL_ZC_HIGH_RPM_ZC_FUNC[i] = -1;

    If you build with optimization --opt_level=2 or higher, the compiler generates the same code as the intrinsic.

    Thanks and regards,

    -George

  • Hi George,

    Thanks for the reply, but

    unfortunately no, compiler does not generate the same intrinsic.

    I use opt level=5, and compiler generates for the

    	    for(i = 0; i <= BL_ZC_HIGH_RPM_ZC_FUNC_ARRAY_MASK; i++)
    	    {
    	        BL_ZC_HIGH_RPM_ZC_FUNC[i] = -1;             
    	    }
    

    this code:

    MOVL XAR4,#_BL_ZC_HIGH_RPM_ZC_FUNC ; [CPU_ARAU] 
    
    MOVB XAR0,#7 ; [CPU_ALU] 
    $C$L105: 
    .dwpsn file "../_code_/BLR2_z_cross_evo3_04_07_18.c",line 2674,column 10,is_stmt,isa 0
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    MOV *XAR4++,#-1 ; [CPU_ALU] |2674| 
    .dwpsn file "../_code_/BLR2_z_cross_evo3_04_07_18.c",line 2672,column 17,is_stmt,isa 0
    BANZ $C$L105,AR0-- ; [CPU_ALU] |2672|

    Any other ideas?

    I have another optimization issue - can you please suggest best code(C-code/intrinsic/asm) for that if-else if structure

    {
    ... signed int var; ... if(var > 0) { var = 1; } else if(var < 0) { var = -1; } ...

    }

    Thanks in advance.

  • Konstantin Shirokov said:
    unfortunately no, compiler does not generate the same intrinsic.

    I should have explained the workaround better.  I don't have all of your code and build options.  So, I cannot reproduce exactly what you see.  For the simple test case I wrote, I get the same assembly code as the intrinsic.

    Konstantin Shirokov said:
    Any other ideas?

    Try changing the build options.  Use different levels of optimization.  Or, if you use it, different values for the switch --opt_for_speed.  

    Konstantin Shirokov said:
    can you please suggest best code(C-code/intrinsic/asm) for that if-else if structure

    There is no intrinsic which performs that operation.  If you build with optimization, I expect you would get code that is as good as hand-written assembly.  Is that the case?

    Thanks and regards,

    -George

  • Hi George,

    what compiler version do you use?

    My version is 18.1.4.LTS and this is the compiler command line:

    -v28 -ml -mt -O4 --opt_for_speed=5 --include_path="..." --include_path="..." --include_path="..." --include_path="..." --include_path="..." --advice:performance=all --diag_warning=225 --diag_wrap=off --display_error_number -k

    As you see, I use opt_level = 4 and speed vs size trade-off = 5

    As to second question - my goal for optimization if-else if structure is to eliminate conditional jumps - just flat code.

    for instance, if I increment circular pointer for an array of arbitrary length i should wrote:

    Uint16 ptr;
    ...
    
    ptr++;
    
    if(ptr >= N)
    {
    ptr = 0;
    }

    but if I use array of 2^N elements, I can do better:

    Uint16 ptr;
    ....
    ptr = (ptr + 1) & (N-1);
    ....

    So maybe I can above mentioned if-else if structure compute in analogous flat way.?

  • Ups, sorry, last example
    ptr = (ptr + 1) & (2^N-1);
    ofcource
  • Konstantin Shirokov said:
    what compiler version do you use?

    The same one.  Version 18.1.4.LTS.  

    You want me to find a workaround that is specific to your code. To do that, I need a test case from you which guarantees I see the same output for the same reason.  So, for the source file which contains the problem __rpt_mov_imm, please follow the directions in the article How to Submit a Compiler Test Case.

    Konstantin Shirokov said:
    As to second question - my goal for optimization if-else if structure is to eliminate conditional jumps - just flat code.

    The C28x CPU has conditional instructions that work like that.  And the compiler uses them when appropriate.  I would expect the compiler to use the conditional instructions in this case.  Is it emitting branches instead?

    Thanks and regards,

    -George

  • Hello George,

    Ok, I'll submit a code little bit later.

    As to  my second question - yes, compiler generates B instruction. 

    So I found a solution. Maybe it's not the best, but it's faster then compiler generated if-elseif structure, do not empy pipeline and have the same clock cycles for positive and negative case:

                    /*if(zc_func > 0)
                    {
                        zc_func = 1;
                    }
                    else if(zc_func < 0)
                    {
                        zc_func = -1;
                    }*/
                    //use this intrinsic. semantic is the same
                    zc_func = __min(zc_func, 1);        //10 cpu ticks
                    zc_func = __max(zc_func, -1);

  • So that I can recommend a specific workaround, I'd appreciate the test case I requested.

    Thanks and regards,

    -George

  • "C:/ti_ccs8/ccsv8/tools/compiler/ti-cgt-c2000_18.1.4.LTS/bin/cl2000" -v28 -ml -mt -O4 --opt_for_speed=5 --include_path="C:/Users/user/Desktop/WinApplication/BLR_PC_ccs8/BLR2" --include_path="C:/ti_ccs8/ccsv8/tools/compiler/ti-cgt-c2000_18.1.4.LTS/include" --include_path="C:/Users/user/Desktop/WinApplication/BLR_PC_ccs8/BLR2/_code_/Headers_from_controlSUITE/include_defs" --include_path="C:/Users/user/Desktop/WinApplication/BLR_PC_ccs8/BLR2/_code_/Headers_from_controlSUITE" --include_path="C:/Users/user/Desktop/WinApplication/BLR_PC_ccs8/BLR2/_code_/Headers_from_controlSUITE/include" --advice:performance=all --preproc_with_comment --preproc_with_compile --diag_warning=225 --diag_wrap=off --display_error_number -k --obj_directory="_code_" "../_code_/BLR2_main.c"
    Finished building: "../_code_/BLR2_main.c"

    comliler version 18.1.4.LTS

    BLR2_main.pp.txt

  • Thank you for the test case.  Here is a workaround to consider.  Change the value of the option --opt_for_speed to 3 or lower. That causes the problem loop to be implemented with RPT.    

    Thanks and regards,

    -George

  • But how lowering opt level willl impact on the rest of the code? Will not that lead to degradation of the overall performance? And can you please suggest how I can measure it?

    Is there another compiler for that micro? Coz honestly speaking, seeing generated code, I think TIs compiler do not provide optimization as I expect after reading manual..

  • George, and can you please ask TI compiler developers what is the logic of that compiler behavior? Why at maximum opt level it generates less efficient code? How should I use opt level parameter further??
    Thank you very much.
  • Konstantin Shirokov said:
    But how lowering opt level willl impact on the rest of the code?

    Lowering the value of the option --opt_for_speed only affects the trade-off between speed and size.  It does not affect overall optimization, which is controlled by the option --opt_level.

    Konstantin Shirokov said:
    Is there another compiler for that micro?

    No.

    Thanks and regards,

    -George

  • "Lowering the value of the option --opt_for_speed only affects the trade-off between speed and size.  It does not affect overall optimization, which is controlled by the option --opt_level."

    That's not quite clear for me... What do you mean "overal optimization"? The statement "only affects the trade-off between speed and size" contradicts to "does not affect overall optimization". How it can be? what is the logic behind --opt_for_speed settings?

    What setting should I use to get fastest possible code? --opt_for_speed = 3 or --opt_for_speed = 5 and why?

  • Please see if this description of the options --opt_for_speed and --opt_level is useful.

    Thanks and regards,

    -George