Compiler/TMS320F28377D: Compiler does not emit conditional moves of floating-point registers

IainRist

Intellectual 711 points

Part Number: TMS320F28377D

Tool/software: TI C/C++ Compiler

The C28x+FPU supports conditional moves of floating-point registers via the following instructions:

MOV32 RaH, RbH {, CNDF}

MOV32 RaH, mem32 {, CNDF}

MOV32 mem32, RbH {, CNDF}

The compiler (v18.12.1.LTS) will not emit these instructions.

For example, the code:

void fp_conditional_move(float *a, float b)
{
*a = ( b == .5f ) ? -.5f : b;
}

generates the following assembly (-O2, -O3, and -O4)(trimmed for brevity):

||fp_conditional_move||:

        ADDB      SP,#2                 ; [CPU_ARAU]
        CMPF32    R0H,#16128            ; [CPU_FPU] |3|
        MOVST0    ZF, NF                ; [CPU_FPU] |3|
        B         ||$C$L1||,NEQ         ; [CPU_ALU] |3|
MOVIZ     R0H,#48896            ; [CPU_FPU] |3|
||$C$L1||:
        SUBB      SP,#2                 ; [CPU_ARAU]
        MOV32     *+XAR4[0],R0H         ; [CPU_FPU] |3|
        LRETR     ; [CPU_ALU]

Including the superfluous (?) stack pointer operations, this will typically take 12 cycles. I'm assuming the branch doesn't cause a prefetch miss which would add additional Flash wait-state cycles.

This looks like inefficient code to me. I expected a conditional overwrite of the R0H register with the value -0.5f, i.e. (my comments):

||fp_conditional_move||:
        ; *XAR4 = (R0H == .5f) ? -.5f : R0H
        MOVIZ     R1H,#48896            ; R1H = -.5f (R1H is caller-saved, i.e. trash)
        CMPF32    R0H,#16128            ; R0H == .5f (sets ZF if equal)
        MOV32     R0H, R1H, EQ          ; if (R0H == .5f) R0H = R1H
        MOV32     *+XAR4[0],R0H         ; *XAR4 = R0H
        LRETR     ; [CPU_ALU]

This code is branchless and takes 4 cycles, a speed up of 200% over the compiler-generated branching code.

Is my conditional move code correct?

Why does not compiler not emit conditional moves?

I have previously worked on C6000 cores where the compiler is very happy to emit conditional instructions.

over 5 years ago

0 George Mock over 5 years ago

TI__Guru**** 239845 points

This does appear to be a performance problem in the compiler. I filed the entry CODEGEN-6788 in the SDOWP system to have this investigated. It is not filed as a bug, because the code executes correctly. Instead, it asks for the compiler to be changed to emit faster code for this input. Sometimes, the requested performance improvement is not possible, or impractical. If this turns out to be the case, an explanation will be given.

You are welcome to follow it with the SDOWP link below in my signature.

Thanks and regards,

-George

0 Anna Youssefi over 4 years ago in reply to George Mock

TI__Intellectual 2560 points

Hi,

Where are you seeing the conditional store instruction listed? The floating point architecture reference guide we have only lists conditional floating point loads and register to register moves, but no conditional floating point store instructions.

Thanks,

Anna Youssefi

Compiler Support Team

0 IainRist over 4 years ago in reply to Anna Youssefi

Intellectual 711 points

Hello Anna,

Looks like I imagined the "MOV32 mem32, RaH {, CNDF}" instruction. I cannot find it in my documentation and I have confirmed my assembler (18.12.1.LTS) emits an error on that instruction. I probably assumed it existed for symmetry reasons.

I have confirmed the assember accepts "MOV32 RaH, mem32 {, CNDF}" and "MOV32 RaH, RbH {, CNDF}" instructions.

Thanks,

Iain

C2000™︎ microcontrollers

C2000 microcontrollers forum

Compiler/TMS320F28377D: Compiler does not emit conditional moves of floating-point registers