This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MSP430-GCC generates poor code

Other Parts Discussed in Thread: MSP430G2553

I've been using gcc for the last couple of months to program MSP430 microcontrollers, and have noticed that the compiler often produces suboptimal code. For example, something as simple as

if (value & 0x80) {}

gets compiled (-O3 or -Os) into

  1. a right shift by 7
  2. a logical AND with 0xff

Of course, because MSP430 does not support shifts, gcc links about 200 bytes of extra code, and executes a total of ~30 instructions. 

The straightforward implementation would be to just do the AND with 0x80 and compare the result to zero, or if one wanted to be clever, use the BIT instruction. 

It is my understanding that GCC decides what instructions to use partially based on the TARGET_RTX_COSTS function in msp430.c:

#define TARGET_RTX_COSTS msp430_rtx_costs

static bool msp430_rtx_costs (rtx   x ATTRIBUTE_UNUSED,
                              int   code,
                              int   outer_code ATTRIBUTE_UNUSED,
                              int   opno ATTRIBUTE_UNUSED,
                              int * total,
                              bool  speed ATTRIBUTE_UNUSED)
{
  switch (code)
    {
    case SIGN_EXTEND:
      if (GET_MODE (x) == SImode && outer_code == SET)
        {
          *total = COSTS_N_INSNS (4);
          return true;
        }
      break;
    case ASHIFT:
    case ASHIFTRT:
    case LSHIFTRT:
      if (!msp430x)
        {
          *total = COSTS_N_INSNS (100);
          return true;
        }
      break;
    }
  return false;
}

This seems to penalize shifts, and I would expect gcc to avoid them in the case I mentioned above. However, if I change the costs in that function to anything else, such as 1 or 100000, and recompile my binary, there is absolutely no change (same md5 sum). So somehow it appears that the TARGET_RTX_COSTS function has no effect. Does anyone know why this might be happening?

Also, even if TARGET_RTX_COSTS worked correctlly, it is virtually empty. For example, it makes no mention of the fact that multiplication is not supported in hardware and should be avoided. The function is 19 lines long; by comparison, for AVR microcontrollers, the equivalent is 822 lines long!

Are the maintainers of MSP430-GCC planning to invest in this area in the near future? A quick start would be to take the TARGET_RTX_COSTS function from mspgcc (MSP430-GCC's predecessor), which is about 130 lines long.

Just to be clear, I realize that this project just started recently, is in beta, and is open source. I do not mean to criticize it in any way.

Thanks!

  • To reproduce (main.cc):

    bool Test() {
      char buffer[1];
      return (buffer[0] & 0x80);
    }
    
    int main() {
      return Test();
    }

    /opt/msp430/bin/msp430-elf-gcc -mmcu=msp430g2553 -O2 -o main.elf main.cc

    /opt/msp430/bin/msp430-elf-objdump -dS main.elf | less

    0000c150 <_Z4Testv>:
        c150:       04 12           push    r4              ;
        c152:       04 41           mov     r1,     r4      ;
        c154:       0c 43           clr     r12             ;
        c156:       b0 12 90 c1     call    #49552          ;#0xc190
        c15a:       34 41           pop     r4              ;
        c15c:       30 41           ret                     
    
    0000c15e <main>:
        c15e:       04 12           push    r4              ;
        c160:       04 41           mov     r1,     r4      ;
        c162:       0c 43           clr     r12             ;
        c164:       b0 12 90 c1     call    #49552          ;#0xc190
        c168:       3c f0 ff 00     and     #255,   r12     ;#0x00ff
        c16c:       34 41           pop     r4              ;
        c16e:       30 41           ret               
    
    
    0000c190 <__mspabi_srli_7>:
        c190:       12 c3           clrc                    
        c192:       0c 10           rrc     r12             ;
    
    0000c194 <__mspabi_srli_6>:
        c194:       12 c3           clrc                    
        c196:       0c 10           rrc     r12             ;
    
    0000c198 <__mspabi_srli_5>:
        c198:       12 c3           clrc                    
        c19a:       0c 10           rrc     r12             ;
    
    0000c19c <__mspabi_srli_4>:
        c19c:       12 c3           clrc                    
        c19e:       0c 10           rrc     r12             ;
    
    0000c1a0 <__mspabi_srli_3>:
        c1a0:       12 c3           clrc                    
        c1a2:       0c 10           rrc     r12             ;
    
    0000c1a4 <__mspabi_srli_2>:
        c1a4:       12 c3           clrc                    
        c1a6:       0c 10           rrc     r12             ;
    
    0000c1a8 <__mspabi_srli_1>:
        c1a8:       12 c3           clrc                    
        c1aa:       0c 10           rrc     r12             ;
        c1ac:       30 41           ret                     
        c1ae:       3d 53           add     #-1,    r13     ;r3 As==11
        c1b0:       12 c3           clrc                    
        c1b2:       0c 10           rrc     r12             ;
    
    0000c1b4 <__mspabi_srli>:
        c1b4:       0d 93           cmp     #0,     r13     ;r3 As==00
        c1b6:       fb 23           jnz     $-8             ;abs 0xc1ae
        c1b8:       30 41           ret                     
    

    /opt/msp430/bin/msp430-elf-gcc -v
    Using built-in specs.
    COLLECT_GCC=/opt/msp430/bin/msp430-elf-gcc
    COLLECT_LTO_WRAPPER=/opt/msp430/libexec/gcc/msp430-elf/4.9.1/lto-wrapper
    Target: msp430-elf
    Configured with: ../sources/tools/configure --prefix=/opt/msp430 --target=msp430-elf --disable-itcl --disable-tk --disable-tcl --disable-libgui --disable-gdbtk
    Thread model: single
    gcc version 4.9.1 20140707 (prerelease (msp430-14r1-364)) (GNUPro 14r1) (Based on: GCC 4.8 GDB 7.7 Binutils 2.24 Newlib 2.1) (GCC)

  • Andrew,

    Thanks for the feedback and the detailed analysis.  We will get this on the list of features for next release.

    Thanks
    Greg

  • Here is an even simpler case, where we do a simple conversion from an int to bool (happens all the time in typical code).

    main.cc:

    int main() {
      volatile unsigned int v = 100;
      volatile bool i = v;
    }
    

    /opt/msp430/bin/msp430-elf-gcc -mmcu=msp430g2553 -O3 main.cc -o main.elf

    0000c150 <main>:
        c150:	04 12       	push	r4		;
        c152:	04 41       	mov	r1,	r4	;
        c154:	21 82       	sub	#4,	r1	;r2 As==10
        c156:	b4 40 64 00 	mov	#100,	-2(r4)	;#0x0064, 0xfffe
        c15a:	fe ff 
        c15c:	1c 44 fe ff 	mov	-2(r4),	r12	;
        c160:	0d 43       	clr	r13		;
        c162:	0d 8c       	sub	r12,	r13	;
        c164:	0c dd       	bis	r13,	r12	;
        c166:	b0 12 76 c1 	call	#49526		;#0xc176
        c16a:	c4 4c fd ff 	mov.b	r12,	-3(r4)	; 0xfffd
        c16e:	0c 43       	clr	r12		;
        c170:	21 52       	add	#4,	r1	;r2 As==10
        c172:	34 41       	pop	r4		;
        c174:	30 41       	ret			
    
    0000c176 <__mspabi_srli_15>:
        c176:	12 c3       	clrc			
        c178:	0c 10       	rrc	r12		;
    
    0000c17a <__mspabi_srli_14>:
        c17a:	12 c3       	clrc			
        c17c:	0c 10       	rrc	r12		;
    
    0000c17e <__mspabi_srli_13>:
        c17e:	12 c3       	clrc			
        c180:	0c 10       	rrc	r12		;
    
    0000c182 <__mspabi_srli_12>:
        c182:	12 c3       	clrc			
        c184:	0c 10       	rrc	r12		;
    
    0000c186 <__mspabi_srli_11>:
        c186:	12 c3       	clrc			
        c188:	0c 10       	rrc	r12		;
    
    0000c18a <__mspabi_srli_10>:
        c18a:	12 c3       	clrc			
        c18c:	0c 10       	rrc	r12		;
    
    0000c18e <__mspabi_srli_9>:
        c18e:	12 c3       	clrc			
        c190:	0c 10       	rrc	r12		;
    
    0000c192 <__mspabi_srli_8>:
        c192:	12 c3       	clrc			
        c194:	0c 10       	rrc	r12		;
    
    0000c196 <__mspabi_srli_7>:
        c196:	12 c3       	clrc			
        c198:	0c 10       	rrc	r12		;
    
    0000c19a <__mspabi_srli_6>:
        c19a:	12 c3       	clrc			
        c19c:	0c 10       	rrc	r12		;
    
    0000c19e <__mspabi_srli_5>:
        c19e:	12 c3       	clrc			
        c1a0:	0c 10       	rrc	r12		;
    
    0000c1a2 <__mspabi_srli_4>:
        c1a2:	12 c3       	clrc			
        c1a4:	0c 10       	rrc	r12		;
    
    0000c1a6 <__mspabi_srli_3>:
        c1a6:	12 c3       	clrc			
        c1a8:	0c 10       	rrc	r12		;
    
    0000c1aa <__mspabi_srli_2>:
        c1aa:	12 c3       	clrc			
        c1ac:	0c 10       	rrc	r12		;
    
    0000c1ae <__mspabi_srli_1>:
        c1ae:	12 c3       	clrc			
        c1b0:	0c 10       	rrc	r12		;
        c1b2:	30 41       	ret			
    

    Here is what this code does to convert an int to a bool:

    • Copies the register to be converted to R12
    • Clears R13 (R13 = 0)
    • Subtracts R12 from R13. This essentially sets the high bit of R13 if R12 is not zero
    • Copies set bits from R13 back to R12 (BIS R13, R12).
    • Shifts R12 right 15 times via a call to __mspabi_srli_15
    • Finally, copies R12 to the output register

    This is a total of 37 instructions (!) to convert an integer to a bool.

    What the code should be doing:

    # Input integer in R4
    MOV #0, R5
    CMP R4, #0
    JEQ $+2
    MOV #1, R5
    # Bool value is now in R5

    4 instructions instead of 37.

    My workaround for now is to define a ToBool() function that inlines the four assembly instructions above, and to use that function explicitly for all bool conversions.