This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/MSP430G2553: TI linker doesn't alias function sections?

Part Number: MSP430G2553

Tool/software: TI C/C++ Compiler

I can't seem to figure if I can get the TI linker (MSP430 toolchain) to alias function sections.  This results in massive output bloat when using templatized code.  Consider the following trivial example:

#include <msp430.h> 
#include <stdint.h>

template <typename T>
const T& min(const T& a, const T& b) {
    return a < b ? a : b;
}

volatile uint16_t x = 1, y = 2, z;
volatile void* p1 = (void*)0xbeef;
volatile void* p2 = (void*)0xbeef;
volatile void* p;


/**
 * main.c
 */
int main(void)
{
	WDTCTL = WDTPW | WDTHOLD;	// stop watchdog timer
	
	z = min(x, y);
	p = min(p1, p2);

	return 0;
}

Here, the two min functions are byte-by-byte identical, since comparing pointers is exactly the same as comparing unsigned integers of the same size.

Compiling the above and checking the output with nm(1) yields:

0000c25c T _Z3minIPVvERKT_S4_S4_
0000c266 T _Z3minIVjERKT_S3_S3_

Clearly one of these function sections can be discarded and the other aliased to it.  Yeah, it would make it difficult to debug, but for release builds I don't care about this and just want to fit it into a smaller and cheaper processor.  I don't care if debugging or even profiling requires a 16k flash device, as long as all this goes away for release builds.

It's pretty easy to verify with an interleaved C source listing that the two functions do in fact emit exactly the same code.

And, while this is a trivial example that would probably go away at higher levels of optimization due to opportunistic inlining.

I've tried disabling all debug information, as you can see from the asm listing below, but it still won't combine sections.

        .sect   ".text:_Z3minIVjERKT_S3_S3_"
        .clink
        .global _Z3minIVjERKT_S3_S3_

;*****************************************************************************
;* FUNCTION NAME: const T1 & min<volatile unsigned int>(const T1 &, const T1 &)*
;*                                                                           *
;*   Regs Modified     : SP,SR,r12                                           *
;*   Regs Used         : SP,SR,r12,r13                                       *
;*   Local Frame Size  : 0 Args + 0 Auto + 0 Save = 0 byte                   *
;*****************************************************************************
_Z3minIVjERKT_S3_S3_:
;* --------------------------------------------------------------------------*
;----------------------------------------------------------------------
;   5 | const T& min(const T& a, const T& b) {
;----------------------------------------------------------------------
;----------------------------------------------------------------------
;   6 | return a < b ? a : b;
;----------------------------------------------------------------------
        CMP.W     @r13,0(r12)           ; [] |6|
        JLO       $C$L1                 ; [] |6|
                                          ; [] |6|
;* --------------------------------------------------------------------------*
        MOV.W     r13,r12               ; [] |6|
;* --------------------------------------------------------------------------*
$C$L1:
        RET       ; []
        ; []
        .sect   ".text:_Z3minIPVvERKT_S4_S4_"
        .clink
        .global _Z3minIPVvERKT_S4_S4_

;*****************************************************************************
;* FUNCTION NAME: const T1 & min<volatile void *>(const T1 &, const T1 &)    *
;*                                                                           *
;*   Regs Modified     : SP,SR,r12                                           *
;*   Regs Used         : SP,SR,r12,r13                                       *
;*   Local Frame Size  : 0 Args + 0 Auto + 0 Save = 0 byte                   *
;*****************************************************************************
_Z3minIPVvERKT_S4_S4_:
;* --------------------------------------------------------------------------*
;----------------------------------------------------------------------
;   5 | const T& min(const T& a, const T& b) {
;----------------------------------------------------------------------
;----------------------------------------------------------------------
;   6 | return a < b ? a : b;
;----------------------------------------------------------------------
        CMP.W     @r13,0(r12)           ; [] |6|
        JLO       $C$L2                 ; [] |6|
                                          ; [] |6|
;* --------------------------------------------------------------------------*
        MOV.W     r13,r12               ; [] |6|
;* --------------------------------------------------------------------------*
$C$L2:
        RET       ; []
        ; []

Again, for things like non-trivial templatized base classes the bloat is absolutely massive, making the toolchain practically unusable.

  • If you build with --opt_level=3 or higher, then both calls to min are inlined, and no min functions are generated.  Is this a satisfactory solution?

    Thanks and regards,

    -George

  • I just used this to demonstrate the problem.  I explicitly didn't enable optimization so it wouldn't be hidden by inlining.  For example, in my project right now I have:

    typedef Uart<UCA0> Serial;
    typedef I2CBus<UCB0, I2C_BUS_SPEED> I2CBusType;
    typedef I2CDevice<I2CBusType, UCB0> I2CDeviceType;
    typedef ads1115::ADC<I2CDeviceType> ADS1115;
    typedef ad5667r::DAC<I2CDeviceType> AD5667;
    typedef n24lc04::Eeprom<I2CDeviceType> Eeprom; 
    ...

    And more.

    Inlining doesn't do much here.
  • The compiler toolchain does not do any analysis to find functions that have the same assembly code, and then combine them. 

    I'd appreciate if we could get a test case that would benefit from that analysis.  If your code is organized as a CCS project, and you are willing to send it in, please package it into a zip file according to the article Project Sharing, and attach it to your next post.  If you aren't comfortable with that, perhaps you have an instance of it that occurs in a single C++ source file, and you are willing to submit it as detailed in How To Submit A Compiler Test Case.  In either case, we would need guidance from you as to which template instantiations lead to code bloat.

    Thanks and regards,

    -George

  • Will do - I'll try to create something reasonably meaningful and self-contained in a single source file...
  • Have you had a chance to create a test case?

    Thanks and regards,

    -George
  • Since it has been a while, I presume you have resolved the problem. I would appreciate hearing how you resolved it.

    Thanks and regards,

    -George
  • The problem turned out to be related to the symbol names used for template classes.

    If X and Y are both derived from a template T, then when X instantiates T the names generated included X - basically the functions became X::A::f().  Same with Y; Y::A::f().  So they became different symbols.  With gcc they are named only A::f() regardless of whether used from X or Y, and the linker then eliminates the duplicate, leaving only one instance in the output.  There are also differences in instantiation strategy - gcc is very greedy and emits the same code in just about every .o file it produces, relying on the linker to eliminate duplicates.  The TI compiler is more deliberate.  I'm not sure which is better (they both have their advantages), but in this case gcc's strategy worked better so we just switched to it.  13k shrank to about half that, or 7k or so.

    It's really hard to reproduce with a small test case, because in MOST cases the TI compiler generates symbols similar to gcc.  I couldn't really come up with a really small test case for it...

    Still, aliasing read-only sections with duplicate contents, even if they're named differently would be a nice feature. :)

  • I meant X::T::f(), Y::T::f(), and T::f(), not A::f().