This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MSP430FR2422: MSP430-GCC 8.3.1.0: Unused bit shift functions in compilation unit

Part Number: MSP430FR2422

When I incorporate say a 2 bit left shift in my code, the resulting .o file contains the entire library of shift functions and multiply functions.

The FR2422 has a hardware multiplier so I am not sure why those are pulled in.

These are my args to the compiler and linker

CFLAGS = -mmcu=$(MCU) -std=gnu11 -g -Os -fno-ipa-icf-functions \
-Wall -Wunused -ffunction-sections -fdata-sections -minrt -fomit-frame-pointer -fwrapv -MMD \
$(INCLUDES) $(USER_DEFINES)


ASFLAGS = -mmcu=$(MCU) -x assembler-with-cpp -Wa,-gstabs


LDFLAGS = -nostartfiles -Wl,--gc-sections,-Map=$(TARGET).map,-umain,-L$(TRIQDIR)/$(ARCH)/linker

The total of functions incorporated into the .o total 0x3c4 (964 bytes).

I have a custom bootloader/fw upload system on the FR2422 leaving only 4.8k left for application code space.

I have to trim this waste down.

Note that the final code does not run like a "normal" MSP430 executable, in that there is no main() function.

The executable is not invoked by the reset interrupt, but rather a FW loader.

Given that this code does not follow the typical formula, is there a simple linker flag that can pare down these unused functions?

Or some way to exclude these extra symbols to mitigate this in the short term?

Thanks,

Dave

  • What version of MSP430-GCC are you using? This sounds like an issue that was present in MSP430-GCC 7.x, but has been fixed since MSP430-GCC 8.2.0.

    If you are using the latest version, please can you provide a reproducible test case. I wasn't able to reproduce this with your given FLAGS and putting together a small source file that uses a left shift.

    Thanks,

  • Apologies, I now see you listed MSP430-GCC 8.3.1 in the title.

    In that case, please provide a test case if possible.

    Thanks,

  • Here's what I tried:

    #include "msp430.h"
    
    int a = 2;
    int b = 4;
    
    void __attribute__((noinline))
    shifter (int c)
    {
      b = c << 2;
    }
    
    void __attribute__((interrupt(TIMER0_A0_VECTOR)))
    isr (void)
    {
      a++;
      shifter (a);
    }
    

    I can see that "__mspabi_slli" is used to perform the shift, and the linker does initially pull in other shift functions which are in the same object file. But it discards them thanks to --gc-sections:

    Discarded input sections                                                           
                                                                                       
     .text          0x0000000000000000        0x0 tester.o                             
     .data          0x0000000000000000        0x0 tester.o                             
     .bss           0x0000000000000000        0x0 tester.o                             
     .text          0x0000000000000000        0x0 /home/jozef/msp430-gcc-8.3.1.25_linux
     .data          0x0000000000000000        0x0 /home/jozef/msp430-gcc-8.3.1.25_linux
     .bss           0x0000000000000000        0x0 /home/jozef/msp430-gcc-8.3.1.25_linux
     .text.__mspabi_slli_n                                                             
                    0x0000000000000000       0x20 /home/jozef/msp430-gcc-8.3.1.25_linux
     .text.__mspabi_slll_n                                                             
                    0x0000000000000000       0x3e /home/jozef/msp430-gcc-8.3.1.25_linux
     .text.__mspabi_slll                                                               
                    0x0000000000000000        0xc /home/jozef/msp430-gcc-8.3.1.25_linux
     .text.__mspabi_sllll                                                              
                    0x0000000000000000       0x1e /home/jozef/msp430-gcc-8.3.1.25_linux

  • Josef,

      I did a little more digging and I have missed the implicit multiply here:

    struct port {

      ....

    } p[6];

    ...

    struct port p = ports + i;

    Because the size of struct port is not a power of 2, the multiply routines are brought in.

    Even the f5 routines.

    That's when everyone else gets included too.

  • As I only have 6 port structures, I made a switch statement to select the offset instead of using the implicit multiply.

    The code usage shrank from 3504 to 2646, a reduction of 898 bytes.

    So I do believe it is the implicit pointer offset multiply

  • Ok, so it doesn't sound like there is still a problem?

    I'm aware of some possible issues with the naming of hardware multiply routines; sometimes they are inconsistent. For the F5 routines, I can see that some of the names end in "_f5" and some don't. There shouldn't be any functional issue with this though.

    As a side note, I noticed you don't have -mmcu in your LDFLAGS, I'm guessing that when you actually invoke the linker -mmcu is there from some other Make variable, otherwise I think you'd have bigger problems. But just in case I would just say to make sure that -mmcu is on the command line every time you invoke msp430-elf-gcc.

    Regards,

  • My initial concern about the bit shift was wrong for sure.

    While I can progress now with my workaround, I would say that pulling in 900 more bytes into your executable because of a pointer offset multiply is somewhere between "surprising" and "bug", especially when the target platform supports HW multiply.

    So I would say my issue has a workaround, but not really resolved.

  • Ok, but I do need a test case that I can build myself along with the command line invocations of the tools.

    When I perform some multiplication in my test case I can see that the linker garbage collects the unused multiply routines.

    Thanks,

  • I see what's going on, let me explain my specialized situation.

    Most likely you are building a full test application, including the final link.

    Then I can assume that this is not an issue for you as you have garbage collection.

    I am building an application to be loaded at runtime on the processor so I do not perform a final link on my application.

    I produce a .o file, export it as hex and FW loader takes the hex and loads it.

    Thus I do not have a garbage collection stage in build process.

    Is there a way to garbage collect without requiring a full application link?

  • I'm afraid I don't fully understand how this all fits together.

    The ".o" file is the output of the assembler, which contains only the references to shift library functions and hardware multiply library functions. Is your loader satisfying these references by pulling code from the libraries?

    What are the LDFLAGS in your original post used for? You have garbage collection (--gc-sections) enabled.

    What if you perform a partial/relocatable link by passing "-r" to the linker? There is some guidance on --gc-sections with relocatable links here: https://sourceware.org/binutils/docs-2.26/ld/Options.html#Options

    Your loader may then be able to use this partially linked output as it uses the ".o" assembler output.

    Regards,

  • I am sorry I am doing a poor job explaining a process that I hacked together from my limited understanding of the tools.

    There are 2 main software components on the target processor:

    -A resident process running on the target processor that handles I2C comms and application SW uploads

    -The application SW itself

    (I am sorry for the misleading LDFLAGS, they are never invoked by the process but are used by the build for the FW loader running on the target processor)

    I compile my application SW in one command. This actually does some linking to objects present in the resident process. The linker map (application.ld) referenced ensures that all application code resides in a reserved address range. Here is an example of the command.

    /opt/ti/msp430-8.3.1.0/bin/msp430-elf-gcc -mmcu=msp430fr2422 -std=gnu11 -g -Os -fno-ipa-icf-functions -Wall -Wunused -ffunction-sections -fdata-sections -minrt -fomit-frame-pointer -fwrapv -MMD -I . -I ../ -I ../../ -I ../common -I../../../include -I/opt/ti/msp430-6.1.0.0/include -I/opt/GCC_RH/include TBIN.c -o TBIN.o -nostartfiles -T ../../TRIQ/fr2422/linker/application.ld -Wl,-Map=TBIN.map,-L../../TRIQ/fr2422/linker

    In this case, TBIN.o ends up containing all the code needed to run the application, including all the extra shift and multiply functions pulled in.

    Then TBIN.o goes through an objdump pipeline to reduce the .o to a raw binary file, which is then sent to the target processor via I2C.

    So to summarize, the compile and link are done in one step.

    The link doesn't do any garbage collection.

    I would need the link to cull out any extra library functions introduced in this command.

  • I am a little confused by this. The FR2422 has a CPUX core and as such it has native multiple bit shift operations and no need for a shifting library for 16bit ints.

  • Dave Bender said:

    I am sorry I am doing a poor job explaining a process that I hacked together from my limited understanding of the tools.

    There are 2 main software components on the target processor:

    -A resident process running on the target processor that handles I2C comms and application SW uploads

    -The application SW itself

    (I am sorry for the misleading LDFLAGS, they are never invoked by the process but are used by the build for the FW loader running on the target processor)

    Ok, thank you for clarifying.

    Dave Bender said:

    I compile my application SW in one command. This actually does some linking to objects present in the resident process. The linker map (application.ld) referenced ensures that all application code resides in a reserved address range. Here is an example of the command.

    /opt/ti/msp430-8.3.1.0/bin/msp430-elf-gcc -mmcu=msp430fr2422 -std=gnu11 -g -Os -fno-ipa-icf-functions -Wall -Wunused -ffunction-sections -fdata-sections -minrt -fomit-frame-pointer -fwrapv -MMD -I . -I ../ -I ../../ -I ../common -I../../../include -I/opt/ti/msp430-6.1.0.0/include -I/opt/GCC_RH/include TBIN.c -o TBIN.o -nostartfiles -T ../../TRIQ/fr2422/linker/application.ld -Wl,-Map=TBIN.map,-L../../TRIQ/fr2422/linker

    Why can't you just pass -Wl,--gc-sections in this link command? This should fix your problem.

    An alternative "hacky" way round this if you can't use --gc-sections is to maybe use objcopy to only copy the sections you want (the "-j" option) from the linked executable to the "real" executable you'll use. Ultimately this is really unnecessary and gc-sections should be used instead.

    Also, using the ".o" suffix for your linked executable file is a bit confusing since that is reserved for object files output by the assembler. ".out",  ".exe" or ".elf" are common suffixes for the names of linked executable files.

  • David Schultz36 said:
    I am a little confused by this. The FR2422 has a CPUX core and as such it has native multiple bit shift operations and no need for a shifting library for 16bit ints.

    GCC will generate one of the native "rotate" instructions for shift operations when the source and destination register are the same.

    But when the source and destination register are different, it will use the library functions most/all of the time. I believe if you have at least 2 instances of the same type of shift in your code, this saves code size overall.

    I'm not saying this is totally optimal, but it's the way it's always been. The general shift behaviour of the compiler is on my list of things to review and look at improving.

  • Josef,

      gc-sections wiped out most of my application code. I think discussing my issue with you has given me a better insight in how to manage the dependency bloat.

    I do use objdump hacks to get an uploadable binary file and I have to be careful with certain things (initialized data sections, etc).

    Honestly, my app FW has to act more like a dynamic library but this will require some modification on the FW handler side as well.

    I will consider the issue resolved with lessons learned.

  • Dave Bender said:

      gc-sections wiped out most of my application code. I think discussing my issue with you has given me a better insight in how to manage the dependency bloat.

    This can happen if the entry point for the program is not set up properly, although I understand your application does not have a standard set up, and there maybe isn't a single entry point, and some functions may appear unused to the linker, but your application actually does use those functions.

    What I would suggest is making use of KEEP directives in your linker script. 

    When link-time garbage collection is in use (`--gc-sections'), it is often useful to mark sections that should not be eliminated. This is accomplished by surrounding an input section's wildcard entry with KEEP(), as in KEEP(*(.init)) or KEEP(SORT_BY_NAME(*)(.ctors)).

    https://sourceware.org/binutils/docs-2.26/ld/Input-Section-Keep.html#Input-Section-Keep

    You could mark all .text sections originating from your source code to be kept, whilst leaving unused .text sections from the libraries to be garbage collected.

    Only the library functions used by your "kept" sections (and other used sections) will be saved from garbage collection. Those unused and unwanted shift and hwmult library functions will be removed.

    For example:

    .text :
    {
      ....
      KEEP (TBIN.o(.text .text.*))
      ....
    }

    Note TBIN.o is the name of the object file created by the assembler. You'll need to make sure this file is retained by compiling/assembling in a separate step, otherwise GCC will just give the assembler output file a random name in /tmp.

    It may not cause problems but you may also need to rename your linked output file from TBIN.o to something else to TBIN.{elf,out,exe} as suggested before.

    There's useful documentation on linker scripts here if you decided to go this route.

    Dave Bender said:

    I will consider the issue resolved with lessons learned.

    I'm glad these discussions helped,

  • Hacking up the linker script strikes me as a poor way to go. The problem is that the program doesn't use the standard entry point (defined in the linker script as _start) so that the linker can't use that hook to find all referenced sections. There is of course an ld option that helps: "--entry=ENTRY".

  • David Schultz36 said:

    Hacking up the linker script strikes me as a poor way to go. The problem is that the program doesn't use the standard entry point (defined in the linker script as _start) so that the linker can't use that hook to find all referenced sections. There is of course an ld option that helps: "--entry=ENTRY".

    IIRC Dave's program sometimes uses absolute addresses of functions/data, fed in from some external source. So the linker may not see a path from any well chosen entry point to every used function.

    I definitely agree that if --entry can be made to work then that is the preferred solution.

    Regards,

  • --entry could work in the case when a bootloader turns over complete control to the application.

    In my case though the loader code stays running to perform auxiliary functions and upload FW again if necessary.

    Another possibility would be to compile in the object files of the loader, then work on extracting the application code from the resulting binary. In this case no --entry would be needed as loader would have the main(). Unfortunately in my case this would still not solve the issue of initializing ram data without the loader explicitly doing that.

  • Hi Dave, Jozef,

    are we done with this? Can I close the case for now?

    Best regards

    Peter

**Attention** This is a public forum