This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

gcc compiler bug with varargs in large mode?

I have found what looks suspiciously like a compiler bug: in some situations, 20-bit pointers are not written correctly to the stack when assembling varargs lists.

Here is the test code:

typedef unsigned int uint16_t;

extern void func(const char* format, ...);
extern uint16_t var;

void test(void)
{
	func("dummy", (void*)var);
}

If this is built with:

msp430-elf-gcc -S -Os -mlarge bug.c

...then the resulting assembly looks like this:

test:
	SUBA	#8, R1
	MOVX.A	#.LC0, @R1
	MOVX.W	&var, R12
	MOVX	R12, 4(R1)
	CALLA	#func
	ADDA	#8, R1
	RETA

You can see that var is loaded into r12 with a movx.w instruction, which is correct --- var is a uint16_t. But then it's written back to the stack using a bare movx instruction, which without a qualifier defaults to movx.w. (As confirmed by objdump.) This should be a movx.a.

As it's allocating four bytes for that stack slot, I think it likely that the size suffix has been accidentally dropped off the end of the instruction in the gcc instruction template for 'extend 16-bit word to 20-bit word and store into memory'. If I rearrange the code to force the extension to happen in a separate instruction, it works fine.

The gcc version is:

msp430-elf-gcc (GCC) 4.9.1 20140707 (prerelease (msp430-14r1-167)) (GNUPro 14r1) (Based on: GCC 4.8 GDB 7.7 Binutils 2.24 Newlib 2.1)

Can anyone confirm this? And does anyone know of a workaround or the availability of a fixed compiler?

  • Hi David,

    I'm moving this to the (DO NOT USE) TI C/C++ Compiler - Forum since it sounds like a potential compiler bug.

    Regards,
    Katie
  • Hi David,

    This is not really a bug. Converting an integer to a pointer via a cast is not a safe thing to do in C, and gcc will even give you a warning message as a hint:

    warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
    func("dummy", (void*)var);

    The correct way to perform the conversion is to use a union, like this:

    typedef unsigned int uint16_t;
    extern void func(const char* format, ...);
    extern uint16_t var;
    typedef union
    {
    void * ptr;
    uint16_t val;
    } converter;

    void test(void)
    {
    converter c;
    c.val = var;
    func("dummy", c.ptr);
    }

    This will result in gcc generating correct (although rather excessive) code:

    SUBA #8, R1
    MOV.W #0, R13
    MOVX.W &var, R12
    PUSH.W R13 { PUSH.W R12 { POPM.A #1, R12
    MOVA R12, 4(R1)
    MOVX.A #.LC0, @R1
    CALLA #func
    ADDA #8, R1
    RETA

    I hope that this helps.

    Cheers
    Nick
  • Okay, so yes, so the standard doesn't allow this, and I had totally forgotten that.

    However, luckily for my reputation, this also doesn't work, and I believe this is supported by the standard:

    #include <stddef.h>
    typedef unsigned int uint16_t;
    
    extern void func(const char* format, ...);
    extern uint16_t var;
    
    void test(void)
    {
    	func("dummy", (void*)(size_t)var);
    }
    

    A size_t is defined as an unsigned integer that's the same size of a pointer, so on the MSP430 in large mode it's a uint20_t. So this is casting a uint16_t to a uint20_t, which should zero extend, and is then converting that to a pointer, which is valid. (I hope.) It produces the same assembly as my earlier example.

  • size_t is not necessarily at least as wide as a pointer, and in fact isn't for some TI devices, although for MSP430 large model it is.

    While converting an integer to a pointer or vice versa is not "safe," it is not illegal, it is "Implementation-defined." If the result is reasonable for the target, the compiler should do the right thing. I see nothing wrong with your original test case, despite the fact it invokes implementation-defined behavior. That said, I'm not familiar with what GCC's rule is for this particular implementation-defined behavior.
  • *noise of mild curses directed at self*

    Yes, you're right about size_t; but as you say, it *is* a 20-bit variable, and I've just verified that the test case still works if I replace it with intptr_t, which is what I was getting it confused with. So I think I'm still safe.

    Regarding the standards: there's a big difference between by-the-book C and Real World C. By-the-book C doesn't define 2s complement or signed integer rollover, for example, but no reasonable C programmer would expect anything else. I'd argue that casting a 16-bit unsigned value to a 20-bit pointer should always zero extend, even though the standard says otherwise; not just because it's free and that the compiler's already done it, but because on a platform like the MSP430X you're very likely to be juggling 16- and 20-bit pointers a lot and doing anything else violates the Principle of Least Surprise.

    Interesting discussion here on what C is like in practice: www.cl.cam.ac.uk/.../notes50-2015-05-24-survey-discussion.html

    Surprise surprise, they completely dismiss embedded platforms as being irrelevant...
  • Being one myself, I can assure you that "reasonable C programmer" is an oxymoron.
  • Hi David.

    OK, yes, using intptr_t should work. Although I still say using the union is better, but nevermind.

    Anyway I have tracked down the bug - you were right about it being the template to extend a 16-bit value to a 20bit value stored in memory. I have developed this patch:

    Index: gcc/config/msp430/msp430.md
    ===================================================================
    --- gcc/config/msp430/msp430.md (revision 225222)
    +++ gcc/config/msp430/msp430.md (working copy)
    @@ -572,7 +572,9 @@
    [(set (match_operand:PSI 0 "msp_nonimmediate_operand" "=r,m")
    (zero_extend:PSI (match_operand:HI 1 "msp_nonimmediate_operand" "rm,r")))]
    ""
    - "MOVX\t%1, %0"
    + "@
    + MOVX\t%1, %0
    + MOVX.A\t%1, %0"
    )

    (define_insn "truncpsihi2"

    which I am currently testing locally. Assuming that I find no regressions I will check it in to the FSF GCC sources later on today, and also send it to TI so that they can include it in their next release.

    Cheers
    Nick
  • Awesome; thanks!

    I have done some gcc (and LLVM) hacking in the past. It's horrible. Thank you for doing this.