gcc compiler bug with varargs in large mode?

David Given

I have found what looks suspiciously like a compiler bug: in some situations, 20-bit pointers are not written correctly to the stack when assembling varargs lists.

Here is the test code:

typedef unsigned int uint16_t;

extern void func(const char* format, ...);
extern uint16_t var;

void test(void)
{
	func("dummy", (void*)var);
}

If this is built with:

msp430-elf-gcc -S -Os -mlarge bug.c

...then the resulting assembly looks like this:

test:
	SUBA	#8, R1
	MOVX.A	#.LC0, @R1
	MOVX.W	&var, R12
	MOVX	R12, 4(R1)
	CALLA	#func
	ADDA	#8, R1
	RETA

You can see that var is loaded into r12 with a movx.w instruction, which is correct --- var is a uint16_t. But then it's written back to the stack using a bare movx instruction, which without a qualifier defaults to movx.w. (As confirmed by objdump.) This should be a movx.a.

As it's allocating four bytes for that stack slot, I think it likely that the size suffix has been accidentally dropped off the end of the instruction in the gcc instruction template for 'extend 16-bit word to 20-bit word and store into memory'. If I rearrange the code to force the extension to happen in a separate instruction, it works fine.

The gcc version is:

msp430-elf-gcc (GCC) 4.9.1 20140707 (prerelease (msp430-14r1-167)) (GNUPro 14r1) (Based on: GCC 4.8 GDB 7.7 Binutils 2.24 Newlib 2.1)

Can anyone confirm this? And does anyone know of a workaround or the availability of a fixed compiler?

over 8 years ago

Katie Pier over 8 years ago

TI__Guru 68116 points

Hi David,

I'm moving this to the (DO NOT USE) TI C/C++ Compiler - Forum since it sounds like a potential compiler bug.

Regards,
Katie

Nick Clifton over 8 years ago

Intellectual 390 points

Hi David,

This is not really a bug. Converting an integer to a pointer via a cast is not a safe thing to do in C, and gcc will even give you a warning message as a hint:

warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
func("dummy", (void*)var);

The correct way to perform the conversion is to use a union, like this:

typedef unsigned int uint16_t;
extern void func(const char* format, ...);
extern uint16_t var;
typedef union
{
void * ptr;
uint16_t val;
} converter;

void test(void)
{
converter c;
c.val = var;
func("dummy", c.ptr);
}

This will result in gcc generating correct (although rather excessive) code:

SUBA #8, R1
MOV.W #0, R13
MOVX.W &var, R12
PUSH.W R13 { PUSH.W R12 { POPM.A #1, R12
MOVA R12, 4(R1)
MOVX.A #.LC0, @R1
CALLA #func
ADDA #8, R1
RETA

I hope that this helps.

Cheers
Nick

David Given over 8 years ago in reply to Nick Clifton

Intellectual 280 points

Okay, so yes, so the standard doesn't allow this, and I had totally forgotten that.

However, luckily for my reputation, this also doesn't work, and I believe this is supported by the standard:

#include <stddef.h>
typedef unsigned int uint16_t;

extern void func(const char* format, ...);
extern uint16_t var;

void test(void)
{
	func("dummy", (void*)(size_t)var);
}

A size_t is defined as an unsigned integer that's the same size of a pointer, so on the MSP430 in large mode it's a uint20_t. So this is casting a uint16_t to a uint20_t, which should zero extend, and is then converting that to a pointer, which is valid. (I hope.) It produces the same assembly as my earlier example.

Archaeologist over 8 years ago in reply to David Given

TI__Guru* 84225 points

size_t is not necessarily at least as wide as a pointer, and in fact isn't for some TI devices, although for MSP430 large model it is.

While converting an integer to a pointer or vice versa is not "safe," it is not illegal, it is "Implementation-defined." If the result is reasonable for the target, the compiler should do the right thing. I see nothing wrong with your original test case, despite the fact it invokes implementation-defined behavior. That said, I'm not familiar with what GCC's rule is for this particular implementation-defined behavior.

David Given over 8 years ago in reply to Archaeologist

Intellectual 280 points

*noise of mild curses directed at self*

Yes, you're right about size_t; but as you say, it *is* a 20-bit variable, and I've just verified that the test case still works if I replace it with intptr_t, which is what I was getting it confused with. So I think I'm still safe.

Regarding the standards: there's a big difference between by-the-book C and Real World C. By-the-book C doesn't define 2s complement or signed integer rollover, for example, but no reasonable C programmer would expect anything else. I'd argue that casting a 16-bit unsigned value to a 20-bit pointer should always zero extend, even though the standard says otherwise; not just because it's free and that the compiler's already done it, but because on a platform like the MSP430X you're very likely to be juggling 16- and 20-bit pointers a lot and doing anything else violates the Principle of Least Surprise.

Interesting discussion here on what C is like in practice: www.cl.cam.ac.uk/.../notes50-2015-05-24-survey-discussion.html

Surprise surprise, they completely dismiss embedded platforms as being irrelevant...

Archaeologist over 8 years ago in reply to David Given

TI__Guru* 84225 points

Being one myself, I can assure you that "reasonable C programmer" is an oxymoron.

Nick Clifton over 8 years ago in reply to David Given

Intellectual 390 points

Hi David.

OK, yes, using intptr_t should work. Although I still say using the union is better, but nevermind.

Anyway I have tracked down the bug - you were right about it being the template to extend a 16-bit value to a 20bit value stored in memory. I have developed this patch:

Index: gcc/config/msp430/msp430.md
===================================================================
--- gcc/config/msp430/msp430.md (revision 225222)
+++ gcc/config/msp430/msp430.md (working copy)
@@ -572,7 +572,9 @@
[(set (match_operand:PSI 0 "msp_nonimmediate_operand" "=r,m")
(zero_extend:PSI (match_operand:HI 1 "msp_nonimmediate_operand" "rm,r")))]
""
- "MOVX\t%1, %0"
+ "@
+ MOVX\t%1, %0
+ MOVX.A\t%1, %0"
)

(define_insn "truncpsihi2"

which I am currently testing locally. Assuming that I find no regressions I will check it in to the FSF GCC sources later on today, and also send it to TI so that they can include it in their next release.

Cheers
Nick

David Given over 8 years ago in reply to Nick Clifton

Intellectual 280 points

Awesome; thanks!

I have done some gcc (and LLVM) hacking in the past. It's horrible. Thank you for doing this.

Code Composer Studio™︎

Code Composer Studio forum

gcc compiler bug with varargs in large mode?