Tool/software: TI C/C++ Compiler
Hi all,
I'm struggling to get the C++ compiler in CGT v7.4.24 to trigger any kind of (N)RVO. That is, I expect a function f() returning a local Foo x to allocate x onto the caller's stack frame. But no matter what I do, the resulting assembly for f() always seems to follow the pattern of:
- Push stack sizeof(Foo)
- Do work on Foo x via SP+n
- copy x via memcpy() to A3 from SP+n for sizeof(Foo)
- Pop stack sizeof(Foo)
Where the caller is doing:
- Push stack sizeof(Foo)
- call f() with A3 = SP+n (Foo x in caller frame)
- ...
- Pop stack sizeof(Foo)
So the stack usage is twice what it should be, and an otherwise-unnecessary call to memcpy() is generated.
What I would expect for f() given that A3 points to an already-allocated Foo is simply:
- Do work on Foo x via A3
The only way I've been able to get any reasonable assembly code is to explicitly allocate Foo x in the caller frame, modify f() to accept e.g. a Foo & result, and operate directly on that reference inside f(). But this pattern breaks support for e.g. operator*(lhs,rhs).
Does "legacy" CGT support (N)RVO? Is there any way to get the compiler to optimize out the unnecessary stack push/pop and memcpy? Or is there a newer CGT with support for C6727?