This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Missed optimization for non-PODs?

Hello


I ran into a problem where the compiler emits very pessimistic code when I think it really shouldn't. I wonder who's wrong, though. Consider the following C++ code:

struct Test
{
#ifndef MAKEPOD
	Test(const Test& t) : x(t.x) {}
#endif

	int x;
};

void SomeOtherFunction(Test t);
void DontCall();

void SomeFunction(Test t)
{
	Test t2 = t;
	t.x = 0;
	SomeOtherFunction(t2);
	if(t.x)
	{
		DontCall();
	}
}

If I define MAKEPOD, the code is compiled to a single CALLRET instruction calling SomeOtherFunction. When I don't, the code saves a couple of registers and also includes a conditional call of DontCall(). That doesn't seem right to me. In my application, this would affect the use of a unique_ptr-like smart pointer, where deleter code is emitted all over the place, even if the (local) unique_ptr previously released its pointer (the situation is thus very similar to the above).


I am using CGT 7.4.7, the compiler flags were

-mv6400 --abi=eabi -O3 --rtti --cpp_default --gcc --display_error_number --interrupt_threshold=1024 --mem_model:const=far --mem_model:data=far --opt_for_speed=2 --gen_opt_info=2 --call_assumptions=0 --std_lib_func_not_defined -k --src_interlist 

Am I missing something?

Kind regards

Markus

  • Markus Moll said:
    Am I missing something?

    I don't think so.  I don't understand why defining MAKEPOD causes such a difference.  So I filed SDSCM00051583 as a performance issue, not a bug, in the SDOWP system.  Feel free to follow it with the SDOWP link below in my signature.  Perhaps there is some subtle detail about C++ that explains everything.  If so, this should expose it.  Or, perhaps it is a performance issue in the compiler, in which case we will address it.

    Thanks and regards,

    -George

  • Thank you. I saw the ticket, it's actually the other way around: Supplying a user-defined copy-constructor degrades performance. I didn't do any more testing to see if the copy constructor is the culprit or if it's a POD vs. non-POD issue.

  • Sorry about that.  I corrected the ticket.

    -George

  • No reason to be sorry, the ifndef was in fact a bit confusing ;-)

    I made a few more observations in the meantime:

    • The same problem also exists in compilers 7.4.13 and 8.0.1
    • It's not only the copy constructor that causes worse performance:
      • The call to DontCall is optimized away and only "CALLRET SomeOtherFunction; NOP 5" is generated if
        • Test is simply "struct Test { int x; };"
        • I add a copy-assignment operator (operator=) to Test
      • The call to DontCall is also optimized away but a more complicated calling sequence (save Test parameter and return address to stack, CALLP, pop return address from stack, RET (the parameter is never read back from the stack!)) is generated if
        • Test becomes so large that it cannot be passed in registers only but is passed by reference
      • The call is not optimized away if (in most but not allof these cases there is an implicitly or explicitly defined copy constructor in the object file)
        • I add a copy constructor to Test
        • I add a destructor to Test
        • I add a virtual function to Test
        • I add a virtual base class to Test
        • I add a data member or base class to Test that satisfies any of the above

    I don't know if that helps, but maybe it does.

    Thank you for your efforts
    Markus

    [EDIT: I originally said the code would be compiled to CALLRET, ZERO A4, NOP 4. That's obviously wrong, I had slightly modified the file in between. The code is actually compiled to the simpler sequence CALLRET, NOP 5]

  • Thanks for running all those additional experiments.  I noted them in the ticket.

    Thanks and regards,

    -George