This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Optimization not inlining nor resolving predictable pointer indirection

Other Parts Discussed in Thread: MSP430FG4618

Hi,

I've previously posted my question in another TI forum. I'd like to move it to here because I think it's best suited here. The link is:

http://e2e.ti.com/support/microcontrollers/msp430/f/166/t/295893.aspx

Summarising the question, I expected that the compiler would optimize my code to an assembly XOR, removing the function call and pointer indirection, that I assume to be predictable. I have a sample C code (that just toggles some LEDs) and the disassembly to show what I get. I'm using level 4 optimization, and level 5 for speed, on CCS 5.4 compiler v4.1.5, compiling to a MSP430FG4618.

I tried using const and inline as much as I could, and tried some advanced compiler options, but the results are always the same.

I think that the compiler can't optimize the calls because I use references to the functions I want to inline. Is that true? Does someone has a point on that?

Thanks for any help!

  • I can't say for sure, but it appears that the file-level analysis to determine that Porta is indeed unchanged in the module occurs after looking for inlinable functions, which means that while looking for inlinable functions, the optimizer does not yet know that Porta is constant, and thus does not know that it will always call SetPins.  Indeed, if you replace "Porta->" with "MSP_GPIO.GPIO8.", the optimizer is able to do what you want.

  • A comment by Jens Michael-Gross summarizes this well ...

    However, the tricky thing is that access to hardware registers is excluded form optimization, because they are declared volatile. And they have to, since reading or writing them may have side-effects and therefore the compiler must not optimize any access to them but instead perform any access in the exact order and number as written in the source code.

    The key line is ...

    *(pin->pRegs->out) ^= pin->bitmask;

    The field out is volatile.  There is no better way to defeat optimization than to have the code assign to a volatile qualified location.

    Thanks and regards,

    -George

  • Archaeologist said:
    Indeed, if you replace "Porta->" with "MSP_GPIO.GPIO8.", the optimizer is able to do what you want.

    Indeed, this change optimized the function call, but the pointer indirection keeps on. So my supposition on the fact that the optimization could not take place because I was referencing the functions was wrong.

    The field out is volatile.  There is no better way to defeat optimization than to have the code assign to a volatile qualified location.

    The field "out" really points to a volatile location, but the pointers in the expression that leads to it do not point directly to a volatile field. So, why couldn't the compiler optimize these indirections before the access to the volatile field?

  • One restriction that is perhaps less obvious is that assignment through a pointer carries some headaches. In general, after an indirect assignment the optimizer is required to assume that All Of Memory (including the pointer itself) has been modified.  If the optimizer can prove that it knows the memory target of the assignment (who remembers "noalias"?), it's off the hook, but that may or may not be easy, and it gets combinatorially more difficult with multiple indirection levels. At some point, the optimizer can be forgiven for throwing up its hands.

    I once dealt with a function that initialized a structure field-by-field (fair enough), but did it using multiple (redundant) levels of indirection. I think it was at least 5K of code, since the optimizer had to go back through all the indirection for every statement. I changed it to pre-compute the destination structure pointer at the beginning of the function, and it shrank to maybe 100 bytes.

  • Bruce McKenney47378 said:
    If the optimizer can prove that it knows the memory target of the assignment (who remembers "noalias"?), it's off the hook, but that may or may not be easy, and it gets combinatorially more difficult with multiple indirection levels. At some point, the optimizer can be forgiven for throwing up its hands.

    Aren't the "const" fields enough to prove the memory targets? Are 2 or 3 levels of indirection too much to be handled by the compiler?

    I'm almost giving up, with insatisfaction. If the compiler can't optimize my code, it inviabilizes my design. I don't know if I can forgive the compiler for that. At least it is better than mspgcc, which gave me a worse assembly.

  • It's not the number of levels of indirection, and it's not the lack of "const" somewhere.  It's that the optimizer did not expect the indirect call site would only ever call one function, and it does not discover this fact early enough to be able to inline it.  It's certainly a feasible optimization, but even if we were to implement it immediately, it wouldn't show up in a compiler release for several months, at the earliest.