This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MSP430 C compiler 4.4.1 doesn't optimize indexed store/load after variable optimized to register

Hello all:

With optimization on, the C compiler can decide to allocate a register to what was a variable; after it does that, though, if that variable had been being used as an index on an indexed store/load, it seems like there's an unnecessary register copy that doesn't go away, because when the variable was 'real', it needed to be copied to a register for the indexed load/store.

This occurs with all levels of optimization. 

Practically, what this leads to is that code like this:

uint8_t buf[16];
struct my_pointers {
   uint8_t rd_ptr;
   uint8_t wr_ptr;
} pointers;
struct my_pointers *p = &pointers;

int main(void) {
  uint8_t tmp;
  tmp = p->wr_ptr;
  buf[tmp++] = P1IN;
  tmp = tmp % 16;
  p->wr_ptr = tmp;
}

is less efficient than this:

uint8_t buf[16];
struct my_pointers {
   uint8_t rd_ptr;
   uint8_t wr_ptr;
} pointers;
struct my_pointers *p = &pointers;
uint8_t *const rd_ptr = &(p->rd_ptr);
uint8_t *const wr_ptr = &(p->wr_ptr);

int main(void) {
  uint8_t tmp;
  tmp = *wr_ptr;
  buf[tmp++] = P1IN;
  tmp = tmp % 16;
  *wr_ptr = tmp;
}

(The 'tmp' there doesn't do anything, removing it in this sample code and just working with p->wr_ptr/*wr_ptr doesn't change anything).

In the first case the first read of p->wr_ptr to a temporary variable is optimized to a register, but then the indexed store afterwards still includes a register copy before it:



        MOV.B     &pointers+1,r15       ; [] |23| 
        MOV.B     r15,r14               ; [] |23| 
        MOV.B     &PAIN_L+0,buf+0(r14)  ; [] |23| 

whereas in the second case, there is no read of p->wr_ptr needed, as now it's just a pointer. So it generates:

        MOV.B     &pointers+1,r15       ; [] |37| 
        MOV.B     &PAIN_L+0,buf+0(r15)  ; [] |37| 

In the assembly file, you can see that in the optimized case, there's been an intermediary variable optimized away - it says "r15 assigned to $O$C1".

Obviously with lots of structure pointers like this, this cost can get pretty high if there are lots of indexed loads/stores. 

I'm attaching a "test case" - which includes a 'bad style' indexed store and a good style' indexed load (and the reverse, commented out) so you can see.8877.main.cpp

  • Thank you for notifying us of this problem, and submitting a well defined test case.  I can reproduce your results.  I filed SDSCM00052087 in the SDOWP system to have this investigated.  It is not filed as a defect report, but a performance issue.  The code generated by the compiler is correct, but not optimal.  Feel free to follow it with the SDOWP link below in my signature.

    Thanks and regards,

    -George