This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C28 code optimization problem

Hello,

I have trouble with the C28 Piccolo device using your uart driverlib V136 when setting optimization level higher than 1. It will stall at the while loop which I have marked red in the below code snippeds. I have also included the disassembly parts of this function with -O1 and -O2. The "loop optimizations" invoked by -O2 did a bit too much. Compiler bug?

I have tried it with compiler version C2000 6.1.4

I will test it with 6.1.5 as well, but I must restart windows prior doing this :-)

Best regards,
Stefan

File F2806X_MWare/driverlib/uart.c

void
UARTCharPut(unsigned long ulBase, unsigned char ucData)
{
    //
    // Check the arguments.
    //
    ASSERT(UARTBaseValid(ulBase));

    //
    // Wait until space is available.
    //
    while(!(HWREGB(ulBase + UART_O_CTL2) & UART_CTL2_TXRDY))
    {
    }

    //
    // Send the char.
    //
    HWREGB(ulBase + UART_O_TXBUF) = ucData;
}

Disassembly with optimization set to -O1

 719    {
        UARTCharPut:
3e8322:   1EA6        MOVL         @XAR6, ACC
 728        while(!(HWREGB(ulBase + UART_O_CTL2) & UART_CTL2_TXRDY))
        C$DW$L$_UARTCharPut$2$B, C$L5:
3e8323:   06A6        MOVL         ACC, @XAR6
3e8324:   0904        ADDB         ACC, #4
3e8325:   83A9        MOVL         XAR5, @ACC
3e8326:   C6C5        MOVB         AL.LSB, *+XAR5[0]
3e8327:   47A9        TBIT         @AL, #0x7
3e8328:   EFFB        SBF          C$L5, NTC
 735        HWREGB(ulBase + UART_O_TXBUF) = ucData;
        C$DW$L$_UARTCharPut$2$E:
3e8329:   06A6        MOVL         ACC, @XAR6
3e832a:   0909        ADDB         ACC, #9
3e832b:   83A9        MOVL         XAR5, @ACC
3e832c:   92A4        MOV          AL, @AR4
3e832d:   3CC5        MOVB         *+XAR5[0], AL.LSB
 736    }
3e832e:   0006        LRETR        
 764        return((HWREGB(ulBase + UART_O_CTL2) & UART_CTL2_TXEMPTY) ? false : true);
        UARTBusy:
3e832f:   0904        ADDB         ACC, #4
3e8330:   8AA9        MOVL         XAR4, @ACC
3e8331:   C6C4        MOVB         AL.LSB, *+XAR4[0]
3e8332:   FFA5        ASR          AL, 6
3e8333:   FF5E        NOT          AL
3e8334:   9001        ANDB         AL, #0x1
 765    }
3e8335:   0006        LRETR

Disassembly with optimization set to -O2

 719    {
        UARTCharPut:
3e85b6:   1EA6        MOVL         @XAR6, ACC
3e85b7:   0904        ADDB         ACC, #4
3e85b8:   83A9        MOVL         XAR5, @ACC
3e85b9:   C6C5        MOVB         AL.LSB, *+XAR5[0]
 728        while(!(HWREGB(ulBase + UART_O_CTL2) & UART_CTL2_TXRDY))
        C$DW$L$_UARTCharPut$2$B, C$L31:
3e85ba:   47A9        TBIT         @AL, #0x7
3e85bb:   EFFF        SBF          C$L31, NTC
 735        HWREGB(ulBase + UART_O_TXBUF) = ucData;
        C$DW$L$_UARTCharPut$2$E:
3e85bc:   06A6        MOVL         ACC, @XAR6
3e85bd:   0909        ADDB         ACC, #9
3e85be:   83A9        MOVL         XAR5, @ACC
3e85bf:   92A4        MOV          AL, @AR4
3e85c0:   3CC5        MOVB         *+XAR5[0], AL.LSB
 736    }
3e85c1:   0006        LRETR        

  • All right, as expected, the TI compiler update to 6.1.5 didn't make any difference.

    But I read in SPRU514 in section 3.5.2 "Use the Volatile Keyword for Necessary Memory Accesses" the reason for my behavior. The optimiziation reduced the memory access because the volatile keyword is missing in the macro HWREGB defined in hw_types.h

    #define HWREGB(x) \
            __byte((int *)(x),0)

    How can I solve this problem in this special case? __byte((volatile int *)(x),0) does not work.
    I found a workaround using a local variable. I have to make changes within the TI middleware. Not good!

    In general, How can I check that I don't violate against the rules described in 3.5 of SPRU514? Do I have to revise / test my code manually?

    Regards,
    Stefan

  • The intrinsic __byte seems to be defined incorrectly, causing this bug.  I've submitted SDSCM00049051 to track this issue.

    You can work around the problem for __byte used as a read by using a shift by 8 instead.

    #define HWREGB(x) ((volatile int *)(x)>>8)
  • Archeologist,

    thanks for the reply. First attempt to replace the macro definition failed because the macro is used for reading and writing to registers. I modified your macro (since it ended up with a compiler error) and added the macro

    #define HWREGB_READ (*((volatile unsigned int *)(x)))

    and used it in places where register reads are done. It didn't work that way, the seems not to be correct.
    I used the HWREG macro instead and it works now. I don't get the reason for the HWREGB macro intention.
    Am I missing something?

    Regards,
    Stefan

  • The intrinsic __byte is special for two reasons

    1. It can be used on the left or right of an '=' operator
    2. It accesses subwords of a 16-bit char

    The macro HWREGB is taking advantage of the first feature by blurring the distinction between a read and write of that register, to mimic the way you can use a plain variable.  Because in C2000 a byte (the smallest addressable unit) is 16 bits, you can't construct (in standard C) an 8-bit variable that works like a plain variable; you have to use __byte or manipulate the bits yourself.  My macro (as noted) would work only for reads.  Writes will be more complicated, and in some cases (usually involving volatile) it may not be possible to write a perfectly equivalent operation without __byte.

    When writing, the macro wants to only modify 8 bits, leaving the adjacent 8 bits untouched, which __byte was designed for.  Otherwise, you have to read the current value as a 16-bit char/int, clear the target 8 bits, OR in the new bits, and write the whole 16-bit value back to memory.

    I hope that's clear.

  • Thanks for the explanations. However I did know the purpose of __byte since I had to deal with it quite often when porting my UART application from a "normal" microconroller code which can deal with byte accesses.

    To clearify my last thread:

    Your workaround suggestion did not work due to a compiler error. I modified it and used it in cases a read with __byte was intended. It did not work as intended in the uart.c. Why is a right shift neccessary? The bit compares are done with masks larger than 8 bits...

    I replaced the HWREGB macro with the macro HWREG in the cases a read in a while() loop was done and it works as intended.

    Regards, Stefan

  • You say you get "a compiler error" but you don't say which one.

    I was mistaken about what __byte(x, 0) means.  Sorry about that; I'm not a C2000 expert.

    The equivalent of __byte(x, 0) should be ((x)0xff) to get the LSB of the 16-bit addressible unit.

    __byte is only supposed to return an 8-bit value, so masks larger than 8 bits do not make sense. The disassembly you posted suggests that UART_CTL2_TXRDY is equal to 0x80, which fits in 8 bits.

  • You are right. Sorry, but i mixed up byte size. Of course 0x80 fits in 8 bits!

    And I checked out that the relevant registers of the SSC unit have the zize of 8 bits (which is strange, since the controller can only access 16 bit).

    Thanks,
    Stefan

    P.S.: your original workaround macro resulted in the compiler error #31 expression must have integral type
    Now I use the following macro for the reads:

    #define HWREGB_READ(x)                                                        \
            (*((volatile unsigned short *)(x)) & 0xFF)

  • We looked at this issue. We think the using the above defined work around is a good way to deal with this type of issue.

    intrinsic function has two sides,

    1) In C level, it will follow the ANSI C standard, so it is a function call. Every argument passed to it will be promoted to its argument type. In your case, the volatile modifier is lost during this promoting. That's why when we apply optimization, the memory access is hoisted to a register access.

    2) Add a prototype for __byte() function is also not a good solution, as the argument type(with volatile) does not guarantee the memory access is a volatile access. We will have to add specific treatment to the new intrinsic function to make sure it is marked as a volatile. In our tools, we have many intrinsic function to access memory, adding a volatile version for each of them is a pain for us to maintain and for the customers to remember them.  So we decided not go that approach.

    I think the rule of thumb here is to use basic type(scalar) to access volatile memory and then use some MASKs to get what you want. The scalar volatile access is always guaranteed by the compiler. In C2000, it may take a few extra cycles, but the performance is down when you have volatile access. So we believe it is acceptable.

    Any other ideas, please let us know.

    Thanks for filing this issue to us,

    Wei