Hi,
I have an optimizer problem with a piece of code that transfers multi-byte variables to a byte array, for later transmission over a byte based device such as a uart.
The C function code is the following:
void Var2buff(void *var,int *buf,int le)
{
switch(le) {
case 1:
buf[0] = __byte((int *)var,0);
break;
case 2: {
int tmp = *((int*)var);
buf[0] = __byte((int *)&tmp,0);
buf[1] = __byte((int *)&tmp,1);
}
break;
default: {
int64 tmp = *((int64*)var);
int i;
for (i = 0; i < le; i++)
buf[i] = __byte((int *)&tmp,i);
}
}
return;
}
The first 2 cases accomodate single and 2-byte variables, the for loop in the default case should handle any other case up to 8 bytes since the longest data type is 64bit.
The unoptimized assembly (annotated partly in dutch) is shown below. Note that this operates correct
Stack is 16 bit wide
32 bit variabelen occupy 2 positions, low word at low address, high word at high address
SP wijst naar volgende LEGE locatie
function entry:
&var (source data) in XAR4 (32 bit address)
&buf (uart buffer) in XAR5 (32 bit address)
le (length) in AL (16 bit value)
Stack
1,2: &var (=XAR4)
3,4: &buf (=XAR5)
5: length (=AL)
6: not used due to alignment requirements???
7,8: copy lower32 bits of tmp (default case)
9,10: copy upper 32 bits of tmp (default case)
11: for loop counter i (default case)
12: not used due to alignment requirements???
Var2buff:
// Put values on stack
33add6: FE0C ADDB SP, #12 // Reserve 12 locations
33add7: 9645 MOV *-SP[5], AL // 16 bit move, push AL (Accumulator Low) is int le (length of variable)
33add8: A044 MOVL *-SP[4], XAR5 // 32 bit move, push XAR5 is base address van buf, *+XAR5[0] = buf[0], *+XAR5[1] = buf[1] etc.
33add9: A842 MOVL *-SP[2], XAR4 // 32 bit move, push XAR4 is address van var
178 switch(le) {
33adda: 6F2C SB C$L10, UNC // Unconditional short branch naar C$L10 (evaluatie van switch conditie staat op C$L10)
-------------------------------------------------------------------------------------------------------------------------
180 buf[0] = __byte((int *)var,0);
C$L6: // One byte
33addb: 8A42 MOVL XAR4, *-SP[2] // Lees address van var terug in XAR4
33addc: 8344 MOVL XAR5, *-SP[4] // Lees base address van buf terug in XAR5
33addd: C6C4 MOVB AL.LSB, *+XAR4[0] // laad lsb van AL met waarde uit var[0]. NB: wordt MSB gewist? Ja, pag 347 van spru430
33adde: 96C5 MOV *+XAR5[0], AL // Kopieer Accumulator Low naar buf[0]
181 break;
33addf: 6F2C SB C$L11, UNC // Unconditional short branch naar C$L11
-------------------------------------------------------------------------------------------------------------------------
184 int tmp = *((int*)var);
C$L7: // Two bytes
33ade0: C542 MOVL XAR7, *-SP[2] // Lees address van var in XAR7
33ade1: 92C7 MOV AL, *+XAR7[0] // Lees waarde in address var naar 16 bit Accumulator Low
33ade2: 9646 MOV *-SP[6], AL // Kopieer Accumulator Low naar stack => tmp staat op SP-6
185 buf[0] = __byte((int *)&tmp,0);
33ade3: 8344 MOVL XAR5, *-SP[4] // Lees base address van buf terug in XAR5
33ade4: 5CAD MOVZ AR4, @SP // mov 16 bit waarde van stack pointer?????, clear AR4-high
33ade5: DC86 SUBB XAR4, #6 // trek 6 af, XAR4 wijst nu naar SP-6, dus naar inhoud van tmp
33ade6: C6C4 MOVB AL.LSB, *+XAR4[0] // Even argument: LSbyte (8 bit) wordt gekopieerd naar Accumulator Low, MSB gewist (pag 347 van spru430)
33ade7: 96C5 MOV *+XAR5[0], AL // Kopieer Accumulator Low naar buf[0]
186 buf[1] = __byte((int *)&tmp,1);
33ade8: 8344 MOVL XAR5, *-SP[4] // Lees base address van buf terug in XAR5
33ade9: 5CAD MOVZ AR4, @SP // mov 16 bit waarde van stack pointer?????, clear AR4-high
33adea: DC86 SUBB XAR4, #6 // trek 6 af, XAR4 wijst nu naar SP-6, dus naar inhoud van tmp
33adeb: C6CC MOVB AL.LSB, *+XAR4[1] // Oneven argument: MSbyte (8 bit) wordt gekopieerd naar Accumulator Low, MSB gewist (pag 347 van spru430)
33adec: 96CD MOV *+XAR5[1], AL // Kopieer Accumulator Low naar buf[1]
188 break;
33aded: 6F1E SB C$L11, UNC // Unconditional short branch naar C$L11
-------------------------------------------------------------------------------------------------------------------------
190 int64 tmp = *((int64*)var);
C$L8: // Four bytes
// Fill stack positions SP-7 (=upper), SP-8 (=lower), SP-9 (=upper) and SP-10 (=lower)
33adee: 8A42 MOVL XAR4, *-SP[2] // Copy address of 64-bit var back to XAR4
33adef: C4D4 MOVL XAR6, *+XAR4[2] // Copy upper 32- of the 64 bits (16-bit indices 2 and 3) in address var to 32 bit XAR6
33adf0: 06C4 MOVL ACC, *+XAR4[0] // Copy lower 32- of the 64 bits (16-bit indices 0 en 1) in address var to 32 bit Accumulator
33adf1: 1E4A MOVL *-SP[10], ACC // Store lower 32 bits of tmp in stack: fill stack positions SP-7 (=upper), SP-8 (=lower)
33adf2: C248 MOVL *-SP[8], XAR6 // Store upper 32 bits of tmp in stack: fill stack positions SP-9 (=upper) and SP-10 (=lower)
192 for (i = 0; i < le; i++)
33adf3: 2B4B MOV *-SP[11], #0 // -SP[11] is for loop counter (=i)
33adf4: 9245 MOV AL, *-SP[5]
33adf5: 544B CMP AL, *-SP[11] // Compare 'le' with 'i'
33adf6: 6515 SB C$L11, LEQ // Exit if i is less or equal to le
193 buf[i] = __byte((int *)&tmp,i);
C$DW$L$_Var2buff$6$B, C$L9:
33adf7: 3B01 SETC SXM
33adf8: 854B MOV ACC, *-SP[11] // Copy 'i' to Accumulator
33adf9: 8344 MOVL XAR5, *-SP[4] // Copy SP-4 (=32 bit start address of destination 'buf[]')
33adfa: 584B MOVZ AR0, *-SP[11] //
33adfb: 5CAD MOVZ AR4, @SP // Copy SP (top of stack) to AR4
33adfc: DC8A SUBB XAR4, #10 // Subtract 10, now XAR4 points to SP-10 = lsword of tmp
33adfd: 560100A5 ADDL @XAR5, ACC // Add index to XAR5 to get buf[i]
33adff: C694 MOVB AL.LSB, *+XAR4[AR0] // get byte tmp[i] where i is a byte index!. Copy single byte to Accumulator Low, MSB gewist (pag 347 van spru430)
33ae00: 96C5 MOV *+XAR5[0], AL // copy byte in Accumulator to address pointed to by XAR5 (=buf[i]
192 for (i = 0; i < le; i++)
33ae01: 0A4B INC *-SP[11] // increment for loop counter (=i)
33ae02: 9245 MOV AL, *-SP[5] // get length
33ae03: 544B CMP AL, *-SP[11] // compare length with loop counter i
33ae04: 62F3 SB C$L9, GT // branch to repeat loop
194 }
C$DW$L$_Var2buff$6$E:
33ae05: 6F06 SB C$L11, UNC // exit loop, unconditional short branch naar C$L11
-------------------------------------------------------------------------------------------------------------------------
178 switch(le) {
C$L10:
33ae06: 5201 CMPB AL, #0x1
33ae07: ECD4 SBF C$L6, EQ // Short branch fast
33ae08: 5202 CMPB AL, #0x2
33ae09: ECD7 SBF C$L7, EQ
33ae0a: 6FE4 SB C$L8, UNC
199 }
C$L11:
33ae0b: FE8C SUBB SP, #12
33ae0c: 0006 LRETR
202 {
As mentioned, this assembly code is correct.
When I enable the compiler at level 2 I get the following (broken) assembly code
Optimized code
Stack is 16 bit wide
32 bit variables occupy 2 positions, low word at the lower address, high word at the higher address
SP points to next empty location of the stack
function entry:
&var in XAR4 (32 bit address)
&buf in XAR5 (32 bit address)
le (length) in AL (16 bit waarde)
Stack
1,2,3,4: 64 bit variable tmp when le == 4
5: tmp when le==2
// AL (Accumulator Low) is int le (length of variable)
// XAR5 is base adres van buf, *+XAR5[0] = buf[0], *+XAR5[1] = buf[1] etc.
// XAR7 is address of var (XAR4 unoptimized)
33ad41: FE06 ADDB SP, #6 // Reserve 6 locations
33ad42: C5A4 MOVL XAR7, @XAR4 //????????????????
178 switch(le) {
33ad43: 5201 CMPB AL, #0x1 // Compare Accumulator Low ('le') with 0x01
33ad44: EC1B SBF C$L3, EQ // Jump for case 1:
33ad45: 5202 CMPB AL, #0x2 // Compare Accumulator Low ('le') with 0x02
33ad46: EC0E SBF C$L2, EQ // Jump for case 2:
-------------------------------------------------------------------------------------------------------------------------
// Implementation of default:
192 for (i = 0; i < le; i++)
33ad47: 5200 CMPB AL, #0x0 // Compare Accumulator Low ('le') with 0x00
33ad48: 6519 SB C$L4, LEQ // Immediate exit (case 0:)
// Default (>2 bytes) implementation
33ad49: 9CFF ADDB AL, #-1 // decrement 'le'
33ad4a: 88A9 MOVZ AR6, @AL
33ad4b: D000 MOVB XAR0, #0x0 // XAR0 is for loop counter (=i)
193 buf[i] = __byte((int *)&tmp,i);
C$DW$L$_Var2buff$5$B, C$L1:
33ad4c: 5CAD MOVZ AR4, @SP // Copy SP (top of stack) to AR4
33ad4d: DC84 SUBB XAR4, #4 // Subtract 4, now XAR4 points to SP-4 = lsword of tmp
33ad4e: C694 MOVB AL.LSB, *+XAR4[AR0] // read tmp byte from stack (NB values are not copied to the stack!)
33ad4f: 9685 MOV *XAR5++, AL
192 for (i = 0; i < le; i++)
33ad50: D801 ADDB XAR0, #1 // increment for loop counter (=i)
33ad51: 000EFFFB BANZ -5,AR6--
C$DW$L$_Var2buff$5$E:
33ad53: 6F0E SB C$L4, UNC
-------------------------------------------------------------------------------------------------------------------------
184 int tmp = *((int*)var);
C$L2: // Two bytes
33ad54: 92C7 MOV AL, *+XAR7[0] // copy value at address var to 16 bit Accumulator Low
33ad55: 9645 MOV *-SP[5], AL // Copy Accumulator Low to stack => tmp is located at SP-5
185 buf[0] = __byte((int *)&tmp,0);
33ad56: 5CAD MOVZ AR4, @SP // mov 16 bit waarde van stack pointer?????, clear AR4-high
33ad57: DC85 SUBB XAR4, #5 // trek 5 af, XAR4 wijst nu naar SP-5, dus naar inhoud van tmp
33ad58: C6C4 MOVB AL.LSB, *+XAR4[0] // Even argument: LSbyte (8 bit) wordt gekopieerd naar Accumulator Low, MSB gewist (pag 347 van spru430)
33ad59: 96C5 MOV *+XAR5[0], AL // Kopieer Accumulator Low naar buf[0]
186 buf[1] = __byte((int *)&tmp,1);
33ad5a: 5CAD MOVZ AR4, @SP // mov 16 bit waarde van stack pointer?????, clear AR4-high
33ad5b: DC85 SUBB XAR4, #5 // trek 5 af, XAR4 wijst nu naar SP-5, dus naar inhoud van tmp
33ad5c: C6CC MOVB AL.LSB, *+XAR4[1] // Oneven argument: MSbyte (8 bit) wordt gekopieerd naar Accumulator Low, MSB gewist (pag 347 van spru430)
33ad5d: 96CD MOV *+XAR5[1], AL // Kopieer Accumulator Low naar buf[1]
188 break;
33ad5e: 6F03 SB C$L4, UNC
-------------------------------------------------------------------------------------------------------------------------
180 buf[0] = __byte((int *)var,0);
C$L3: // Eén byte
33ad5f: C6C7 MOVB AL.LSB, *+XAR7[0] // laad lsb van AL met waarde uit var[0]. NB: wordt MSB gewist? Ja, pag 347 van spru430
33ad60: 96C5 MOV *+XAR5[0], AL // Kopieer Accumulator Low naar buf[0]
199 }
-------------------------------------------------------------------------------------------------------------------------
C$L4: // exit
33ad61: FE86 SUBB SP, #6 // release stack space
33ad62: 0006 LRETR
202 {
Par2buff:
33ad63: FE06 ADDB SP, #6
The default case is indicated by the comment: "// Implementation of default:"
When I follow the code, data is copied from the stack pointed to be XAR4) to the buffer (pointed to by XAR5),
but the data was never copied from var to the stack!
The C source line int64 tmp = *((int64*)var);
is not translated and not present in the annotated assembly generated by the compiler.
Does anybody have an idea if this is a compiler bug or just ambiguous code and some bad luck?
Best regards,
Paul