I am learnig "TMS320C6000 Programmer’s Guide".
When I was learning the chapter 5.4 which tells how to unroll the "c" code in "Assembly Code",I met problems.
the "c " code is as the following
int dotp(short a[], short b[] )
{
int sum0, sum1, sum, i;
sum0 = 0;
sum1 = 0;
for(i=0; i<100; i+=2){
sum0 += a[i] * b[i];
sum1 += a[i + 1] * b[i + 1];
}
sum = sum0 + sum1;
return(sum);
}
and the Assembly Code is as the following (183 Page in TMS320C6000 Programmer’s Guide)
{
LDW .D1 *A4++,A2 ; load ai & ai+1 from memory
|| LDW .D2 *B4++,B2 ; load bi & bi+1 from memory
|| MVK .S1 50,A1 ; set up loop counter
|| ZERO .L1 A7 ; zero out sum0 accumulator
|| ZERO .L2 B7 ; zero out sum1 accumulator
[A1] SUB .S1 A1,1,A1 ; decrement loop counter
|| LDW .D1 *A4++,A2 ;* load ai & ai+1 from memory
|| LDW .D2 *B4++,B2 ;* load bi & bi+1 from memory
[A1] SUB .S1 A1,1,A1 ;* decrement loop counter
||[A1] B .S2 LOOP ; branch to loop
|| LDW .D1 *A4++,A2 ;** load ai & ai+1 from memory
|| LDW .D2 *B4++,B2 ;** load bi & bi+1 from memory
[A1] SUB .S1 A1,1,A1 ;** decrement loop counter
||[A1] B .S2 LOOP ;* branch to loop
|| LDW .D1 *A4++,A2 ;*** load ai & ai+1 from memory
|| LDW .D2 *B4++,B2 ;*** load bi & bi+1 from memory
[A1] SUB .S1 A1,1,A1 ;*** decrement loop counter
||[A1] B .S2 LOOP ;** branch to loop
|| LDW .D1 *A4++,A2 ;**** load ai & ai+1 from memory
|| LDW .D2 *B4++,B2 ;**** load bi & bi+1 from memory
MPY .M1X A2,B2,A6 ; ai * bi
|| MPYH .M2X A2,B2,B6 ; ai+1 * bi+1
||[A1] SUB .S1 A1,1,A1 ;**** decrement loop counter
||[A1] B .S2 LOOP ;*** branch to loop
|| LDW .D1 *A4++,A2 ;***** ld ai & ai+1 from memory
|| LDW .D2 *B4++,B2 ;***** ld bi & bi+1 from memory
MPY .M1X A2,B2,A6 ;* ai * bi
|| MPYH .M2X A2,B2,B6 ;* ai+1 * bi+1
||[A1] SUB .S1 A1,1,A1 ;***** decrement loop counter
||[A1] B .S2 LOOP ;**** branch to loop
|| LDW .D1 *A4++,A2 ;****** ld ai & ai+1 from memory
|| LDW .D2 *B4++,B2 ;****** ld bi & bi+1 from memory
LOOP:
ADD .L1 A6,A7,A7 ; sum0 += (ai * bi)
|| ADD .L2 B6,B7,B7 ; sum1 += (ai+1 * bi+1)
|| MPY .M1X A2,B2,A6 ;** ai * bi
|| MPYH .M2X A2,B2,B6 ;** ai+1 * bi+1
||[A1] SUB .S1 A1,1,A1 ;****** decrement loop counter
||[A1] B .S2 LOOP ;***** branch to loop
|| LDW .D1 *A4++,A2 ;******* ld ai & ai+1 fm memory
|| LDW .D2 *B4++,B2 ;******* ld bi & bi+1 fm memory
; Branch occurs here
ADD .L1X A7,B7,A4 ; sum = sum0 + sum1
}
I want to know why it is to write into the same registers (A2 ,B2)successive in the assemly code before the data in A2(or B2) has been computed.
waiting for your answers
Thanks