Question on TMS28335 matrix operation

Yang Sun

Hi everyone,

I'm trying to perform a vector multiply a matrix 3 times with in 0.1 ms in my controller application. The 3 matrices are 6*5, 6*7 and 2*7 separately. The current way I'm using is like the code below:

void Matrix_multiply(double *x, double *y, double *z,int m, int n)
{
int i,k;
for (i=0;i<m;i++)
{
*(z+i)=0
for(k=0;k<n;k++)
*(z+i)+= *(x+i*n+k)*(*(y+k));
}
return;
}

But the efficiency of the code is not high enough to be finished in 0.1ms. I tried to use the FPU DSP function:

void mpy_SP_RVxRV_2(float32 *y, const float32 *w, const float32 *x, const Uint16 N)

However since the requirement is N must be even so that it seems cannot be used. is there any way to optimize the code?

Thanks

over 9 years ago

0 Vishal_Coelho over 9 years ago

TI__Mastermind 20850 points

Hi Yang,

I modified the mpy_SP_RVxRV_2 as follows:

        .global _mpy_SP_RVxRV_odd
        .text
_mpy_SP_RVxRV_odd:
        MOVL        XAR6, *-SP[4]    ;XAR6 = &x
        LSR         AL, #1           ;divide N by 2, discard remainder of 1
        ADDB        AL, #-1          ;subtract 1 from N since RPTB is 'n-1' 
                                     ;times and last iteration done separately
                                     
        MOV32       R0H, *XAR5++     ;load first w
        MOV32       R1H, *XAR6++     ;load first x
                                     
;---Main loop                        
        RPTB        end_loop, @AL    
                                     
        MPYF32       R2H, R0H, R1H   ;y[i] = w[i]*x[i]
        || MOV32     R0H, *XAR5++    ;load next w
        MOV32        R1H, *XAR6++    ;load next x
        MOV32        *XAR4++, R2H    ;store y[i]
                                     
        MPYF32       R2H, R0H, R1H   ;y[i] = w[i]*x[i]
        || MOV32     R0H, *XAR5++    ;load next w
        MOV32        R1H, *XAR6++    ;load next x
        MOV32        *XAR4++, R2H    ;store y[i]

end_loop:

;--- One final iteration remaining
        MPYF32       R2H, R0H, R1H   ;y[i] = w[i]*x[i]
        NOP                          ;delay slot
        MOV32        *XAR4++, R2H    ;store y[i]
                                     
;Finish up                           
        LRETR                        ;return
                                     
;end of function _mpy_SP_RVxRV_odd()
;*********************************************************************

       .end

Where i divide N by 2 and do two multiplies in the repeat block. Since N is odd we have one more iteration left which we do after the repeat block. you need to add the following prototype to the header file

void mpy_SP_RVxRV_odd(float32 *y, const float32 *w, const float32 *x, const Uint16 N)

0 Yang Sun over 9 years ago in reply to Vishal_Coelho

Prodigy 150 points

Thanks, also I find that the calculation time is long due to I call tanh() function several times. Is there any fast tanh function in C28x?

C2000™︎ microcontrollers

C2000 microcontrollers forum

Question on TMS28335 matrix operation