This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Question on TMS28335 matrix operation

Hi everyone,

I'm trying to perform a vector multiply a matrix 3 times with in 0.1 ms in my controller application. The 3 matrices are 6*5, 6*7 and 2*7 separately. The current way I'm using is like the code below:

void Matrix_multiply(double *x, double *y, double *z,int m, int n)
{
int i,k;
for (i=0;i<m;i++)
{
*(z+i)=0
for(k=0;k<n;k++)
*(z+i)+= *(x+i*n+k)*(*(y+k));
}
return;
}

But the efficiency of the code is not high enough to be finished in 0.1ms. I tried to use the FPU DSP function:

void mpy_SP_RVxRV_2(float32 *y, const float32 *w, const float32 *x, const Uint16 N) 

However since the requirement is N must be even so that it seems cannot be used. is there any way to optimize the code?

Thanks

  • Hi Yang,

    I modified the mpy_SP_RVxRV_2 as follows:

            .global _mpy_SP_RVxRV_odd
            .text
    _mpy_SP_RVxRV_odd:
            MOVL        XAR6, *-SP[4]    ;XAR6 = &x
            LSR         AL, #1           ;divide N by 2, discard remainder of 1
            ADDB        AL, #-1          ;subtract 1 from N since RPTB is 'n-1' 
                                         ;times and last iteration done separately
                                         
            MOV32       R0H, *XAR5++     ;load first w
            MOV32       R1H, *XAR6++     ;load first x
                                         
    ;---Main loop                        
            RPTB        end_loop, @AL    
                                         
            MPYF32       R2H, R0H, R1H   ;y[i] = w[i]*x[i]
            || MOV32     R0H, *XAR5++    ;load next w
            MOV32        R1H, *XAR6++    ;load next x
            MOV32        *XAR4++, R2H    ;store y[i]
                                         
            MPYF32       R2H, R0H, R1H   ;y[i] = w[i]*x[i]
            || MOV32     R0H, *XAR5++    ;load next w
            MOV32        R1H, *XAR6++    ;load next x
            MOV32        *XAR4++, R2H    ;store y[i]
    
    end_loop:
    
    ;--- One final iteration remaining
            MPYF32       R2H, R0H, R1H   ;y[i] = w[i]*x[i]
            NOP                          ;delay slot
            MOV32        *XAR4++, R2H    ;store y[i]
                                         
    ;Finish up                           
            LRETR                        ;return
                                         
    ;end of function _mpy_SP_RVxRV_odd()
    ;*********************************************************************
    
           .end

    Where i divide N by 2 and do two multiplies in the repeat block. Since N is odd we have one more iteration left which we do after the repeat block. you need to add the following prototype to the header file

    void mpy_SP_RVxRV_odd(float32 *y, const float32 *w, const float32 *x, const Uint16 N) 

  • Thanks, also I find that the calculation time is long due to I call tanh() function several times. Is there any fast tanh function in C28x?