This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28377S: C2000™ microcontrollers forum

Part Number: TMS320F28377S


Hello,

FPU library provides an C callable assembly function of vector multiplication for element wise(dot product, related code snippet is given as screenshot in the end of the thread). I'm also looking for  a C callable assembly of vector multiplication according to figure 1 which gives one single element at the end.

                                figure 1. Vector Multiplication

The provided vector multiplication function in the FPU library has the following function name: void mpy_SP_RVxRV_2(float32 *y, const float32 *w, const float *x, const Uint16 N)

I'm looking for something like this : float32 function_name(const float32 *w, const float *x, const Uint16 N

there will be a return which gives the summation of the all element wise vector multiplication. 

  • Hello,

    In this example, the final vector is to be referenced by the argument, 'y'. Does your application prevent this usage?

  • Hi,

    "y" is vector. I would rather have the single element which is the summation of the dot product of the input vectors than have the vector output. This is why I changed the definition of the function name as float32 function_name(const float32 *w, const float *x, const Uint16 N). This definition will give the output as the return single element(function has float32 return).

    I have no idea how to write an assembly code. This is why I ,here, wrote to get your help. Also, it would be better if you provides this function in the FPU library for the other user in the future release.

  • Hi Shanty,

    It seems that you deleted your last reply. By the way, thanks for your reply.

    You understood me correctly according to your deleted answer. My main purpose to write an assembly code is to increase the computation speed in my application. Think about you will do this multiplication for matrix and vector. If I use the provided function(void mpy_SP_RVxRV_2(float32 *y, const float32 *w, const float *x, const Uint16 N)), in ever step I will make summation before assign it in the related matrix element. The size of this summation (actually for loop) will be #row times #col of the matrix. 

    I also want to have the exact computation time. If I have assembly, I will be calculation the computation time depends on the size of the vector and matrix. On the other hand, with the c code(I've already wrote), the computation time will vary depends on the different factor(optimization use, etc.) 

  • Hello,

    Apologies for deleting the reply. I misunderstood the fact that  you wanted a solution in assembly.

    The c function I had written is as follows:

     float32 function_name(const float32 *w, const float *x, const Uint16 N)

    {

    float64_t y[N];
    float32 sum = 0;
    memset(y, 0U, N*sizeof(float64_t));
    mpy_DP_RVxRV_2((float64_t *)y, 
                       (const float64_t *)w, 
                       (const float64_t *)x, N);
    for(int i = 0; i<N; i++)
    sum += y[i];
    return sum;
    }
    In assembly, If you store N in AL and retain the address in XAR4, it is just a matter of writing a simple loop that looks like this:
    MOV XAR5, #0
    RPTB next_code, @AL
    ADD XAR5, *XAR4++
    next_code
    XAR5 should have your sum. Let me know if there are any problems
    -Shanty

    }

  • Hi Shanty,

    You gave me the assembly code snippet for the summation of the elements in the vector. If you combine this with the mpy_SP_RVxRV_2 function, you will give me what I want actually. 

    I don't have any information about argument passing or local variable usage in assembly. This is why I can't write it. In fact, it seems easy for the ones that have familiar with assembly. 

     float32 function_name(const float32 *w, const float *x, const Uint16 N), According to this function definition, you will multiply each element w and x and sum them into one local variable until N is reached. Once the loop finishes, you will return the value of the summation. 

    One last note; in my opinion, this function is very useful for other users, you can think to put this in the FPU library in the future release.

  • Hello,

    We will definitely look into it. If you would like to understand how the arguments passing works, write a simple c code and look at the disassembly. Make sure the optimisation settings are the same as the end application. I hope your query has been resolved.

  • It seems like I will understand the variable usage in assembly then write my own code. Thanks for your advice.