This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

PROCESSOR-SDK-DRA8X: Example for Matrix-Matrix multiplication using the MMALib

Part Number: PROCESSOR-SDK-DRA8X

Hello TI support,

can you provide a simple example for a matrix matrix multiplication utilizing the C7x-MMA? The provided user guide does not provide the answers I'm looking for. I'd like to know how to call the init function correctly and what parameters needs to be passed for 'handle' and 'pKerInitArgs' as well as how the data is correctly assigned to the function.

    const size_t rows    = 3;
    const size_t cols    = 3;
    const size_t matSize = rows * cols;

    int32_t matA[matSize]   = {0, 1, 2, 3, 4, 5, 6, 7, 8};
    int32_t matB[matSize]   = {0, 1, 2, 3, 4, 5, 6, 7, 8};
    int32_t matRes[matSize] = {0};

    // For each element C(i,j)
    for (size_t i = 0; i < rows; i++)
    {
        for (size_t j = 0; j < cols; j++)
        {
            // C(i,j) = dot(A(i,:), B(:, j)
            int32_t sum = 0;
            for (size_t k = 0; k < rows; k++)
            {
                sum += matA[i * rows + k] * matB[k * rows + j];
            }
            matRes[i * rows + j] = sum;
        }
    }

    assert(matRes[0] == 15);
    assert(matRes[1] == 18);
    assert(matRes[2] == 21);
    assert(matRes[3] == 42);
    assert(matRes[4] == 64);
    assert(matRes[5] == 66);
    assert(matRes[6] == 69);
    assert(matRes[7] == 90);
    assert(matRes[8] == 111);

    std::cout << "Matrix multiplication successful!" << std::endl;

It would be great to see this simple example realized on the MMA.

Thank you and kind regards,

Florian

  • Hi Florian, 

        I have looped in the right person to respond to this. You should get reply soon.

    Regards,

    Anshu

  • Hi Florian,

    MMALIB has a module of linalg_c7xmma which performs linear algebra operations. It has four kernels and multiple test cases. You can look at this module. 

    MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX

    MMALIB_LINALG_matrixMatrixMultiplyAccumulate_ixX_ixX_ixX_oxX

    MMALIB_LINALG_matrixTranspose_ixX_oxX

    MMALIB_LINALG_pointwiseMatrixMatrixMultiply_ixX_ixX_oxX

    Regards

    Asheesh

  • Hello Asheesh.

    As I said in my first post: "The provided user guide does not provide the answers I'm looking for.". In the MMALib Version I have "mmalib_00_09_05_00" there are also no examples.

        const size_t rows = 3;
        const size_t cols = 3;
        const size_t size = rows * cols;
    
        int32_t RESTRICT matA[size] = {0};
        int32_t RESTRICT matB[size] = {0};
        int32_t RESTRICT matC[size] = {0};
    
        for (size_t i = 0; i < size; ++i) {
            matA[i] = i;
            matB[i] = i;
        }
    
        MMALIB_bufParams2D_t aBuffer;
        aBuffer.data_type = MMALIB_INT32;
        aBuffer.dim_x = rows;
        aBuffer.dim_y = cols;
        aBuffer.stride_y = 0;
    
        MMALIB_bufParams2D_t bBuffer;
        bBuffer.data_type = MMALIB_INT32;
        bBuffer.dim_x = rows;
        bBuffer.dim_y = cols;
        bBuffer.stride_y = 0;
    
        MMALIB_bufParams2D_t resultBuffer;
        resultBuffer.data_type = MMALIB_INT32;
        resultBuffer.dim_x = rows;
        resultBuffer.dim_y = cols;
        resultBuffer.stride_y = 0;
    
        MMALIB_kernelHandle kernelHandle;
        MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_InitArgs initArgs;
        initArgs.funcStyle = MMALIB_FUNCTION_NATC;
        initArgs.shift = 0;
    
        int32_t handleSize = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_getHandleSize(&initArgs);
    
        std::cout << 'Handle size ' << std::to_string(handleSize) << std::endl;
    
        MMALIB_STATUS statusCheck = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_init_checkParams(kernelHandle,
                                                                                                    &aBuffer,
                                                                                                    &bBuffer,
                                                                                                    &resultBuffer,
                                                                                                    &initArgs);
    
        std::cout << 'Init Check status ' << std::to_string(statusCheck) << std::endl;
    
        MMALIB_STATUS status = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_init(kernelHandle, &aBuffer, &bBuffer, &resultBuffer, &initArgs);
    
        std::cout << 'Init status ' << std::to_string(status) << std::endl;
    
        MMALIB_STATUS paramCheck = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_exec_checkParams(kernelHandle,
                                                                                                   &matA[0],
                                                                                                   &matB[0],
                                                                                                   &matC[0]);
        std::cout << 'Param check status ' << std::to_string(paramCheck) << std::endl;
    
        MMALIB_STATUS execStatus = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_exec(kernelHandle,
                                                                                       &matA[0],
                                                                                       &matB[0],
                                                                                       &matC[0]);
        std::cout << 'Execution check status ' << std::to_string(execStatus) << std::endl;
    
        std::cout << 'MatA:' << std::endl;
        for (int row = 0; row < rows; ++row) {
            for (int col = 0; col < cols; ++col) {
                std::cout << matA[row * cols + col] << ', ';
            }
            std::cout << '' << std::endl;
        }
        std::cout << '' << std::endl;
    
        std::cout << 'MatB:' << std::endl;
        for (int row = 0; row < rows; ++row) {
            for (int col = 0; col < cols; ++col) {
                std::cout << matB[row * cols + col] << ', ';
            }
            std::cout << '' << std::endl;
        }
        std::cout << '' << std::endl;
    
        std::cout << 'MatC:' << std::endl;
        for (int row = 0; row < rows; ++row) {
            for (int col = 0; col < cols; ++col) {
                std::cout << matC[row * rows + col] << ', ';
            }
            std::cout << '' << std::endl;
        }
        std::cout << '' << std::endl;
    
        assert(matC[0] == 15);
        assert(matC[1] == 18);
        assert(matC[2] == 21);
        assert(matC[3] == 42);
        assert(matC[4] == 64);
        assert(matC[5] == 66);
        assert(matC[6] == 69);
        assert(matC[7] == 90);
        assert(matC[8] == 111);
    
        std::cout << 'MatMulIntrinsics done...' << std::endl;

    That is the sample program I was able to write and execute on the DSP. But the out put is follwing

    Handle size 360
    Init Check status 0
    Init status 0
    Param check status 0
    Execution check status 0
    MatA:
    0, 1, 2, 
    3, 4, 5, 
    6, 7, 8, 
    
    MatB:
    0, 1, 2, 
    3, 4, 5, 
    6, 7, 8,
    MatC:
    0, 3, 6, 
    0, 0, 0, 
    0, 0, 0, 

    Expected would be

    MatC:
    15, 18, 21, 
    42, 54, 66, 
    69, 90, 111, 
    

    I might suspect that I'm doing someting wrong with the kernelHandle but couldn't find anything in the user guide.

    Regards.

    Florian

  • Hi Florian,

    You can refer the test files

    MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_d.c driver file for the test setup 

    MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_idat.c for test data

    MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_idat.h for test header file

    Regards

    Asheesh

  • Hi Asheesh,

    where should I find these files? I installed the latest P-SDK RTOS 6.1.0 with the MMALib 1.0 but I can't find any examples in there. TI_C7X_DSP_TRAINING_00_05 has also no examples in that regard. Please point me to a location where I can get those files. Thanks.

    Kind regards,

    Florian

  • Hi Florian,

    I have edited the example code you provided and tested it on my setup.  Here are a few notes on the changes I made.

    • dim_x is columns, dim_y is rows. This doesn't matter in this square matrix example, but will for rectangular matrices.
    • The stride parameter is the number of bytes from the beginning of one row to the beginning of the next row in the matrix.   
    • Modified initialization of the handle.  MMALIB_kernelHandle is a pointer, not the actual data, so space for data has to be allocated and its address is stored in the handle.

    Here is the code:

       const size_t rows = 3;
       const size_t cols = 3;
       size_t size = rows * cols;
       
       int32_t matA[size];
       int32_t matB[size];
       int32_t matC[size];
       int32_t matOpt[size];
       
       size_t i;
       for (i = 0; i < size; ++i) {
          matA[i] = i;
          matB[i] = i;
       }
       
       MMALIB_bufParams2D_t aBuffer;
       aBuffer.data_type = MMALIB_INT32;
       aBuffer.dim_x = cols;
       aBuffer.dim_y = rows;
       aBuffer.stride_y = aBuffer.dim_x * MMALIB_sizeof(aBuffer.data_type);
       
       MMALIB_bufParams2D_t bBuffer;
       bBuffer.data_type = MMALIB_INT32;
       bBuffer.dim_x = cols;
       bBuffer.dim_y = rows;
       bBuffer.stride_y = bBuffer.dim_x * MMALIB_sizeof(bBuffer.data_type);
       
       MMALIB_bufParams2D_t resultBuffer;
       resultBuffer.data_type = MMALIB_INT32;
       resultBuffer.dim_x = cols;
       resultBuffer.dim_y = rows;
       resultBuffer.stride_y = resultBuffer.dim_x * MMALIB_sizeof(resultBuffer.data_type);
       
       MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_InitArgs initArgs;
       initArgs.funcStyle = MMALIB_FUNCTION_NATC;
       initArgs.shift = 0;
    
       int32_t handleSize = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_getHandleSize(&initArgs);
       MMALIB_kernelHandle kernelHandle = malloc(handleSize);
    
       // Check that the parameters will generate a valid handle
       MMALIB_STATUS initCheck = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_init_checkParams(kernelHandle,
                                                                                                   &aBuffer,
                                                                                                   &bBuffer,
                                                                                                   &resultBuffer,
                                                                                                   &initArgs);
       
       printf("Init check = %d.\n", initCheck);
    
       // Generate the handle
       MMALIB_STATUS initStatus = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_init(kernelHandle, &aBuffer, &bBuffer, &resultBuffer, &initArgs);
       printf("Init status = %d.\n", initStatus);
       
       // Check that the execute arguments are valid for execution
       MMALIB_STATUS execCheck = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_exec_checkParams(kernelHandle,
                                                                                                  &matA[0],
                                                                                                  &matB[0],
                                                                                                  &matC[0]);
       printf("Exec check = %d.\n", execCheck);
       
       // Execute the kernel
       MMALIB_STATUS execStatus = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_exec(kernelHandle,
                                                                                      &matA[0],
                                                                                      &matB[0],
                                                                                      &matC[0]);
       
       printf("Exec status = %d.\n", execStatus);
       
       printf("The natural C result is: \n");
       size_t k;
       size_t idx = 0;
       for (i = 0; i < rows; i++) {
          for (k = 0; k < cols; k++){
             printf("%4d ", matC[idx++]);
          }
          printf("\n");
       }
       
       initArgs.funcStyle = MMALIB_FUNCTION_OPTIMIZED;
       // Execute the kernel
       execStatus = MMALIB_LINALG_matrixMatrixMultiply_ixX_ixX_oxX_exec(kernelHandle,
                                                                                      &matA[0],
                                                                                      &matB[0],
                                                                                      &matOpt[0]);
       printf("Exec status = %d.\n", execStatus);
       
       printf("The optimized C result is: \n");
       idx = 0;
       for (i = 0; i < rows; i++) {
          for (k = 0; k < cols; k++){
             printf("%4d ", matOpt[idx++]);
          }
          printf("\n");
       }
       
       printf("MatMulIntrinsics done...\n");
       free(kernelHandle);

    This should produce

    Init check = 0.
    Init status = 0.
    Exec check = 0.
    Exec status = 0.
    The natural C result is:
      15   18   21
      42   54   66
      69   90  111
    Exec status = 0.
    The optimized C result is:
      15   18   21
      42   54   66
      69   90  111
    MatMulIntrinsics done...

    Best,

    Will

  • Hi Will,

    thanks for your code! This was really helpful and I was able to get it running on the board now.

    Kind regards.

    Florian