How to use CMATMPY intrinsic

Alexandre NGUYEN

Hi,

I'd like to know how to use the cmatmpy intrinsic.

I read in "TMS320C6000 Optimizing Compiler v7.4" that the intrinsic is __x128_t_cmatmpy (long long src1, __x128_t src2);

I'm really new at this and I don't understand how it works... I tried _cmatmpy(array,matrix); but it doesn't match with the types of the arguments.

Can you clear that up for me please ?

Thank you,

Alex

over 11 years ago

0 HRi over 11 years ago

Guru 10750 points

Hi Alex,

Please check the document sprugh7.pdf -TMS320C66x DSP CPU and Instruction Set paragraph 4.35 CCMATMPY

Thanks,

0 Alexandre NGUYEN over 11 years ago in reply to HRi

Intellectual 260 points

Hi HR,

I already did check this document, I began with this one.

The thing is that the description and examples are in assembly code, and I'd like to use the intrinsic in order to implement C code in CCS...

The function takes two numbers for input, but we treat a 1x2 array and a 2x2 matrix. I don't understand why we need the conversion from array/matrix to long long/__x128_t. I also checked the example in C code in the document "Optimizing Loops on the C66x DSP" without fully understanding it... I repeat that I'm really new at this, I'm in an internship where I learn day after day...

Best regards,

Alex

0 Alexandre NGUYEN over 11 years ago in reply to Alexandre NGUYEN

Intellectual 260 points

Hi again,

I found how to use the function but it doesn't give me the good results...

Here is my code :

void main(){

    int matrix[2][2] = {-3,7,2,1};
    int array[2] = {-5,6};
    long long arr,arr1,arr2;
    __x128_t matrix_128,output;
    long A8,A9,A10,A11;

    arr=_itoll(array[0],array[1]); // conversion of the array in long long
    arr1=_itoll(matrix[0][0],matrix[0][1]); // conversion of the first line of the input matrix into long long
    arr2=_itoll(matrix[1][0],matrix[1][1]); // conversion of the second line of the input matrix into long long
    matrix_128=_llto128(arr1,arr2); // conversion of the input matrix in __x128_t

output=_cmatmpy(arr,matrix_128);

    A8=_get32_128 (output, 0);
    A9=_get32_128 (output, 1);
    A10=_get32_128 (output, 2);
    A11=_get32_128 (output, 3);

printf("A11 = %ld, A10 = %ld, A9 = %ld, A8 = %ld",A11,A10,A9,A8);
}

In the console I get :
A11 = -26, A10 = 8, A9 = 29, A8 = -7 instead of 27, 0, -29, 0

Can someone tell me what's wrong ?

Thanks,

Alex

0 Clement FR over 11 years ago in reply to Alexandre NGUYEN

Genius 4740 points

Alexandre,

It is a complex multiply. With a complex being represented as a 32-bit entity (two 16-bit numbers packed)

In your code I don't see any reference to complex number nor 16-bit numbers packed in a 32-bit one.

Are you trying to do

[A B] x [ C D ] = [ G H ]
[ E F ]

with A = -5, B = 6, C = -3, D = 7, E = 2, F = 1 ?
in that case G = 27 H = -29

The code doesn't seem wrong per se, it's just that your understanding of what CMATMPY does is confused.

0 Alexandre NGUYEN over 11 years ago in reply to Clement FR

Intellectual 260 points

Hi Clement,

Thank you for your answer.

That's exactly what I want to do, I try the function with some easy input values for understanding it... It doesn't matter if I don't use complex values, right ?

But the output doesn't seem correct. Instead of 27, 0, -29, 0 (for real and imaginary parts) I get -26, 8, 29, -7.

Do you know why ?

0 Clement FR over 11 years ago in reply to Alexandre NGUYEN

Genius 4740 points

The output is correct. The input isn't.

The problem is : the intrinsic expect a 32-bit complex with the 16-bit low part being the imaginary the 16-bit high part being the real part.

When you use a

int A = 5; in memory you have 00000000 00000000 ; 00000000 00000101 (high part = re ; low part = im)

if you tell the compiler that it's the input of your CMATMPY intrinsics it understands it as : 0 + 5i

Do you understand your problem ?

0 Alexandre NGUYEN over 11 years ago in reply to Clement FR

Intellectual 260 points

Thank you, I think I have understood indeed.

I have one last question though : what do we have in memory for negative numbers, when I do int A = -5 ?

I thought we had something like 00000000 00000000 ; 10000000 00000101 but when I do the computation array/matrix myself, I do not get the values that CCS give me.

0 Clement FR over 11 years ago in reply to Alexandre NGUYEN

Genius 4740 points

Assuming you are in a debug configuration,

use the memory browser view in CCS and/or the variable view too. You can then see your 'int' in memory.

It's a good idea to debug step by step and check in memory how things are stored (when you use itoll for example).

0 Alexandre NGUYEN over 11 years ago in reply to Clement FR

Intellectual 260 points

Ok so I saw that we use the 2s compliment to represent a negative number.

For example, -5 is 11111111 11111111 ; 11111111 11111011 in binary.

But when I use the intrinsic, the compiler doesn't understand it as 65535 + i*65531 right ?

0 Clement FR over 11 years ago in reply to Alexandre NGUYEN

Genius 4740 points

Well I don't know

you have to try something like that :

short re, im; //16-bit
int complex; //32-bit

re = -5;
im = 0;

complex = _thegoodintrinsic(re,im);

and see what happens in memory.

or you can use a struct too

pseudo code

typedef struct complex{

short re;
short im; } ;

complex myComplex;

myComplex.re = -5;
myComplex.im = 0;

0 Alexandre NGUYEN over 11 years ago in reply to Clement FR

Intellectual 260 points

Hi,

I'll try with your structure today. I saw that you tried to make me understand alone but it's really new for me.

Anyway thanks a lot for your help, I'll come back if necessary but I don't hope so... !

Alexandre

0 Alexandre NGUYEN over 11 years ago in reply to Alexandre NGUYEN

Intellectual 260 points

Hi,

I have another problem with the function. Below is the code I wrote for multiplying a 1x2 vector by a 2x2 matrix when all elements are of type int :

typedef struct {

int re; //32 bit

int im; // 32 bit

} complex;

complex arr1, arr2, mat1, mat2, mat3, mat4;

arr1.re = -5; arr1.im = 0; arr2.re = 6; arr2.im = 0;

mat1.re = -3; mat1.im = 0; mat2.re = 7; mat2.im = 0; mat3.re = 2; mat3.im = 0; mat4.re = 1; mat4.im = 0;

int array_1, array_2, matrix_1, matrix_2, matrix_3, matrix_4;

array_1 = _spack2(arr1.re,arr1.im);

array_2 = _spack2(arr2.re,arr2.im);

matrix_1 = _spack2(mat1.re,mat1.im);

matrix_2 = _spack2(mat2.re,mat2.im);

matrix_3 = _spack2(mat3.re,mat3.im);

matrix_4 = _spack2(mat4.re,mat4.im);

long long array, line1_matrix, line2_matrix;

array = _itoll(array_1,array_2); line1_matrix = _itoll(matrix_1,matrix_2); line2_matrix = _itoll(matrix_3,matrix_4);

__x128_t matrix, product;

matrix = _llto128(line1_matrix,line2_matrix); product = _cmatmpy(array,matrix);

long A8,A9,A10,A11;

A8 = _get32_128(product,0); A9 = _get32_128(product,1); A10 = _get32_128(product,2); A11 = _get32_128(product,3);

printf("A11 = %ld, A10 = %ld, A9 = %ld, A8 = %ld",A11,A10,A9,A8);

This code works correctly for matrix of type int. I'd like to use the intrinsic for matrix of float. I tried to do things like that but I didn't succeed because of the types of the arguments.

Can you help me to adapt this ? I think that there shouldn't be a lot to modify...

Thanks,

Alex

0 Alexandre NGUYEN over 11 years ago in reply to Alexandre NGUYEN

Intellectual 260 points

Hmm I'm wondering if that's possible.

float numbers are coded on 32 bits, and all are used...

With integers it was easier since I just took the 16 LSB of each integer to form the long long.

I can't deprive myself by considering only some bits of a float number... Otherwise the number will not be the same.

But I don't think TI has forgotten the possibility to treat float numbers matrix...

So how can I do this ?

0 Alexandre NGUYEN over 11 years ago in reply to Alexandre NGUYEN

Intellectual 260 points

I'm upping the topic...

I recall my problem : I'd like to use the complex matrix multiply when my matrixes are composed by float numbers. I succeedeed for int numbers, but for floats, I really don't see the conversions to do for getting a correct input.

Can someone help me please ?

0 Alberto Chessa over 11 years ago in reply to Alexandre NGUYEN

Mastermind 6650 points

Alexandre NGUYEN said:

I recall my problem : I'd like to use the complex matrix multiply when my matrixes are composed by float numbers. I succeedeed for int numbers, but for floats, I really don't see the conversions to do for getting a correct input.

Can someone help me please ?

The C6678 don't have instructions for float matrix multiply, so you cannot do that directly with instrinsics. You have to code the multiply by yourself, mybe trying to optimize it with some other intrinsic as _complex_conjugate_mpysp. See source code in DSPLIB, function DSPF_sp_mat_mul_cplx in MCSDK.

0 Alexandre NGUYEN over 11 years ago in reply to Alberto Chessa

Intellectual 260 points

Hi Alberto,

Thank you for your answer !

That's what I thought... I could have looked for the solution for a long time !

Problem solved !

Bye

Processors

Processors forum

How to use CMATMPY intrinsic