Is there a reference QR decomposition agorithm avalibal for the C64x+?

Tobias53434

Hi,

Did anyone know if there is a reference assembly implementation of the QR decomposition algorithm available for the C64x+.

Thanks.

over 15 years ago

0 Senthil Kumar Yogamani over 15 years ago

TI__Intellectual 1450 points

Hello Tobias,

Are you looking for an optimized implementation or just some reference code which runs on C64x+?

I am not aware of any optimized implementation. But I can point you to some reference code.

Also please mention the size of the matrix, the data type of the elements of the matrix and expected performance.

It would be good to start with the reference C code and observe the performance.

Regards

Senthil

0 Tobias53434 over 15 years ago in reply to Senthil Kumar Yogamani

Prodigy 150 points

Hello Senthil

Currently, I am looking for an fixed point (Q15 and Q31) reference code for a complex 2x2 QRD based on the Givens rotation or Gram-Schmidt algorithm. I already have a modified fixed point and floating point C implementation for these algorithms.

Best regards

Tobias

0 Senthil Kumar Yogamani over 15 years ago in reply to Tobias53434

TI__Intellectual 1450 points

Tobias,

From what I understand, you are having a fixed point and floating point implementation and you are trying to convert these to Q-point implementations?

It should be straight forward if you understand Q-point arithmetic. Please let me know if you have any difficulty.

As the dimension is only 2x2, the compiler should be able to do a good job in optimizing the C code.

Regards,

Senthil

0 Tobias53434 over 15 years ago in reply to Senthil Kumar Yogamani

Prodigy 150 points

Senthil,

thank you for your time.

I have now wrote my own code in linear assembly. But now I have the problem that if I define a couple of instructions for parallel execution, the compiler removes some of the parallel execution commands.

(A_xx = .rega, B_xx = .regb) (C64X+)

Linear Assembly
          MPY32     .M1    A_tmp1,        A_tmp1,        A_mpy2:A_mpy3
    ||    MPY32     .M2    B_tmp1,        B_tmp1,       B_mpy2:B_mpy3
    ||    SHL        .S1    A_mpy0,        Q_int,        A_mpy0
    ||    SHL        .S2    B_mpy0,        Q_int,       B_mpy0

Assembly
1180044C 0956AA01            MPY32.M1      A21,A21,A19:A18
11800450 084E6A02 ||         MPY32.M2      B19,B19,B17:B16
11800454 04A4ACE2           SHL.S2           B9,B5,B9
11800458 67E2                   SHL.S1       A7,A3,A7

Regards,

Tobias

0 Senthil Kumar Yogamani over 15 years ago in reply to Tobias53434

TI__Intellectual 1450 points

Tobias,

It is difficult to help you with the limited information above. I cant spot anything wrong with your code.

If you can send me a stand-alone project, I can try to help you better.

Please email me @ ysenthil@ti.com if possible.

Regards

Senthil

0 Tobias53434 over 15 years ago in reply to Senthil Kumar Yogamani

Prodigy 150 points

Hi,

I found the problem. There was an external link to a makefile, which activates the full debug option.

Thanks for your help.

0 Senthil Kumar Yogamani over 15 years ago in reply to Tobias53434

TI__Intellectual 1450 points

Tobias,

I am glad that you could fix the problem.

Thanks for letting us know.

Regards

Senthil

Processors

Processors forum

Is there a reference QR decomposition agorithm avalibal for the C64x+?