Hi,
Did anyone know if there is a reference assembly implementation of the QR decomposition algorithm available for the C64x+.
Thanks.
This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi,
Did anyone know if there is a reference assembly implementation of the QR decomposition algorithm available for the C64x+.
Thanks.
Hello Tobias,
Are you looking for an optimized implementation or just some reference code which runs on C64x+?
I am not aware of any optimized implementation. But I can point you to some reference code.
Also please mention the size of the matrix, the data type of the elements of the matrix and expected performance.
It would be good to start with the reference C code and observe the performance.
Regards
Senthil
Hello Senthil
Currently, I am looking for an fixed point (Q15 and Q31) reference code for a complex 2x2 QRD based on the Givens rotation or Gram-Schmidt algorithm. I already have a modified fixed point and floating point C implementation for these algorithms.
Best regards
Tobias
Tobias,
From what I understand, you are having a fixed point and floating point implementation and you are trying to convert these to Q-point implementations?
It should be straight forward if you understand Q-point arithmetic. Please let me know if you have any difficulty.
As the dimension is only 2x2, the compiler should be able to do a good job in optimizing the C code.
Regards,
Senthil
Senthil,
thank you for your time.
I have now wrote my own code in linear assembly. But now I have the problem that if I define a couple of instructions for parallel execution, the compiler removes some of the parallel execution commands.
(A_xx = .rega, B_xx = .regb) (C64X+)
Linear
Assembly
MPY32 .M1 A_tmp1, A_tmp1,
A_mpy2:A_mpy3
|| MPY32 .M2 B_tmp1, B_tmp1,
B_mpy2:B_mpy3
|| SHL .S1 A_mpy0,
Q_int, A_mpy0
|| SHL .S2 B_mpy0,
Q_int, B_mpy0
Assembly
1180044C
0956AA01 MPY32.M1 A21,A21,A19:A18
11800450 084E6A02
|| MPY32.M2 B19,B19,B17:B16
11800454 04A4ACE2
SHL.S2 B9,B5,B9
11800458 67E2
SHL.S1 A7,A3,A7
Regards,
Tobias
Tobias,
It is difficult to help you with the limited information above. I cant spot anything wrong with your code.
If you can send me a stand-alone project, I can try to help you better.
Please email me @ ysenthil@ti.com if possible.
Regards
Senthil
Hi,
I found the problem. There was an external link to a makefile, which activates the full debug option.
Thanks for your help.
Tobias,
I am glad that you could fix the problem.
Thanks for letting us know.
Regards
Senthil