This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Is there a reference QR decomposition agorithm avalibal for the C64x+?

Hi,

Did anyone know if there is a reference assembly implementation of the QR decomposition algorithm available for the C64x+.

Thanks.

  •  

    Hello Tobias,

    Are you looking for an optimized implementation or just some reference code which runs on C64x+?

    I am not aware of any optimized implementation. But I can point you to some reference code.

    Also please mention the size of the matrix, the data type of the elements of the matrix and expected performance.

    It would be good to start with the reference C code and observe the performance.

    Regards

    Senthil

  • Hello Senthil

    Currently, I am looking for an fixed point (Q15 and Q31) reference code for a complex 2x2 QRD based on the Givens rotation or Gram-Schmidt algorithm. I already have a modified fixed point and floating point C implementation for these algorithms.

    Best regards

    Tobias

  •  

    Tobias,

    From what I understand, you are having a fixed point and floating point implementation and you are trying to convert these to Q-point implementations?

    It should be straight forward if you understand Q-point arithmetic. Please let me know if you have any difficulty.

    As the dimension is only 2x2, the compiler should be able to do a good job in optimizing the C code.

    Regards,

    Senthil

     

  • Senthil,

    thank you for your time.

    I have now wrote my own code in linear assembly. But now I have the problem that if I define a couple of instructions for parallel execution, the compiler removes some of the parallel execution commands.

    (A_xx = .rega, B_xx = .regb) (C64X+)

    Linear Assembly  
              MPY32     .M1    A_tmp1,        A_tmp1,        A_mpy2:A_mpy3
        ||    MPY32     .M2    B_tmp1,        B_tmp1,       B_mpy2:B_mpy3       
        ||    SHL        .S1    A_mpy0,        Q_int,        A_mpy0
        ||    SHL        .S2    B_mpy0,        Q_int,       B_mpy0

    Assembly
    1180044C 0956AA01            MPY32.M1      A21,A21,A19:A18
    11800450 084E6A02 ||         MPY32.M2      B19,B19,B17:B16
    11800454 04A4ACE2           SHL.S2           B9,B5,B9
    11800458 67E2                     SHL.S1           A7,A3,A7

    Regards,

    Tobias

     

  • Tobias,

    It is difficult to help you with the limited information above. I cant spot anything wrong with your code.

    If you can send me a stand-alone project, I can try to help you better.

    Please email me @ ysenthil@ti.com if possible.

    Regards

    Senthil

  • Hi,

    I found the problem. There was an external link to a makefile, which activates the full debug option.

    Thanks for your help.

  •  

    Tobias,

    I am glad that you could fix the problem.

    Thanks for letting us know.

    Regards

    Senthil