This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6672: About the Accuracy of the QR Decomposition Routine of DSPLIB

Part Number: TMS320C6672

Hi,

I've been working on a C6672 DSP, and I've recently come across a problem regarding a DSPLIB (3.4.0.0) library function, namely "DSPF_sp_qrd", which takes a real single-precision floating-point matrix as input and decomposes it into an orthogonal matrix, Q, and an upper-triangular matrix, R. Although it yields acceptable results most of the time, I'm experiencing rather poor accuracy for certain matrices. Take the following matrix, for instance:

A =

    [  2.1945 -0.0050 -0.0000  0.0000 ]

    [ -0.0050  1.0758 -0.0007  0.0000 ]

    [ -0.0000 -0.0004  1.2757 -0.0002 ]

    [  0.0000  0.0000 -0.0001 -0.1320 ]

(These might not reflect the exact numbers I have due to being rounded to 4 decimal places, but you should get the idea.)

When I decompose this matrix, I get the following Q and R:

Q = 

    [  1.0000 -0.0023 -0.0000 -0.0000 ]

    [ -0.0023 -1.0180 -0.0003 -0.0000 ]

    [ -0.0000  0.0003 -0.0778 -0.0001 ]

    [  0.0000 -0.0000  0.0010 -0.9772 ]

R = 

    [  2.1945 -0.0075 -0.0000  0.0000 ]

    [  0.0000 -1.0951  0.0012 -0.0000 ]

    [  0.0000  0.0000 -0.0993 -0.0001 ]

    [  0.0000  0.0000  0.0000  0.1290 ]

I noticed that the results are incorrect because when I multiply Q and R, the result is nowhere near the original matrix. Also, Q is clearly not an orthogonal matrix; the third column gives it away. I also tested this on the C equivalent function "DSPF_sp_qrd_cn" (using GCC on my PC instead of C6672), but it gave the same results as well.

Could I be missing something? Or might the function have a restriction to its inputs that is not documented in the API reference?

Thanks in advance.

  • I'd really appreciate if someone could help me out on this.

    Regards,
    Silacko

  • Hi 

    Sorry we missed this query. I have routed this to a colleague who is more familiar with these routines. Will keep you posted.

    Regards

    Mukul

  • Hi 
    Can you try disabling the flag “ENABLE_NR”, which is currently enabled by default in the code and see if it makes any difference

  • Hi, thanks for your response. I have a few questions:

    1. Do I have to rebuild DSPLIB just to disable the flag?
    2. What does NR stand for?
    3. I noticed the flag "ENABLE_NR" is not referenced at all in "DSPF_sp_qrd_cn.c", where the decomposition also yields inaccurate results, so its's safe to assume disabling the flag won't resolve the issue, don't you think?

    Regards,
    Silacko

  • Hi Please find my answers below in your order of questions

    1) You may just copy the optimized code in your project and build along with your project. This will be simplest to check if issue is resolved or now.

    2) NR stands for newton rapshon method to increase the floating point precision along with DSP intrinsic. Of that flag is disabled then normal division is performed.

    3) so if "DSPF_sp_qrd_cn.c" also gives inaccurate result then you may try any opensource C function for this purpose. As such there is not much SIMD vectorization is possible for this function, hence just cross compiling open source code for this purpose miight be ok. You will find there is not much compute performance difference between natural C version vs optimized version.

    Regards

    Deepak Poddar

  • Hi,

    Thanks for the reply.

    From your answer to my 3rd question, I infer that disabling NR will be pointless since the issue is not related to optimization, and that the DSPLIB routine for QR decomposition does not involve much of an optimization in the first place, which is unfortunate for me because I needed an optimized implementation for QR.

    Anyways, I think I'll either find an open-source code or implement the function myself. I'm still curious about why the DSPLIB routine is failing, though.

    Regards,

    Silacko

  • Hi,

    Basically it is hard to vectorize the operations involved in QR decomposition, hence simple optimization related to pointer aliasing avoidance , loop count information to compiler are done in existing code. You may take clue from opensource code and as well modify existing DSLIP code which keep the basic optimizations.

    Regards

    Deepak Poddar