TMS320C6672: About the Accuracy of the QR Decomposition Routine of DSPLIB

silacko

Part Number: TMS320C6672

Hi,

I've been working on a C6672 DSP, and I've recently come across a problem regarding a DSPLIB (3.4.0.0) library function, namely "DSPF_sp_qrd", which takes a real single-precision floating-point matrix as input and decomposes it into an orthogonal matrix, Q, and an upper-triangular matrix, R. Although it yields acceptable results most of the time, I'm experiencing rather poor accuracy for certain matrices. Take the following matrix, for instance:

A =

[ 2.1945 -0.0050 -0.0000 0.0000 ]

[ -0.0050 1.0758 -0.0007 0.0000 ]

[ -0.0000 -0.0004 1.2757 -0.0002 ]

[ 0.0000 0.0000 -0.0001 -0.1320 ]

(These might not reflect the exact numbers I have due to being rounded to 4 decimal places, but you should get the idea.)

When I decompose this matrix, I get the following Q and R:

Q =

[ 1.0000 -0.0023 -0.0000 -0.0000 ]

[ -0.0023 -1.0180 -0.0003 -0.0000 ]

[ -0.0000 0.0003 -0.0778 -0.0001 ]

[ 0.0000 -0.0000 0.0010 -0.9772 ]

R =

[ 2.1945 -0.0075 -0.0000 0.0000 ]

[ 0.0000 -1.0951 0.0012 -0.0000 ]

[ 0.0000 0.0000 -0.0993 -0.0001 ]

[ 0.0000 0.0000 0.0000 0.1290 ]

I noticed that the results are incorrect because when I multiply Q and R, the result is nowhere near the original matrix. Also, Q is clearly not an orthogonal matrix; the third column gives it away. I also tested this on the C equivalent function "DSPF_sp_qrd_cn" (using GCC on my PC instead of C6672), but it gave the same results as well.

Could I be missing something? Or might the function have a restriction to its inputs that is not documented in the API reference?

Thanks in advance.

over 4 years ago

0 silacko over 4 years ago

Intellectual 640 points

I'd really appreciate if someone could help me out on this.

Regards,
Silacko

0 Mukul Bhatnagar over 4 years ago in reply to silacko

TI__Guru* 84005 points

Sorry we missed this query. I have routed this to a colleague who is more familiar with these routines. Will keep you posted.

Regards

Mukul

0 Mukul Bhatnagar over 4 years ago in reply to silacko

TI__Guru* 84005 points

Hi
Can you try disabling the flag “ENABLE_NR”, which is currently enabled by default in the code and see if it makes any difference

0 silacko over 4 years ago in reply to Mukul Bhatnagar

Intellectual 640 points

Hi, thanks for your response. I have a few questions:

Do I have to rebuild DSPLIB just to disable the flag?
What does NR stand for?
I noticed the flag "ENABLE_NR" is not referenced at all in "DSPF_sp_qrd_cn.c", where the decomposition also yields inaccurate results, so its's safe to assume disabling the flag won't resolve the issue, don't you think?

Regards,
Silacko

0 Deepak Poddar over 4 years ago in reply to silacko

TI__Expert 4725 points

Hi Please find my answers below in your order of questions

1) You may just copy the optimized code in your project and build along with your project. This will be simplest to check if issue is resolved or now.

2) NR stands for newton rapshon method to increase the floating point precision along with DSP intrinsic. Of that flag is disabled then normal division is performed.

3) so if "DSPF_sp_qrd_cn.c" also gives inaccurate result then you may try any opensource C function for this purpose. As such there is not much SIMD vectorization is possible for this function, hence just cross compiling open source code for this purpose miight be ok. You will find there is not much compute performance difference between natural C version vs optimized version.

Regards

Deepak Poddar

0 silacko over 4 years ago in reply to Deepak Poddar

Intellectual 640 points

Hi,

Thanks for the reply.

From your answer to my 3rd question, I infer that disabling NR will be pointless since the issue is not related to optimization, and that the DSPLIB routine for QR decomposition does not involve much of an optimization in the first place, which is unfortunate for me because I needed an optimized implementation for QR.

Anyways, I think I'll either find an open-source code or implement the function myself. I'm still curious about why the DSPLIB routine is failing, though.

Regards,

Silacko

+1 Deepak Poddar over 4 years ago in reply to silacko

TI__Expert 4725 points

Hi,

Basically it is hard to vectorize the operations involved in QR decomposition, hence simple optimization related to pointer aliasing avoidance , loop count information to compiler are done in existing code. You may take clue from opensource code and as well modify existing DSLIP code which keep the basic optimizations.

Regards

Deepak Poddar

Processors

Processors forum

TMS320C6672: About the Accuracy of the QR Decomposition Routine of DSPLIB