66AK2G12: Data dependent errors using DSPR_sp_fftSPxSp_r2c

Adam Daluga

Part Number: 66AK2G12

Hi,

I need an efficient 32-bit float fft for real input and complex output, which is conveniently provided in DSPF_sp_fftSPxSP_r2c. In fact, the library distribution even includes precisely applicable example code, even identical in that the example is for the same 256-pt length that I need.

My problem is that I’m getting erroneous, data-dependent results. I’ve spent a lot of effort trying to diagnose, to little avail, i.e., I still experience the same errors as when I started. My results are consistent, but data dependent: perfectly accurate for some inputs, significantly wrong for others. I have a collection of passing and failing test examples that I could share in detail, if that seems useful, even though it seems to me that this approach could become rather tedious.

Right now, I’m wondering if I might be better served trying a new version of the library.

I just downloaded the latest from the TI website (http://software-dl.ti.com/sdoemb/sdoemb_public_sw/dsplib/latest//exports/dsplib_c66x_3_4_0_0_Win32.exe), and all the components I’m using match the downloaded files exactly. I seem to be missing another folder called dsplib_c66x_3_4_0_4. Does this mean I do not have the latest version? Where can I find the latest version.

Additionally: Please find the attached MSWord docx file, representing the DSP vs. Matlab fft results comparison that I’ve done, showing the data-dependent errors I experience on the DSP.

Thanks!

Adam200226a.docx

over 5 years ago

0 Adam Daluga over 5 years ago

TI__Expert 6195 points

Hi,

Any update here?

Here are all the roughly equivalent DSPLIB function calls I’ve tried:

DSPF_sp_fftSPxSP (256, x_sp, w_sp, y_sp, brev, 4, 0, 256); // a Bad

DSPF_sp_fftSPxSP_r2c (256, x_sp, w_sp, y_sp, brev, 4, 0, 256); // b Worse

DSPF_sp_fftSPxSP (256, x_sp, w_sp, y_sp, brev, 2, 0, 256); // c Bad (=a)

DSPF_sp_fftSPxSP_r2c (256, x_sp, w_sp, y_sp, brev, 2, 0, 256); // d Worse (=b)

DSPF_sp_fftSPxSP_cn (256, x_sp, w_sp, y_sp, brev, 4, 0, 256); // e Bad (=a)

DSPF_sp_fftSPxSP_r2c_cn(256, x_sp, w_sp, y_sp, brev, 4, 0, 256); // f Bad (=a)

DSPF_sp_fftSPxSP_opt (256, x_sp, w_sp, y_sp, brev, 4, 0, 256); // g Bad (=a)

The results aren’t identical, but all similarly “Bad” or “Worse.” I’ve been including the relevant function sources into my project and compiling them with my sources in that project, rather than using the DSPLIB as an independently built link library. I tried building the library, which I did, but quite unsuccessfully. Meanwhile, my progress is blocked, so I’m still trying to get some kind of equivalent sub-optimal 256-pt fft function working, currently trying to build and use Cygwin/gsl, finding it more complicated than I’d hoped. Ultimately, of course I’d like to use highly efficient, optimized functions, such as are provided in the DSPLIB. Right now I most urgently need to get something working, even if it’s considerably less efficient, so I can continue development. I’d appreciate any help you can offer. I attach a slightly more complete and clearer recap of my fft test results.

Best,

Adam200227a.docx

0 Rahul Prabhu over 5 years ago in reply to Adam Daluga

TI__Guru** 116020 points

Adam,

Updating to newer version of the library is unlikely to impact the output currently observed so unless you are moving to be on the latest baseline with the new compiler based support, I wouldn`t expect to see any change in results.

It is not clear to me what the bad and worse comment is based on. Is this based on FFT plots. Does the Natural C implementation and the optimized version provide the same output ? Are you running these tests by simply replacing the test vectors in the DSPLIB that we have provided ? Can you please confirm that you are taking care of the assumption mentioned in the user guide.

I will review your docs and read about your findings in a little more detail and provide additional inputs if I find any issues with your setup.

Regards,

Rahul

0 Alan Hunt over 5 years ago in reply to Rahul Prabhu

Prodigy 40 points

Rahul: I'm the actual user with the problem. Adam posted the problem to this forum in response to my email report to him. (Thanks, Adam!)

My tests comply with all the assumptions. Interrupts are not an issue. I set up my tests using the fft_sp_ex example code, with only minor changes to supply a wider variety of test cases.

My x, w, and y arrays are all setup identically to the example code, with proper 8-byte alignment, e.g.: #pragma DATA_ALIGN(x, 8)

It appears to me that my problem is deeper and more subtle than anything specifically stipulated in the nominal documentation.

All test results I provided were from use of DSPF_sp_fftSPxSP, which results I'm calling "bad." I tried other functions to see if any others would yield better results with the same inputs, to no avail. Some were just as "bad", some were "worse," but none were better.

-- AKH

0 Alan Hunt over 5 years ago in reply to Alan Hunt

Prodigy 40 points

Please note these important facts: (1) I obtain the exact same erroneous fft results with DSPF_sp_fftSPxSP and DSPF_sp_fftSPxSP_cn. In other words, my problems are exactly the same even using TI-supplied "natural C" implementation. (2) I get weird, complicated results even in response to the most simple of all inputs, i.e., a Kronecker delta, i.e., a single positive value followed by all zeros. (3) I get exactly the same erroneous results when I build the fft example code directly from freshly downloaded and installed dsplib directory. (4) The supplied DSPF_sp_fftSPxSP_cn.c and DSPF_sp_fftSPxSP_r2c_cn.c are identical. Conclusion: It's not just me. There are real problems with the supplied code in the dsplib distribution.

0 Alan Hunt over 5 years ago in reply to Alan Hunt

Prodigy 40 points

The problem appears to stem from a mistake made by someone who (1) thought that endianness needed to be "handled" even though it doesn't matter at all, and (2) applied mods that perhaps better correspond to fft vs ifft? I dunno. But anyway... Here's how to modify DSPF_sp_fftSPxSP_cn to obtain correct results.
Replace this: | With this:
#ifdef _LITTLE_ENDIAN |
si1 = w[j]; |
co1 = w[j + 1]; |
si2 = w[j + 2]; |
co2 = w[j + 3]; | co1 = w[j];
si3 = w[j + 4]; | si1 = w[j + 1];
co3 = w[j + 5]; | co2 = w[j + 2];
#else | si2 = w[j + 3];
co1 = w[j]; | co3 = w[j + 4];
si1 = -w[j + 1]; | si3 = w[j + 5];
co2 = w[j + 2]; |
si2 = -w[j + 3]; |
co3 = w[j + 4]; |
si3 = -w[j + 5]; |
#endif |
Note that the replacement on the right matches neither case on the left, but is similar to both. Also note that the former / broken / incorrect version of DSPF_sp_fftSPxSP_cn.c yields erroneous results identical to the erroneous results of DSPF_sp_fftSPxSP and DSPF_sp_fftSPxSP_opt.
Can we please get all these problems cleaned up in the dsplib distribution?

0 Rahul Prabhu over 5 years ago in reply to Alan Hunt

TI__Guru** 116020 points

AKH,

I will report your finding and check to see if the developers can look into the findings and address the issue. I am not sure how does this explain the fact that sometimes you were seeing accurate result while at other times the issue showed up with your code. I am assuming that all of your code is in little endian and the correct defines and library was used in your code.

Regards,

Rahul

0 Alan Hunt over 5 years ago in reply to Rahul Prabhu

Prodigy 40 points

My results never varied. I only said that the errors were "data dependent." In other words, results were sometimes ok, sometimes not too bad, sometimes ridiculous, but always consistent given the same input. The problem is that the fft is implemented incorrectly in the dsplib. It has nothing to do with endianness. The endianness check in the _cn version is a programmer's mistake. The optimized versions match the incorrect "little endian" case in the _cn function. Anyway, yes, my endianness is fine, correctly defined, just irrelevant to the problem.

Processors

Processors forum

66AK2G12: Data dependent errors using DSPR_sp_fftSPxSp_r2c