DSPF_sp_fftSPxSP producing wrong results

Kikmelon

I started with the FFT example provided with DSPLIB (version 3.4.0.0) which I found in C:\ti\dsplib_c66x_3_4_0_0\examples\fft_sp_ex\fft_example_sp.c.

I noticed that for many strictly real inputs I tried, the data does not show the expected symmetry. I modified the example and produced the output given below. The DFT function I wrote does not agree with TI's FFT function, but the DFT function does agree with Octave.

Am I missing something here?

I should mention that I've tried this with N = 256, and I get similar results.

Here is my code and my output:

#include <stdint.h>
#include <math.h>
#include <ti/dsplib/dsplib.h>
#include <stdio.h>
#include <complex.h>


#define MAXN 256

#pragma DATA_ALIGN(x_sp, 8);
float   x_sp [2*MAXN];
#pragma DATA_ALIGN(y_sp, 8);
float   y_sp [2*MAXN];
#pragma DATA_ALIGN(w_sp, 8);
float   w_sp [2*MAXN];

unsigned char brev[64] = {
    0x0, 0x20, 0x10, 0x30, 0x8, 0x28, 0x18, 0x38,
    0x4, 0x24, 0x14, 0x34, 0xc, 0x2c, 0x1c, 0x3c,
    0x2, 0x22, 0x12, 0x32, 0xa, 0x2a, 0x1a, 0x3a,
    0x6, 0x26, 0x16, 0x36, 0xe, 0x2e, 0x1e, 0x3e,
    0x1, 0x21, 0x11, 0x31, 0x9, 0x29, 0x19, 0x39,
    0x5, 0x25, 0x15, 0x35, 0xd, 0x2d, 0x1d, 0x3d,
    0x3, 0x23, 0x13, 0x33, 0xb, 0x2b, 0x1b, 0x3b,
    0x7, 0x27, 0x17, 0x37, 0xf, 0x2f, 0x1f, 0x3f
};

static void print_cpx(const float* x, int n)
{
	for(int i = 0; i < n; i++)
		printf("%4d: %10.4f %+10.4fi\n",i, x[2*i],x[2*i+1] );
	printf("\n");
}

void generateInput(int N) {
    for(int i = 0; i < N; ++i) {
    	x_sp[2*i]  = (float)i/4.0f;
    	x_sp[2*i] += (i%4)==0 ? 1.0f : 0.0f;
    	x_sp[2*i+1] = 0.0f;
    }
}

// a simple (but slow N^2) DFT function.
void slow_dft( const float complex* xin, float complex* xout, unsigned int N )
{
	double twoPi_N = 2.0f*3.141592654/(double)N;

    for(int k = 0; k < N; ++k)
    {
        double complex Xk = 0.0;

        double w = k*twoPi_N;

        for(int n = 0; n < N; ++n)
        {
        	double complex e_iwn = cexp(-I*w*n);
            Xk += xin[n]*e_iwn;
        }

        xout[k] = (float complex)Xk;
    }
}


/* Function for generating Specialized sequence of twiddle factors */
void gen_twiddle_fft_sp (float *w, int n)
{
    int i, j, k;
    double x_t, y_t, theta1, theta2, theta3;
    const double PI = 3.141592654;

    for (j = 1, k = 0; j <= n >> 2; j = j << 2)
    {
        for (i = 0; i < n >> 2; i += j)
        {
            theta1 = 2 * PI * i / n;
            x_t = cos (theta1);
            y_t = sin (theta1);
            w[k] = (float) x_t;
            w[k + 1] = (float) y_t;

            theta2 = 4 * PI * i / n;
            x_t = cos (theta2);
            y_t = sin (theta2);
            w[k + 2] = (float) x_t;
            w[k + 3] = (float) y_t;

            theta3 = 6 * PI * i / n;
            x_t = cos (theta3);
            y_t = sin (theta3);
            w[k + 4] = (float) x_t;
            w[k + 5] = (float) y_t;
            k += 6;
        }
    }
}


void main (void) {

       int N = 16;

	/* Generate the input data */
	generateInput(N);
	printf("Input:\n");
	print_cpx(x_sp,N);

	/* Genarate twiddle factors */
	gen_twiddle_fft_sp(w_sp, N);

	/* Call FFT routine */
	DSPF_sp_fftSPxSP(N, x_sp, w_sp, y_sp, brev, 4, 0, N);
	printf("TI FFT:\n");
    print_cpx(y_sp,N);


    /* Re-generate the input data */
    generateInput(N);
    slow_dft((const float complex*)x_sp,(float complex*)y_sp,N);
    printf("Simple DFT:\n");
    print_cpx(y_sp,N);
}

///////////////////////////////////////////////////////////////////////////

Input:
   0:     1.0000    +0.0000i
   1:     0.2500    +0.0000i
   2:     0.5000    +0.0000i
   3:     0.7500    +0.0000i
   4:     2.0000    +0.0000i
   5:     1.2500    +0.0000i
   6:     1.5000    +0.0000i
   7:     1.7500    +0.0000i
   8:     3.0000    +0.0000i
   9:     2.2500    +0.0000i
  10:     2.5000    +0.0000i
  11:     2.7500    +0.0000i
  12:     4.0000    +0.0000i
  13:     3.2500    +0.0000i
  14:     3.5000    +0.0000i
  15:     3.7500    +0.0000i

TI FFT:
   0:    34.0000    +0.0000i
   1:     2.0000   +10.0547i
   2:    -4.8284    +2.0000i
   3:    -2.0000    -2.9932i
   4:     2.0000    +2.0000i
   5:     2.0000    -2.9932i
   6:     4.8284    +2.0000i
   7:    -2.0000   +10.0547i
   8:     2.0000    +0.0000i
   9:     2.0000    -0.3978i
  10:     0.8284    +2.0000i
  11:    -2.0000    +1.3364i
  12:     2.0000    -2.0000i
  13:     2.0000    +1.3364i
  14:    -0.8284    +2.0000i
  15:    -2.0000    -0.3978i

Simple DFT:
   0:    34.0000    +0.0000i
   1:    -2.0000   +10.0547i
   2:    -2.0000    +4.8284i
   3:    -2.0000    +2.9932i
   4:     2.0000    +2.0000i
   5:    -2.0000    +1.3364i
   6:    -2.0000    +0.8284i
   7:    -2.0000    +0.3978i
   8:     2.0000    +0.0000i
   9:    -2.0000    -0.3978i
  10:    -2.0000    -0.8284i
  11:    -2.0000    -1.3364i
  12:     2.0000    -2.0000i
  13:    -2.0000    -2.9932i
  14:    -2.0000    -4.8284i
  15:    -2.0000   -10.0547i

over 9 years ago

0 Raja over 9 years ago

TI__Guru* 81335 points

Kikmelon,

Welcome to the TI E2E forum. I hope you will find many good answers here and in the TI.com documents and in the TI Wiki Pages (for processor issues). Be sure to search those for helpful information and to browse for the questions others may have asked on similar topics (e2e.ti.com).

Note: We strongly recommend you to create new e2e thread for your queries instead of following up on an old/closed e2e thread, new threads gets more attention than old threads and can provide link of old threads or information on the new post for clarity and faster response.

Thank you.

0 ran35366 over 9 years ago in reply to Raja

TI__Genius 12805 points

You are right, there is an issue with the routine.
First I noticed that the results that TI's routine gives are similar to the correct results with location difference and sometimes with negative sign.
Then I run the two functions on input data where x_sp[0] = 1.0 and all other values are 0. I got the following results:

Input x_sp[0] = 1.0 all other values are 0

1.0000 +0.0000i
1: 0.0000 -1.0000i
2: 0.0000 -1.0000i
3: 0.0000 -1.0000i
4: 1.0000 +0.0000i
5: 0.0000 -1.0000i
6: 0.0000 -1.0000i
7: 0.0000 -1.0000i
8: 1.0000 +0.0000i
9: 0.0000 -1.0000i
10: 0.0000 -1.0000i
11: 0.0000 -1.0000i
12: 1.0000 +0.0000i
13: 0.0000 -1.0000i
14: 0.0000 -1.0000i
15: 0.0000 -1.0000i

Simple DFT:
0: 1.0000 +0.0000i
1: 1.0000 +0.0000i
2: 1.0000 +0.0000i
3: 1.0000 +0.0000i
4: 1.0000 +0.0000i
5: 1.0000 +0.0000i
6: 1.0000 +0.0000i
7: 1.0000 +0.0000i
8: 1.0000 +0.0000i
9: 1.0000 +0.0000i
10: 1.0000 +0.0000i
11: 1.0000 +0.0000i
12: 1.0000 +0.0000i
13: 1.0000 +0.0000i
14: 1.0000 +0.0000i
15: 1.0000 +0.0000i
Last I run the two functions with x_sp[1] = 1.0 and all other values zero. I got teh following:

x-sp[1] = 1.0 all other values are 0000
0: 0.0000 +1.0000i
1: 1.0000 +0.0000i
2: 1.0000 +0.0000i
3: 1.0000 +0.0000i
4: 0.0000 +1.0000i
5: 1.0000 +0.0000i
6: 1.0000 +0.0000i
7: 1.0000 +0.0000i
8: 0.0000 +1.0000i
9: 1.0000 +0.0000i
10: 1.0000 +0.0000i
11: 1.0000 +0.0000i
12: 0.0000 +1.0000i
13: 1.0000 +0.0000i
14: 1.0000 +0.0000i
15: 1.0000 +0.0000i

Simple DFT:
0: 0.0000 +1.0000i
1: 0.0000 +1.0000i
2: 0.0000 +1.0000i
3: 0.0000 +1.0000i
4: 0.0000 +1.0000i
5: 0.0000 +1.0000i
6: 0.0000 +1.0000i
7: 0.0000 +1.0000i
8: 0.0000 +1.0000i
9: 0.0000 +1.0000i
10: 0.0000 +1.0000i
11: 0.0000 +1.0000i
12: 0.0000 +1.0000i
13: 0.0000 +1.0000i
14: 0.0000 +1.0000i
15: 0.0000 +1.0000i

I will investigate some time to understand what is the issue, and then submit a request to fix the library

Ran

0 ran35366 over 9 years ago in reply to ran35366

TI__Genius 12805 points

OK

It turns out that the twiddle factor generation is wrong. (regardless the reason, it was optimized for a different version of the code, but this is irrelevant)

Please use the following twiddle function

void tw_gen (float *w, int n)
{
int i, j, k;
const double PI = 3.141592654;

for (j = 1, k = 0; j <= n >> 2; j = j << 2)
{
for (i = 0; i < n >> 2; i += j)
{
#ifdef _LITTLE_ENDIAN
w[k] = (float) sin (2 * PI * i / n);
w[k + 1] = (float) cos (2 * PI * i / n);
w[k + 2] = (float) sin (4 * PI * i / n);
w[k + 3] = (float) cos (4 * PI * i / n);
w[k + 4] = (float) sin (6 * PI * i / n);
w[k + 5] = (float) cos (6 * PI * i / n);
#else
w[k] = (float) cos (2 * PI * i / n);
w[k + 1] = (float) -sin (2 * PI * i / n);
w[k + 2] = (float) cos (4 * PI * i / n);
w[k + 3] = (float) -sin (4 * PI * i / n);
w[k + 4] = (float) cos (6 * PI * i / n);
w[k + 5] = (float) -sin (6 * PI * i / n);
#endif
k += 6;
}
}
}

0 Kikmelon over 9 years ago in reply to ran35366

Prodigy 40 points

Cool. I copied this new twiddle function to my test code, and my FFT output looks much happier.

0 Kikmelon over 9 years ago in reply to Kikmelon

Prodigy 40 points

I have a follow-up question. My FFT output looks good when calling DSPF_sp_fftSPxSP, but when I call DSPF_sp_fftSPxSP_r2c, again, I am getting unexpected results.

Is there a different twiddle function I should be using for the case of strictly-real input?

0 ran35366 over 9 years ago in reply to Kikmelon

TI__Genius 12805 points

As you of course know, the way to do real fft faster is to present the N points real FFT as N/2 points complex FFT, and then do an additional step. You can see it in dsp-book.narod.ru/.../0270_PDF_C14.pdf
The extra step requires an additional set of twiddle factors. You can look at the test code (DSPF_sp_fftSPxSP_r2c_d.c ) and see how the twiddle factors are generated.

If you did all of this already, and the C model gives you wrong results as well as the optimized code, post it again and I will build my test program and try to understand what is wrong. But please report first your observations with the twiddle factor generation and trying the C model as well

Ran

0 Kikmelon over 9 years ago in reply to ran35366

Prodigy 40 points

Ah, thanks. I was able to take that info and get it working.

I was not able to find this kind of info in this page of the documentation:

file:///C:/ti/dsplib_c66x_3_4_0_0/docs/doxygen/html/dsplib_html/group___d_s_p_f__sp__fft_s_p_x_s_p__r2c.html

or in the examples found in:

C:\ti\dsplib_c66x_3_4_0_0\examples

Is this where I would find the best and official documentation? Or should I be looking elsewhere?

0 ran35366 over 9 years ago in reply to Kikmelon

TI__Genius 12805 points

Thanks Kikmelon, I am happy that you made it working

I tell you what I usually use for documentations.

First I look at the docgen directory that is part of the release to see what are the functions that are available, what is the API and what is the definitions of structures if structures are needed.

To understand how the function is working, what are the requirements and limitations of the parameters like the way to generate twiddle factors I look at the unit test that comes with the release and I look at two functions. The C model - the file with extension _cn.c (cn stands for c natural), it is pure C code without intrinsic and usually present the algorithm. The main purpose of the C model is to compare the results of the optimized code. Most of the optimized functions are written in C with intrinsic. I prefer C with intrinsic over assembly. But I like to understand the algorithm (I am sure you like to), for example to see if TI implemented DIF or DIT FFT.

The second file that I look at is the unit test code itself. The file with extension _d.c it shows me how to use the optimized code. It tells me how to build the correct input (should it be in bit-reverse order? how to build the twiddle factor?)

To make the "process" that I describe above easier, we have some tentative plans to put together an app-note to address some of the common DSPLib usage questions. We will add a link to this thread once it is available.

Best regards

Ran

Processors

Processors forum

DSPF_sp_fftSPxSP producing wrong results