FFT function in DSPLIB C674x returns wrong result

Diego10504

Other Parts Discussed in Thread: OMAP-L137

Hi all,
I'm using OMAP-L137 evm board.
Starting from example /src/DSPF_sp_fftSPxSP/DSPF_sp_fftSPxSP.pjt distributed with DSPLIB C674x v. 1.1,
I created a new project (attached) for testing purpose, but I don't understand why function returns a wrong fft result.

Project uses a reference function to compute twiddle array, the same included in the demostration app as suggested by http://wiki.davincidsp.com/index.php/C674x_DSPLIB#DSPF_sp_fftSPxSP_.28Mixed_Radix_Forward_FFT_with_Bit_Reversal.29

Any idea or suggestion?

PS: Project includes library and header file.

fft.zip

over 16 years ago

0 Mariana over 16 years ago

TI__Mastermind 24340 points

Hi Diego,

I'm not a fft expert, but here is my suggestion - try using the C equivalent (DSPF_sp_fftSPxSP_cn) instead and see the results. If it is a problem with your inputs, you can step trough the C equivalent code and see what could be wrong.

0 Joe Coombs over 16 years ago

TI__Expert 8890 points

I have run this data through the natural C and legacy ASM functions (both included in the demo project), and they all return the same result. Without performing alternate calculations (i.e. running an FFT in MATLAB or similar), I won't comment on the accuracy of the result. Does the result you get from the DSPLIB disagree with some other FFT algorithm?

0 Diego10504 over 16 years ago in reply to Joe Coombs

Prodigy 160 points

I try the C fft algorithm in Visual C++ platform to verify its correctness and it works:

It returns the same SCILAB fft result.

Down here I've pasted a report of various tests ( %i = immaginary unit )

*****************************************************************************
VISUAL C++
*****************************************************************************

--INPUT---

8.5000000000000000 + (0.0000000000000000) * %i
-0.5000000000000000 + (-2.5136697292327881) * %i
-0.5000000000000000 + (-1.2071068286895752) * %i
-0.5000000000000000 + (-0.7483028769493103) * %i
-0.5000000000000000 + (-0.5000000000000000) * %i
-0.5000000000000000 + (-0.3340893089771271) * %i
-0.5000000000000000 + (-0.2071067839860916) * %i
-0.5000000000000000 + (-0.0994561836123466) * %i
-0.5000000000000000 + (0.0000000000000000) * %i
-0.5000000000000000 + (0.0994561836123466) * %i
-0.5000000000000000 + (0.2071067839860916) * %i
-0.5000000000000000 + (0.3340893089771271) * %i
-0.5000000000000000 + (0.5000000000000000) * %i
-0.5000000000000000 + (0.7483028769493103) * %i
-0.5000000000000000 + (1.2071068286895752) * %i
-0.5000000000000000 + (2.5136697292327881) * %i

--OUTPUT (C function)---

1.0000000000000000 + (0.0000000000000000) * %i
2.0000000000000000 + (0.0000002384185791) * %i
3.0000000000000000 + (0.0000003258413699) * %i
4.0000004768371582 + (0.0000000000000000) * %i
5.0000000000000000 + (0.0000000000000000) * %i
6.0000000000000000 + (0.0000001192092896) * %i
7.0000000000000000 + (0.0000000317865130) * %i
8.0000000000000000 + (0.0000000000000000) * %i
9.0000000000000000 + (0.0000000000000000) * %i
10.0000000000000000 + (-0.0000002384185791) * %i
11.0000000000000000 + (-0.0000001509958025) * %i
12.0000000000000000 + (0.0000000000000000) * %i
13.0000000000000000 + (0.0000000000000000) * %i
14.0000000000000000 + (-0.0000001192092896) * %i
15.0000000000000000 + (-0.0000002066320661) * %i
16.0000000000000000 + (0.0000000000000000) * %i

done

*****************************************************************************
SCILAB
*****************************************************************************

x =

    8.5
- 0.5 - 2.5136697i
- 0.5 - 1.2071068i
- 0.5 - 0.7483029i
- 0.5 - 0.5i
- 0.5 - 0.3340893i
- 0.5 - 0.2071068i
- 0.5 - 0.0994562i
- 0.5
- 0.5 + 0.0994562i
- 0.5 + 0.2071068i
- 0.5 + 0.3340893i
- 0.5 + 0.5i
- 0.5 + 0.7483029i
- 0.5 + 1.2071068i
- 0.5 + 2.5136697i

-->fft(x)
ans =

    1.
    2.
    2.9999999
    3.9999999
    5.
    6.0000001 - 2.220D-16i
    7.0000001 - 2.220D-16i
    8.0000001
    9.
    9.9999999
    11.
    12.
    13.
    14. + 2.220D-16i
    15. + 2.220D-16i
    16.

*****************************************************************************
OMAP-L137
*****************************************************************************

--INPUT---

--OUTPUT (asm function)---

1.0000000000000000 + (0.0000000000000000) * %i
1.7807149887084961 + (-0.3204739093780518) * %i
3.0000000000000000 + (0.0000003258413699) * %i
3.9001204967498779 + (0.0845487117767334) * %i
5.0000000000000000 + (0.0000000000000000) * %i
5.9226493835449219 + (0.1669408082962036) * %i
6.5571165084838867 + (-0.0000000874227766) * %i
8.0845489501953125 + (-0.1215617656707764) * %i
9.0000000000000000 + (0.0000000000000000) * %i
9.9978437423706055 + (0.0990324020385742) * %i
11.0000000000000000 + (-0.0000001509958025) * %i
11.8784379959106445 + (0.1368930339813232) * %i
13.0000000000000000 + (0.0000000000000000) * %i
14.2987918853759766 + (0.0545006990432739) * %i
15.4428834915161133 + (-0.0000000874227766) * %i
16.1368923187255859 + (-0.0998799800872803) * %i

--OUTPUT (C function)---

--OUTPUT (asm legacy function)---

done

As you can see, DSP error affects first decimal digit.

I chose this particular input vector because of its discrete fourier trasform is simple to visualize and verify.

I first noted this problem of the DSP code execution for a real sequence (imaginary part to zero): fft of a real sequence must have simmetric real part and anti-simmetric imaginary part, except for the first bin (DC value).

Debugging step by step, I found the origin of value mismatch: when tw_gen() function calls cos(theta), for some angles it returns wrong value.

For example, I try this code on a new project

int main()
{
int i;
double theta;
const double PI = 3.141592654;
double c0, s0;
float c[N], s[N];

for (i=0; i<N; i++)
{
  theta = 2 * PI * i / N;
  c0 = cos (theta);
  s0 = sin (theta);
  c[i] = c0;
  s[i] = s0;
  printf("\ncos(%f) = %f; sin(%f) = %f;", theta, c[i], theta, s[i]);
}

return 0;
}

output:

cos(0.000000) = 1.000000; sin(0.000000) = 0.000000;
cos(0.196350) = 1.000000; sin(0.196350) = 0.195090;
cos(0.392699) = 1.000000; sin(0.392699) = 0.382683;
cos(0.589049) = 1.000000; sin(0.589049) = 0.555570;
cos(0.785398) = 1.000000; sin(0.785398) = 0.707107;
cos(0.981748) = 1.000000; sin(0.981748) = 0.831470;
cos(1.178097) = 1.000000; sin(1.178097) = 0.923880;
cos(1.374447) = 1.000000; sin(1.374447) = 0.980785;
cos(1.570796) = -1.000000; sin(1.570796) = 1.000000;
cos(1.767146) = -1.000000; sin(1.767146) = 0.980785;
cos(1.963495) = -1.000000; sin(1.963495) = 0.923880;
cos(2.159845) = -1.000000; sin(2.159845) = 0.831470;
cos(2.356194) = -1.000000; sin(2.356194) = 0.707107;
cos(2.552544) = -1.000000; sin(2.552544) = 0.555570;
cos(2.748894) = -1.000000; sin(2.748894) = 0.382683;
cos(2.945243) = -1.000000; sin(2.945243) = 0.195090;
cos(3.141593) = -1.000000; sin(3.141593) = -0.000000;
cos(3.337942) = -0.980785; sin(3.337942) = -0.195090;
cos(3.534292) = -0.923880; sin(3.534292) = -0.382683;
cos(3.730641) = -0.831470; sin(3.730641) = -0.555570;
cos(3.926991) = -0.707107; sin(3.926991) = -0.707107;
cos(4.123340) = -0.555570; sin(4.123340) = -0.831470;
cos(4.319690) = -0.382683; sin(4.319690) = -0.923880;
cos(4.516039) = -0.195090; sin(4.516039) = -0.980785;
cos(4.712389) = 0.000000; sin(4.712389) = -1.000000;
cos(4.908739) = 0.195090; sin(4.908739) = -0.980785;
cos(5.105088) = 0.382683; sin(5.105088) = -0.923880;
cos(5.301438) = 0.555570; sin(5.301438) = -0.831470;
cos(5.497787) = 0.707107; sin(5.497787) = -0.707107;
cos(5.694137) = 0.831470; sin(5.694137) = -0.555570;
cos(5.890486) = 0.923880; sin(5.890486) = -0.382683;
cos(6.086836) = 0.980785; sin(6.086836) = -0.195090;

for 0 < theta < PI, function cos(theta) returns only two values: +/-1

Is the bug only on my system or anyone has got it too?

0 Diego10504 over 16 years ago in reply to Diego10504

Prodigy 160 points

Replacing cos(theta) with sin(PI / 2 - theta) all works fine.

0 Joe Coombs over 16 years ago in reply to Diego10504

TI__Expert 8890 points

Interesting. My results (using your code as provided) are a pretty good match for your Visual C++ and SCILAB results:

--OUTPUT---

1.0000000000000000 + (0.0000000000000000) * %i
2.0000000000000000 + (0.0000001192092896) * %i
3.0000000000000000 + (0.0000000874227766) * %i
4.0000000000000000 + (0.0000000000000000) * %i
5.0000000000000000 + (0.0000000000000000) * %i
6.0000004768371582 + (0.0000000000000000) * %i
7.0000000000000000 + (-0.0000000874227766) * %i
8.0000000000000000 + (0.0000000000000000) * %i
9.0000000000000000 + (0.0000000000000000) * %i
10.0000000000000000 + (-0.0000001192092896) * %i
11.0000000000000000 + (0.0000000874227766) * %i
12.0000000000000000 + (0.0000000000000000) * %i
13.0000000000000000 + (0.0000000000000000) * %i
14.0000000000000000 + (0.0000000000000000) * %i
15.0000000000000000 + (-0.0000000874227766) * %i
16.0000000000000000 + (0.0000000000000000) * %i

I wonder if we could be using different versions of the RTS libraries. What version are your codegen tools? (To check this in CCS, go to Help->About.) I am using version 6.1.10.

0 Diego10504 over 16 years ago in reply to Joe Coombs

Prodigy 160 points

Hi Joe,

my Code Generation Tools version is 6.1.5, how can I update it to the latest version?

0 TimHarron over 16 years ago in reply to Diego10504

TI__Mastermind 18130 points

Visit here: http://tiexpressdsp.com/index.php/Compiler_Releases

0 Joe Coombs over 16 years ago in reply to Diego10504

TI__Expert 8890 points

Diego,

The codegen tools are available on the CCS Update Advisor website. Try this URL:

https://www-a.ti.com/downloads/sds_support/CodeGenerationTools.htm

If that doesn't work, try going through Help->Update Advisor->Check for Updates in CCS.

You should close CCS before installing the codegen tools. Then, go to the CCS component manager (accessed through the Help->About screen, bizarrely) and select the new codegen tools under Build Tools->TMS32067XX. You'll probably need to restart CCS again after that.

0 Mariana over 16 years ago in reply to Joe Coombs

TI__Mastermind 24340 points

In Component manager, you need to select the CGT you downloaded for both TMS320C64XX and TMS320C67XX for C674x.

0 Diego10504 over 16 years ago in reply to Mariana

Prodigy 160 points

Thank you all,
with new version of Code Generation Tools cos() function works fine.

0 TimHarron over 16 years ago in reply to Diego10504

TI__Mastermind 18130 points

Hi Diego,

Glad to hear it. I can confirm that this was a known issue fixed in CGT v6.1.8, but I am having trouble finding the specific bug number for your reference. If I come across it I will edit this post as an FYI.

0 André over 12 years ago in reply to Mariana

Prodigy 20 points

Hy.

Can anyone send me the "DSPF_sp_fftSPxSP_cn.c" used in this post?

I need the "DSPLIB C674x v. 1.1" because i'm still using CCS 3.3.

Thanks! André Águas.

0 Rahul Prabhu over 12 years ago in reply to André

TI__Guru** 116170 points

Please find the C files in the attachment.

Fullscreen 6177.DSPF_sp_fftSPxSP_cn.c Download

/* ======================================================================= */
/* DSPF_sp_fftSPxSP_cn.c -- Forward FFT with Mixed Radix                   */
/*                 Natural C Implementation                                */
/*                                                                         */
/* Rev 0.0.3                                                               */
/*                                                                         */
/* Copyright (C) 2011 Texas Instruments Incorporated - http://www.ti.com/  */ 
/*                                                                         */
/*                                                                         */
/*  Redistribution and use in source and binary forms, with or without     */
/*  modification, are permitted provided that the following conditions     */
/*  are met:                                                               */
/*                                                                         */
/*    Redistributions of source code must retain the above copyright       */
/*    notice, this list of conditions and the following disclaimer.        */
/*                                                                         */
/*    Redistributions in binary form must reproduce the above copyright    */
/*    notice, this list of conditions and the following disclaimer in the  */
/*    documentation and/or other materials provided with the               */
/*    distribution.                                                        */
/*                                                                         */
/*    Neither the name of Texas Instruments Incorporated nor the names of  */
/*    its contributors may be used to endorse or promote products derived  */
/*    from this software without specific prior written permission.        */
/*                                                                         */
/*  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS    */
/*  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT      */
/*  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR  */
/*  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT   */
/*  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,  */
/*  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT       */
/*  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,  */
/*  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY  */
/*  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT    */
/*  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE  */
/*  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.   */
/*                                                                         */
/* ======================================================================= */

#pragma CODE_SECTION(DSPF_sp_fftSPxSP_cn, ".text:natural");

#include "DSPF_sp_fftSPxSP_cn.h"

#ifndef _TMS320C6x

unsigned int _bitr(unsigned int a) 
{
    unsigned int i, b=0;

    for(i=0; i<32; i++)
    {
        b |=  (((a >> i) & 0x1) << (31 -i));
    }

    return(b);
}

#endif

void DSPF_sp_fftSPxSP_cn (int N, float *ptr_x, float *ptr_w, float *ptr_y,
    unsigned char *brev, int n_min, int offset, int n_max)
{

    int i, j, k, l1, l2, h2, predj;
    int tw_offset, stride, fft_jmp;

    float x0, x1, x2, x3, x4, x5, x6, x7;
    float xt0, yt0, xt1, yt1, xt2, yt2, yt3;
    float yt4, yt5, yt6, yt7;
    float si1, si2, si3, co1, co2, co3;
    float xh0, xh1, xh20, xh21, xl0, xl1, xl20, xl21;
    float x_0, x_1, x_l1, x_l1p1, x_h2, x_h2p1, x_l2, x_l2p1;
    float xl0_0, xl1_0, xl0_1, xl1_1;
    float xh0_0, xh1_0, xh0_1, xh1_1;
    float *x, *w;
    int l0, radix;
    float *y0, *ptr_x0, *ptr_x2;

    radix = n_min;

    stride = N;                                                /* n is the number of complex samples */
    tw_offset = 0;
    while (stride > radix)
    {
        j = 0;
        fft_jmp = stride + (stride >> 1);
        h2 = stride >> 1;
        l1 = stride;
        l2 = stride + (stride >> 1);
        x = ptr_x;
        w = ptr_w + tw_offset;

        for (i = 0; i < N; i += 4)
        {
            co1 = w[j];
            si1 = w[j + 1];
            co2 = w[j + 2];
            si2 = w[j + 3];
            co3 = w[j + 4];
            si3 = w[j + 5];

            x_0 = x[0];
            x_1 = x[1];
            x_h2 = x[h2];
            x_h2p1 = x[h2 + 1];
            x_l1 = x[l1];
            x_l1p1 = x[l1 + 1];
            x_l2 = x[l2];
            x_l2p1 = x[l2 + 1];

            xh0 = x_0 + x_l1;
            xh1 = x_1 + x_l1p1;
            xl0 = x_0 - x_l1;
            xl1 = x_1 - x_l1p1;

            xh20 = x_h2 + x_l2;
            xh21 = x_h2p1 + x_l2p1;
            xl20 = x_h2 - x_l2;
            xl21 = x_h2p1 - x_l2p1;

            ptr_x0 = x;
            ptr_x0[0] = xh0 + xh20;
            ptr_x0[1] = xh1 + xh21;

            ptr_x2 = ptr_x0;
            x += 2;
            j += 6;
            predj = (j - fft_jmp);
            if (!predj)
                x += fft_jmp;
            if (!predj)
                j = 0;

            xt0 = xh0 - xh20;
            yt0 = xh1 - xh21;
            xt1 = xl0 + xl21;
            yt2 = xl1 + xl20;
            xt2 = xl0 - xl21;
            yt1 = xl1 - xl20;

            ptr_x2[l1] = xt1 * co1 + yt1 * si1;
            ptr_x2[l1 + 1] = yt1 * co1 - xt1 * si1;
            ptr_x2[h2] = xt0 * co2 + yt0 * si2;
            ptr_x2[h2 + 1] = yt0 * co2 - xt0 * si2;
            ptr_x2[l2] = xt2 * co3 + yt2 * si3;
            ptr_x2[l2 + 1] = yt2 * co3 - xt2 * si3;
        }
        tw_offset += fft_jmp;
        stride = stride >> 2;
    }                                                        /* end while */

    j = offset >> 2;

    ptr_x0 = ptr_x;
    y0 = ptr_y;

    /* l0 = _norm(n_max) +3; get size of fft */

    l0 = 0;
    for (k = 30; k >= 0; k--)
        if ((n_max & (1 << k)) == 0)
            l0++;
        else
            break;
    l0 = l0 + 3;
    if (radix <= 4)
        for (i = 0; i < N; i += 4)
        {
            /* reversal computation */
            k = _bitr(j) >> l0;
            j++;                                            /* multiple of 4 index */

            x0 = ptr_x0[0];
            x1 = ptr_x0[1];
            x2 = ptr_x0[2];
            x3 = ptr_x0[3];
            x4 = ptr_x0[4];
            x5 = ptr_x0[5];
            x6 = ptr_x0[6];
            x7 = ptr_x0[7];
            ptr_x0 += 8;

            xh0_0 = x0 + x4;
            xh1_0 = x1 + x5;
            xh0_1 = x2 + x6;
            xh1_1 = x3 + x7;

            if (radix == 2)
            {
                xh0_0 = x0;
                xh1_0 = x1;
                xh0_1 = x2;
                xh1_1 = x3;
            }

            yt0 = xh0_0 + xh0_1;
            yt1 = xh1_0 + xh1_1;
            yt4 = xh0_0 - xh0_1;
            yt5 = xh1_0 - xh1_1;

            xl0_0 = x0 - x4;
            xl1_0 = x1 - x5;
            xl0_1 = x2 - x6;
            xl1_1 = x3 - x7;

            if (radix == 2)
            {
                xl0_0 = x4;
                xl1_0 = x5;
                xl1_1 = x6;
                xl0_1 = x7;
            }

            yt2 = xl0_0 + xl1_1;
            yt3 = xl1_0 - xl0_1;
            yt6 = xl0_0 - xl1_1;
            yt7 = xl1_0 + xl0_1;

            if (radix == 2)
            {
                yt7 = xl1_0 - xl0_1;
                yt3 = xl1_0 + xl0_1;
            }

            y0[k] = yt0;
            y0[k + 1] = yt1;
            k += n_max >> 1;
            y0[k] = yt2;
            y0[k + 1] = yt3;
            k += n_max >> 1;
            y0[k] = yt4;
            y0[k + 1] = yt5;
            k += n_max >> 1;
            y0[k] = yt6;
            y0[k + 1] = yt7;
        }
}

/* ======================================================================= */
/*  End of file:  DSPF_sp_fftSPxSP_cn.c                                    */
/* ----------------------------------------------------------------------- */
/*            Copyright (c) 2011 Texas Instruments, Incorporated.          */
/*                           All Rights Reserved.                          */
/* ======================================================================= */

Processors

Processors forum

FFT function in DSPLIB C674x returns wrong result