This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

FFT function in DSPLIB C674x returns wrong result

Other Parts Discussed in Thread: OMAP-L137

 

 

Hi all,
I'm using OMAP-L137 evm board.
Starting from example /src/DSPF_sp_fftSPxSP/DSPF_sp_fftSPxSP.pjt distributed with DSPLIB C674x v. 1.1,
I created a new project (attached) for testing purpose, but I don't understand why function returns a wrong fft result.

Project uses a reference function to compute twiddle array, the same included in the demostration app as suggested by http://wiki.davincidsp.com/index.php/C674x_DSPLIB#DSPF_sp_fftSPxSP_.28Mixed_Radix_Forward_FFT_with_Bit_Reversal.29

Any idea or suggestion?

PS: Project includes library and header file.

fft.zip
  • Hi Diego,

    I'm not a fft expert, but here is my suggestion - try using the C equivalent (DSPF_sp_fftSPxSP_cn) instead and see the results. If it is a problem with your inputs, you can step trough the C equivalent code and see what could be wrong.

  • I have run this data through the natural C and legacy ASM functions (both included in the demo project), and they all return the same result.  Without performing alternate calculations (i.e. running an FFT in MATLAB or similar), I won't comment on the accuracy of the result.  Does the result you get from the DSPLIB disagree with some other FFT algorithm?

  • I try the C fft algorithm in Visual C++ platform to verify its correctness and it works:

    It returns the same SCILAB fft result.

    Down here I've pasted a report of various tests ( %i = immaginary unit )

    *****************************************************************************
     VISUAL C++
    *****************************************************************************

    --INPUT---

    8.5000000000000000 + (0.0000000000000000) * %i
    -0.5000000000000000 + (-2.5136697292327881) * %i
    -0.5000000000000000 + (-1.2071068286895752) * %i
    -0.5000000000000000 + (-0.7483028769493103) * %i
    -0.5000000000000000 + (-0.5000000000000000) * %i
    -0.5000000000000000 + (-0.3340893089771271) * %i
    -0.5000000000000000 + (-0.2071067839860916) * %i
    -0.5000000000000000 + (-0.0994561836123466) * %i
    -0.5000000000000000 + (0.0000000000000000) * %i
    -0.5000000000000000 + (0.0994561836123466) * %i
    -0.5000000000000000 + (0.2071067839860916) * %i
    -0.5000000000000000 + (0.3340893089771271) * %i
    -0.5000000000000000 + (0.5000000000000000) * %i
    -0.5000000000000000 + (0.7483028769493103) * %i
    -0.5000000000000000 + (1.2071068286895752) * %i
    -0.5000000000000000 + (2.5136697292327881) * %i

    --OUTPUT (C function)---

    1.0000000000000000 + (0.0000000000000000) * %i
    2.0000000000000000 + (0.0000002384185791) * %i
    3.0000000000000000 + (0.0000003258413699) * %i
    4.0000004768371582 + (0.0000000000000000) * %i
    5.0000000000000000 + (0.0000000000000000) * %i
    6.0000000000000000 + (0.0000001192092896) * %i
    7.0000000000000000 + (0.0000000317865130) * %i
    8.0000000000000000 + (0.0000000000000000) * %i
    9.0000000000000000 + (0.0000000000000000) * %i
    10.0000000000000000 + (-0.0000002384185791) * %i
    11.0000000000000000 + (-0.0000001509958025) * %i
    12.0000000000000000 + (0.0000000000000000) * %i
    13.0000000000000000 + (0.0000000000000000) * %i
    14.0000000000000000 + (-0.0000001192092896) * %i
    15.0000000000000000 + (-0.0000002066320661) * %i
    16.0000000000000000 + (0.0000000000000000) * %i

    done


    *****************************************************************************
     SCILAB
    *****************************************************************************

     x  =
     
        8.5              
      - 0.5 - 2.5136697i 
      - 0.5 - 1.2071068i 
      - 0.5 - 0.7483029i 
      - 0.5 - 0.5i       
      - 0.5 - 0.3340893i 
      - 0.5 - 0.2071068i 
      - 0.5 - 0.0994562i 
      - 0.5              
      - 0.5 + 0.0994562i 
      - 0.5 + 0.2071068i 
      - 0.5 + 0.3340893i 
      - 0.5 + 0.5i       
      - 0.5 + 0.7483029i 
      - 0.5 + 1.2071068i 
      - 0.5 + 2.5136697i 
     
    -->fft(x)
     ans  =
     
        1.                     
        2.                     
        2.9999999              
        3.9999999              
        5.                     
        6.0000001 - 2.220D-16i 
        7.0000001 - 2.220D-16i 
        8.0000001              
        9.                     
        9.9999999              
        11.                    
        12.                    
        13.                    
        14. + 2.220D-16i       
        15. + 2.220D-16i       
        16.                    

    *****************************************************************************
     OMAP-L137
    *****************************************************************************

    --INPUT---

    8.5000000000000000 + (0.0000000000000000) * %i
    -0.5000000000000000 + (-2.5136697292327881) * %i
    -0.5000000000000000 + (-1.2071068286895752) * %i
    -0.5000000000000000 + (-0.7483028769493103) * %i
    -0.5000000000000000 + (-0.5000000000000000) * %i
    -0.5000000000000000 + (-0.3340893089771271) * %i
    -0.5000000000000000 + (-0.2071067839860916) * %i
    -0.5000000000000000 + (-0.0994561836123466) * %i
    -0.5000000000000000 + (0.0000000000000000) * %i
    -0.5000000000000000 + (0.0994561836123466) * %i
    -0.5000000000000000 + (0.2071067839860916) * %i
    -0.5000000000000000 + (0.3340893089771271) * %i
    -0.5000000000000000 + (0.5000000000000000) * %i
    -0.5000000000000000 + (0.7483028769493103) * %i
    -0.5000000000000000 + (1.2071068286895752) * %i
    -0.5000000000000000 + (2.5136697292327881) * %i

    --OUTPUT (asm function)---

    1.0000000000000000 + (0.0000000000000000) * %i
    1.7807149887084961 + (-0.3204739093780518) * %i
    3.0000000000000000 + (0.0000003258413699) * %i
    3.9001204967498779 + (0.0845487117767334) * %i
    5.0000000000000000 + (0.0000000000000000) * %i
    5.9226493835449219 + (0.1669408082962036) * %i
    6.5571165084838867 + (-0.0000000874227766) * %i
    8.0845489501953125 + (-0.1215617656707764) * %i
    9.0000000000000000 + (0.0000000000000000) * %i
    9.9978437423706055 + (0.0990324020385742) * %i
    11.0000000000000000 + (-0.0000001509958025) * %i
    11.8784379959106445 + (0.1368930339813232) * %i
    13.0000000000000000 + (0.0000000000000000) * %i
    14.2987918853759766 + (0.0545006990432739) * %i
    15.4428834915161133 + (-0.0000000874227766) * %i
    16.1368923187255859 + (-0.0998799800872803) * %i

    --OUTPUT (C function)---

    1.0000000000000000 + (0.0000000000000000) * %i
    1.7807149887084961 + (-0.3204739093780518) * %i
    3.0000000000000000 + (0.0000003258413699) * %i
    3.9001204967498779 + (0.0845487117767334) * %i
    5.0000000000000000 + (0.0000000000000000) * %i
    5.9226493835449219 + (0.1669408082962036) * %i
    6.5571165084838867 + (-0.0000000874227766) * %i
    8.0845489501953125 + (-0.1215617656707764) * %i
    9.0000000000000000 + (0.0000000000000000) * %i
    9.9978437423706055 + (0.0990324020385742) * %i
    11.0000000000000000 + (-0.0000001509958025) * %i
    11.8784379959106445 + (0.1368930339813232) * %i
    13.0000000000000000 + (0.0000000000000000) * %i
    14.2987918853759766 + (0.0545006990432739) * %i
    15.4428834915161133 + (-0.0000000874227766) * %i
    16.1368923187255859 + (-0.0998799800872803) * %i

    --OUTPUT (asm legacy function)---

    1.0000000000000000 + (0.0000000000000000) * %i
    1.7807149887084961 + (-0.3204739093780518) * %i
    3.0000000000000000 + (0.0000003258413699) * %i
    3.9001204967498779 + (0.0845487117767334) * %i
    5.0000000000000000 + (0.0000000000000000) * %i
    5.9226493835449219 + (0.1669408082962036) * %i
    6.5571165084838867 + (-0.0000000874227766) * %i
    8.0845489501953125 + (-0.1215617656707764) * %i
    9.0000000000000000 + (0.0000000000000000) * %i
    9.9978437423706055 + (0.0990324020385742) * %i
    11.0000000000000000 + (-0.0000001509958025) * %i
    11.8784379959106445 + (0.1368930339813232) * %i
    13.0000000000000000 + (0.0000000000000000) * %i
    14.2987918853759766 + (0.0545006990432739) * %i
    15.4428834915161133 + (-0.0000000874227766) * %i
    16.1368923187255859 + (-0.0998799800872803) * %i

    done

    As you can see, DSP error affects first decimal digit.

    I chose this particular input vector because of its discrete fourier trasform is simple to visualize and verify.

    I first noted this problem of the DSP code execution for a real sequence (imaginary part to zero): fft of a real sequence must have simmetric real part and anti-simmetric imaginary part, except for the first bin (DC value).


    Debugging step by step, I found the origin of value mismatch: when tw_gen() function calls cos(theta), for some angles it returns wrong value.

    For example, I try this code on a new project

    int main()
    {
     int i;
     double theta;
     const double PI = 3.141592654;
     double c0, s0;
     float c[N], s[N];

     for (i=0; i<N; i++)
     {
      theta = 2 * PI * i / N;
      c0 = cos (theta);
      s0 = sin (theta);
      c[i] = c0;
      s[i] = s0;
      printf("\ncos(%f) = %f;  sin(%f) = %f;", theta, c[i], theta, s[i]);
     }


     return 0;
    }

    output:

    cos(0.000000) = 1.000000;  sin(0.000000) = 0.000000;
    cos(0.196350) = 1.000000;  sin(0.196350) = 0.195090;
    cos(0.392699) = 1.000000;  sin(0.392699) = 0.382683;
    cos(0.589049) = 1.000000;  sin(0.589049) = 0.555570;
    cos(0.785398) = 1.000000;  sin(0.785398) = 0.707107;
    cos(0.981748) = 1.000000;  sin(0.981748) = 0.831470;
    cos(1.178097) = 1.000000;  sin(1.178097) = 0.923880;
    cos(1.374447) = 1.000000;  sin(1.374447) = 0.980785;
    cos(1.570796) = -1.000000;  sin(1.570796) = 1.000000;
    cos(1.767146) = -1.000000;  sin(1.767146) = 0.980785;
    cos(1.963495) = -1.000000;  sin(1.963495) = 0.923880;
    cos(2.159845) = -1.000000;  sin(2.159845) = 0.831470;
    cos(2.356194) = -1.000000;  sin(2.356194) = 0.707107;
    cos(2.552544) = -1.000000;  sin(2.552544) = 0.555570;
    cos(2.748894) = -1.000000;  sin(2.748894) = 0.382683;
    cos(2.945243) = -1.000000;  sin(2.945243) = 0.195090;
    cos(3.141593) = -1.000000;  sin(3.141593) = -0.000000;
    cos(3.337942) = -0.980785;  sin(3.337942) = -0.195090;
    cos(3.534292) = -0.923880;  sin(3.534292) = -0.382683;
    cos(3.730641) = -0.831470;  sin(3.730641) = -0.555570;
    cos(3.926991) = -0.707107;  sin(3.926991) = -0.707107;
    cos(4.123340) = -0.555570;  sin(4.123340) = -0.831470;
    cos(4.319690) = -0.382683;  sin(4.319690) = -0.923880;
    cos(4.516039) = -0.195090;  sin(4.516039) = -0.980785;
    cos(4.712389) = 0.000000;  sin(4.712389) = -1.000000;
    cos(4.908739) = 0.195090;  sin(4.908739) = -0.980785;
    cos(5.105088) = 0.382683;  sin(5.105088) = -0.923880;
    cos(5.301438) = 0.555570;  sin(5.301438) = -0.831470;
    cos(5.497787) = 0.707107;  sin(5.497787) = -0.707107;
    cos(5.694137) = 0.831470;  sin(5.694137) = -0.555570;
    cos(5.890486) = 0.923880;  sin(5.890486) = -0.382683;
    cos(6.086836) = 0.980785;  sin(6.086836) = -0.195090;

    for 0 < theta < PI, function cos(theta) returns only two values: +/-1

    Is the bug only on my system or anyone has got it too?

  • Replacing cos(theta) with sin(PI / 2 - theta) all works fine.

  • Interesting.  My results (using your code as provided) are a pretty good match for your Visual C++ and SCILAB results:

    --OUTPUT---

    1.0000000000000000 + (0.0000000000000000) * %i
    2.0000000000000000 + (0.0000001192092896) * %i
    3.0000000000000000 + (0.0000000874227766) * %i
    4.0000000000000000 + (0.0000000000000000) * %i
    5.0000000000000000 + (0.0000000000000000) * %i
    6.0000004768371582 + (0.0000000000000000) * %i
    7.0000000000000000 + (-0.0000000874227766) * %i
    8.0000000000000000 + (0.0000000000000000) * %i
    9.0000000000000000 + (0.0000000000000000) * %i
    10.0000000000000000 + (-0.0000001192092896) * %i
    11.0000000000000000 + (0.0000000874227766) * %i
    12.0000000000000000 + (0.0000000000000000) * %i
    13.0000000000000000 + (0.0000000000000000) * %i
    14.0000000000000000 + (0.0000000000000000) * %i
    15.0000000000000000 + (-0.0000000874227766) * %i
    16.0000000000000000 + (0.0000000000000000) * %i

    I wonder if we could be using different versions of the RTS libraries.  What version are your codegen tools?  (To check this in CCS, go to Help->About.)  I am using version 6.1.10.

  • Hi Joe,

    my Code Generation Tools version is 6.1.5, how can I update it to the latest version?

  • Diego,

    The codegen tools are available on the CCS Update Advisor website.  Try this URL:

    https://www-a.ti.com/downloads/sds_support/CodeGenerationTools.htm

    If that doesn't work, try going through Help->Update Advisor->Check for Updates in CCS.

    You should close CCS before installing the codegen tools.  Then, go to the CCS component manager (accessed through the Help->About screen, bizarrely) and select the new codegen tools under Build Tools->TMS32067XX.  You'll probably need to restart CCS again after that.

  • In Component manager, you need to select the CGT you downloaded for both TMS320C64XX and TMS320C67XX for C674x.

  • Thank you all,
       with new version of Code Generation Tools cos() function works fine.

     

  • Hi Diego,

    Glad to hear it. I can confirm that this was a known issue fixed in CGT v6.1.8, but I am having trouble finding the specific bug number for your reference. If I come across it I will edit this post as an FYI.

  • Hy.

    Can anyone send me the "DSPF_sp_fftSPxSP_cn.c" used in this post?

    I need the "DSPLIB C674x v. 1.1" because i'm still using CCS 3.3.


    Thanks! André Águas.

  • Please find the C files in the attachment.

    /* ======================================================================= */
    /* DSPF_sp_fftSPxSP_cn.c -- Forward FFT with Mixed Radix                   */
    /*                 Natural C Implementation                                */
    /*                                                                         */
    /* Rev 0.0.3                                                               */
    /*                                                                         */
    /* Copyright (C) 2011 Texas Instruments Incorporated - http://www.ti.com/  */ 
    /*                                                                         */
    /*                                                                         */
    /*  Redistribution and use in source and binary forms, with or without     */
    /*  modification, are permitted provided that the following conditions     */
    /*  are met:                                                               */
    /*                                                                         */
    /*    Redistributions of source code must retain the above copyright       */
    /*    notice, this list of conditions and the following disclaimer.        */
    /*                                                                         */
    /*    Redistributions in binary form must reproduce the above copyright    */
    /*    notice, this list of conditions and the following disclaimer in the  */
    /*    documentation and/or other materials provided with the               */
    /*    distribution.                                                        */
    /*                                                                         */
    /*    Neither the name of Texas Instruments Incorporated nor the names of  */
    /*    its contributors may be used to endorse or promote products derived  */
    /*    from this software without specific prior written permission.        */
    /*                                                                         */
    /*  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS    */
    /*  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT      */
    /*  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR  */
    /*  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT   */
    /*  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,  */
    /*  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT       */
    /*  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,  */
    /*  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY  */
    /*  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT    */
    /*  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE  */
    /*  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.   */
    /*                                                                         */
    /* ======================================================================= */
    
    #pragma CODE_SECTION(DSPF_sp_fftSPxSP_cn, ".text:natural");
    
    #include "DSPF_sp_fftSPxSP_cn.h"
    
    #ifndef _TMS320C6x
    
    unsigned int _bitr(unsigned int a) 
    {
        unsigned int i, b=0;
    
        for(i=0; i<32; i++)
        {
            b |=  (((a >> i) & 0x1) << (31 -i));
        }
    
        return(b);
    }
    
    #endif
    
    void DSPF_sp_fftSPxSP_cn (int N, float *ptr_x, float *ptr_w, float *ptr_y,
        unsigned char *brev, int n_min, int offset, int n_max)
    {
    
        int i, j, k, l1, l2, h2, predj;
        int tw_offset, stride, fft_jmp;
    
        float x0, x1, x2, x3, x4, x5, x6, x7;
        float xt0, yt0, xt1, yt1, xt2, yt2, yt3;
        float yt4, yt5, yt6, yt7;
        float si1, si2, si3, co1, co2, co3;
        float xh0, xh1, xh20, xh21, xl0, xl1, xl20, xl21;
        float x_0, x_1, x_l1, x_l1p1, x_h2, x_h2p1, x_l2, x_l2p1;
        float xl0_0, xl1_0, xl0_1, xl1_1;
        float xh0_0, xh1_0, xh0_1, xh1_1;
        float *x, *w;
        int l0, radix;
        float *y0, *ptr_x0, *ptr_x2;
    
        radix = n_min;
    
        stride = N;                                                /* n is the number of complex samples */
        tw_offset = 0;
        while (stride > radix)
        {
            j = 0;
            fft_jmp = stride + (stride >> 1);
            h2 = stride >> 1;
            l1 = stride;
            l2 = stride + (stride >> 1);
            x = ptr_x;
            w = ptr_w + tw_offset;
    
            for (i = 0; i < N; i += 4)
            {
                co1 = w[j];
                si1 = w[j + 1];
                co2 = w[j + 2];
                si2 = w[j + 3];
                co3 = w[j + 4];
                si3 = w[j + 5];
    
                x_0 = x[0];
                x_1 = x[1];
                x_h2 = x[h2];
                x_h2p1 = x[h2 + 1];
                x_l1 = x[l1];
                x_l1p1 = x[l1 + 1];
                x_l2 = x[l2];
                x_l2p1 = x[l2 + 1];
    
                xh0 = x_0 + x_l1;
                xh1 = x_1 + x_l1p1;
                xl0 = x_0 - x_l1;
                xl1 = x_1 - x_l1p1;
    
                xh20 = x_h2 + x_l2;
                xh21 = x_h2p1 + x_l2p1;
                xl20 = x_h2 - x_l2;
                xl21 = x_h2p1 - x_l2p1;
    
                ptr_x0 = x;
                ptr_x0[0] = xh0 + xh20;
                ptr_x0[1] = xh1 + xh21;
    
                ptr_x2 = ptr_x0;
                x += 2;
                j += 6;
                predj = (j - fft_jmp);
                if (!predj)
                    x += fft_jmp;
                if (!predj)
                    j = 0;
    
                xt0 = xh0 - xh20;
                yt0 = xh1 - xh21;
                xt1 = xl0 + xl21;
                yt2 = xl1 + xl20;
                xt2 = xl0 - xl21;
                yt1 = xl1 - xl20;
    
                ptr_x2[l1] = xt1 * co1 + yt1 * si1;
                ptr_x2[l1 + 1] = yt1 * co1 - xt1 * si1;
                ptr_x2[h2] = xt0 * co2 + yt0 * si2;
                ptr_x2[h2 + 1] = yt0 * co2 - xt0 * si2;
                ptr_x2[l2] = xt2 * co3 + yt2 * si3;
                ptr_x2[l2 + 1] = yt2 * co3 - xt2 * si3;
            }
            tw_offset += fft_jmp;
            stride = stride >> 2;
        }                                                        /* end while */
    
        j = offset >> 2;
    
        ptr_x0 = ptr_x;
        y0 = ptr_y;
    
        /* l0 = _norm(n_max) +3; get size of fft */
    
        l0 = 0;
        for (k = 30; k >= 0; k--)
            if ((n_max & (1 << k)) == 0)
                l0++;
            else
                break;
        l0 = l0 + 3;
        if (radix <= 4)
            for (i = 0; i < N; i += 4)
            {
                /* reversal computation */
                k = _bitr(j) >> l0;
                j++;                                            /* multiple of 4 index */
    
                x0 = ptr_x0[0];
                x1 = ptr_x0[1];
                x2 = ptr_x0[2];
                x3 = ptr_x0[3];
                x4 = ptr_x0[4];
                x5 = ptr_x0[5];
                x6 = ptr_x0[6];
                x7 = ptr_x0[7];
                ptr_x0 += 8;
    
                xh0_0 = x0 + x4;
                xh1_0 = x1 + x5;
                xh0_1 = x2 + x6;
                xh1_1 = x3 + x7;
    
                if (radix == 2)
                {
                    xh0_0 = x0;
                    xh1_0 = x1;
                    xh0_1 = x2;
                    xh1_1 = x3;
                }
    
                yt0 = xh0_0 + xh0_1;
                yt1 = xh1_0 + xh1_1;
                yt4 = xh0_0 - xh0_1;
                yt5 = xh1_0 - xh1_1;
    
                xl0_0 = x0 - x4;
                xl1_0 = x1 - x5;
                xl0_1 = x2 - x6;
                xl1_1 = x3 - x7;
    
                if (radix == 2)
                {
                    xl0_0 = x4;
                    xl1_0 = x5;
                    xl1_1 = x6;
                    xl0_1 = x7;
                }
    
                yt2 = xl0_0 + xl1_1;
                yt3 = xl1_0 - xl0_1;
                yt6 = xl0_0 - xl1_1;
                yt7 = xl1_0 + xl0_1;
    
                if (radix == 2)
                {
                    yt7 = xl1_0 - xl0_1;
                    yt3 = xl1_0 + xl0_1;
                }
    
                y0[k] = yt0;
                y0[k + 1] = yt1;
                k += n_max >> 1;
                y0[k] = yt2;
                y0[k + 1] = yt3;
                k += n_max >> 1;
                y0[k] = yt4;
                y0[k + 1] = yt5;
                k += n_max >> 1;
                y0[k] = yt6;
                y0[k + 1] = yt7;
            }
    }
    
    /* ======================================================================= */
    /*  End of file:  DSPF_sp_fftSPxSP_cn.c                                    */
    /* ----------------------------------------------------------------------- */
    /*            Copyright (c) 2011 Texas Instruments, Incorporated.          */
    /*                           All Rights Reserved.                          */
    /* ======================================================================= */