This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

what is the maximum N limit for DSPF_sp_icfftr2_dif function in dsplib?

Hi all, 

I'm using function DSPF_sp_icfftr2_dif() for inverse fft from the lib version dsplib_c674x_3_4_0_0.

I'm getting the output as zero for N value set to 32768 upto 16384 it works fine, is there any size limitation for the inverse FFT?

there is no information about the N value in the dsplib user guide, can some one respond to this query soon please.

  • We are working on your post.
  • Hi Titus, 

    Thankyou for your response, is there any solution for it? the issue is a showstopper for us for now. looking forward for a quick response.

    because when i use it in my actual code (the lib call), its causing a buffer overflow after this function call, which is a major concern.

  • Hi abz,

    abz said:
    I'm getting the output as zero for N value set to 32768 upto 16384 it works fine, is there any size limitation for the inverse FFT?

    there is no information about the N value in the dsplib user guide, can some one respond to this query soon please.

    For queries such as the maximum and minimum limit of the parameters, you can look at the API refrence documents like below:

    C:\ti\dsplib_c674x_3_1_1_1\docs\doxygen\html\dsplib_html\group___d_s_p_f__sp__fft_s_p_x_s_p.html

    void DSPF_sp_fftSPxSP ( int N,
    float * ptr_x,
    float * ptr_w,
    float * ptr_y,
    unsigned char * brev,
    int n_min,
    int offset,
    int n_max
    )

    The benchmark performs a mixed radix forwards fft.

    Parameters:
    N length of FFT in complex samples
    ptr_x pointer to complex data input
    ptr_w pointer to complex twiddle factor
    ptr_y pointer to complex output data
    brev pointer to bit reverse table containing 64 entries
    n_min should be 4 if N can be represented as Power of 4 else, n_min should be 2
    offset index in complex samples of sub-fft from start of main fft
    n_max size of main fft in complex samples
    Algorithm:
    DSPF_sp_fftSPxSP_cn.c is the natural C equivalent of the optimized linear assembly code without restrictions. Note that the linear assembly code is optimized and restrictions may apply.
    Assumptions:
    N needs to be power of 2 8 <= N <= 131072 Arrays pointed by ptr_x, ptr_w, and ptr_y should not overlap Arrays pointed by ptr_x, ptr_w, and ptr_y should align on the double words boundary

    For DSPF_sp_icfftr2_dif() Please visit, ~\ti\dsplib_c674x_3_1_1_1\docs\doxygen\html\dsplib_html\group___d_s_p_f__sp__icfftr2__dif.html

  • Hi Sankari,
    Can you carefully read my question please? the inverse fft is for function DSPF_sp_icfftr2_dif() and the parameter it gets is unsigned short, which makes it 2^15 to be the maximum input considering N is the power of 2.
    And the issue occurs after 16384 no of samples, if i increase the sample to 2 ^15 the value that comes out of the lib is all zeros.
    and when i use the lib function call in my original code, its causing memory overflow. using the C code of the same function works well, but not the library function call.

    Regards,
    Nancy
  • Nancy,

    In the documentation I have, the n argument for DSPF_sp_icfftr2_dif() is short which is signed by C default conventions, not unsigned short. Unfortunately, this limits the positive range for n to (2^15)-1, which means 32768 is not valid.

    You probably have access to the library source and could change that parameter to allow it to go higher. There is no useful reason on a 32-bit machine to pass a count argument as a short, although as soon as I say that there will be a good example countering it. You may have already looked at the library and seen that the argument is not limited to signed, but I have not, and am going by the documentation and C header file.

    When you are getting to larger counts like this, it can be a problem when you are using internal memory. There may be multiple buffers that have to hold that same number of values, and internal memory buffers could reach the limit of the size of the internal memory. You can always move everything to external memory and test it for functionality, then work on the best allocation of internal memory as cache or SRAM or for which buffers should go where.

    Regards,
    RandyP
  • Hi Randy, 

    Thankyou for the response. 

    As you have mentioned, the source code in the lib  gets the argument as unsigned short, which makes the limit 2^16 - 1, and 32768 samples  in order to use the N as power of 2.

    But the issue i'm facing is memory corruption that occurs when I use 32768 samples  with the  optimized linear assembly code, there is no problem with the memory or the output generated when i use the C equivalent of the optimized linear assembly code, which boils down to some issue with the lib generated code. 

    Has anyone reported this before, is it a known problem?  as i'm more keen on using the optimized code.

    Meanwhile i will check the memory allocation, but i'm pretty sure the heap is set to the maximum limit it can go upto, and the buffer that is used to hold the input value is well within the internal memory size.

  • Nancy,

    Have you had a chance to debug this any more?

    Can you send the snippet of code before, including, and after the function call, especially with something that shows how the memory corruption shows itself. It would be best if the code can be built and run and show the problem you are seeing. Input and expect outputs will be needed, too.

    Regards,
    RandyP
  • Is this thread close? If not, please update us.

    Adding to what Randy said:

    I wonder what happens if you put all the vectors (data, twiddle, bit reverse) in DDR, disable all caches, and then run the code. Do you still have memory corruption? Do you see the correct results?

    Ran
  • Hi Randy, 

    I'm continuing to use the C code of the lib function, meanwhile I found this link.

    I reckon, the problem I'm facing is same as the what is stated in the below given link but the function used  is different DSPF_sp_icfftr2_dif.

    And the DSPLIB is still hasnt fixed the problem yet.

    e2e.ti.com/.../208987tion , if the link isnt working please search for question

    DSPF_sp_ifftSPxSP with N=131072

  • I read your latest posting.  I will update the developers that the same problem with DSPF_sp_ifftSPxSP exists for DSPF_sp_icfftr2_dif

    Can you close the thread now (again?)

    Ran

  • Nancy,

    The failure in the ifft case was corruption at addresses before the primary buffer target. I do not know why that would be caused by a larger N and larger buffer size, but it most likely is something harder to find than just common over-reaching of the output pointers; I would have expected the over-reaching would happen all the time even with smaller N, also.

    If you put all of your buffers in external memory (where there is a lot of space) and allocate those buffers so there are large gaps between them, you should either find a work-around or a helpful data point for us to examine.

    If the problem is simply epilog overflow, then the real buffer will have the right contents and some memory around that buffer will be corrupted. If that is the case, then you can use the optimized version and just expect the corruption and defend against it by allocating dummy space in that area.

    But if it is not so simple, then the data in the real buffer will be wrong because more complicated errors were made with the pointers. That will be harder to find the solution.

    It would still be helpful if you can supply a testcase. I know it is easier to tell us to just run the function to see it, but there are too many times that doing that does not duplicate the problem or point to the right problem. It is easier for us to get someone to look at a bug if they are handed a testcase to use. Personally, I have no idea what a icfftr2_dif function is supposed to do, so I could never create such a testcase - and I am not on the team of smart programmers who write these things.

    Regards,
    RandyP
  • Hi Ran,
    Thanks, I will close the thread.
    But can you kindly let me know, when it will be fixed on the DSPLIB?

    @Randy

    Please find the piece of code that can perform the ifft

    N = 32768 
    
    x = 2*N 
    
    w =2*N
    
    void main(void)
    
    {
    
    gen_w_r2(w, N); // Generate coefficient table
    bit_rev(w, N>>1); // Bit−reverse coefficient table
    DSPF_sp_cfftr2_dit(x, w, N); // radix−2 DIT forward FFT // input in normal order, output in // order bit−reversed // coefficient table in bit−reversed // order
    DSPF_sp_icfftr2_dif(x, w, N); // Inverse radix 2 FFT // input in bit−reversed order, // order output in normal // coefficient table in bit−reversed // order
    divide(x, N); // scale inverse FFT output // result is the same as original // input
    
    }
    
    

    just as in the spru657c.pdf

  • I will try

    Two comments to make

    1. We do not understand what does corrupt output means. Did you fill the output vector with a known value and then it changed?
    2. Try to change the function parameter from short to int in the prototype file and see if it works with 32K

    Best Regards

    Ran
  • Hi,
    output i meant is the value x, which i read using fprintf. memory corruption occurs, when i use the same function in my code, using the linear optimized code. It works fine if i use the c version of the same code.
    the value is calculated in float, short is for the size N.

    I dont want to be repeating the steps again and again, please read my questions and the link i shared and my other comments.

    e2e.ti.com/.../208987