what is the maximum N limit for DSPF_sp_icfftr2_dif function in dsplib?

NancyJ

Hi all,

I'm using function DSPF_sp_icfftr2_dif() for inverse fft from the lib version dsplib_c674x_3_4_0_0.

I'm getting the output as zero for N value set to 32768 upto 16384 it works fine, is there any size limitation for the inverse FFT?

there is no information about the N value in the dsplib user guide, can some one respond to this query soon please.

over 7 years ago

0 Titusrathinaraj Stalin over 7 years ago

TI__Guru** 116100 points

We are working on your post.

0 NancyJ over 7 years ago in reply to Titusrathinaraj Stalin

Intellectual 995 points

Hi Titus,

Thankyou for your response, is there any solution for it? the issue is a showstopper for us for now. looking forward for a quick response.

because when i use it in my actual code (the lib call), its causing a buffer overflow after this function call, which is a major concern.

0 Shankari G over 7 years ago in reply to NancyJ

TI__Mastermind 43955 points

Hi abz,

abz said:
I'm getting the output as zero for N value set to 32768 upto 16384 it works fine, is there any size limitation for the inverse FFT?
there is no information about the N value in the dsplib user guide, can some one respond to this query soon please.

For queries such as the maximum and minimum limit of the parameters, you can look at the API refrence documents like below:

C:\ti\dsplib_c674x_3_1_1_1\docs\doxygen\html\dsplib_html\group___d_s_p_f__sp__fft_s_p_x_s_p.html

void DSPF_sp_fftSPxSP	(	int	N,
		float *	ptr_x,
		float *	ptr_w,
		float *	ptr_y,
		unsigned char *	brev,
		int	n_min,
		int	offset,
		int	n_max
	)

The benchmark performs a mixed radix forwards fft.

Parameters:

	N	length of FFT in complex samples
	ptr_x	pointer to complex data input
	ptr_w	pointer to complex twiddle factor
	ptr_y	pointer to complex output data
	brev	pointer to bit reverse table containing 64 entries
	n_min	should be 4 if N can be represented as Power of 4 else, n_min should be 2
	offset	index in complex samples of sub-fft from start of main fft
	n_max	size of main fft in complex samples

Algorithm:: DSPF_sp_fftSPxSP_cn.c is the natural C equivalent of the optimized linear assembly code without restrictions. Note that the linear assembly code is optimized and restrictions may apply.

Assumptions:: N needs to be power of 2 8 <= N <= 131072 Arrays pointed by ptr_x, ptr_w, and ptr_y should not overlap Arrays pointed by ptr_x, ptr_w, and ptr_y should align on the double words boundary

For DSPF_sp_icfftr2_dif() Please visit, ~\ti\dsplib_c674x_3_1_1_1\docs\doxygen\html\dsplib_html\group___d_s_p_f__sp__icfftr2__dif.html

0 NancyJ over 7 years ago in reply to Shankari G

Intellectual 995 points

Hi Sankari,
Can you carefully read my question please? the inverse fft is for function DSPF_sp_icfftr2_dif() and the parameter it gets is unsigned short, which makes it 2^15 to be the maximum input considering N is the power of 2.
And the issue occurs after 16384 no of samples, if i increase the sample to 2 ^15 the value that comes out of the lib is all zeros.
and when i use the lib function call in my original code, its causing memory overflow. using the C code of the same function works well, but not the library function call.

Regards,
Nancy

0 RandyP over 7 years ago in reply to NancyJ

TI__Guru* 84110 points

Nancy,

In the documentation I have, the n argument for DSPF_sp_icfftr2_dif() is short which is signed by C default conventions, not unsigned short. Unfortunately, this limits the positive range for n to (2^15)-1, which means 32768 is not valid.

You probably have access to the library source and could change that parameter to allow it to go higher. There is no useful reason on a 32-bit machine to pass a count argument as a short, although as soon as I say that there will be a good example countering it. You may have already looked at the library and seen that the argument is not limited to signed, but I have not, and am going by the documentation and C header file.

When you are getting to larger counts like this, it can be a problem when you are using internal memory. There may be multiple buffers that have to hold that same number of values, and internal memory buffers could reach the limit of the size of the internal memory. You can always move everything to external memory and test it for functionality, then work on the best allocation of internal memory as cache or SRAM or for which buffers should go where.

Regards,
RandyP

0 NancyJ over 7 years ago in reply to RandyP

Intellectual 995 points

Hi Randy,

Thankyou for the response.

As you have mentioned, the source code in the lib gets the argument as unsigned short, which makes the limit 2^16 - 1, and 32768 samples in order to use the N as power of 2.

But the issue i'm facing is memory corruption that occurs when I use 32768 samples with the optimized linear assembly code, there is no problem with the memory or the output generated when i use the C equivalent of the optimized linear assembly code, which boils down to some issue with the lib generated code.

Has anyone reported this before, is it a known problem? as i'm more keen on using the optimized code.

Meanwhile i will check the memory allocation, but i'm pretty sure the heap is set to the maximum limit it can go upto, and the buffer that is used to hold the input value is well within the internal memory size.

0 RandyP over 7 years ago in reply to NancyJ

TI__Guru* 84110 points

Nancy,

Have you had a chance to debug this any more?

Can you send the snippet of code before, including, and after the function call, especially with something that shows how the memory corruption shows itself. It would be best if the code can be built and run and show the problem you are seeing. Input and expect outputs will be needed, too.

Regards,
RandyP

0 ran35366 over 7 years ago in reply to RandyP

TI__Genius 12805 points

Is this thread close? If not, please update us.

Adding to what Randy said:

I wonder what happens if you put all the vectors (data, twiddle, bit reverse) in DDR, disable all caches, and then run the code. Do you still have memory corruption? Do you see the correct results?

Ran

0 NancyJ over 7 years ago in reply to ran35366

Intellectual 995 points

Hi Randy,

I'm continuing to use the C code of the lib function, meanwhile I found this link.

I reckon, the problem I'm facing is same as the what is stated in the below given link but the function used is different DSPF_sp_icfftr2_dif.

And the DSPLIB is still hasnt fixed the problem yet.

e2e.ti.com/.../208987tion , if the link isnt working please search for question

DSPF_sp_ifftSPxSP with N=131072

0 ran35366 over 7 years ago in reply to NancyJ

TI__Genius 12805 points

I read your latest posting. I will update the developers that the same problem with DSPF_sp_ifftSPxSP exists for DSPF_sp_icfftr2_dif

Can you close the thread now (again?)

Ran

0 RandyP over 7 years ago in reply to NancyJ

TI__Guru* 84110 points

Nancy,

The failure in the ifft case was corruption at addresses before the primary buffer target. I do not know why that would be caused by a larger N and larger buffer size, but it most likely is something harder to find than just common over-reaching of the output pointers; I would have expected the over-reaching would happen all the time even with smaller N, also.

If you put all of your buffers in external memory (where there is a lot of space) and allocate those buffers so there are large gaps between them, you should either find a work-around or a helpful data point for us to examine.

If the problem is simply epilog overflow, then the real buffer will have the right contents and some memory around that buffer will be corrupted. If that is the case, then you can use the optimized version and just expect the corruption and defend against it by allocating dummy space in that area.

But if it is not so simple, then the data in the real buffer will be wrong because more complicated errors were made with the pointers. That will be harder to find the solution.

It would still be helpful if you can supply a testcase. I know it is easier to tell us to just run the function to see it, but there are too many times that doing that does not duplicate the problem or point to the right problem. It is easier for us to get someone to look at a bug if they are handed a testcase to use. Personally, I have no idea what a icfftr2_dif function is supposed to do, so I could never create such a testcase - and I am not on the team of smart programmers who write these things.

Regards,
RandyP

0 NancyJ over 7 years ago in reply to ran35366

Intellectual 995 points

Hi Ran,
Thanks, I will close the thread.
But can you kindly let me know, when it will be fixed on the DSPLIB?

@Randy

Please find the piece of code that can perform the ifft

N = 32768 

x = 2*N 

w =2*N

void main(void)

{

gen_w_r2(w, N); // Generate coefficient table
bit_rev(w, N>>1); // Bit−reverse coefficient table
DSPF_sp_cfftr2_dit(x, w, N); // radix−2 DIT forward FFT // input in normal order, output in // order bit−reversed // coefficient table in bit−reversed // order
DSPF_sp_icfftr2_dif(x, w, N); // Inverse radix 2 FFT // input in bit−reversed order, // order output in normal // coefficient table in bit−reversed // order
divide(x, N); // scale inverse FFT output // result is the same as original // input

}

just as in the spru657c.pdf

0 ran35366 over 7 years ago in reply to NancyJ

TI__Genius 12805 points

I will try

Two comments to make

1. We do not understand what does corrupt output means. Did you fill the output vector with a known value and then it changed?
2. Try to change the function parameter from short to int in the prototype file and see if it works with 32K

Best Regards

Ran

0 NancyJ over 7 years ago in reply to ran35366

Intellectual 995 points

Hi,
output i meant is the value x, which i read using fprintf. memory corruption occurs, when i use the same function in my code, using the linear optimized code. It works fine if i use the c version of the same code.
the value is calculated in float, short is for the size N.

I dont want to be repeating the steps again and again, please read my questions and the link i shared and my other comments.

e2e.ti.com/.../208987

Processors

Processors forum

what is the maximum N limit for DSPF_sp_icfftr2_dif function in dsplib?

DSPF_sp_ifftSPxSP with N=131072