RTOS/AM5728: FFTLIB from Processor SDK

Louis Thiery31

Part Number: AM5728
Other Parts Discussed in Thread: FFTLIB,

Tool/software: TI-RTOS

I am trying to use the DSPs via OpenCL to do an IFFT. I've had success using DSPLIB but I need to do a non-power of two FFT now (50 elements to be exact). I read that FFTLIB has non-power of two FFTs and indeed the fftlib_c66x_2_0_0_2 provides the following documentation (I've only included a short extract to give some context):

int ifft_sp_1d_c2r_batch_direct	(	fft_param_u	u,
		void *	edmaState
	)

Parameters:

	N	= IFFT size
	M	= Power of 2 IFFT size, if Bluestein algorithm is used

Assumptions:Batch size is at least 4.
N is a positive value.

Looks great, but there was no ae66 library for the whole project, but I did find fftlib_3_1_0_0 in ti-processor-sdk-linux-am57xx-evm-05.01.00.11 which seemed great because it's much more recent and included the library. Problem is, the function documentation explains that non-power of two is more or less not supported, suggesting a rollback in features? Here is another documentation extract from fftlib_3_1_0_0:

void ifft_sp_plan_1d_c2r_batch	(	int	N,
		int	mode,
		fft_callout_t	fxns,
		fft_plan_t *	p,
		float *	in,
		float *	out,
		float *	tw,
		float *	work
	)

Parameters

N	= IFFT size

Assumptions:

Batch size is at least 8.
N is a positive value.
N is multiple of 8.

So my concern is: did the IFFT in 2_0_0_2 version not work for non-powers of 2? Or was there a feature rollback?

over 5 years ago

0 Rahul Prabhu over 5 years ago

TI__Guru** 114410 points

Louis,

As you may know FFTLIB Is based of FFTW library so the newer versions are optimized versions of the FFTW that was made available. The only issue that I am aware of with iFFT during this update is that fftlib ifft function scales output with 1/n which does not match FFTW implementation.

If you are interested in the implementation from earlier library, you could either choose to use the older library version or simply extract the source of the earlier function to use with the OpenCL implementation. It is also possible that the option to not support the non-power of two is result of the optimization tthat was done of rthe function as part of the library integration on TI DSP.

Regards,
Rahul

0 Louis Thiery31 over 5 years ago in reply to Rahul Prabhu

Prodigy 70 points

Thanks for the quick response, Rahul.

So just to recap, we are unsure why the non-power-of-two support was dropped, but it was most likely due to following the lead of FFTW or a TI DSP optimization. And so, as far as we know, the older version does not have any known bugs that would've caused the feature to be dropped?

Best,

Louis

0 Parian Golchin over 5 years ago

Prodigy 100 points

Part Number: AM5728

Tool/software: Code Composer Studio

Hello,

I have used FFTlib library in my project. At first it seems there is not problem and no errors, however after 7th times that I have run my program, the program hangs in

“fft_execute” function. Tracking down the issue, it leads to “EdmaMgr_alloc” in “void *fft_assign_edma_resources(void)”.

As mentioned in relevant question "https://e2e.ti.com/support/processors/f/791/p/754731/2800231 ", Am5728 uses only one EDMA DSP. Accordingly,I have changed the following in configuration file “fft_c6678_config.c”.

#define EDMA_MGR_NUM_EDMA_INSTANCES 1

#define NUM_EDMA_INSTANCES 1

Moreover, I have changed the "Global Register Region of CC Registers".

#define DSP1_EDMA3_CC_BASE_ADDR (0x01D10000)
#define DSP1_EDMA3_TC0_BASE_ADDR (0x01D05000)
#define DSP1_EDMA3_TC1_BASE_ADDR (0x01D06000)

But still my program hangs exactly after the 7th times.

Could you please help me with this issue?

Best regards,

Parian Golchin

0 Rahul Prabhu over 5 years ago

TI__Guru** 114410 points

Parian,

We need some more information on how you are running this test code. Are you enabling the DSP cache, where is the code placed. Where is the DSP PC when the hang occurs? Is it in the EDMA code or in the FFT code? Are EDMA transfers bigger than 32KB ?

Please check for existing issues like these that you may be running into :
e2e.ti.com/.../518566

Regards,
Rahul

0 Parian Golchin over 5 years ago in reply to Rahul Prabhu

Prodigy 100 points

Dear Rahul,

Are you enabling the DSP cache, where is the code placed? I do not know how to enable or not to enable it or if it is or it is not.
Where is the DSP PC when the hang occurs? it hangs in “EdmaMgr_alloc” function.
Is it in the EDMA code or in the FFT code? EDMA code
Are EDMA transfers bigger than 32KB ? No, It is not.

Just one thing I have noticed to mention that I get two warning before running my program. the warning are the following
warning #10247-D: creating output section ".ddr_mem" without a SECTIONS specification
warning #10247-D: creating output section ".ll2_mem" without a SECTIONS specification

Thank you very much,
Best regards,
Parian Golchin

0 Omid MDB over 5 years ago

Intellectual 690 points

Part Number: AM5728

Tool/software: TI-RTOS

Hi,

I build and run the fft_sp_1d_r2c FFTLIB example (C:\ti\fftlib_c66x_2_0_0_2\packages\ti\fftlib\src\fft_sp_1d_r2c\k1\fft_sp_1d_r2c_k1_66_LE_ELF), but I encountered an issue with EDMA3.

The console output:

EdmaMgr_alloc() failed (0)

Please help me.

--------------------

CCS Version: 8.2.0.00007

EDMA3 Version: 2.12.3

Framework Component Version: 2.12.3

PDK Version: 1.0.13

FFTLIB Version: fftlib_c66x_2_0_0_2

Best regards, Omid

0 Rahul Prabhu over 5 years ago in reply to Louis Thiery31

TI__Guru** 114410 points

I have reached out to the FFTLIB maintainer for his comment and will post a response by the end of this week.

0 Rahul Prabhu over 5 years ago in reply to Louis Thiery31

TI__Guru** 114410 points

I have reached out to FFTLIB maintainer for their comment on this feature and will get back to you by the end of this week.

Regards,
Rahul

0 Louis Thiery31 over 5 years ago in reply to Rahul Prabhu

Prodigy 70 points

Thanks, Rahul! I'm still very interested in the response.

0 Rahul Prabhu over 5 years ago in reply to Omid MDB

TI__Guru** 114410 points

Omid,

It looks like you are running into the same issue as described by Parian so I am merging the two threads on the same topic.

I will check with our system test team and confirm that they have run this test case.

Regards,
Rahul

0 Rahul Prabhu over 5 years ago in reply to Omid MDB

TI__Guru** 114410 points

Omid,

It appears that fftlib that was released in 2014 was validated on C66x and K2H platforms.

For AM572x FFTLIB is only supported from Processor SDK Linux where the FFTLIB functionality can be offloaded to the DSP from ARM Linux using OpenCL as mentioned here:
wiki.tiprocessors.com/.../Processor_SDK_Libraries

the version of FFTLIB for DSP has been upgraded to 3.1.0.0 and should have the required EDMA porting done for the DSP on AM572x.

Regards,
Rahul

0 Omid MDB over 5 years ago in reply to Rahul Prabhu

Intellectual 690 points

Hi Rahul,

Thank you for your reply. I will do so.

Best regards, Omid

0 Parian Golchin over 5 years ago in reply to Omid MDB

Prodigy 100 points

Hi Rahul,

Thank you very much for your fast response.

I have tried to use FFTlib 3.1.0.0. Now that problem solved.

However now I have new issue "[core 0] Memory allocation error!".

when I tracked the issue inside "fft_omp_sp_2d_r2c_ecpy" function, it allocates memory for "data_wLocal, workbuf_wLocal, workbuf_tLocal" by calling "lib_smem_falloc" function. Please see below

data_wLocal = (float*)lib_smem_falloc (fft_mem_handle, 4*N*FFT_OMP_SP_2D_R2C_NUMOFLINEBUFS*sizeof(float), 8);

workbuf_wLocal = (float*)lib_smem_falloc (fft_mem_handle, 4*N*FFT_OMP_SP_2D_R2C_NUMOFLINEBUFS*sizeof(float), 8);

workbuf_tLocal = (float*)lib_smem_falloc (fft_mem_handle, 2*N*FFT_OMP_SP_2D_R2C_NUMOFLINEBUFS*sizeof(float), 8);

but it really did not and they are all NULL.

I would greatly appreciate your help!

Thank you!

Best regards,

Parian Golchin

0 Rahul Prabhu over 5 years ago in reply to Parian Golchin

TI__Guru** 114410 points

Parian,

The issue with memory allocation is a known issue which has been reported here:
e2e.ti.com/.../768577

It appears when the BIOS and EDMA and dependent components were updated the memory requirement for the FFTLIB were increased so you are running into issue that we are tracking in our bug system. This library is in maintenance so please expect some delays.

I will post a response if I get an intermediate fix in the interim so you don`t need to wait for a new release.

Regards.
Rahul

0 Parian Golchin over 5 years ago in reply to Rahul Prabhu

Prodigy 100 points

Hi Rahul,

Thanks for your reply.

I will wait for an intermediate fix.

Best regards,

Parian

Processors

Processors forum

RTOS/AM5728: FFTLIB from Processor SDK