Hi. For a customer project I am using DMA and XINTF to sample two 14-bit ADC's at 12.5MHz.The ADC's are simply connected to the databus, and the address bus is not in use. Once the DMA has filled a buffer with the sample data, I mask out the relevant bits to extractthe data for each individual ADC. Then, for each ADC, I run the data through an FIR filter witha cutoff frequency of 1.25MHz, and then downsample to 2.5MHz by taking every 5th sample.This leaves us with 1024 samples. After downsampling I calculate the FFT of the signal. I have used the fixed point DSP library, because according to the benchmark info provided inthe documentation, the fixed point functions were faster than the floating point ones. The processor used is TMS320F28335. The format from the ADC is 14-bit signed integer. I read this into a regular int, xn. Obviously xnis not 14 bits, and I can't think of a good way of reading the 14-bits directly from the DMA bufferto the variable xn, and getting the sign bit and everything in xn set correctly. The current codefor reading the data from one ADC into xn is as follows: if(__DMABuf1__[j] & 0x2000) // Mask out 14 LSB's of DMABuf1, discarding sign bit (14), // and convert to negative number. xn = -(8192 - (__DMABuf1__[j] & 0x1FFF)); else // Mask out 13 LSB's of DMABuf1, discarding sign bit (14) xn = __DMABuf1__[j] & 0x1FFF; Question 1: Is there perhaps a faster/better way to do the above? The whole code for processing data from one ADC looks like this: unsigned int j; unsigned int i, n; // -------------------------------------------- // ADC2 // -------------------------------------------- fir_filter_reset(); // Reset filter before starting on ADC2 // Decimation. Downsampling by picking every 5th sample for(i = 0, j = 0; i < ADC_BUF_SIZE; i++) { for(n = 0; n < 5; n++, j++) { if(__DMABuf1__[j] & 0x2000) // Mask out 14 LSB's of DMABuf1, discarding sign bit (14), // and convert to negative number. xn = -(8192 - (__DMABuf1__[j] & 0x1FFF)); else // Mask out 13 LSB's of DMABuf1, discarding sign bit (14) xn = __DMABuf1__[j] & 0x1FFF; fir_filter.input=xn; fir_filter.calc(&fir_filter); } // "Convert output to Q30" (sort of) and store in adc1buf. ADC value is 14-bit (signed). __adcbuf__[i] = (long)fir_filter.output * ((long)1 << 17); } fft1024_calc((long *)__adcbuf__); // Calculate FFT for ADC2 fir_filter_reset() is a convenience function for resetting the filter (I thought this might be desirable,as the same filter is used for both ADC's in order to save resources): #pragma DATA_SECTION(fir_filter, "firfilt"); FIR16 fir_filter = FIR16_DEFAULTS; // Cleans up the delay buffer // Re-calls the filter's init() function. void fir_filter_reset(void) { unsigned int i; //Clean up delay buffer for(i = 0; i < (FIR_ORDER_REV+3)/2; i++) { delay_buf[i]=0; } fir_filter.init(&fir_filter); } fft1024_calc() is a convenience function for calculating 1024 point FFT. Since the FFT's input is Q31, I multiply the ADC value with 2^17. The buffer containing the inputsignal is also used to store the output: void fft1024_calc(long *signal) { unsigned int i; //Clean up computation buffer for(i=0; i < 2*N; i++) { fft_comp_buf[i] = 0; } RFFT32_brev(signal, fft_comp_buf, N); /* real FFT bit reversing */ fft.magptr=signal; /* Magnitude output buffer */ fft.winptr=(long *)hamming_win; /* Window coefficient array */ fft.win(&fft); fft.calc(&fft); fft.mag(&fft); } An init function for the FFT is called at startup: void fft1024_init(void) { fft.ipcbptr=fft_comp_buf; /* FFT computation buffer */ fft.winptr=(long *)hamming_win; /* Window coefficient array */ fft.init(&fft); /* Twiddle factor pointer initialization */ } Now, I am having some performance issues. To test how fast the code is, I set a GPIO when it starts,and reset it when it has finished for both ADC's. Then I measure the time it takes by connecting anoscilloscope probe to the GPIO. According to the benchmark info in the C28XX fixed point DSP documentation, my 1024 point FFTshould take about 1/4 the time to execute that it takes to run 5120 samples through a 16-bit FIR filter.In practice the FFT takes almost as long to execute. Probably because I put the twiddle factors in FLASH, because I was not able to make room for them in RAM. I knew it would have an effect performance, but not by that much. Question 2: How much memory do the twiddle factors actually require? The documentation is a bit ambigous; it is mentioned on p. 17 in the C28x fixed point library documentation that "Twiddle factors are assembled into “FFTtf” section and contains 768 entries (32-bit words) to facilitate complex FFT computation of upto 1024 points.", but on p. 20 it is said that "The section “FFTtf” holds 4096 twiddle factors, each 32-bits or 2 words wide. A total of 8192 (0x2000) contiguous words need to be allotted this section in memory.When running in emulation mode it may be necessary to allocate an entire RAM block in the linker command file." Which is correct? Question 3: This is really the most important question. Is there a way to speed up the FIR filter?It seems very ineffective having to loop through the buffer and call the calc() functionfor each sample. I chose the fixed point library because its FIR filter was faster thanthe floating point version, but still it is just too slow. If I am able to move around some memory to speed up the FFT, then the FIR filter will be the limiting factor in terms ofperformance. Question 4: The documentation does not say a lot about the window function for the FFT.My hamming window seems to be working, but I would like to make sure thatI have done it the correct way. I assume the win(), calc() and other FFTfunctions should be called in the order I have done in fft1024_calc().
Hi Simon,
Simon Voigt NesboIs there perhaps a faster/better way to do the above?
xn = ((int)(__DMABuf1__[j] << 2) >> 2); should sign extend the 14 bit number to 16 bits.Simon Voigt NesboThe documentation is a bit ambigous; it is mentioned on p. 17 in the C28x fixed point library documentation that "Twiddle factors are assembled into “FFTtf” section and contains 768 entries (32-bit words) to facilitate complex FFT computation of upto 1024 points.", but on p. 20 it is said that "The section “FFTtf” holds 4096 twiddle factors, each 32-bits or 2 words wide. A total of 8192 (0x2000) contiguous words need to be allotted this section in memory.When running in emulation mode it may be necessary to allocate an entire RAM block in the linker command file." Which is correct?you are correct that statement is ambiguous and incorrect, we have already pegged it for correction on the next release. Basically for any N point FFT you need N - N/4 twiddle factors for our implementation of the FFT (theoretically you only need N/2 because of symmetry).Since the max FFT you can do with the Fixedpoint lib is 4096, we have a table of 3072 twiddle factors(each 32-bit complex) so we require a max space of 3072*2(complex)*2(32-bit) = 0x3000. For N = 1024, you will need 768 twiddle factors. Having the twiddles in flash will slow down the FFT, you have to factor in the FLASH wait states. I have attached a twiddle factor generator script with this post which you can use to generate(generate in Q30 format) the smaller table i.e. 768. The name of the generated table will be TF_Q30. You will need to :1. include the header file with your project2. assign the table to the FFTtf section: #pragma DATA_SECTION(TF_30,"FFTtf")3. When creating the FFT structure in code you need to use this macro(changes from the default initialization macro are in red):#define CFFT32_1024P_NEW_DEFAULTS { (long *)NULL,\ (long *)NULL,\ 1024,\ 10,\ (long *)NULL,\ (long *)NULL,\ 0,\ 0,\ 1,\ //skip factor is 1 not 4 (void (*)(void *))NULL,\ //NOTICE I am not assigning the init function here (void (*)(void *))FFT32_izero,\ (void (*)(void *))FFT32_calc,\ (void (*)(void *))NULL,\ (void (*)(void *))NULL}Simon Voigt NesboThis is really the most important question. Is there a way to speed up the FIR filter?It seems very ineffective having to loop through the buffer and call the calc() functionfor each sample. I chose the fixed point library because its FIR filter was faster thanthe floating point version, but still it is just too slow. If I am able to move around some memory to speed up the FFT, then the FIR filter will be the limiting factor in terms ofperformance.The filter was designed to take in samples from the ADC in real time and use a delay line to calculate 1 output each sampling period. If you are buffering data and want to filter blockwise it might be easier to write a convolution routineSimon Voigt NesboThe documentation does not say a lot about the window function for the FFT.My hamming window seems to be working, but I would like to make sure thatI have done it the correct way. I assume the win(), calc() and other FFTfunctions should be called in the order I have done in fft1024_calc().The windowing function, at present, is meant to window real data and store in bit reversed order. We are revising this function to work with complex data. I would feed in shifted impulses and check that they appear in the correct bit reversed addresses.
Simon Voigt NesboThe documentation is a bit ambigous; it is mentioned on p. 17 in the C28x fixed point library documentation that "Twiddle factors are assembled into “FFTtf” section and contains 768 entries (32-bit words) to facilitate complex FFT computation of upto 1024 points.", but on p. 20 it is said that "The section “FFTtf” holds 4096 twiddle factors, each 32-bits or 2 words wide. A total of 8192 (0x2000) contiguous words need to be allotted this section in memory.When running in emulation mode it may be necessary to allocate an entire RAM block in the linker command file." Which is correct?
Simon Voigt NesboThis is really the most important question. Is there a way to speed up the FIR filter?It seems very ineffective having to loop through the buffer and call the calc() functionfor each sample. I chose the fixed point library because its FIR filter was faster thanthe floating point version, but still it is just too slow. If I am able to move around some memory to speed up the FFT, then the FIR filter will be the limiting factor in terms ofperformance.
Simon Voigt NesboThe documentation does not say a lot about the window function for the FFT.My hamming window seems to be working, but I would like to make sure thatI have done it the correct way. I assume the win(), calc() and other FFTfunctions should be called in the order I have done in fft1024_calc().
Regards,
Vishal
Forgot the attachment
6433.C28xFixedPointLib_Twiddle_Factor_Generator.m
Thank you Vishal, that's what I was looking for.
Is there a word wrap function for this forum by the way? Some of your post goes outside of the screen and can't be seen.
Oddly enough it comes out word-wrapped in email if you subscribe to the forum--Sorry about that i should really preview before i post. I couldn't find a word wrap option anyway....so i edited and reposted.