This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Performing FFT on TM4C1294XL MCU using CMSIS DSP Library

Howdy, 

I am working on my senior design project, and I need the MCU specified in the subject line to perform a FFT transformation on REAL data that will be sampled from a external source (lets say a function generator).

I followed Amit's (From TI's) application note on how to compile the ARM CMSIS DSP library in code composer for the M4 processors. I successfully compiled the library, and ran the examples that ARM had in the library and the code ran successfully.

Application note:

www.ti.com/.../spma041g.pdf

Now, I am stuck.

I need to essentially do the following:

1. sample a external analog signal through the MCU's ADC (fs = 1MHz, Number of samples = 512)

2. take that data and feed it into the CMSIS DSP's real fft function (to get the energy per frequency bin information.

3. have some way to export/ view this data so I can verify that the MCU is indeed performing a FFT (if that makes sense, can CCS export data to a plot perhaps?)

I will try my best to reply to any questions as fast as possible.

I have no idea where to begin to do this. Sampling real signals and performing ffts' on MCUs is not unhead of, however I would appreciate some guiding on how to accomplish this (coding/syntax wise).

Thank you all in advance!

  • Hello Scott,

    Excellent post on putting out the problem statement. I wish I could share the project with you that does the same for audio :-)

    Since its your senior project, I would gladly help you though. Some points to seed:

    1. An ADC example is available in TivaWare under examples/peripherals/adc/single_ended.c. This is will help you establish sampling.
    2. Since you may be continuously sampling signals, it would make sense to have two buffers for storing the sample. The CPU shall user buffer-1 for FFT when Buffer-2 when being written with ADC data and vice-versa
    3. You would need to first establish that the time it takes for 512 ADC samples > the time it takes for FFT analysis of the 512 ADC samples.
    4. CCS has plotter for both time and frequency domain as one of its feature.

    Regards,
    Amit
  • The CMSIS DSP lib example named "arm_fft_bin_example" in .../Libraries/CMSIS/DSP_Lib/Examples/arm_fft_bin_example would be a good starting point (this is at least the relevant path on my system, but not for a TI MCU ...). This example does a 2048 point transformation on one predefined input data block, and checks for the expected result. You would need to reconfigure it for 512 points, and call the transformation in a loop, triggered by a "sample buffer full" event (might be interrupt or DMA). Not sure about the TI version, I had been working often with the DSP lib for a competitor's defive ...

    But 512 points at 1Msps might be a bit challenging. I measured about 4ms for a 2048 point FFT (80MHz, TM4C). I think the FFT itself is doable, but not much time left for anything else. As Amit suggests, evaluate the FFT timing first. Fortunately, the CMSIS routines proved quite optimization-proof (i.e. produce correct result even with maximal/aggressive optimization settings).

  • Hello f.m.

    I checked the 512 point FFT on TM4C123 at 80MHz from my notes and it turns out that it takes 1.2449 ms for CFFT to Energy per bin calculations. Considering 512 samples at 1 MHz, the FFT calculation time is almost 2.5 that of the sample time for the required samples.

    Regards
    Amit
  • FFTs apparently scale as n log n so your example of should take 2.4x as long as the 512 pts suggested by the OP. Not optimistic for a continuous operation succeeding.

    Robert
  • Hi Amit,

    I checked the 512 point FFT on TM4C123 at 80MHz from my notes and it turns out that it takes 1.2449 ms for CFFT to Energy per bin calculations.

    this confuses me.

    I was referring to a 2048 point FFT (the slightly adapted DSP lib example, with GPIO toggling to measure the runtime with a scope). That was actually with a LM4C Launchpad (before "rebranding"), with  Crossworks 2.4 (gcc based), and -O3 optimization level.

    Perhaps you can try a more agressive optimization (-O2 / -O3), but generally I agree - there will be no much leeway for other tasks. And IMHO the most would be gained by relaxing the sample frequency requirements.

  • Hello f.m.

    I was referring to the 512 point FFT that I have done and had made time measurements for different steps of the FFT from CMSIS DSP. agree that sampling frequency must be relaxed.

    Regards
    Amit
  • It fits with what you would expect with the Big O timing expectations f.m. A 512 pt FFT would be expected to take ~40% of the time of a 2048pt FFT

    Robert

  • I would have expected less. According to the N*log(N) scaling, it should be about 20%. However, I had compared several MCUs with the same code  - and the given 2048 points. And for my audio application (max. 44100Hz), the CFFT consumed only about 10% CPU time.

  • Yep, I miscalculated. That gets us to 40% of the processor dedicated to the fft calculation. Overhead might add enough to be an issue but it not obviously condemned to failure.

    Robert
  • Hello f.m.

    Question: How are you ascertaining 10% of the CPU time for 2048 point FFT?

    Regards
    Amit
  • Question: How are you ascertaining 10% of the CPU time for 2048 point FFT?

    I mean, in relation to the time it takes to fill one buffer of 2048 items. This buffer, btw, was filled by a Timer->ADC->DMA chain, giving a "transfer complete" interrupt after 2048 items. Actually, the load was even less than 10%, for an audio signal of 44100 Hz max. sampling rate (2.5ms per FFT vs. 46ms to fill the buffer). For 1 MHz, that would become a challenge. Even the "channeling out" of that amount of data is non-trivial. If I was the O.P., I would look for alternatives - either relaxed specs, or other hardware (DSP?).

  • Hello f.m.

    OK so the load was from the perspective of processing time to total sampling time assuming that the CPU was Idle for the rest of the (sampling time - FFT time). I would in fact suggest a code like TLV320AIC which has the filtering done internally.

    Regards
    Amit
  • Or a different algorithm entirely. I.E. ditch the FFT in favour of something else

    Robert
  • I agree.

    Even with FFT, there are other implementations, perhaps faster, more specialized ones.

    However, that (to repeat me) depends on the O.P.'s requirements for the project. There are several ways to go.

  • Hi Amit,

    that project, a kind of hobby project of mine, had been "finished" about two years ago. It was an audio spectrum analyzer for a specific hardware I had (based on a STM32F407 with 320x240 LCD).  And the core wasn't too idle in between FFT processing - that board came closed-source graphics driver, so I had to write everthing from scatch, including the low-level LCD interfacing code. With a signal display, a level indicator and the spectral display, this graphics code took about 20ms per cycle to redraw - much more than the FFT itself. That's one reason why I think a 1MHz-realtime-FFT approach is likely bound to fail, at least in this M4 realm ...

  • Hello f.m.,

    1 MHz realtime FFT is anyways not going to be achievable considering the time it takes for the FFT v/s the sampling time for the number of samples. User has to relax the requirement (well we have not heard much from the original poster).

    Regards
    Amit
  • Sorry for the delay Amit,

    I followed the example workbook for the 1294xl and completed lab 6 which essentially continuously samples 3 analog inputs, and then plots them in ccs. So essentially I have completed 1 and 4 from your suggestions.

    I know now that I need loop through 512 times to "build" the input data array. Then call some of the CMSIS DSP functions to create the frequency spectrum data. which should produce a new "frequency spectrum" array, which I can then plot in ccs.

    Forgive me for being an amateur, I am still learning how to program MCUs. I have found example code of online where the CMSIS DSP library is implemented on a stellaris MCU, but the code is muddled with graphics code which I don't quite understand 

    Any guidance on how to build these arrays, and call these specific functions would be appreciated!

  • Hello Scott

    The CMSIS DSP examples are not tied to a device or vendor. You can create an array of ADC sampled values like ADCInput[512] and then In the CMSIS installation the example must be at the following path

    CMSIS\DSP_Lib\Examples\arm_fir_example

    where you may replace the input 10K stream with the ADCInput array. Note that while reading the values from the ADC to the array, typecast them as float32_t.

    Regards
    Amit