• Join
  • Sign In with my.TI Login
Texas Instruments
  • Products
  • Applications
  • Tools & Software
  • Support & Community
  • Sample & Buy
  • About TI
Sample & Purchase Cart Sample & Purchase Cart
  • Search
  • Advanced
TI E2E™ Community
  • Support Forums
  • Blogs
  • Groups
  • Videos
  • 简体中文
  • More ...
TI Home » TI E2E Community » Support Forums » Microcontrollers » C2000™ Microcontrollers » C2000 32-bit Microcontrollers Forum » Fixed point DSP - FIR filter performance, etc.
Share
C2000™ Microcontrollers
  • Forums
  • Announcements
  • E2E Wiki
Options
  • Subscribe via RSS
C2000 Resources
  • Product Folder
  • C2000 Training Portal
  • C2000 Technical Training Catalog
  • C2000 Datasheets, App Notes, User Guides
  • C2000 Hardware Design Kits
  • controlSUITE for C2000 Software Library


  • InstaSPIN Resources
  • What is InstaSPIN?
  • Videos and Support


  • InstaSPIN-FOC and InstaSPIN-MOTION Resources
  • What is InstaSPIN-FOC?
  • What is InstaSPIN-MOTION?
  • InstaSPIN Simulation Tool
  • Product Folder: F28069F, F28068F, F28062F, F28068M, F28069M
  • User’s Guide
  • Technical User’s Manual
  • Tools
  • Fixed point DSP - FIR filter performance, etc.

    Fixed point DSP - FIR filter performance, etc.

    This question is answered
    Simon Voigt Nesbo
    Posted by Simon Voigt Nesbo
    on Aug 14 2012 10:15 AM
    Prodigy20 points
    Hi.
    
    For a customer project I am using DMA and XINTF to sample two 14-bit ADC's at 12.5MHz.
    The ADC's are simply connected to the databus, and the address bus is not in use. Once the DMA has filled a buffer with the sample data, I mask out the relevant bits to extract
    the data for each individual ADC. Then, for each ADC, I run the data through an FIR filter with
    a cutoff frequency of 1.25MHz, and then downsample to 2.5MHz by taking every 5th sample.
    This leaves us with 1024 samples. After downsampling I calculate the FFT of the signal. I have used the fixed point DSP library, because according to the benchmark info provided in
    the documentation, the fixed point functions were faster than the floating point ones. The processor used is TMS320F28335. The format from the ADC is 14-bit signed integer. I read this into a regular int, xn. Obviously xn
    is not 14 bits, and I can't think of a good way of reading the 14-bits directly from the DMA buffer
    to the variable xn, and getting the sign bit and everything in xn set correctly. The current code
    for reading the data from one ADC into xn is as follows: if(__DMABuf1__[j] & 0x2000) // Mask out 14 LSB's of DMABuf1, discarding sign bit (14),
    // and convert to negative number. xn = -(8192 - (__DMABuf1__[j] & 0x1FFF)); else // Mask out 13 LSB's of DMABuf1, discarding sign bit (14) xn = __DMABuf1__[j] & 0x1FFF; Question 1: Is there perhaps a faster/better way to do the above? The whole code for processing data from one ADC looks like this:
        unsigned int j;
        unsigned int i, n;
      // -------------------------------------------- // ADC2 // -------------------------------------------- fir_filter_reset(); // Reset filter before starting on ADC2 // Decimation. Downsampling by picking every 5th sample for(i = 0, j = 0; i < ADC_BUF_SIZE; i++) { for(n = 0; n < 5; n++, j++) { if(__DMABuf1__[j] & 0x2000) // Mask out 14 LSB's of DMABuf1, discarding sign bit (14),
    // and convert to negative number. xn = -(8192 - (__DMABuf1__[j] & 0x1FFF)); else // Mask out 13 LSB's of DMABuf1, discarding sign bit (14) xn = __DMABuf1__[j] & 0x1FFF; fir_filter.input=xn; fir_filter.calc(&fir_filter); } // "Convert output to Q30" (sort of) and store in adc1buf. ADC value is 14-bit (signed). __adcbuf__[i] = (long)fir_filter.output * ((long)1 << 17); } fft1024_calc((long *)__adcbuf__); // Calculate FFT for ADC2 fir_filter_reset() is a convenience function for resetting the filter (I thought this might be desirable,
    as the same filter is used for both ADC's in order to save resources): #pragma DATA_SECTION(fir_filter, "firfilt"); FIR16 fir_filter = FIR16_DEFAULTS; // Cleans up the delay buffer // Re-calls the filter's init() function. void fir_filter_reset(void) { unsigned int i; //Clean up delay buffer for(i = 0; i < (FIR_ORDER_REV+3)/2; i++) { delay_buf[i]=0; } fir_filter.init(&fir_filter); } fft1024_calc() is a convenience function for calculating 1024 point FFT. Since the FFT's input is Q31, I multiply the ADC value with 2^17. The buffer containing the input
    signal is also used to store the output: void fft1024_calc(long *signal) { unsigned int i; //Clean up computation buffer for(i=0; i < 2*N; i++) { fft_comp_buf[i] = 0; } RFFT32_brev(signal, fft_comp_buf, N); /* real FFT bit reversing */ fft.magptr=signal; /* Magnitude output buffer */ fft.winptr=(long *)hamming_win; /* Window coefficient array */ fft.win(&fft); fft.calc(&fft); fft.mag(&fft); } An init function for the FFT is called at startup: void fft1024_init(void) { fft.ipcbptr=fft_comp_buf; /* FFT computation buffer */ fft.winptr=(long *)hamming_win; /* Window coefficient array */ fft.init(&fft); /* Twiddle factor pointer initialization */ } Now, I am having some performance issues. To test how fast the code is, I set a GPIO when it starts,
    and reset it when it has finished for both ADC's. Then I measure the time it takes by connecting an
    oscilloscope probe to the GPIO. According to the benchmark info in the C28XX fixed point DSP documentation, my 1024 point FFT
    should take about 1/4 the time to execute that it takes to run 5120 samples through a 16-bit FIR filter.
    In practice the FFT takes almost as long to execute. Probably because I put the twiddle factors in
    FLASH, because I was not able to make room for them in RAM. I knew it would have an effect
    performance, but not by that much. Question 2: How much memory do the twiddle factors actually require? The documentation is a bit ambigous; it is mentioned on p. 17 in the C28x fixed point library
    documentation that "Twiddle factors are assembled into “FFTtf” section and contains 768
    entries (32-bit words) to facilitate complex FFT computation of upto 1024 points."
    , but on p. 20
    it is said that "The section “FFTtf” holds 4096 twiddle factors, each 32-bits or 2 words wide.
    A total of 8192 (0x2000) contiguous words need to be allotted this section in memory.
    When running in emulation mode it may be necessary to allocate an entire RAM block in
    the linker command file."
    Which is correct? Question 3: This is really the most important question. Is there a way to speed up the FIR filter?
    It seems very ineffective having to loop through the buffer and call the calc() function
    for each sample. I chose the fixed point library because its FIR filter was faster than
    the floating point version, but still it is just too slow. If I am able to move around some
    memory to speed up the FFT, then the FIR filter will be the limiting factor in terms of
    performance. Question 4: The documentation does not say a lot about the window function for the FFT.
    My hamming window seems to be working, but I would like to make sure that
    I have done it the correct way. I assume the win(), calc() and other FFT
    functions should be called in the order I have done in fft1024_calc().
    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    All Replies
    • Vishal_Coelho
      Posted by Vishal_Coelho
      on Aug 14 2012 15:43 PM
      Verified Answer
      Verified by Simon Voigt Nesbo
      Expert5875 points

      Hi Simon,

      Simon Voigt Nesbo
      Is there perhaps a faster/better way to do the above?

      xn = ((int)(__DMABuf1__[j] << 2) >> 2); should sign extend the 14 bit number to 16 bits.

      Simon Voigt Nesbo
      The documentation is a bit ambigous; it is mentioned on p. 17 in the C28x fixed point library
      documentation that "Twiddle factors are assembled into “FFTtf” section and contains 768
      entries (32-bit words) to facilitate complex FFT computation of upto 1024 points."
      , but on p. 20
      it is said that "The section “FFTtf” holds 4096 twiddle factors, each 32-bits or 2 words wide.
      A total of 8192 (0x2000) contiguous words need to be allotted this section in memory.
      When running in emulation mode it may be necessary to allocate an entire RAM block in
      the linker command file."
      Which is correct?


      you are correct that statement is ambiguous and incorrect, we have already pegged it for correction
      on the next release. Basically for any N point FFT you need N - N/4 twiddle factors for our
      implementation of the FFT (theoretically you only need N/2 because of symmetry).
      Since the max FFT you can do with the Fixedpoint lib is 4096, we have a table of 3072 twiddle factors
      (each 32-bit complex) so we require a max space of 3072*2(complex)*2(32-bit) = 0x3000. For
      N = 1024, you will need 768 twiddle factors. Having the twiddles in flash will slow down the FFT, you
      have to factor in the FLASH wait states.
      I have attached a twiddle factor generator script with this post which you can use to generate
      (generate in Q30 format) the smaller table i.e. 768.
      The name of the generated table will be TF_Q30. You will need to :
      1. include the header file with your project
      2. assign the table to the FFTtf section: #pragma DATA_SECTION(TF_30,"FFTtf")
      3. When creating the FFT structure in code you need to use this macro(changes from the default
      initialization macro are in red):
      #define CFFT32_1024P_NEW_DEFAULTS    { (long *)NULL,\
              (long *)NULL,\
              1024,\
              10,\
              (long *)NULL,\
              (long *)NULL,\
              0,\
              0,\
              1,\ //skip factor is 1 not 4
              (void (*)(void *))NULL,\ //NOTICE I am not assigning the init function here
              (void (*)(void *))FFT32_izero,\
              (void (*)(void *))FFT32_calc,\
              (void (*)(void *))NULL,\
              (void (*)(void *))NULL}

      Simon Voigt Nesbo
      This is really the most important question. Is there a way to speed up the FIR filter?
      It seems very ineffective having to loop through the buffer and call the calc() function
      for each sample. I chose the fixed point library because its FIR filter was faster than
      the floating point version, but still it is just too slow. If I am able to move around some
      memory to speed up the FFT, then the FIR filter will be the limiting factor in terms of
      performance.


      The filter was designed to take in samples from the ADC in real time and use a delay line to
      calculate 1 output each sampling period. If you are buffering data and want to filter blockwise
       it might be easier to write a convolution routine

      Simon Voigt Nesbo
      The documentation does not say a lot about the window function for the FFT.
      My hamming window seems to be working, but I would like to make sure that
      I have done it the correct way. I assume the win(), calc() and other FFT
      functions should be called in the order I have done in fft1024_calc().


      The windowing function, at present, is meant to window real data and store in bit reversed
      order. We are revising this function to work with complex data. I would feed in shifted
      impulses and check that they appear in the correct bit reversed addresses.

      Regards,

      Vishal

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Vishal_Coelho
      Posted by Vishal_Coelho
      on Aug 14 2012 15:45 PM
      Expert5875 points

      Forgot the attachment

      6433.C28xFixedPointLib_Twiddle_Factor_Generator.m

      Regards,

      Vishal

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Simon Voigt Nesbo
      Posted by Simon Voigt Nesbo
      on Aug 16 2012 05:32 AM
      Prodigy20 points

      Thank you Vishal, that's what I was looking for.

      Is there a word wrap function for this forum by the way? Some of your post goes outside of the screen and can't be seen.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Vishal_Coelho
      Posted by Vishal_Coelho
      on Aug 16 2012 15:34 PM
      Expert5875 points

      Oddly enough it comes out word-wrapped in email if you subscribe to the forum--Sorry about that i should really preview before i post. I couldn't find a word wrap option anyway....so i edited and reposted.

      Regards,

      Vishal

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    TI E2E™ Community
    • Support Forums
    • Blogs
    • Videos
    • Groups
    • Site Support & Feedback
    • Settings
    TI E2E™ Community Groups
    • TI University Program
    • Make the Switch
    • Microcontroller Projects
    • Motor Drive & Control
    Other Communities
    • Deyisupport
    • Designsomething.org
    • beagleboard.org
    • TI on Element 14
    • TI on TechXchangeSM
    Other Technical & Support Resources
    • WEBENCH® Design Center
    • Product Information Centers
    • Technical Documents
    • TI Design Network
    • TI Technical Articles
    • TI Training

    All content and materials on this site are provided "as is". TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with regard to these materials, including but not limited to all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement of any third party intellectual property right. TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with respect to these materials. No license, either express or implied, by estoppel or otherwise, is granted by TI. Use of the information on this site may require a license from a third party, or a license from TI.

    Content on this site may contain or be subject to specific guidelines or limitations on use. All postings and use of the content on this site are subject to the Terms of Use of the site; third parties using this content agree to abide by any limitations or guidelines and to comply with the Terms of Use of this site. TI, its suppliers and providers of content reserve the right to make corrections, deletions, modifications, enhancements, improvements and other changes to the content and materials, its products, programs and services at any time or to move or discontinue any content, products, programs, or services without notice.

    Follow Us Texas Instruments on Facebook Texas Instruments on Twitter Texas Instruments on LinkedIn Texas Instruments on Google+
    TI Worldwide | Contact Us | my.TI Login | Site Map | Corporate Citizenship | mobile m.ti.com (Mobile Version)

    TI is a global semiconductor design and manufacturing company. Innovate with 100,000+ analog ICs and
    embedded processors, along with software, tools and the industry’s largest sales/support staff.

    © Copyright 1995-2013 Texas Instruments Incorporated. All rights reserved.
    Trademarks | Privacy Policy | Terms of Use