This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Any instruction can not be executed after CMSIS DSP Library FFT functions with Optimization Level 2

Hello Community Members

I am currently using CMSIS DSP library in Code Composer Studio. I followed Amit's CMSIS DSP library integration to Code Composer Studio for Tiva C series document and I can use these functions. 

But I have a problem that after CMSIS DSP FFT functions, I can not execute any command and code after these functions. My code does not enter While(1) loop. When ı change to Optimization level from 2) Global Optimization to off, It works fine. I wonder why it happens with this optimization settings.

#include <stdint.h>
#include <stdbool.h>
#include <math.h>
#include "inc/hw_memmap.h"
#include "inc/hw_types.h"
#include "driverlib/fpu.h"
#include "driverlib/sysctl.h"
#include "driverlib/rom.h"
#include "arm_math.h"
#include "arm_const_structs.h"

#define TEST_LENGTH_SAMPLES 2048

extern float32_t testInput_f32_10khz[TEST_LENGTH_SAMPLES];
static float32_t testOutput2[128];

uint32_t fftSize = 1024;
uint32_t ifftFlag = 0;
uint32_t doBitReverse = 1;

uint32_t refIndex = 213, testIndex = 0,testIndex2 = 0;

#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif

#define SERIES_LENGTH 256
#define Sampling_Freq 3200.0f
#define Line_Freq 50.0
float32_t gSeriesData[SERIES_LENGTH];
float fRadians;
int32_t i32DataCount = 0,success;
float32_t Fundamental_Freq;

void main()
{

    FPULazyStackingEnable();
    FPUEnable();
    SysCtlClockSet(SYSCTL_SYSDIV_4 | SYSCTL_USE_PLL | SYSCTL_XTAL_16MHZ | SYSCTL_OSC_MAIN);

    while(i32DataCount < SERIES_LENGTH)
    {
    gSeriesData[i32DataCount] = 2 * sinf( Line_Freq * (i32DataCount / Sampling_Freq) * 2 * M_PI);
    i32DataCount++;
    gSeriesData[i32DataCount] = 0;
    i32DataCount++;
    }

      float32_t maxValue,maxValue2;
      arm_cfft_f32(&arm_cfft_sR_f32_len128, gSeriesData, ifftFlag, doBitReverse);

      arm_cmplx_mag_f32(gSeriesData, testOutput2, 128);


      arm_max_f32(gSeriesData, 256, &maxValue2, &testIndex2);
      arm_max_f32(testOutput2, 128, &maxValue, &testIndex);

      // Any instruction is not working after this line and code stucks in 60th line
      while(1)
      {
          Fundamental_Freq = testIndex * Sampling_Freq/SERIES_LENGTH;
      }
}

Regards

  • Hello Serkan,

    Which compiler version are you using and where does the code get stuck?

    Regards
    Amit
  • Hello Amit

    I am using 6.1.0.00104 Version of Code Composer Studio.

    My code stucks in 60th line(while(1)) and does not enter inside of While(1) loop.

    Regards

    Serkan
  • Hello Serkan,

    Code optimization changes the manner in which the code debug occurs. Did you check by viewing the variables that the values are being evaluated correctly?

    Regards
    Amit
  • My code stucks in 60th line(while(1)) and does not enter inside of While(1) loop.

    Most probably because the optimizer has determined the content of the loop is pointless, and thrown it out.

    Fundamental_Freq = testIndex * Sampling_Freq/SERIES_LENGTH;

    In the debugger, check in the assembler view to see what this line had actually been evaluated/compiled to.

    All items are constants or variables set before, and unchanged. Use the volatile keyword to keep the compiler from disposing your code.

    The CMSIS code is fine - the same FFT function work for me, with optimization level 3.

  • Hello Amit

    When code works with no optimization, all variables seems ok. With optimizer, FFT variables seems ok too but As I said, after these functions no instruction can not be executed and seems their default value.

    Serkan
  • Hello f.m

    When I changed variable Fundamental_Freq in to volatile variable, it is executed only one time and after the code again stucks.

    Serkan
  • When I changed variable Fundamental_Freq in to volatile variable, it is executed only one time and after the code again stucks.

    So the variable is at least once evaluated then.

    Have you checked with the assembler view WHAT the compiler made of your C code of this loop ?

  • Hello Serkan

    I agree with f.m. Evaluation of the assembly code will tell what it is doing.

    Regards
    Amit
  • Hello f.m and Amit

    Ok I will do as you said.

    I would like to ask something else to you.

    When I include CMSIS DSP Library in to my workspace, in memory allocation this library takes 186k space in FLASH. Thats why I couldt use it in my application. I would like to use only FFT functions of it. How can I reduce this size ?

    Serkan
  • Hello Serkan,

    When you use the precompiled library in a project, the linker will optimize away any unused function calls.

    Regards
    Amit
  • And, it will unroll loops if the right side of an expression is constant - even if the left side (your variable) is "volatile".

    And regarding your size issue - that is a "special" feature of the CMSIS FFT example. It uses predefined static test (input) data, which will end up in Flash. The example then checks if the evaluated maximum refers to the expected bin.

    Remove this static test data array and provide your own buffer in RAM, and the used Flash size will reduce drastically.

  • Hello Amit

    I did all things same like in your CMSIS DSP integration document. And compiled it before include as a library in to my workspace.

    While creating library in your document, it took 10-15 minutes to compile it.

    I think with this steps like in your document, It became precompiled library. Am I doing something wrong in here ?

    Regards

    Serkan
  • Hello f.m

    I removed FFT's static array. There is only my buffer that holds sinf values. But It is impossible that all of 185 kB value is related with this buffer.

    Only 1kb space decreased.

    When I look at the memory alllocation, It seemms like CMSIS DSP .text section allocates huge space.

    Serkan
  • Hello Serkan,

    The steps are correct. I would check with f.m.'s last post as well.

    Regards
    Amit
  • Hello Serkan

    Did you check "remove unused section" options?

    Regards
    Amit
  • Hello Amit

    Where can I find this "remove unused section" option?

    Thanks

    Regards
    Serkan
  • Hello Serkan

    Right click the project, go to Properties

    Then under Build -> ARM Linker -> Advanced Options -> Linktime Optimization

    the first section to be selected in "Eliminate sections not needed in the executable". Select ON in the drop down menu

    Regards
    Amit
  • Hello Amit

    I selected this as an "ON" and compiled my project again. But it seems same.

  • I'm not a CCS user, Amit can surely provide better instructions here.
    But with 185k, there definitely seems something wrong. In my project involving FFT, the CMSIS part used just a few kBytes. However, I compiled just the (DSPlib) source file I needed. And, my builds never took longer than 4..5 seconds (full build).

    If in doubt, check the map file, where those Flash space is lost. (You probably need to change the build/project settings to create a map file during build ...)
  • Hello Serkan,

    Did you use the "Rebuild All" option instead of a simple Build? If yes, then can you check which functions calls for the largest size in flash and if they are being called within the functions?

    Regards
    Amit
  • Hello Amit

    Yes I tried that too. I am sharing memory allocation of my workspace. As you can see in below, It seems like 177kB off memory is used for .const variables.

    Regards

    Serkan

  • Hello Serkan

    That is what f.m. also highlighted. If the tables are not required, then you may want to selectively trim out the table file

    Regards
    Amit
  • Hello Amit

    Yes I don't use most of them. Then How can I ignore them ? Can you help me in this case ?

    Thanks

    Regards

    Serkan
  • Hello Serkan,

    You may need to remove it manually or put ifdef for the tables to make sure only the required entry is defined and used by your project. The overhead it will add is that as you change the main project for filter points, the library would also need to be recompiled.

    Regards
    Amit
  • Hello Amit

    As summary, I should go back and change DSP CMSIS Library and remove unused parts and recompiled it. Thus I will create new library that is special for my application. And I will include this new library in to my workspace. Am I right in this case ?

    Regards
    Serkan
  • Hello Serkan

    Never remove unused part. Put them in defines. I have not done them myself as in my applications I care about flexibility of change than code size.

    Regards
    Amit
  • Hello Amit

    Ok. Thanks for your suggestion. Then I should put unused parts in to defines and recompiled library as in your integration document and include new library in to my workspace right ??

    Regards
    Serkan
  • Hello Serkan

    Yes. That should be OK. Again: I have not done any such optimizations.. So to a large degree you would need to ascertain the impact of the change.

    Regards
    Amit
  • Hello Serkan,

    changing third-party code (like the source code of the CMSIS DSPlib) is not the intended way. Otherwise you have to re-do all your changes with a new version. Rather set up you project to include just what you need. For a gcc-like compiler, you could specify build options to put each function in a separate section, and data, too (-ffunction-sections, -fdata-sections). That would reduce the size of dead code.

  • Hello f.m

    I am really confused in this way. I am not experienced CCS user that's why these terms seems to me unfamiliar.

    Could you suggest me any guideline or example or something that leads me to do it ?

    Regards

    Serkan

  • Hello Amit

    I tried to create new library and put defines in unused parts and consts. But these variables is related with the a lot of .c file in library. That's why everything goes error in library. I am in so bad situtation now.

    I only wonder that how other other users can use FFT library with these code size. I am unable to work out.

    Regards
    Serkan
  • I am not experienced CCS user that's why these terms seems to me unfamiliar.

    I am neither a CCS user, so I can't give very specific hints for this IDE. And neither did I use the TI-specific DSP-lib.

    However, from one of your post above, it seems you linked in factor tables ("twiddle-factor" tables, FFT algorithm specific) for data types you don't use. Since you use float (single precision floating point), you can throw out (i.e. exclude from build) everything named "q15" or "q31". The screenshot only showed constant tables, but I guess you also included the q15/q31 version of the routines in your build. Exclude them, too.

    Perhaps this document helps: ti.com/lit/an/spma041g/spma041g.pdf

  • Hello Serkan,

    It depends on what functions are required. It is not going to be an easy task modifying the library, but persistence will still pay.

    Hello f.m.

    The DSP Lib is not TI specific. It is the generic CMSIS DSP from ARM, only ported to CCS IDE (TI ARM Compiler), as CCS IDE was not supported by the original build

    Regards
    Amit
  • Hello Amit,

    The DSP Lib is not TI specific.

    I'm aware that the "basic" DSP lib is generic, and not specific to TI. However, I am not sure how (if at all) it is re-packed with TI specifics like build files and peripheral driver code. For instance, the original lib contains code for M0 and M3 cores, which is irrelevant for TI. And other vendors use to bundle it with their peripheral libs.

    Finally, the O.P. must come to grips with his IDE, build environment, and project setup. And that is usually learning-by-doing ...

  • To comment myself:

    The arm_common_tables.c file of the DSP lib does not contain any #ifdefs around the tables. You need to advice the linker to remove all unused data/functions. On gcc-based systems, this is usually achieved with the linker flag "--gc-sections", in addition to the compiler flags -ffunction-sections and -fdata-sections.

    I suggest to consult the CCS documentation in this regard.

  • Hello f.m.

    There is no change made to the original code. The only change is the addition of the files in the support package

    Regards
    Amit
  • Hello Amit and f.m

    I finally reduced code size by importing older version of CMSIS DSP Library. 

    I used 3.2 version instead of newest version. It is also big too but at least it fit in my applcation. Older version takes 100kB space instead of 180kB like in newest version. It works ok but I will try to reduce this size.

    Regards

    Serkan

  • Hello Amit again

    I would you like to ask you something. I know I am asking a lot of things but I need your experiences :)

    My workspace says in advice to use optimization 3 but in your document we are using optimizing 2.

    CCS says Not all available code size is being used.

    Which optimization should I use in my workspace ? Could you give me brief information about it ?

    Thanks so much

    Regards
    Serkan

  • Hello Serkan,

    Every optimization step has an impact on performance and size. Please refer to the wiki on CCS Compiler Optimization. I would suggest putting the next question on Optimization on CCS Forum (i generally stick to 2 as it is a tried and tested option when it comes to TM4C for both speed and optimization code size)

    processors.wiki.ti.com/.../Optimizer_Assistant

    Regards
    Amit