This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Performance issue: DSPF_sp_fftSPxSP function

Other Parts Discussed in Thread: TMS320DM8148

Hello, 

cfg_and_bld.rar

Recently I'm facing with a fft performance issue. 

As you can see on the following results there is a pretty big difference between a theoretical complexity (cycles(soll)) and a real one.

You can notice an especially poor performance for a input length = 32768. 

FFT benchmark for 500000000 Hz
DSPLIB SPxSP (1024) cycles(soll)=14464.0000 cycles(ist)=32799.0000 cycles(ratio)=0.4409 time=0.0655 ms
DSPLIB SPxSP (4096) cycles(soll)=69781.0000 cycles(ist)=145969.0000 cycles(ratio)=0.4780 time=0.2919 ms
DSPLIB SPxSP (8192) cycles(soll)=164010.0000 cycles(ist)=439627.0000 cycles(ratio)=0.3730 time=0.8792 ms
DSPLIB SPxSP (16384) cycles(soll)=327850.0000 cycles(ist)=1020660.0000 cycles(ratio)=0.3212 time=2.0413 ms
DSPLIB SPxSP (32768) cycles(soll)=753855.0000 cycles(ist)=5856574.0000 cycles(ratio)=0.1287 time=11.7131 ms
DSPLIB cfftr2(32768) cycles(soll)=983082.0000 cycles(ist)=3171030.0000 cycles(ratio)=0.3100 time=6.3420 ms

 I'm using DM814x and  DSPF_sp_fftSPxSP function comes from a dsplib674x.h

Program code and data are placed into external memory.

MAR registers are configured. Cache seems to work either.

In attachment you can find cfg and bld files.

Could you please help me with that ? What else should I check ? 

  • Mark,

    Please go through the below links:

    http://processors.wiki.ti.com/index.php/C674x_DSPLIB#DSPF_sp_fftSPxSP_.28Mixed_Radix_Forward_FFT_with_Bit_Reversal.29
    http://processors.wiki.ti.com/index.php/C674x_DSPLIB_Known_Issues
    http://processors.wiki.ti.com/index.php/MexExample
    http://processors.wiki.ti.com/index.php/OpenMP_Accelerator_Model_User%27s_Guidehttp://processors.wiki.ti.com/index.php/C6Accel_Signal_Processing_API_Reference_guide#int_C6accel_DSPF_sp_fftSPxSP_.28Floating_point_Mixed_Radix_Forward_FFT_with_Bit_Reversal.29

    http://e2e.ti.com/support/embedded/tirtos/f/355/t/213819
    http://e2e.ti.com/support/embedded/tirtos/f/355/t/319936

    Best regards,
    Pavel
  • Pavel,

    Thank you for your advice. 

    There is still one issue I don't understand :  Performance drop between N=16384 and N=32768 is over 5 times !

     Could you tell me what is the reason of it ?

    Best regards,

    Mark

  • Mark,

    Mark Stojecki said:

    There is still one issue I don't understand :  Performance drop between N=16384 and N=32768 is over 5 times !

     Could you tell me what is the reason of it ?

    Can you provide me the exact steps to reproduce this issue on my side?

    BR
    Pavel

  • Pavel,

    I'm using:

    TMS320DM8148 chip

    dsplib674   v1.03.00.00 - I think so ( Please take a look at an attachment - I added the lib there )

    DEPOT = /opt/ti/ezsdk_5_05_02_00

    BIOS_INSTALL_DIR = $(DEPOT)/component-sources/bios_6_33_05_46
    CGT_C674_ELF_INSTALL_DIR= $(DEPOT)/../C6000CGT7.4.2
    DSPLIB_INSTALL_DIR = $(DEPOT)/../c674x-dsplib
    EDMA3_INSTALL_DIR = $(DEPOT)/component-sources/edma3lld_02_11_05_02
    IPC_INSTALL_DIR = $(DEPOT)/component-sources/ipc_1_24_03_32
    SYSLINK_INSTALL_DIR = $(DEPOT)/component-sources/syslink_2_20_02_20
    XDC_INSTALL_DIR = $(DEPOT)/component-sources/xdctools_3_23_03_53

    I simply run a benchmark function (test_dsplib_SPxSP) for various uiN one after  another inside a normal task.

            uiN=1024;
    
            fCurrentTime = test_dsplib_SPxSP( uiN );
            System_printf( "DSPLIB SPxSP (%d) cycles(soll)=%f cycles(ist)=%f cycles(ratio)=%f time=%f ms \n",
                            uiN, cycles_SPxSP(uiN), fCurrentTime,
                            cycles_SPxSP(uiN)/fCurrentTime,
                            fCurrentTime/((float)(oMachineSpeed.lo/1000.0)) );
    
            uiN=2048;
    
            fCurrentTime = test_dsplib_SPxSP( uiN );
            System_printf( "DSPLIB SPxSP (%d) cycles(soll)=%f cycles(ist)=%f cycles(ratio)=%f time=%f ms \n",
                            uiN, cycles_SPxSP(uiN), fCurrentTime,
                            cycles_SPxSP(uiN)/fCurrentTime,
                            fCurrentTime/((float)(oMachineSpeed.lo/1000.0)) );
    
            uiN=4096;
    
            fCurrentTime = test_dsplib_SPxSP( uiN );
            System_printf( "DSPLIB SPxSP (%d) cycles(soll)=%f cycles(ist)=%f cycles(ratio)=%f time=%f ms \n",
                            uiN, cycles_SPxSP(uiN), fCurrentTime,
                            cycles_SPxSP(uiN)/fCurrentTime,
                            fCurrentTime/((float)(oMachineSpeed.lo/1000.0)) );
    
    
            uiN=8192;
    
            fCurrentTime = test_dsplib_SPxSP( uiN );
            System_printf( "DSPLIB SPxSP (%d) cycles(soll)=%f cycles(ist)=%f cycles(ratio)=%f time=%f ms \n",
                            uiN, cycles_SPxSP(uiN), fCurrentTime,
                            cycles_SPxSP(uiN)/fCurrentTime,
                            fCurrentTime/((float)(oMachineSpeed.lo/1000.0)) );
    
    
            uiN=16384;
    
            fCurrentTime = test_dsplib_SPxSP( uiN );
            System_printf( "DSPLIB SPxSP (%d) cycles(soll)=%f cycles(ist)=%f cycles(ratio)=%f time=%f ms \n",
                            uiN, cycles_SPxSP(uiN), fCurrentTime,
                            cycles_SPxSP(uiN)/fCurrentTime,
                            fCurrentTime/((float)(oMachineSpeed.lo/1000.0)) );
    
    
            uiN=32768;
    
            fCurrentTime = test_dsplib_SPxSP( uiN );
            System_printf( "DSPLIB SPxSP (%d) cycles(soll)=%f cycles(ist)=%f cycles(ratio)=%f time=%f ms \n",
                            uiN, cycles_SPxSP(uiN), fCurrentTime,
                            cycles_SPxSP(uiN)/fCurrentTime,
                            fCurrentTime/((float)(oMachineSpeed.lo/1000.0)) );

    You can find  a test_dsplib_SPxSP function in the attachment. Thank you for help.

    c674x-dsplib.rar

    benchmark.rar

  • Any answer please ? I'm wondering if those benchmarks results are normal for the dsp chip. 

  • Can you try with the latest DSPLIB version of 3_4_0_0?