This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6748: DSPF_sp_fftSPxSP function goes run when L2 cache is enable.

Part Number: TMS320C6748

I use the DSPF_sp_fftSPxSP_674_LE_ELF project TI provided for test.

The project directory is  ..\ti\dsplib_c674x_3_4_0_0\packages\ti\dsplib\src\DSPF_sp_fftSPxSP\c674\DSPF_sp_fftSPxSP_674_LE_ELF

In the project, the default "MAXN" is 256, N = FFT length. And the project does not enable Cache.

I use following lines to enable Cache:

    CacheEnableMAR((unsigned int)0xC0000000, (unsigned int)0x10000000);
    CacheEnableMAR((unsigned int)0x80000000, (unsigned int)0x20000);
    CacheEnable(L1DCFG_L1DMODE_32K | L1PCFG_L1PMODE_32K  |  L2CFG_L2MODE_256K); 

I find that:

1、 When N<=256, the program runs right with or without Cache enabled.

2、 When N>=1024, the program runs right with Cache disabled.

3、 When N>=1024, the program runs right with L1P and L1D Cache enabled.

4、 When N>=1024, the program runs wrong with L1P and L1D and L2 Cache enabled. The program run out. Some times, it has error: "C674X_0: Trouble Reading Memory Block at 0xc00178dc......." 

      The nature C version of the fftSPxSP has no problem. The assembly version of the fftSPxSP has the problem.

      The program is attached below.

Can you help me why the the assembly version of the fftSPxSP can not work normally with L2 Cache enabled?

Because for the efficiency, I have to use L2 Cache, and I have to use fftSPxSP with N>=1024.

Thank you.DSPF_sp_fftSPxSP.zip

  • And I find that, if I delete the following line in the link.cmd file, the program can work normally.

    .kernel: {

    *.obj (.text:optimized) { SIZE(_kernel_size) }
    }

    And If I change it to 

    .kernel: {

    *.obj (.text:optimized) 
    }

    It also run out with N=1024.

    I don't exactly know what the lines means. That seems the optimization to the .text will make it not work normally.

    But why?

    That makes me worry about whether the code can work normally if I open some optimization options in the CCS.


    The "*.obj (.text:optimized) " can save clocks when N=256. I compared that:

    with *.obj (.text:optimized) ,   the 1st call takes 3597 clks.

    without *.obj (.text:optimized) , the 1st call takes 4777 clks.

  • Frank,

    From the information that you are providing, it is not really clear whether the issue is cache related, compiler related or something else.

    1. Can you provide the version of the compiler that you are using and confirm the issue goes with lower compiler optimization level (-o2 or -o1).

    2. As user guide suggests the function is designed to work for 8 <= N <= 131072 so N =1024 should be something that is know to work.  Please make sure that all your data buffers, twiddle factor and output buffers are aligned and placed in DDR or SHRAM on the device. Does the issue also occur if you put code and data in SHRAM instead of SDRAM/DDR memory?

    3. Are you running this on a known good TI platform (EVM, LCDK) using a GEL file to initialize the clocks and DSP ? Is this a custom platform with different DDR. Have you confirmed the external memory interface is stable. When you indicate it is not working with cache enabled Can you indicate if the output is incorrect or partially correct ?

    4. Please compare or provide the linker command file with the two .kernel settings and see if there is any difference with the two settings.

    Regards,

    Rahul

  • 4718.DSPF_sp_fftSPxSP.zipHi Rahul,

        Thank you very much for reply.

        1、Compiler version is 8.3.4.  The problem still occur with optimization level (-o2 or -o1).

        2、All data are aligned, as shown in the attached project I uploaded. The issue still occur with SHARAM.

              The same as DDR2, every goes right if " .kernel: {*.obj (.text:optimized) { SIZE(_kernel_size) }}"   is deleted.

        3、I use EVM board. And "It is not working" means the program run out, it runs to somewhere I don't know.

        4、Attatched is the link.cmd file and the whole project.

        Thank you.

    Frank 

  • Frank,

    Can you share the latest status on this issue.  Does this issue occur only with latest compiler or can you please use older CGT 7.4.14 or CGT 8.1.and report if the same issue. Also, it would be good to understand if you are able to single step through the code and indicate where exactly does the code run into weeds.

    Regards,

    Rahul