This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6748: TMS320C6748: Noise in signal processing when using L1 cache and MATHLib.

Part Number: TMS320C6748
Other Parts Discussed in Thread: MATHLIB

In fact, I've posted this question, but I'm post again because It is an urgent situation that needs to be answered quickly.

I am sorry to write a question with the same content.

If you confirm this post, I will remove any previous questions.

I beg your pardon.

Environment

 - Board : TMS320C6748

 - CCS 8.3.1

 - compiler version : 8.3.4

 - Used library : DSP_LIB, MATH_LIB

 - L1_CACHE_ENABLED

 - All the data stored in the DDR2

From McASP example code, I experiment a signal processing algorithm.

The experiment condition is:

Sampling rate : 16000Hz

Buffer size : 512

With L1_CACHE_ENABLED and log10sp_v, there is noise of the speaker output, which is not when I comment out L1_CACHE_ENABLED and log10sp_v.

Of course I verified the clean output value in the code before It is transferred to speaker output.

In this case, how could I remove the noise without disabling L1 cache and MATH_LIB?

I using the rCSL cache functions, so following codes are rCSL cache example codes.

static void setup_DDR2_cache (void)
{
	// Set SDRAM (MAR 192 - 223) as cacheable
	for(counter = 192; counter < 224; counter++)
		CSL_FINST(cacheRegs->MAR[counter], CACHE_MAR_PC, CACHEABLE);
}/* setup_SDRAM_cache */

/*---------------------------------------------------------------------------*/

static void enable_L1 (void)
{
	// Set L1P size to 32K
	CSL_FINST(cacheRegs->L1PCFG, CACHE_L1PCFG_MODE, 32K);
	stall = cacheRegs->L1PCFG;
	
	// Set L1D size to 32K
	CSL_FINST(cacheRegs->L1DCFG, CACHE_L1DCFG_MODE, 32K);
	stall = cacheRegs->L1DCFG;
}/* enable_L1 */

/*---------------------------------------------------------------------------*/

static void enable_L2 (void)
{
	// Set L2 size to 256K
	CSL_FINST(cacheRegs->L2CFG, CACHE_L2CFG_MODE, 256K);
	stall = cacheRegs->L2CFG;
}/* enable_L2 */

As you can see, there is a code that enabling the L2 cache, but If using that code it does not work correctly.

So I used just two static void setup_DDR2_cache () and static void enable_L1() functions.

And actually I didn't use any code about L2 cache, I don't know how it works in my project.

Following codes are my linker command.

MEMORY
{
    DSPL2ROM     o = 0x00700000  l = 0x00100000   /* 1MB L2 Internal ROM */
    DSPL2RAM     o = 0x00800000  l = 0x00040000   /* 256kB L2 Internal RAM */
    DSPL1PRAM    o = 0x00E00000  l = 0x00008000   /* 32kB L1 Internal Program RAM */
    DSPL1DRAM    o = 0x00F00000  l = 0x00008000   /* 32kB L1 Internal Data RAM */
    SHDSPL2ROM   o = 0x11700000  l = 0x00100000   /* 1MB L2 Shared Internal ROM */
    SHDSPL2RAM   o = 0x11800000  l = 0x00040000   /* 256kB L2 Shared Internal RAM */
    SHDSPL1PRAM  o = 0x11E00000  l = 0x00008000   /* 32kB L1 Shared Internal Program RAM */
    SHDSPL1DRAM  o = 0x11F00000  l = 0x00008000   /* 32kB L1 Shared Internal Data RAM */
    EMIFACS0     o = 0x40000000  l = 0x20000000   /* 512MB SDRAM Data (CS0) */
    EMIFACS2     o = 0x60000000  l = 0x02000000   /* 32MB Async Data (CS2) */
    EMIFACS3     o = 0x62000000  l = 0x02000000   /* 32MB Async Data (CS3) */
    EMIFACS4     o = 0x64000000  l = 0x02000000   /* 32MB Async Data (CS4) */
    EMIFACS5     o = 0x66000000  l = 0x02000000   /* 32MB Async Data (CS5) */
    SHRAM        o = 0x80000000  l = 0x00020000   /* 128kB Shared RAM */
    DDR2         o = 0xC0000000  l = 0x20000000   /* 512MB DDR2 Data */
}

SECTIONS                                       
{                                              
    .text          >  DDR2
    .stack         >  DDR2
    .bss           >  DDR2
    .cio           >  DDR2
    .const         >  DDR2
    .data          >  DDR2
    .switch        >  DDR2
    .sysmem        >  DDR2
    .far           >  DDR2
    .args          >  DDR2
    .ppinfo        >  DDR2
    .ppdata        >  DDR2
  
    /* COFF sections */
    .pinit         >  DDR2
    .cinit         >  DDR2
  
    /* EABI sections */
    .binit         >  DDR2
    .init_array    >  DDR2
    .neardata      >  DDR2
    .fardata       >  DDR2
    .rodata        >  DDR2
    .c6xabi.exidx  >  DDR2
    .c6xabi.extab  >  DDR2
}

As you can see, all of my codes and data are stored in DDR2 memory.

So I think there is codes that moved from DDR2 to DSP internal memory.

I'll explain following "noise go away with cache enabled but without use of log10sp_v"

There are two scenarios.

First.

If I comment out setup_DDR2_cache () and enable_L1() functions, it works correctly.

In that case I still use MATHLIB like a log10sp_v.

Second.

If I comment out MATHLIB codes, also it works correctly.

In that case I still use cache functions.

So I suspect cache functions ans MATHLIB both.

I don't know what exactly occurring problem, so please give me some advice.

Thank you.

  • Hi,

    Can you post which Processor SDK RTOS version are you using?

    Best Regards,
    Yordan

  • I don't use RTOS. My project not using any OS.

    thank you

  • Kim,

    Sorry for the delay in getting back to you on this. I have checked the archives and don`t see any issues related to log10sp_v function in the MATHLIB. 

    If the code is working without MATHLIB and cache enabled then can you describe the nature of the failure when you include mathlib. does the DSP core hang or go into weeds. How big is the vector passed to log10sp_v. Also indicate the version of the compiler and the optimization level used? 

    Is there are significant performance improvement due to which you prefer to use log10sp from MATHLIB instead of the compiler RTS library? One issue that I can see is that log10sp_v uses the inlined version of the function so this can significant increase the code size and cause the code to expand to size bigger than cache.  I would try to see if removing the inlining of code resolves the issue.

    Regards,

    Rahul

  • Thank you for your reply.

    My project gets a sound input through the microphone and sends it out to the speakers with some sort of signal processing.

    When I start using mathlib, appeared periodic ticking noises from the processed sound.

    So, to check if there is a problem with my signal processing algorithm, I put 0 in the output and checked.

    The result was the same tickling sound and disappeared when comment out mathlib.

    The compiler version is 8.3.4 and optimization level is -o3.

    I am not using only log10sp_v in Mathlib and log10sp_v is one confirmed function, i.e. other methods may have the same problem.

    Therefore, meaningful performance improvements were achieved through the many functions of the mathlib.


    I don't know about the RTS library, is it the math library available on c6748?

    I will look up the RTS library and check if this problem is solved.

    Thank you.

  • RTS library is the default compiler library that provides the log10sp function defined as in standard C. MATHLIB optimizes this by making some compromises in precision and optimized it or the DSP architecture. If you add libc.a and remove the link to MATHLIB then the API will make call into RTS library instead of MATHLIB for log10sp. 

    I would also recommend you to try -O2 option and a slightly older version of the compiler that CGT 8.3.4.  There is some performance issue reported with CGT8.3.3 that we are looking internally but would be good to try the code with --legacy compiler flag or if you could try this with version of compiler used in Release notes of the package.

    Regards,

    Rahul