This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6748: Why CPU on board vs on simulator is different? How to reduce the processing time | Optimization Help Needed

Part Number: TMS320C6748

Tool/software:

Hi TI Team,

I’m using TMS320C6748 with CCS 5.5 and XDS100v2 emulator for a noise suppression application.
The same code runs with low CPU on the CCS simulator(~48MHz), but on the actual board, CPU usage is much higher. (~2794 MHz) calculated using TSCL.

Key Details:
Using ~850 KB of constant data buffers.
Buffers placed in SHRAM using .cmd and #pragma DATA_SECTION

Questions:
Why is CPU usage so high on board compared to simulator?
Are we placing the buffers correctly? Should we move some data to any other memory block for faster access?
How to further reduce processing time on board? Any settings on Memory MAP?

We’re looking for urgent guidance on optimizing this setup to reduce on-board CPU usage and match simulator performance.

.cmd: 

MEMORY
{
// DSPL2ROM o = 0x00700000 l = 0x00100000 /* 1MB L2 Internal ROM */
DSPL2RAM o = 0x00800000 l = 0x00040000 /* 256kB L2 Internal RAM */
DSPL1PRAM o = 0x00E00000 l = 0x00008000 /* 32kB L1 Internal Program RAM */
DSPL1DRAM o = 0x00F00000 l = 0x00008000 /* 32kB L1 Internal Data RAM */
SHDSPL2ROM o = 0x11700000 l = 0x00100000 /* 1MB L2 Shared Internal ROM */
SHDSPL2RAM o = 0x11800000 l = 0x00040000 /* 256kB L2 Shared Internal RAM */
SHDSPL1PRAM o = 0x11E00000 l = 0x00008000 /* 32kB L1 Shared Internal Program RAM */
SHDSPL1DRAM o = 0x11F00000 l = 0x00008000 /* 32kB L1 Shared Internal Data RAM */
// EMIFACS0 o = 0x40000000 l = 0x20000000 /* 512MB SDRAM Data (CS0) */
EMIFACS2 o = 0x60000000 l = 0x02000000 /* 32MB Async Data (CS2) */
EMIFACS3 o = 0x62000000 l = 0x02000000 /* 32MB Async Data (CS3) */
EMIFACS4 o = 0x64000000 l = 0x02000000 /* 32MB Async Data (CS4) */
EMIFACS5 o = 0x66000000 l = 0x02000000 /* 32MB Async Data (CS5) */
SHRAM o = 0xC0000000 l = 0x20000000 /* 128kB Shared RAM */
// DDR2 o = 0xC0000000 l = 0x20000000 /* 512MB DDR2 Data */
}

SECTIONS
{
.text > SHRAM
.stack > SHRAM
.bss > SHRAM
.cio > SHRAM
.const > SHRAM
.data > SHRAM
.switch > SHRAM
.sysmem > SHRAM
.far > SHRAM
.args > SHRAM
.ppinfo > SHRAM
.ppdata > SHRAM

/* COFF sections */
.pinit > SHRAM
.cinit > SHRAM

/* EABI sections */
.binit > SHRAM
.init_array > SHRAM
.neardata > SHRAM
.fardata > SHRAM
.rodata > SHRAM
.c6xabi.exidx > SHRAM
.c6xabi.extab > SHRAM
}

Thanks & Regards,
Priyanka