TDA2PXEVM: TDA2PXEVM

Soumya V S

Part Number: TDA2PXEVM

Hi,

I am having a custom algorithm plugin in rtos, dsp core. But the algorithm is taking more time. On debugging it is observed that 4 "for" loops are running sequentially causing this error. How can i optimise the code running in DSP?

over 5 years ago

0 Dave Bell over 5 years ago

TI__Genius 14680 points

Soumya,

There is a good training series at: https://training.ti.com/c6000-embedded-design-workshop

Also, this is an old appnote but holds up really well and will be a good companion for the training series -- http://www.ti.com/lit/an/sprabg7/sprabg7.pdf

For AM57 in general, we have a great deal of training material that can be reviewed at https://training.ti.com/am57x-sitara-processors-training-series

Some references that should be useful:

TMS320C6000 Optimizing C Compiler Tutorial (Rev. A) http://www.ti.com/lit/pdf/spru425

TMS320C6000 Programmer's Guide (Rev. K) http://www.ti.com/lit/pdf/spru198

Processor SDK DSP section - http://software-dl.ti.com/processor-sdk-rtos/esd/docs/latest/rtos/DSP_Software.html

Best regards,

Dave

0 Soumya V S over 5 years ago in reply to Dave Bell

Intellectual 260 points

Thanks Dave.

I have one doubt regarding #pragma MUST_ITERATE.

My code contains 4 number of "nested for" loop running sequentially.

for (row...)
{
for (col)
{

//some operations

}

The loop iterates 450X1800 times.

The loop takes 114ms. If i use this pragma will there be any improvement in execution time?

0 Dave Bell over 5 years ago in reply to Soumya V S

TI__Genius 14680 points

Soumya,

The MUST_ITERATE pragma give the minimum number of iterations through the loop. So the number of nested loops doesn't necessarily apply there but if the outer loop is iterating 450x1800 times you should be able to apply it.

Note that you can look to additionally use _nassert() on the loop counters for the nested and outer loops, and also on your address pointers to give hints to the compiler for ways to unroll and parallelize.

Another optimization technique is to merge two (nested) loops into one.

Best regards,

Dave

Processors

Processors forum

TDA2PXEVM: TDA2PXEVM