This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28P550SG: Code execution time difference between CPU and CLA.

Part Number: TMS320F28P550SG

Dear Ti Gurus,

I am observing huge difference (~10 times) in code execution time between CPU and CLA for excatly same code. Execution time is meaured by toggling the GPIO pin in DSO.

Please refer to attached code snip.

CPU execution time is ~100nSec, whereas CLA is executing the code in ~1000nSec, which is almost 10 times slower than CPU.

Is it expected or am I missing someting? Optimization level are same for both codes, but not sure if clock-rate, RAM data-variable locations or any other issues are creating difference in performace. 

Please provide your feedback.

F28P55SJ_CPU_vs_CLA.txt 

Best Regards

Milan

 

  • Hi Milan,

    The CLA is a smaller CPU and limited in some operation types - meaning it does not have the same instructions as the C28x CPU so some operations take longer.

    It performs better with 32-bit floating point data types. 16-bit int types should be limited to load/store register accesses where possible. You could try using float32 in your code for CLA and see if it improves in  performance.

    Regards

    Lori