This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

why C6748 so slow?

Other Parts Discussed in Thread: TMS320C6748, MATHLIB

Hello  everyone,

I use Floating-point of TMS320C6748 for calculate double operation. I don't know I configure Is correct?my configure Build Option->Compiler->Basic->Target Version->C674x;Build Option->Linker->Libraries->Inc1.Libraries->rts6740.lib;CMD->-l rts67plus.lib -levmomap138_bsl.lib.my CCS3.3.81.5,CGT 6.1.21.

my Test result:

double l,tmpl,tmpd;

l = 10.00;     need 13 clock;

tmpl = 20.00; need 13 clock;

tmpd = 30.00; need 13 clock;

if( l >= tmpl)  need 14clock;

    l = tmpl;

tmpl = tmpd; need 16 clock;

l = tmpl*tmpd; need 26clock;

tmpd = l/tmpl;need 901clock;

why

So time consuming calculation

?

How to provide the calculation speed??

Thanks!

  • Bo,

    I repeated your test and here are the numbers that I get with the cycle accurate C674x simulator. In the simulator, it assumes data and code is in internal memory, so these numbers are also memory optimized.

    l = 10.00; // need 3 clock;

    tmpl = 20.00;//need 3 clock;

    tmpd = 30.00;//need 3 clock;

    if( l >= tmpl) //need 14clock;

    l = tmpl;

    tmpl = tmpd; //need 7 clock;

    l = tmpl*tmpd; //need 18 clock;

    tmpd = 1/(tmpl); //need 514 clock;

    I believe these are the correct numbers. In addition to this for the division operation if you use the divdp function from the Math library for C674x, you can bring the cycle count down to 97 cycles for the division operation. I would recommend you to take a look at this library.

    http://www.ti.com/tool/mathlib

    In order to obtain the same numbers on the hardware you need to place all the sections of your code into DSP internal memory. Putting sections of code in specific sections is described in the compiler documentation. Also to obtain best performance i would recommend moving to the latest version of the compiler and not link with the rtsC67plus library when you are linking in the rtsC674x library.

    Also on the C674x code the single precision much better than the doubleprecision so it is recommended to define variable as floats where ever possible rather than double 

    Regards,

    Rahul

  • Rahul,

    Thanks.

          Because the procedure is relatively large, may not in theRAM,so need in the external_RAM.I test the premise in external_RAM.

          How to Improve if use fastrts?

          I have about 50000 lines of code,And there are plenty of assignment of floating point calculation,In particular the division operation

         ,And it must be in a millisecond calculation.I feel that the use of C6748 is not suitable, do you think?

          Have any good suggestions

    Regards,

    Bo


  • Bo,

    The mathlib (or the fastrts library) contains optimized version of key math functions that are used for floating point computations. The unique feature of this library is that you do not need to modify your code from its original version. All you need to do to link in this library, is specify the mathlib higher in the linking order as compared to normal RTS library. The linker scans for symbols from left to right and due to the higher linking order of the library the symbols for rts functions will be found from this library and should provide higher performance for these functions.

    This process should be described in the library documentation. However, this library currently supports development in CCSv4 and CCSv5. I may have an older version of this library that you can use in CCSv3.3. If you would like me to share that with you and an introduction document for quick ways to optimize your code, please write to us on the software developers list mentioned here:

     http://processors.wiki.ti.com/index.php/Software_libraries#Developer_Mailing_List

    Regards,

    Rahul