This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

!!!!!How to know when to stop optimizing algorithm?

Hi RandyP:


In the thread ----
http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/112/p/265550/929354.aspx#929354
I followed your advice and I searched "C6000 Training" keyword on
http://processors.wiki.ti.com.

        So I found the useful workshop -
TMS320C6000 DSP Optimization Workshop.In the workshop There was
one pdf file --- op6000_student_guide_v1.51.pdf.In addition I saw
the page like below and I couldn't agree more!


Question1: why spend time optimizing if real-time performance has
alreadybeen met with plain C code.

      In my opinion,Of course you couldn't spend time optimizing if
performance has beenmet.And its very easy to test whether the
performance of your code met your need using profiler or
CLK_getltime or clock.Before you start to optimize your C code you
must to prove that .its very important.

     But as to Question2 -Why spend time optimizing if the algorithm
cannot be improved.I didn't know how to test that.In another word I
didn't know when to stop optimizing.I didn't know what was the optimum!
I didn't know what was the best performance of the algorithm.

      I just know I had to start to optimize my poor performance algorithm
because the performance couldn't meet my need.But I didn't know when
to stop optimizing.

    That was a rather terrible thing.Only knowing when to start But not
knowing when to stop.It was unbelievable.What can I do? Help me please!

  • Steve,

    Since I did not write that slide, I do not know exactly what the author was trying to describe. I see two different meanings, and the author may have been discussing one of these, or both, or something else.

    a. The "mathematical algorithm" is the mathematical description of the procedure you want to run on the DSP. A sum-of-products for an FIR filter is repetitive and easy to optimize; an IIR filter has feedback paths that make it more difficult to implement on a DSP. An FFT requires implementation of a large and complex algorithm. If you can figure out a way to improve on the basic mathematical requirements for the task, then you may be able to get faster performance from your implementation. One example in matrix mathematics would be the case of recognizing that a particular matrix will always have the lower half be filled with zero, so you can remove some portions of your algorithm to apply that knowledge.

    b. The "logical algorithm" is the way you have chosen to implement the mathematical description. This may also involve mathematics, but it would more often involve the logical implementation. Techniques described in that optimization chapter are looking at ways to improved the implementation "algorithm". An example is manually unrolling a loop, or changing a pair of nested loops into a single loop.

    In either case, how do you know whether more improvement is possible? This could be the subject of a Master's Degree thesis or a Doctoral paper on Digital Signal Processing. Each algorithm will be different, and you will get different answers from one engineer or another engineer. Sometimes, you can count up the number of math operations and divide by the best possible implementation on a specific processor, but that can be challenging with the advanced multiple data instructions in TI's improving architectures. The answer can be very different when comparing the C6713 to the C6678.

    You should always start with what you believe to be a good mathematical algorithm. Then you should always use the techniques described in the workshop material to improve your logical algorithm. If all of your logical algorithm improvements do not get the improvement you need, then you can start back looking at the science behind the mathematical algorithm.

    Regards,
    RandyP

  • I agree with what RandyP said here. He covered it quite well.

    What we were intending with the comment in question is that "you should try and estimate the best performance for your algorithm". As RandyP says, this can be tricky with modern, SIMD architectures, but it's a good practice to establish. Think of it as coming up with a "Theory" of the best possible optimization; then you try to "Prove" that theory.

    This idea came from the fact that we saw so many folks trying to optimize code beyond what was even possible. For example, given the data-path for one CPU is two 64-bit buses, if your algorithm needs more data than that per loop, that effectively establishes a minimum # of cycles. Same goes for the number of multiplies and so on...

    The goal here is not to be exact, but rather to make sure you don't waste your time trying to "fix" something that isn't "broken".

    I hope this helps clear things up.

    Scott

  • See question 1. :-)

    You do some optimizations, check if the perfo meets the requirement.

    If not, you do another round.