This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Why some SIMD instruction will decrease MACs number

Hi,

I read the following link:

http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/187669/674810.aspx#674810

It said that performance will be down from 320GMACs to 80GMACs because of SIMD instructions. I am curious about it. Could some one explain it to me?

 

Thanks,

 

"Bringing this back down to earth some, there's a lot of other SIMD type operations that are going to put it @ around 80GMACs range that are applicable over a wider range of applications."

  • The 320GMACS number is based of a specific matrix multiplication instruction.  It's performing 16 MACs per cycle per functional unit, per core (or 32MACs per cycle per core * 8 core * 1.25GHz -> 320GMACs.)

    A lot of DSP related items are going to be more in the lines of Dot Products for filtering type routines that's going to be more in the neighbored of 4 MACs per cycle for the SIMD instructions.

    Best Regards,
    Chad

  • Hi,

    The following also mentioned "Shannon". I know the core it talked is C6678. Shannon is another name, or something else?

    Thanks,

    .........

    Yes, it's based upon the peak performance.  Specifically a complex conjugate matrix multiplication routine.  In which you effectively get 16 Multiplies per cycle from the instruction, it's on 2 functional units (.M1 and .M2) so it's 32 Multiplies per cycle per core.  That gives you 256 per cycle on Shannon.  When running at 1.25GHz that's 320GMACs.  For this specific operation. 

  • Sorry, that's an internal code name for C6678, I shouldn't have used it.  I replaced it in the other thread.

    Best Regards,
    Chad