TDA4VM: Hypothesis - You can use c7x DSP and MMA concurrently

Ash Shav

Prodigy 10 points

Part Number: TDA4VM

Tool/software:

Is it possible for the MMA and C7x DSP to operate simultaneously without issues, or are there limitations due to how resources are shared between these two IPs? Could you please explain?
Can the maximum performance for the MMA (xx TOPS) and the DSP (xx GFLOPS) be achieved concurrently during benchmarking?

over 1 year ago

0 Asha Bhandarkar over 1 year ago

TI__Genius 10170 points

Hello,

The MMA is tightly-coupled with the C7x core, and do not operate independently from each other. Due to this, performance is at a case-by-case basis depending on the algorithm implemented and the optimization techniques used. Instructions for both C7x and MMA will be pipelined together in a execution loop.

Best,

Asha

0 Ash Shav over 1 year ago in reply to Asha Bhandarkar

Prodigy 10 points

Thank you so much Asha, Can you please suggest a use-case where they will be working concurrently? Because TI advertise DSP GFLOPS and MMA TOPS separately; I am wondering if we can demonstrate peak performance advertised for both IPs simultaneously.

0 Asha Bhandarkar over 1 year ago in reply to Ash Shav

TI__Genius 10170 points

Hi,

Let me clarify my previous post, when I refer to the C7x+MMA architecture being tightly-coupled and not independent from each other. Instructions that exercise the accelerator architecture and those that don't (purely C7x instructions) map to the same functional units on the hardware. The maximum performance numbers given for C7x and MMA are based on having access to all required functional units at a time. In that respect, you will not be able to reach the GFLOPS metric for C7x and the TOPS metric for MMA simultaneously, as instructions would be pipelined together by the compiler.

Best,

Asha

Processors

Processors forum

TDA4VM: Hypothesis - You can use c7x DSP and MMA concurrently