C6746 HPI Throughput and SIMD performance

takahiro kusunoki

Prodigy 130 points

Hello Teams,

Let me ask you some questions about HPI and SIMD.

When FPGA is connected to C6746, there are some options.

1. EMIF

2. uHPI

3. uPP

I'd like to know about the throughput measurement results of those interfaces.

I found uPP wiki but didn't found other interfaces.

http://processors.wiki.ti.com/index.php/Introduction_to_uPP

Q1) Could you please let me know your answer whether or not you have a measurement result of HPI and EMIF?

I also checked the following doc and found Table1 on P.5.

http://www.ti.com/lit/an/sprabg7/sprabg7.pdf

It seems that the specification of SIMD in C674+ is similar to C64+'s one.

(32-bit(2 × 16-bit, 4×-8bits))

Q2) I'd like to confirm whether or not my understanding is correct.

Best Regards,

Taka Kusunoki

over 12 years ago

0 Mukul Bhatnagar over 12 years ago

TI__Guru* 85095 points

On Q1 , we do not have any throughput data to share on HPI or EMIFA. For EMIFA in general it is expected that you can get somewhere between 60-70% of the theoretical max, using EDMA with 32 byte default burst size. Theoretical max is ~ EMIFA_CLK /(Setup + Hold + Strobe)

On Q2, yes your understanding is correct.
C674x is essentially c64x+ plus a floating point unit.

Regards

Mukul

0 takahiro kusunoki over 12 years ago in reply to Mukul Bhatnagar

Prodigy 130 points

Hi Mukul,

Thank you for your reply.

Please let me ask you an additional question.

How much throughput I expect for HPI?

I understand you don't have the performance result of HPI but I'd appreciate it if you give me your answer based on your experience.

Regards,

Taka Kusunoki

0 Mukul Bhatnagar over 12 years ago in reply to takahiro kusunoki

TI__Guru* 85095 points

Hi Kusunoki-san

Since we don't have any first hand data with actual hardware, please consider my response purely from a guidance, ball park estimation standpoint only

A quick/coarse way to calculate max UHPI throughput would be to look at the hstrobe timings in the datasheet (Table 5-114, No.3, 4) . The (2M+15) ns is a clock period of 28.34 ns or a strobe rate of ~35 MHz (assuming a 300 MHz CPU frequency and 150 MHz SYSCLK2) .

For the 16 bit UHPI on device this would mean 70 MBytes/sec. Assuming realistic 60-80% utilization you are looking at throughput numbers in the range of ~ 50 Mybytes/sec

The 60-80% bandwidth utilization assumes
1) Use of data autoincrementing mode (address followed by block of data)

2) Higher utilization for bigger block sizes (if you were doing smaller chunks of data then utilization will go down)

3) Standalone throughput (without contention from other peripherals in the system accessing the same data memory).

In general UHPI data rate will be highly dependent on external master strobe frequency & the device UHPI ready signal.

Hope this helps

Regards

Mukul

0 takahiro kusunoki over 12 years ago in reply to Mukul Bhatnagar

Prodigy 130 points

Hi Mukul-san

Let me close this ticket, since I was able to understand about the throughput of UHPI thanks to your help.

I'd appreciate your support.

Best Regards,

Taka Kusunoki

Processors

Processors forum

C6746 HPI Throughput and SIMD performance