This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6746 HPI Throughput and SIMD performance

Hello Teams,

Let me ask you some questions about HPI and SIMD.

When FPGA is connected to C6746, there are some options.

1. EMIF

2. uHPI

3. uPP

I'd like to know about the throughput measurement results of those interfaces.

I found uPP wiki but didn't found other interfaces.

http://processors.wiki.ti.com/index.php/Introduction_to_uPP

Q1) Could you please let me know your answer whether or not you have a measurement result of HPI and EMIF?

I also checked the following doc and found Table1 on P.5.

http://www.ti.com/lit/an/sprabg7/sprabg7.pdf

It seems that the specification of SIMD in C674+ is similar to C64+'s one.

(32-bit(2 × 16-bit, 4×-8bits))

Q2) I'd like to confirm whether or not my understanding is correct.

Best Regards,

Taka Kusunoki

  • Hi

    On Q1 , we do not have any throughput data to share on HPI or EMIFA. For EMIFA in general it is expected that you can get somewhere between 60-70% of the theoretical max, using EDMA with 32 byte default burst size. Theoretical max is ~ EMIFA_CLK /(Setup + Hold + Strobe)

    On Q2, yes your understanding is correct.
     C674x is essentially c64x+ plus a floating point unit.

    Regards

    Mukul

  • Hi Mukul,

    Thank you for your reply.

    Please let me ask you an additional question.

    How much throughput I expect for HPI?

    I understand you don't have the performance result of HPI but I'd appreciate it if you give me your answer based on your experience.

    Regards,

    Taka Kusunoki

  • Hi Kusunoki-san

    Since we don't have any first hand data with actual hardware, please consider my response purely from a guidance, ball park estimation standpoint only

    A quick/coarse way to calculate max UHPI throughput would be to look at the hstrobe timings in the datasheet (Table 5-114, No.3, 4) . The (2M+15) ns is a clock period of 28.34 ns or a strobe rate of ~35 MHz (assuming a 300 MHz CPU frequency and 150 MHz SYSCLK2) .

    For the 16 bit UHPI on device this would mean 70 MBytes/sec. Assuming realistic 60-80% utilization you are looking at throughput numbers in the range of ~ 50 Mybytes/sec

    The 60-80% bandwidth utilization assumes
    1) Use of data autoincrementing mode (address followed by block of data)

    2) Higher utilization for bigger block sizes (if you were doing smaller chunks of data then utilization will go down)

    3) Standalone throughput (without contention from other peripherals in the system accessing the same data memory).

    In general UHPI data rate will be highly dependent on external master strobe frequency & the device UHPI ready signal.

    Hope this helps

    Regards

    Mukul

  • Hi Mukul-san

    Let me close this ticket, since I was able to understand about the throughput of UHPI thanks to your help.

    I'd appreciate your support.

    Best Regards,

    Taka Kusunoki